The term 'machine learning' is everywhere; in the media, education and local businesses. The whole field of machine learning can seem pretty overwhelming but you'll be surprised at how much we're surrounded by it.
The definition is basically in the phrase, it is the capability of a machine to learn without being explicitly programmed. Machine learning is about the experience of a computer learning from data by improving their performance at a specific task through experience.
Tom Mitchell provides a very useful definition;
"A computer program is said to learn from experience E with respect to some class of tasks T and performance P, if its performance at tasks in T, as measured by P, improves with experience E."
Take the most common example playing checkers. E is the experience of playing checkers, T is the task of playing checkers and P is the probability that the program will win the next game. A task such as checkers doesn't seem like the most advanced of phenomenons, however back in 1962 the IBM 7094 played against self-proclaimed checkers master Robert Nealey and won. This ability to learn when fed so much historical data makes classification problems like face detection, email filtering, medical diagnosis and weather prediction all the more accurate.
The two main types of Machine learning algorithms are Supervised and Unsupervised. Supervised learning is when you have the answer and you have the input(data) but you need to understand how the data concludes with that answer, you want to understand the journey. Supervised learning is further grouped into Classification and Regression. Classification is the process of classifying the data. It can be binary classes (0 or 1, yes or no, spam or no spam) or it can multivariate, like classifying characters from an image process recognition system (1,2,3,4,5,6,7,8,9). Regression on the other hand is associated with prediction, for example with house prices, if my house costs x amount how much could I sell it for. The objective behind supervised learning is to build a model that can be replicated and used for different datasets and still have a high accuracy.
On the other hand unsupervised learning is when we just have the input(data) and we have no answer. The goal for unsupervised learning is to model the underlying structure or distribution in the data in order to learn more about the data. Unsupervised learning can be further grouped into clustering and association problems. A Clustering problem is where you want to discover inherent groupings in the data, such as grouping customers by purchasing behaviour . An association rule learning problem is where you want to discover rules that describe large portions of your data, such as people that buy X also tend to buy Y.
Anaeko's mission is to enable data insights that deliver better public services to citizens. We specialise in finding the best model for your data problem. We strive to find insights within data and use these to improve better public services. If you want more information regarding data services, feel free to get in touch!