Machine Learning Explained by Ahmad Najjar
The minute we hear the word machine learning, many thoughts rush into our minds, starting from high-tech robotics, sophisticated complex algorithms and ending by Sci-fi movies (of course!).
Well, we might be partially right if the plan is to assemble a robot or reinvent an auto-pilot car, but this is no longer the case these days!
In the recent years, Machine Learning is being widely used in business scenarios or – to be fair – it’s even becoming mainstream! Predictive machine learning modules and/or sentiments are being used in our day-to-day tools and utilities. For example, there are off-the-shelf SaaS sentiments to tell whether a tweet from a customer is a good feedback (happy customer) or a bad feedback (unhappy customer).
Another example that we could find off-the-shelf is face recognition, which is a service we could feed with a couple of (face) photos of a person, and then feed a totally different photo – of the same person – other than the ones provided at the beginning, then the service in turn will tell (predict) whether it’s the same person or not.
Before jumping into the meat in this article, an introduction is indeed needed to understand how machine learning modules are prepared and then built. Preparation is – by far – the most important step that we do when building a machine learning module, whereas it could split the different if the machine learning module is predicting results correctly of not!
The first step is preparing the data we are going to introduce to the module (hereinafter dataset), the dataset must have the following characteristics (The latter machine learning SaaS service example “Face Recognition” is good one to kick off with explaining datasets):
- Relevant data: the data must not include a photo of the person’s hand or foot (for example), ever since such photos are not relevant to the face in any way, right!?
- Connected data: all important face feature must be clear and present in the fed photos. E.g. photos showing half of the face only are considered disconnected data.
- Enough data: the machine learning module needs to be introduced to a considerable amount of data, to split the difference and be more accurate and decisive. The face recognition service asks for 3 to 4 pictures of the person it’s trying to recognize (predict).
Once the dataset is ready, then we need to figure out what is the problem we are trying to tackle or – in other words – what is (are) the expected result(s) to be predicted by the machine learning module. There are four types of machine learning algorithms, as per the problem we are trying to tackle and they are as follows:
- Classification: is the problem of identifying to which of a set of categories a new observation belongs to, based on a training set of data containing observations whose category membership is known. There are two type of classification:
- Two-class: which is identifying to which of a set of two categories a new observation belongs to. E.g. is someone telling the truth or not? OR is that someone’s face or not?
- Multi-Class: which is identifying to which of a set of three or more categories a new observation belongs to. E.g. what type of iris we would most likely to have, given iris flower features?
- Regression: is the problem of identifying a value – most likely numerical – of a new observation on a continuum. E.g. what is the price of a house, given post code and size? OR what is the price of a diamond, given specific carats?
- Anomaly Detection: is the problem of identifying an observation which do not conform to an expected pattern or other items in a dataset. E.g. Cards fraud, whereas we have a huge number of legit transactions in comparison to fraud transactions.
- Clustering: is the problem of grouping a set of observations in such a way that observations in the same group are more similar to each other than to those in other groups. E.g. Shopping patterns for customers on a website OR entertainment recommendations (recommended movies, series and shows) based on user usage.
The first three types are supervised machine learning while the latter is unsupervised machine learning.
Unsupervised machine learning
Is the machine learning task of inferring a pattern to describe hidden structure from «unlabeled» data. Since the examples given to the learner are unlabeled, there is no evaluation of the accuracy of the structure that is output by the relevant algorithm—which is one way of distinguishing unsupervised learning from supervised learning and reinforcement learning.
The main algorithm used for unsupervised machine learning is k-means. K-means aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster.
Supervised machine learning
Is the machine learning task of inferring a pattern from labeled training data. The training dataset consist of a set of training examples. Each example is a pair consisting of an input object and a desired output value. A supervised learning algorithm analyzes the training data and produces an inferred pattern, which can be used for mapping new examples.
And this is how supervised machine learning is carried out…
Once the dataset is ready, the dataset is split into two sets:
- Training data: the data which will be absorbed, analyzed and rationalized by the model while referring to a predefined machine learning algorithm.
- Test Data: this data works as a benchmark for the training data, to check if the predicted result(s) – in the training process – is (are) corrected as per the chosen algorithm.
The best ratio to split our data is 70% (training) by 30% (Test), however this also depends on the algorithm used and/or the results/predictions we are trying to produce.
Then the training process is set to take place at this point, where the training data is introduced into a training algorithm – based on the problem – whereas the algorithm will conclude a learned pattern which can be tested against the test data we held back, the latter testing process is called scoring.
The latter process can be repeated many times or even compared with other algorithms until the desired (most accurate) predictions are produced, in a process called evaluation.
All the steps and processes that have been mentioned previously can be implemented easily using Azure Machine Learning which is a fully-managed cloud service that enables us to easily build, deploy, and share predictive analytics solutions, which I’ll be writing a separate article about…
Finally, I hope this article explained how machine learning initially works and I hope it gave a crystallized understanding on the important background to start working with machine learning.