Machine Learning is a subfield of Artificial Intelligence, which blends techniques and principles from computer science, applied statistics and software engineering. Machine Learning enables computers to make accurate predictions by detecting complex patterns in data. It generates code itself, which enables those patterns to be recognised in new data.
People often use the terms Machine Learning and A.I. interchangeably but a better way to describe the technology is that Machine Learning is a subset of A.I. while Deep Learning is, in turn, a subset of Machine Learning. The line between Machine Learning and Deep Learning is quite blurred: it is a matter of complexity in the type of toolset that is being used.
At the heart of this, we are talking about machines that learn and improve over time as they are exposed to new data.
Subsets of Machine Learning
Predictive analytics, natural language processing, speech to text, text to speech, expert systems are subsets of Machine Learning. At the same time, Machine Learning can help to plan and optimise strategies, powering robotics and computer vision technologies. All these functions apply to the definition of Machine Learning, which is about using data to recognise patterns, learning as it is exposed to new data.
We can break Machine Learning down into three areas: supervised (task driven classification), unsupervised (data driven clustering), and reinforcement learning (algorithm learns to react to an environment).
Supervised learning is very much a task driven function: that is, you start with some data and you start with a task. “I want you to predict this, I want you to classify that.”
Unsupervised Machine Learning is data driven, you start with a piece of data, you give it to the model and you say I want you to group things as you see fit, identify clusters. That could be very useful say in understanding a target business market better. For example: ‘here are the people under 25, all the people living in Swindon’ and you can use a clustering system to identify groupings you might not have thought of previously.
Lastly, we have reinforcement learning, which underpins the driverless car scenario that the media have so much fun talking about. This subset of Machine Learning is also the tool that famously won the game of Go recently against the Number 2 player in the world. This machine relies on trial and error, finding a way and being told each time if that is right or wrong.
Supervised Learning – Is it a Cat or a Dog?
For the purpose of this article, I am going to focus on supervised learning, in other words, ways of predicting something using data as opposed to classification.
An example of supervised Machine Learning classification is the cats and dogs scenario. We can train a model to tell us if a machine is looking at a picture of a dog or cat. We feed the machine lots of pictures of dogs and cats and convert those pictures into mathematical values.
The next step is that after having shown the machine lots of pictures of dogs and cat when you show it a new picture it should be able to tell if it is a cat or a dog. It will get better at recognising cats and dogs as you feed it more pictures because they have unique features in common.
How would that apply in an insurance context? Well imagine you have two insurance claims: Claim A and Claim B. How can you tell which of those claims is genuine and which is fraudulent? The answer is to feed the machine lots of claims we know to be genuine and lots of claims we know to be fraudulent. We train the system to recognise those so that when the machine is presented with a new claim they can work out if it is a fraudulent claim.
Why Machine Learning Now?
Machine Learning algorithms can be traced back to the 1950s. Until now it has not been possible to apply the algorithms in a useful way but the extra capacity and falling cost of computer processing power are revolutionising society’s ability to crunch data.
In particular, the power of cloud computing means that business doesn't need to scale up their IT infrastructure internally as they can go out to Amazon and other similar entities to rent that space when they need it. This is changing the face of Machine Learning massively.
The second point about the new abundance of data is also crucial. Never before have we seen so much data and the increasingly enormous volumes of real-time and continuous data emerging from sensors and telematics of all kinds, including commercial and engineering platforms everywhere. The online libraries of Machine Learning tools are the algorithms that power these models, which are all open source and freely available.
The data on the Internet is estimated to be about 5 million TB and of that Google amazingly has indexed only 200 TB or 0.004 percent of that total amount. Just think about existing data: we are using almost none of that existing information to inform the current state of the world around us. Indeed, within organisations like an insurer, a similar percentage of data is not utilised, certainly not from an analytics perspective.
A Change is Gonna Come
What is changing is the sum of human knowledge. It is said that back in 1900 the sum of human knowledge doubled every Century. From 1945 it doubled every 25 years. Now it is doubling every 13 months and IBM estimates that with the IoT, very soon human knowledge is set to double every 12 hours.
A good quote to end on, however, comes from The Economist: “For industries that embrace Machine Learning the future will depend on how well they marry its predictive power with old fashioned human wisdom.” Machine Learning technology is powerful but is reliant on human inputs and values, which are no less powerful. The benefits of Machine Learning can be no panacea for a badly run business.