John Winn


Ten years ago, when I first started thinking about this book, machine learning was mostly an academic study. It was primarily an intellectual exercise whose goal was to push ahead and see what a computer could be made to learn: to understand what tasks were within the capabilities of our current techniques and which ones lay outside. The focus was to try and move this boundary – to test the limits and break through, as a popular children’s song might have it.

My goal then was to write a book to help people to understand that machine learning models and algorithms are not abstract mathematical concepts, but mathematical descriptions of the real world. By explaining the assumptions hidden away in machine learning algorithms, I hoped to make them easier to understand, both to beginners and to those with more experience. The idea was to show by example how each choice about the structure of the mathematical model has a real effect on the behaviour of the resulting machine learning system.

In the intervening decade, the role of machine learning in the world has fundamentally changed. It is no longer a purely intellectual discipline. Instead, use of machine learning has expanded ten-fold, a hundred-fold, in a myriad of applications across every aspect of our digital lives. Increasingly, machine learning is affecting everything we see online, what is drawn to our attention and what is hidden. Machine learning influences what we watch, what we listen to, the things that we buy, even the people that we date. More worryingly, machine learning is starting to influence who gets hired for a job, who gets access to medical treatments, even where police get deployed and who gets sent to jail.

Understanding how the assumptions in a machine learning model affect its behaviour is no longer just a useful skill for developing machine learning systems. It has become a critically important way of making sure that a machine learning system is transparent, interpretable and fair. As people’s lives are influenced by machine-made decisions to an ever greater extent, the call to understand the reasoning behind these systems is going to become deafeningly loud. The assumptions in these systems need to be clear, transparent and available for all to see – and made accessible through clear explanations of each decision or prediction. Model-based machine learning is a crucial tool in ensuring that transparency and fairness lie at the foundations of all machine learning systems.

So, to those who are designing the next generation of machine learning systems, think carefully about every assumption you make and about every data set you train on. Your modelling decisions are not abstract: they will have very real effects on the lives of real people. Machine learning will affect the lives of your family, your friends, of you yourself. Remember that if you use gender as a variable in your model, it will likely make sexist predictions. If you use race as a variable, your model will likely be racist. Training on data sets which record the status quo will entrench past inequalities and propagate them to the future. So apply your skills with thought, with care and, above all, with empathy.

May all your assumptions be good ones.

John Winn

June 2020