AlphaGo wasn’t just a victory for artificial intelligence. When millions of people across the world tuned in to watch DeepMind’s machine beat the human Go world champion Lee Sedol, they also witnessed a historic victory for open-source.
DeepMind used a scientific computing framework called Torch extensively in the development and execution of AlphaGo’s neural networks. Torch was first [released back in 2002] under an open-source BSD license with algorithms that are still commonly used by data scientists such as multi-layer perceptions, support vector machines and K-nearest neighbours. Torch also supported ensembles – a popular technique that combines the output of multiple algorithms, usually with a weighted average.
It’s not just open-source software that contributed to the growth of machine learning. Long before startups and big business became obsessed with artificial intelligence, the academic world was openly researching and sharing and building upon their learnings. Christopher Bishop published a research paper in 1995 called *[Neural Networks for Pattern Recognition]* that presented the corpus of techniques that took machine learning from a statistical science to one inspired by the biological networks in our brains. Geoffrey Hinton noted in the foreward that Bishop “has wisely avoided the temptation to try to cover everything and has omitted interesting topics such as reinforcement learning, Hopfield Networks and Boltzmann machines in order to focus on the types of neural networks that are most widely used in practical applications”. DeepMind famously employed these techniques almost 20 years later to create a generalised AI that can learn to play Atari games at a superhuman level. When Bishop published [his 2006 paper] he was at Microsoft research.
So, why the history lesson? To make the point that if you look closely enough artificial intelligence has always been open-source, and open research and development is a core reason why AI is where it is today.
I have been building technology start-ups since 2003. Throughout the years I observed a trend towards the commoditisation of machine learning algorithms and the data wrangling tools to deploy these techniques in the real world. The team at Seldon had been hand-crafting recommendation algorithms for a number of years. We adopted Hadoop back in 2011 in order to scale our data processing capabilities beyond programmatic and relational databases. Hadoop had a sister called project Apache Mahout that bundled a variety of machine learning algorithms. A few years later Apache Spark revolutionised the computation of streaming data and came bundled with a powerful machine learning library called MLlib. By mid-2014, PredictionIO had released an open-source machine learning server – the first to provide a full stack solution with tools to build, deploy and optimise machine learning models. The talented team at PredictionIO are now part of Salesforce, where they most likely played an important part in the development of Einstein — the new AI platform baked into the Salesforce cloud ecosystem.
Following Seldon’s first release in February 2015, our open-source project quickly built a community of thousands of data scientist and developers from around the world. Through the lens of the business world, open-sourcing something as advanced as a machine learning platform was new and exciting. It contradicted the start-up playbook of locking down your code and providing solutions from a black box. I had observed signals of a trend towards commoditisation, but 2015 became the year that open-source machine learning hit the prime time.
*How are you using open-source technologies in your project or business? And what do you think the future will hold? I’d love to hear from you – please leave your comments below.*
Alex Housley, CEO & Founder, [Seldon] Published in