BlogApril 24, 2017
Successful Machine Learning Requires Three Things: Cloud, Big Data Technologies, and Skilled People
Attend an industry conference this year and you'll find that machine learning - a term sometimes used interchangeably with artificial intelligence and predictive analytics - is drawing the most enthusiastic crowds. It is the shiny new object in the data space.The fascination with machine learning is understandable.
- Machine learning is what enables self-driving cars to turn massive amounts of data into left turns, right turns, and all the other responses necessary to negotiate traffic safely.
- Chatbots use machine learning to interpret and respond to such questions as: "What's the weather doing?" "What's the exchange rate for dollar versus yen?" "What is my checking account balance?"
- Some consumer-facing businesses are using machine learning to provide customers with timely and personally relevant information while they are onsite. Amusement parks, for example, provide guests with real-time information about wait times for rides and can offer guests a better experience based on their past behaviors, enhancing their experience while at the park.
Three Keys to Machine Learning
Despite all the excitement, few companies are in a position to put machine learning to work within the next year. Getting there requires extensive investment in three key areas: cloud computing, big data technologies, and skilled people.
Cloud Computing. Machine learning requires massive amounts of data. And more data requires more storage. It also requires more compute cycles to accommodate predictive analyses and identify patterns in the data. Few companies other than Google, Yahoo, or Facebook possess the infrastructure needed to handle the amounts of data involved in machine learning.
To respond to a simple request for a weather report, for example, a voice assistant such as Siri must recognize and understand many permutations of the request ("How's the weather today?," "What's the weather doing?," "Give me the weather!," etc.) spoken by many people. It must find and present the correct response. And it must do all of that quickly. Given the many requests and permutations thereof that a voice assistant might receive, the data requirements add up quickly.
The cloud offers a practical solution to processing large amounts of data, although some organizations, notably government agencies and financial services companies, balk at putting data in the cloud because of security concerns. In fact, cloud providers generally do a far better job with security than many organizations are doing on their own. Regardless, absent the cloud, few organizations will be able to capture, store, and process the mountains of data involved in machine learning without substantial infrastructure investments.
Big Data Technologies. Use of the cloud is necessary but not sufficient to solve all the challenges presented by machine learning. In addition to massive amounts of data, data storage, and compute power, organizations need tools that will enable them to work with the data.
The Hadoop ecosystem is one of the best-known of these big data toolsets. Hadoop is an open-source software framework that can be used to store all kinds of data, including unstructured data such as digitized calls from customers. The ecosystem is highly complex and includes a wide variety of software applications such as Apache Mahout, Apache Hive, Apache HBase, Apache Spark, and more. Mastering the toolset takes time and training. Beyond Hadoop, there are cloud vendor specific implementations of big data tools such as Amazon's EMR, Kinesis and Redshift, Googles Big Table and Big Query, and Microsoft's Azure Data Lake and Azure Machine Learning.
That brings me to the next point: skilled people.
Skilled People. The great promise of machine learning is that someday machines will program themselves. We aren't there yet.
To leverage machine learning, organizations need people. In this case, people with white-hot technical skills that don't grow on trees and aren't inexpensive.
It takes multiple data science and data engineering skills to tackle the work involved in bringing together data from multiple disparate systems and using that data within machine-learning models. The models, in turn, require the development of sophisticated mathematical algorithms. To build them, data scientists must have both technical and business knowledge.
Many businesses and government agencies are aggressively recruiting data engineers and data scientists. Meanwhile, the technologies are evolving quickly, and so are the skills needed to leverage them.
All of this means that, despite the hype and excitement surrounding machine learning, few organizations are there yet. Some are dipping their toes in the cloud, exploring big data, developing new skill sets, and doing proofs of concept. They are moving in the right direction, albeit slowly, even if they aren't achieving results yet. These organizations certainly have the potential to exceed consumer expectations in the future.
Organizations that aren't investing in any of the three areas necessary for machine learning can expect to be leapfrogged by competitors that are putting in the effort.