In a recent blog “Deep learning performance breakthrough” IBM’s Sumit Gupta suggests that a lot of the recent heightened interest in AI is fuelled by deep learning.
Deep learning is a new machine learning method based on neural networks that learns and becomes more accurate as we feed the model more data. This enables AI models to learn in a very similar way to how a human learns, which is through experience and perception on an ongoing basis.
For example, banks today mostly use rule-based systems for fraud detection, wherein a rule might specify a set of conditions that will trigger a fraud alert. Instead, they can use the past few years of credit card usage to train a deep learning model that learns the more data you give it, even after deployment. The advantage of this approach is that the AI model is automatically learning new situations that might be fraudulent based on experience, rather than a data scientist writing new rules for each type of situation.
Deep learning has dozens of uses across every industry, ranging from retail analytics to drone video analytics to medical imaging analysis to assist clinicians with diagnoses. Organisations can now use these kinds of advanced machine learning methods to extract insights from the data they have been collecting in their data lakes over the last few years. But to use these AI methods, you need to process massive amounts of data, and that approach requires IT infrastructure that is up to the task.
What does meeting the performance demands of deep learning require?
Gupta suggests that you need to use accelerators like NVIDIA’s Tesla V100 GPU accelerators specifically for this purpose. This sees transfers of data up to 5.6 times faster than the CUDA host-device bandwidth of tested x86 platforms.
According to Gupta a lot of IBM’s “magic” behind deep learning also comes from their software framework called PowerAI, which is designed for deployment on IBM Power Systems. PowerAI is an enterprise software distribution of some of the most popular open source machine learning and deep learning frameworks that have been tune and tested to provide the performance and speed deployment optimised for this deep learning.
Getting rid of the single-node bottleneck
According to Patrick Moorhead, a Forbes Contributor, one of the biggest problems holding back the further proliferation of deep learning, however, is the issue of scalability. Most AI servers today are just one single system, not multiple systems combined. The most popular open-source deep learning software frameworks simply don’t perform well across multiple servers, creating a time-consuming bottleneck. In other words, while many data scientists have access to servers with four to eight GPUs, they can’t take advantage of it and scale beyond the single node—at the end of the day, the software just wasn’t designed for it.
Breaking through the IT infrastructure wall
Moorhead suggests that “by getting rid of this bottleneck, DDL has the potential to really open the deep learning floodgates. It could be a real game-changer for IBM, PowerAI, and OpenPOWER”
Contact us to get started, learn more about deep learning and IBM PowerAI, and Register here for a
free PowerAI trial.
(1) Results are based on IBM Internal Measurements running the CUDA H2D Bandwidth Test
Hardware: Power AC922; 32 cores (2 x 16c chips), POWER9 with NVLink 2.0; 2.25 GHz, 1024 GB memory, 4xTesla V100 GPU; Ubuntu 16.04. S822LC for HPC; 20 cores (2 x 10c chips), POWER8 with NVLink; 2.86 GHz, 512 GB memory, Tesla P100 GPU. Competitive HW: 2x Xeon E5-2640 v4; 20 cores (2 x 10c chips) / 40 threads; Intel Xeon E5-2640 v4; 2.4 GHz; 1024 GB memory, 4xTesla V100 GPU, Ubuntu 16.04