on December 04, 2018 Bridging the AI Performance Gap Storage Solutions for AI

Choosing the right storage for AI

Subscribe to Email Updates

In a recent blog post by Rick Janowski - Product Marketing Manager, IBM Software-Defined Storage, cited research [1] which suggested that “by 2020, 30 percent of organisations that fail to apply AI will no longer be operationally and economically viable” [1].  And in another survey, 91 percent of infrastructure and operations leaders identified “data” as a main inhibitor of AI initiatives[2].”

Clearly if you are seeking to unleash the power of Artificial Intelligence (AI) and Deep Learning you will need to build a machine learning environment so as to ensure that future requirements won’t impede the application as it moves from prototype to production.

Storage Matters

Jankowski, opined that “unlike traditional programming in which a programmer provides the computer with each step that it has to take to accomplish some task, deep learning requires the computer to learn for itself.  In the case of visual object recognition, for example, there is no way to program a computer with the steps needed to recognise a given object which may present itself in different locations, at different angles, in different lighting conditions, perhaps partially obscured by some other object and so forth.  Instead, the computer is trained by being given thousands of examples of images containing the object until it can consistently recognise it.

This kind of training requires lots of data, but once deployed, the system may need to respond in seconds.”

It is more than just holding the data

Legacy storage solutions are simply not designed for the low-latency, highly parallel, mixed workload requirements found in the production stages of the AI data lifecycle. Sooner or later you will hit the wall where it is either no longer manageable or the GPU based systems are idling due to the lack of I/O performance, and the project requires a costly re-architecture, both in terms of time and capital.

Ensuring storage can support AI systems

So what are the considerations to ensure the storage subsystem can support your AI project?

Availability:  If the storage is not available, then the AI software can’t do its job. High-availability storage with data replication to a secondary site is highly recommended, if not an imperative, for enterprises adopting AI.

Reliability: AI systems can’t make a decision if the data they use is corrupt. The reliability of the data depends on building capabilities such as error checking into the software that runs the storage array. Another aspect of reliability is the dependable operation of the storage hardware. Storage arrays that use flash memory are well suited to AI systems because the solid-state drives have no moving parts that can wear out.

Performance: Systems processing AI workloads have got to be fast because they are increasingly making real-time data decisions. A fast AI engine requires high-performance storage, and the fastest storage technology available today at the right price point is a storage system equipped with all-flash technology. As big data and cognitive analytics become more pervasive, all-flash technology is definitely the future for storage systems supporting these workloads.

Ease of use: Even while the volume of data is increasing, budgets for hiring IT storage administrators has dropped significantly across many industries. As more storage is added to support AI and other data-hungry applications, organisations require storage in their IT infrastructure that offers ease of use for an IT generalist—or the application owner who is configuring the AI system—to easily install, deploy and manage that storage.

Automation: Because fewer people today are managing more IT infrastructure, automation is essential. Automation of functional storage features such as replication, tiering and backups can greatly reduce the storage management impact of AI. And the more automated the system, typically the lower the costs are to run it.

Building AI into storage

Today, providers such as IBM are also leveraging AI inside the storage array so that the system can make smart storage management decisions. Examples of such decisions include sensing the IT environment and making the necessary adjustments to run better within it, moving data to a location closer to where it’s being accessed and understanding when information is becoming cold data that can be archived. An AI-assisted system can even help predict when an environment is running out of storage and generate a notification to increase capacity as necessary or as circumstances change. 

Storage engineered to meet the needs of AI

IBM Flash Storage systems are engineered to meet the storage requirements of AI, big data analytics, machine learning and other cognitive applications with mission-critical availability and reliability, high performance, ease of use and automation. Learn more about the value of storage with all-flash technology to support AI workloads.

Partner with a vendor who understands the whole environment, not just storage - Open Systems Specialists (OSS) 

[1] Gartner Predicts 2018: Compute Infrastructure 

[2] Gartner AI State of The Market – and Where HPC intersects


Mike Clancy

An advocate of innovative thinking and productive interaction to affect real change, I also bring exposure from local and international markets and understand what is required for successful modern business practices.