Why Software Defined Storage Matters

by Kiran Bhageshpur – November 3, 2015

Everywhere you turn there are articles and blog posts about Software Defined everything: Networks, data centers, WANs and, more recently, storage. What is Software Defined Storage (SDS) and, more importantly, why do you care?

Wikipedia says SDS “is an evolving concept for computer data storage software to manage policy-based provisioning and management of data storage independent of hardware. Software-defined storage definitions typically include a form of storage virtualization to separate the storage hardware from the software that manages the storage infrastructure.”

That’s a mouthful! Another definition I found was “Software-defined storage is an approach to data storage in which programming that controls storage-related tasks is decoupled from the physical storage hardware.”

The best and most succinct definition I find for “Software Defined *” comes from Google’s Amin Vahdat from the ONS 2015 keynote. At 21:41 in this presentation he defines “Software Defined Networking” thus: Split control plane from the data plane to support the independent evolution of the data path from the control path. He goes on to say, “It’s not about hardware versus software” but it’s a software control plane that abstracts and manages complexity.

I propose a similarly simple definition for Software Defined Storage as being an approach to data storage in which the programming that controls all storage related tasks (the control plane) is decoupled and independent of the programming that handles user data and meta-data (the data path).

The control plane handles stuff like configuration, provisioning, telemetry, security, data policies and so on. The control plane is pure software (no hardware). The data path has a strong coupling to underlying hardware (merchant silicon and storage media, be it HDD or SSD or something else). Think of the data path as the engine and the control plane as all the ‘stuff’ that surrounds the engine and well..controls the engine!

Why is this important? According to Amin the distinction allows for the independent evolution of the two components. You want to be able to take advantage of Moore’s Law and Kryder’s law to use the most effective hardware at any given point in time while at the same time taking advantage of software optimizations and new functionality without regards to the underlying data path functionality and/or hardware platform.

These definitions lead to tangible, substantive advantages for those of us in the storage field. Contrast this with a trend I have seen recently to define SDS as storage software run on commodity hardware. To me, this definition misses the point entirely. Unless you separate the control plane from the data path, you won’t get the benefits you want from SDS.

What’s your view? Do you agree with the above take on SDS?