Subscribe to Email Updates

How Igneous Solves the Problem of Data Movement for Large Datasets

by Jeff Hughes – December 5, 2017

Getting data from where it is to where it needs to be sounds simple in concept, but becomes a big issue when your datasets are very large. Though the aspect that most often comes to mind is moving across geographies, different formats and impact to primary systems play equally challenging roles. Yet moving data well is a key function required for backup, archive, and cloud tiering.

Screen Shot 2017-12-05 at 11.24.55 AM.png

Why is Data Movement for Large Datasets So Hard?

Many ways to move data were designed when enterprise data was measured in gigabytes.  Now that it’s measured in petabytes, many old techniques don’t work anymore.

For example, one way to move data off legacy file systems was NDMP, which was a single-threaded protocol designed to move data linearly to tape. Those constraints don’t apply today, but the protocols are still often in use.

How Does Igneous Solve this Problem?

Igneous moves data from primary storage in highly parallel streams. Rather than using legacy protocols like NDMP, we come in via front-end protocols such as NFS and SMB, and open many parallel streams the way that many users would. In addition, the way we scan and the way we move data are done intelligently, specifically designing on how the filers are built.

Impact on Filers

Igneous is latency aware. We move data as fast we can when the filers quiesce, and as we detect load from users or applications, we back off intelligently.

This enables backups to run continuously without creating “backup windows” where backup administrators tell users and application owners the data is unavailable, from say 11pm-4am. In our case, backups run all the time.

Read Consistency

When read consistency is an issue, we have integration with APIs for the filers to take a snapshot, move data, and release the snapshot after we’re done. We’ve integrated with NetApp, Dell EMC Isilon, and Pure FlashBlade to date.

Moving Data to Other Locations or Cloud

The key element here is to understand where you need low latency. Between the filers and the data movement software you want a low latency connection, as POSIX semantics involved in NFS and SMB transactions require it.

However, the communications between our data mover software and our storage layers are RESTful protocols, designed to work over WAN and Internet connections just like the Web. In fact, the RESTful protocols all work over https.

As such, we can do data movement between Igneous systems or between Igneous systems and public clouds very efficiently and reliably, without the typical retries and timeouts associated with trying to run POSIX semantics over the network.

Learn more about our data movement engine on our newly launched Technology page.

Related Content

How Much is Tape Really Costing Your Business?

April 17, 2018

Tape remains a popular medium of data storage due to its advantages in cost and ease of use. Often as little as a penny per gig per month to cover just the hardware, tape seems to be a great, affordable option for backup. Unlike other backup solutions, scaling tape is simply a matter of adding physical inventory, and tape doesn’t require network bandwidth or expertise to move offsite, or power when storing long-term archives.

read more

Data Management Trends: Machine Learning, Artificial Intelligence, and High Performance Computing in the Life Sciences Industry

April 10, 2018

Explosive data growth in the life sciences industry is nothing new; what’s truly exciting is how this data can be used!

read more

Archive 101: What is It, Why is It So Important, and How Do You Archive Effectively?

April 3, 2018

As data grows, archiving data has become more important than ever for a robust data management strategy. Yet, effective archive remains elusive for many organizations. Even defining what “archive” means can be difficult because archive commonly refers to backup archives or e-mail archives, not unstructured data management.

read more