Blog

Subscribe to Email Updates

The Backup Window is an Outdated Concept

by Steve Pao – July 5, 2017

Backup windows for unstructured file data will go the way of the rotary phone. Why? Backup windows were designed when tape was the primary backup target, before the proliferation of unstructured data. Let’s explore how legacy backup software backed into the concept of backup windows and how a modern approach eliminates them.

Originally Designed for Tape

Prior to disk-based backups, the primary medium for backup was tape. The medium had some interesting constraints because data had to be serially written to – and read from – tape. Moreover, the job of backup systems was to keep the data flowing at a continuous rate as the tape’s physical reels rotated.

As a result, single-threaded streams and continuous rates evolved as fundamental concepts in legacy backup protocols (such as Network Data Management Protocol or NDMP) used in primary storage systems and in legacy backup software that supports those protocols. Even after disk became a viable backup target, these concepts persisted in legacy backup software and protocols.

The Backup Window Emerges

All backup software reads data from the primary storage system, potentially impacting the performance of user and application access to data. With the concepts of single-threaded streams (or “jobs”) and continuous data rates, backup administrators had to choose between how many concurrent jobs they wanted to run (impacting system performance), against how frequently they wanted to backup their data.

To meet daily backup requirements, IT administrators generally ran backups at night, and for weekly backups, they ran backups over weekends. This practice enabled those administrators to run as many backup jobs as their systems could handle during these discrete periods.

The Growth of Unstructured File Data

Historically, when data sizes were small and when users and applications accessed primary storage systems only during regular working hours, the practice of scheduling backup windows outside those hours largely worked.

However, humans are no longer the only ones generated unstructured file data in the form of Word documents or Excel spreadsheets in their home directories. Most of today's data is generated by machines (e.g., medical equipment, cameras, and now autonomous vehicles) and software applications (e.g., design automation software, image rendering, and scientific computing).

The growth both in file count and in total data volume strains the concept of jobs utilizing single threaded streams. Often, there's simply too much data to move during a backup window!

When backup windows extend into working hours, user complaints often force backup administrators to turn off backups altogether, often resulting in no complete backup set for the day or week.

In some organizations, continuous processing and 24/7 operations challenge the notion of backup windows because data must be available to users and applications every minute of the day.

Rethinking Backup

It's time to rethink backup. A modern approach can eliminate backup windows altogether. Consider these two approaches:

  • Multi-streaming: Without the requirement to single stream data serially to tape, data can move faster in dynamic, parallel streams without administrators having to manually split backups into separate, discrete jobs.
  • Latency awareness: Without the requirement to stream data at a continuous rate, data can move faster when the primary system load from users and applications is low, and “back off” when users and applications are accessing the data.

With these approaches, it is possible to run backups all the time without impacting users or applications. In essence, backup jobs run at maximum speeds when usage of the primary systems is low, and automatically slow down when needed.

The ideas here are pretty straightforward, and they work! Igneous customers use Igneous Hybrid Storage Cloud to backup primary storage file systems that couldn't previously backup during acceptable backup windows.

The trick here is making these concepts work together, and this is where our engineering comes in. Look for future posts about how we overcame challenges to implement our unique secondary storage approach, including:

  • Removing the reliance on NDMP to track changes
  • Integrating with NAS systems to enforce read consistency
  • Providing a horizontally scalable and performant backend target for the multi-streaming

Download our "Secondary Storage for the Cloud Era" whitepaper for more insights on today's secondary storage challenges and solutions for overcoming them. 

Read whitepaper

Related Content

How Much is Tape Really Costing Your Business?

April 17, 2018

Tape remains a popular medium of data storage due to its advantages in cost and ease of use. Often as little as a penny per gig per month to cover just the hardware, tape seems to be a great, affordable option for backup. Unlike other backup solutions, scaling tape is simply a matter of adding physical inventory, and tape doesn’t require network bandwidth or expertise to move offsite, or power when storing long-term archives.

read more

Data Management Trends: Machine Learning, Artificial Intelligence, and High Performance Computing in the Life Sciences Industry

April 10, 2018

Explosive data growth in the life sciences industry is nothing new; what’s truly exciting is how this data can be used!

read more

Archive 101: What is It, Why is It So Important, and How Do You Archive Effectively?

April 3, 2018

As data grows, archiving data has become more important than ever for a robust data management strategy. Yet, effective archive remains elusive for many organizations. Even defining what “archive” means can be difficult because archive commonly refers to backup archives or e-mail archives, not unstructured data management.

read more

Comments