How to Create Your Active Archive with Hybrid Cloud

AAA-DavidCerf-headshotby David Cerf

The exponential growth of unstructured data and limited storage budgets force the need for more cost-effective storage architectures like active archiving. As data ages or changes its performance profile, it makes sense to move this data from primary storage to economy tiers of storage such as lower cost disk, tape or even cloud.

While high-performance storage keeps data rapidly available, it comes with significant costs. It makes little sense today to just keep buying expensive storage, filling it up and buying more. With today’s storage options, it is best to use the right mix of storage technologies to balance performance and cost, that is – the right data is on the right media to deliver both performance and cost efficiency. The challenge is how to do this simply and to gain real savings without adding complexity.

To start, you must be able to identify what data can be managed in an active archive architecture. The majority of data stored in file systems and object storage is unstructured data. Examples include rich media (video), general purpose files, and machine generated data. According to research from the International Data Group, unstructured data is growing at more than 60 percent per year. In the next five years, as much as 93 percent of our digital universe will consist of unstructured data. Much of this data is a perfect candidate for active archive storage.

What if you don’t know how much data is in that archive category? Fortunately, there are tools available such as this free storage assessment from, to identify if data in your storage is active or inactive. Running the tool shows exactly how much data could be managed more effectively to reduce costs, reduce your backup, and enable you to save money with an active archive approach.

Once this data is identified, the next question is how to build an active archive using the existing storage you have today.

“Active archiving” simply means an archiving strategy in which all data is protected and always available using tiering for performance and low cost storage to deliver a lower cost storage solution than traditional single tier storage. As cloud storage becomes ubiquitous among IT users, we must pay close attention to where cloud can fit in long-term archiving, and where it doesn’t.

Here are a few key criteria to evaluate when considering a cloud-based active archiving solution:

  • Know the true costs – watch for hidden expenses
  • Choose non-proprietary solutions to ensure your data is always accessible
  • Ensure your data is secure and protected to meet your on and off site requirements
  • Read the SLA fine print – for example, how long does it take to retrieve your data
  • Simple is always better
  • Leverage what you already own

There are myriad options for active archiving. This strategy will often include hybrid storage architectures that may combine flash, disk, tape and cloud storage. The secret sauce boils down to managing these storage technologies to work best within your business. Some solutions have intelligent storage management with intrinsic, automated policies to protect and manage files and objects. This can include self-healing and self-managing features to ensure the data is resilient and protected all the time across the different storage mediums and multiple sites.

Enter the Cloud
Companies use cloud storage for a variety of reasons including archive. Unfortunately, storing data in the “cloud” can force serious compromises such as hidden transfer/access fees and proprietary “lock-in.” Most importantly, as data increases, the costs can add up fast and push users out of their budgetary comfort zone. So do the math. Make sure the cloud will actually save you money in the long run; it might sound good initially but it may be costly in the second or third year. And not all cloud services are equal with regards to the SLA and data ownership, so read the fine print. In a recently published TCO study by Brad Johns Consulting, it was reported that one active archive, Fujifilm’s Dternity Cloud, is 34% less expensive than a comparable offering of Amazon Glacier service. Additionally, the report found with Dternity, data always remained the property of the user, not the vendor.

Private clouds can overcome these issues, and many active archive solutions can be used to build a private cloud. Ideally, a cloud-enabled active archive should use open standards so that data is never out of reach or “locked down” by the vendor software or storage. Since an archive is all about long-term storage, make sure the storage can be self-describing and accessible without dependencies on the vendor or applications so you retain ownership and access rights.

Hybrid Cloud
Often, companies may need to maintain some data onsite with protected copies in a second location. Employing a hybrid cloud also helps organizations meet the golden rule of 3 copies, 2 different mediums, and 1 offsite copy for data protection. Innovative cloud storage solutions can now support this 3-2-1 strategy as a secondary site for replicated disaster recovery copies.

Active Archive as a Cloud Storage Gateway
Active archive solutions should be able to leverage hybrid architectures to support both on- and off-site storage, including the cloud. Ideally, the solution would have intelligence to blend local storage with the cloud to best manage storage resources and ensure the right data resides on the right storage. This approach is often called a storage gateway and is ideal for delivering the most cost effective multi-site storage, meeting DR requirements and providing remote accessibility.

Make sure the solution can work with your existing applications without changing application or user behavior. Look for solutions with both file and S3 interfaces to deliver “plug and play” support for both your cloud and file based applications.

With this approach, you can control data cost and maintain chain of custody, storing objects or files in a familiar way without modifying applications or having to write scripts. Also make sure the solution can support a “living archive” model. This means the solution is intelligent enough to auto-migrate data so you can easily add new technology to your storage environment without “fork-lift” upgrades or “rip and replace” migrations.

The Bottom Line
With data set to reach yottabyte status in the next few years, crafting an active archive is more important than ever before. As companies continue to embrace and deploy cloud solutions across their organizations, cloud storage is gaining popularity as a lower cost way to manage the deluge of data. Leveraging intelligent active archiving for onsite, offsite and cloud can help lower costs, improve data protection and solve the complexity of data management.

David Cerf is the EVP of Strategy and Business Development at Crossroads Systems and an Active Archive Alliance Member.

Leave a Reply

WWPI – Covering the best in IT since 1980