Shrink your Data with Deduplication – But Make Sure you’re Doing it Right

1876 0

Maximize your investment in data protection and recovery solutions with true global source-side deduplication

“Too much data, not enough storage space, not enough time and not enough budget” are familiar statements among small and medium-sized businesses. Protecting and ensuring recovery for an ever-increasing volume of data while managing costs is a serious concern for companies that may not have the IT resources, infrastructure or personnel of a large, enterprise-level organization. Backups may fail, take up too much space or cost too much, all of which can have a significantly negative impact on an SMB’s ability to recover and operate in the event of a data loss.

One way to address the issue of exponential data growth is to optimize the deduplication process. By identifying and eliminating portions of redundant information in a data set, deduplication technology – an essential part of any data backup solution – can dramatically reduce an organization’s need for storage space and network bandwidth.

There are many variations, each with important considerations, however source-side deduplication is becoming widely recognized as the premier, next-generation solution. Because of its ability to back up only new and unique data at the source, source-side deduplication can yield dramatic savings in terms of bandwidth, required storage and corresponding costs.

The process of source-side deduplication entails new next generation Backup Servers that work in conjunction with Agents installed on the Clients (the “data source”), and check for duplicate data before the data is transferred over the network. It is quickly replacing target deduplication as the preferred method of deduplication for data protection.

Global source-side deduplication takes this process a step further. This method shares all of an organization’s deduplicated data intelligence across all source systems. Every computer, virtual machine or server that is backed up communicates with a backup server, which acts as the central data store and manages a global database index of files on all machines everywhere. The backup server does the work of figuring out what needs to be backed up and pulls only new data as required, while eliminating duplicate copies. As you’d imagine, this can yield tremendous operational efficiencies and tangible cost savings by reducing the amount of data that is backed up (by up to 92% with some solution providers). Further, it optimizes storage requirements and required bandwidth, while accelerating data protection and recovery. With a much-lower volume of data, the frequency of backups can then be increased to improve recovery point objectives.

The trick with global source-side deduplication is performance. Because global source-side deduplication increases the number of source systems, it puts additional strain on the global database index. To be successful, your backup vendor must solve some very tricky performance and scalability challenges managing the global hash database so global source-side deduplication can function properly. Don’t be fooled by data protection solutions that limit deduplication to a single storage volume or a single backup job. Deduplication solutions that limit the scope of deduplication, such as Windows Server 2012, are doing so to limit the size of the hash database index. True global deduplication works across the entire network for maximum results.

SMBs (and organizations of all sizes) can reap the benefits of – and their investment in – deduplication technology by choosing a truly “global” solution. And when that solution is seamlessly interwoven with data protection and recovery solutions, simultaneously managing and protecting impossibly huge volumes of data seems a little less daunting.

About Christophe Bertrand
Christophe BertrandChristophe Bertrand is the vice president of Product Marketing at Arcserve. He joined the Data Management division at CA Technologies, now the independent Arcserve company, to lead Product Marketing and launch the award-winning Arcserve Unified Data Protection (UDP) solution. With more than 20 years in the data storage and protection space, Christophe enjoys sharing his knowledge and best practices worldwide.

In this article