Compression and Deduplication

Compression and Deduplication

Picking up from our last post in this series, Tales of Data Storage Evolution in Retail: Virtualization

“Data has become the new currency for retailers, their most critical asset; however, retailers have been more reactive than proactive in managing this asset. The results of which have led to blind additions of capacity, data trapped in storage silos, duplicate data, extended refresh cycles, and ad hoc cloud usage that leads to dead-ends and the need to buy more storage. And data is not getting cheaper at the same rate that more data is being collected – A vicious cycle.”

This means that data storage can no longer be just a commodity in future IT infrastructures. Data management is now driving storage, and the storage world (infrastructure vendors and users alike) must respond with the right products and tools to efficiently support and exploit that data, to both operational and financial advantage. The goal is more favorable technologies that will not depend so heavily on increasing hardware footprints, but will leverage automation and other sophisticated tools for greater efficiencies.

Your strategy does not end with Virtualization, it begins there. With virtualization:

  • Disparate, inefficient storage environments can be a conundrum of the past.
  • Companies can take existing storage systems – regardless of brand, make, or model – and create a completely new virtualized, performance-boosted environment.
  • External virtualization software doesn’t just connect your storage. It truly virtualizes and improves existing infrastructure performance.
  • Add in a single-point-of-management and migration and your company gains the ability to manage, update, and protect all storage resources from one platform.

But, now that storage virtualization has grown the capacity, how do you keep up in performance? This is when we look at compression and de-duplication with automated tiering and automated migration. This next step in your storage enhancement process is to optimize those stored data pools. Imagine enhancing the gas within your car so that you can travel further without the risk of degrading the performance and speed of your engine

Compression and De-duplication – the ability to compress data by removing unnecessary identical bits but still maintaining the integrity of the data to keep it viable.

While looking at Compression software look for its ability to compress all existing data that’s already in your storage tank instead of just the new data that is being syphoned in. This will optimize your entire storage infrastructure, creating more space and helping to significantly reduce data center expansion. Add in the power to automatically compress new data as it is stored, and your company can increase storage capacity and use less power.

Compression and de-duplication can reduce:

  • Storage purchase costs
  • Rack space
  • Power and cooling
  • Software costs for additional functions

The right Compression software allows users to effectively store more active, primary data (up to 5X is not atypical) in a given amount of physical disk space. This provides key benefits, including reduced physical storage requirements; improved efficiency, which leads to reduced storage costs; or a combination of increased capacity and decreased costs. But, keep in mind that not all compression technologies are created equal. Some will appreciably affect processing speed and create performance bottlenecks. For example, typically, a compression technology for an all Hard Disk Drive storage solution increases latency. (But, that is fodder for our next topic in the Series)

If some form of tiering is not used, they may even decrease useable storage space and raise storage refresh costs.

So, the next step is to migrate differently-accessed data elements to the appropriate “tier” of storage: for example, “hot” workloads might be migrated to the most accessible and fastest mediums.

It is very important to use the right kind of storage to avoid problems and reduce wasting resources. For example, in a high-performance environment, well-meaning IT professionals may end up keeping low priority data on high-performance devices simply because they don’t have the ability to automatically migrate it to appropriate systems as its usage becomes less urgent or frequent.

This might leave other heavy transactional database applications with “hot” content to suffer delays in I/O performance. In other words, the application demands data faster than the storage is able to deliver it, causing excessive CPU cycle demands on the storage controllers and diminishing user responsiveness by the application.

Data de-duplication and retention policies can help alleviate some of this storage demand. However, demand for more and diverse types of data, including streaming data from social media sites or customer service systems, machine data, archives, or financial data, shows no sign of abating. The new world order is more and faster data.

Uptime can be another major benefit of a good storage environment, and it is achieved in two main ways: through continuous operations, or via non-disruptive migrations. With continuous operations, as new applications are rolled out, and new storage capacity or a new storage tier is brought online—there need not be any downtime. In the case of non-disruptive migrations, as data is migrated to new storage systems or technologies, downtime can again be avoided. With that said, automated migration solutions are becoming a must have in the retail IT infrastructure.

The “right” infrastructure—that is, optimal from both an operational and financial perspective, with automated flexibility to respond to changing data needs—can contribute to successful business results by:

  • Increasing performance, enabling faster analytics, and speeding up “time to business insights.”
  • Storing more data in less/optimized space thereby reducing energy costs and freeing up capital expense budgets and saving on operational costs.
  • Reducing complexity, driving down operational expenses, and enabling workers to focus on strategic priorities.

Are we there yet? Not without the final piece to this IT infrastructure puzzle: Using the right media/technology, in the right place, to tie it all together.

To view or add a comment, sign in

More articles by Timothy Hill, PMP

  • Case for Flash

    Our journey to address the data conundrum and subsequent storage requirements facing retail today has brought us to the…

  • Virtualization

    Continuing from my post last week, The Data Conundrum: Today’s Retailers are awash in high-volume, high-velocity data…

    1 Comment
  • The Data Conundrum

    Today’s Retailers are awash in high-volume, high-velocity data coming at them from all sides… eCommerce, for example…

Others also viewed

Explore content categories