Ending a Nightmare: Backups

Ending a Nightmare: Backups

Backups has always been a nightmare for IT. A huge of amount of data gets stored just in case… We copy everything from our server to tape just in case… We keep several copies of our data just in case… Let us get a copy stored far away just in case… Better to have one close just in case… Batch job about to start, let us backup before just in case… Batch job finished, let us backup now just in case… We copy our data to tapes or to a big virtual type library just in case… We backup everyday just in case… We keep a replica just in case…

Backup is a necessary evil. Tape vaulting, transportation or remote duplication. The complex restore operation. The special hardware for the 20-year-old backups. The new user interface no longer compatible with that special tape with the CEO data from 40 years ago. The backup slots for a follow the sun application. The backup schedule that needs 25 hours in a day. The two tapes that skipped the rotation plan. The new application that never was backed up. Tape 5 of the backup sequence is missing for the restore. The new backup operator that goes into panic first time he is alone at night…

But why do we backup? In my experience I have found three patterns for just in case…

  1. … our systems are destroyed; we need to restore them back to life. Recovery.
  2. … our records are not accurate; we need to restore information from long ago. Repudiation. 
  3. … our data was corrupted; we need to restore back to before corruption occurred. Integrity.

As usual I eat the elephant piece by piece. So probably I need is to create a solution for each of the three patterns: Recovery, Integrity and Repudiation. A fire occurred we use the Recovery pattern. Mechanical drive failure caused data loss we use the Integrity pattern. System was stolen we use the Recovery pattern. Our customer contests our process we use the Repudiation pattern. Our application developers deployed a program that changed incorrectly the data we use the Integrity pattern. 

The Recovery pattern

The typical backup solution for this is the bare-metal restore backup. We do a full backup including not only data but the full system image including operating system, drivers, configuration and applications. The backup is shipped to an offsite location just in case something happens with the normal site.

Today this is solved in a very reliable means using storage replication from one location to a second location. Not only the shipping is faster, as soon as data is changed you send the data over the telecommunication lines instead of having a security crew transporting the tapes, but also the frequency increases because it is made the instant the data is changed, so no longer waiting for the typical 24 hour cycles of the backups.

Storage replication enables Recovery Point Objective of less than 5 minutes. This is much better than backups which give you 24 hours typically. The replica of the storage replication is ready to be used, it is a permanent consistent replica of data, programs, configurations and system. The Recover Time Objective can be few hours or minutes, way better than reading and restoring tapes that can take days for huge databases.

The Repudiation pattern

The typical backup solution for this is to keep the backups for years upon years upon years just in case... We expect somehow to be able to restore the data from a contract celebrated on 23rd July 1967 by keeping the backup of our full systems from the end of the month just after processing all payments, so we restore the tapes from 2nd August 1967 with a tape reader we hope still works into a computer compatible with the one we had back then, and print the contract just as if we were printing with that old, but then brand new, IBM 1403 printer.

This extreme example is somehow still used today, and we expect that the backups we make today we will still be able to read in 40 years time to print that one contract or document. Backup is obviously not the best of choices for this long term retention of data. Today this function of backup has been replaced by generating human readable documents that can be stored in a digital document archive or unstructured by searchable document data lake.

The contract or any other future required document such as accounting ledger, customer claim, payment receipt, is registered and indexed. When needed, we search the archive or data lake by the available indexes without the need for the original applications or systems that generated said document in the first place. At most this archive is replicated once at an offsite using the previous Recovery pattern.

The Integrity pattern

The typical backup solution for this is the daily backup with as many copies as you think you may need. When we identify corrupted data, we search backwards within the backups for the most recent version of the data not yet corrupted. Once found said data, different restore mechanisms may be used depending on the particular data to be recovered, be it a database, a register, a file, an email…

This is probably the last real need for Backup, but even this one can be eliminated as finer technology becomes available. Today we have several different mechanics on storage that keeps the last processed data and before image journals or log of transactions made on the storage, that enables us to reconstitute the storage as it was at a given moment in time. Be it using snapshot technology or continuous log registration.

We can even attach a virtual machine to that virtual time image and fetch the data from the system at that particular moment in time. With this ability, we can go back in time until a moment in which the data is not corrupted and used that data to reconstitute integrity.

No more backups

With the 3 patterns and identified alternative solutions to using backup we are able to eliminate once and for all backup systems.

  1. Recovery pattern: Storage replication solution
  2. Repudiation pattern: Digital archive solution
  3. Integrity pattern: Storage transaction journal or snapshot solution

You can read more at the reason of one.

Disclaimer: The views, ideas and concepts here depicted and/or described are the sole responsibility of the author and are in no way related to or representing the companies for which the author works or has worked.


Hi Alexandre, Great summary. I fully agree on all three points and step-by-step traditional backup can and will be replaced with newer technologies (and is widely already for the first two points -> replication/point-in-time recovery; archiving) to cover the needs of having the right information available when needed. This technologies may or may not be called “backup” in the future but should fulfil still the same requirements as today > being an “insurance” in case of. And since nobody really likes to spend money for an insurance, at least we should make every effort to pay for a solution that 1) really can help us “in case of” and also 2) covers the needs from the business in regards to fast access, granularity, visibility and potentially even provides additional benefits. To the first point, in my eyes having a media-break is something really important and should be the case for every future solution acting as a real “insurance” (everything what is based on the source, e.g. snapshots from the original data is suspect to failure if the source fails). And to the second point, it really should provide instant access to the required (and integer) information but also cover additional needs like providing self service, visibility into the data for reporting, governance, compliance, devops purposes and so on. Not sure how a solution which covers all three aspects (from your post: Recovery, Repudiation and Integrity) will look like in the future and if there will be an all-in-one solution one day which covers all the needs from IT and business perspective. But in any way there is still a lot of improvements possible when looking at all the current solutions available in the market. It will remain interesting, and keeps us busy for a while I guess. Best’ Marcel

The paradigm is not new and thanksfully nor emergent. Our awareness that technology will never be 'matured' to a point we wont have to take care about evolution is what is making us to take snapshot OF THE TIME where the information was captured and recorded, enabling us to recover it, completely, in the future. Ancient monks had similar problem. Back then, when books was starting to rotten, decompose or deteriorate, monks had to rewriten them again. Back then, also because semantics (our changing in systems) , knowledge (our datamapping that allow us to know what does an 'X' mean at column B), or even language (different low level coding), information might get lost, corrupted or missunderstood.

Hi Alex, Like the article. Recovery is an even more horrifying prospect: Recovering data to a prior point in time is straight forward for an application in isolation, but doing so when it is integrated with other applications can destroy integrity between them. It is challenging enough when the applications are under your control but more challenging when integrated with 3rd party applications outside of your control. The problem of maintaining integrity also applies to copies of data held for other purposes; for example, data marts and lakes etc. There is also the prospect of loss of “good data” that had business value. I suspect that such realities of recovery are not considered deeply enough in advance. That can result in protracted discussion/agreement following a problem that delays the initiation of any recovery, during which time business cannot be conducted. Alan

Storage replication scenario with journaling is a reality with Actifio and gives more data usage features. Digital archive to storage as one product also a reality. Storage replication was a must have and now is a feature.

To view or add a comment, sign in

More articles by Alex Lopes

  • LinkedIn: Death by InMail

    Many of you, like me have thousands of contacts, are part of a end user organisation, and face hundreds of InMail and…

    4 Comments
  • The Superhero kills Agile

    In order for organizations and enterprises to achieve objectives and goals, many individuals and teams either place…

    8 Comments
  • Boredom induced by Artificial Intelligence

    Where is the WOW ? When was the last time you said WOW? Been quite a while that I am not surprised with something…

    14 Comments
  • Meritocracy vs Democracy

    In a world in which bad collective decisions are constantly being made, plenty of examples out there at the country…

    5 Comments
  • Next step in human brain evolution… Artificial intelligence or augmentation?

    Humans are known to sport big brains. On average, the primate brain size is almost twice that of mammals of the same…

  • Recruiting for the Zurich Architecture Think Tank in Dublin

    Do you love a challenge? A role in insurance could be just what you’re looking for. In today’s complex world, tackling…

    1 Comment
  • O2/Telefonica bullies expat

    Or designing for Bad User Experience..

    9 Comments
  • Excite Innovation @ Zurich

    Does your company have an exciting technology to help Zurich innovate into new areas? Are you looking for an…

    14 Comments
  • Designing for User Experience

    I am all about designing with user experience in mind. I do not like to do publicity, but I like to give credit when…

    6 Comments
  • Parallelism: Quantum Physics and Information Technology

    There is an effect in Quantum Physics that somehow are applicable in Information Technology. Of course, the way this…

    1 Comment

Others also viewed

Explore content categories