An unprecedented level of reliability

It's good to be reminded from time to time that information security addresses the availability of data in addition to its confidentiality and integrity. I was reminded of this when I read about Amazon's Glacier low-cost storage service:

Amazon Glacier is an extremely low-cost storage service that provides secure and durable storage for data archiving and backup. In order to keep costs low, Amazon Glacier is optimized for data that is infrequently accessed and for which retrieval times of several hours are suitable. With Amazon Glacier, customers can reliably store large or small amounts of data for as little as $0.01 per gigabyte per month, a significant savings compared to on-premises solutions.

Glacier is designed to provide a level of reliabilty that was essentially impossible just a few years ago. Here's how they describe this:

Amazon Glacier is designed to provide average annual durability of 99.999999999% for an archive. The service redundantly stores data in multiple facilities and on multiple devices within each facility. To increase durability, Amazon Glacier synchronously stores your data across multiple facilities before returning SUCCESS on uploading archives. Unlike traditional systems which can require laborious data verification and manual repair, Glacier performs regular, systematic data integrity checks and is built to be automatically self-healing. 

Holy cow! That's 11 nines!

When I designed some highly-available dot-com-era e-commerce systems that was an absolutely impossible goal. And even though I haven't seen how Amazon plans to get those 11 nines, I'm not inclined to doubt that they can actually do it.

So although there are definitely issues with respect to data confidentiality and integrity that still need to be resolved in cloud computing, it certainly looks like availability is one area where cloud computing can give you a benefit that would be hard to get otherwise.

  • Andrew Yeomans

    If you take 1 archive = 1 file (Amazon also allows multiple files per archive) then you can begin to see the need for 11-nines durability.
    My Windows PC currently has over 10^5 files on C: (not including my personal data). With a million customers, that is 10^11 archives, so the durability would mean one lost per year. And if the service is successful, there would be several orders of magnitude more files and customers, being archived for decades. It’s still impressive if it’s achieved.


  • Tom Ritter

    You’re mixing terms. 😉 “When I designed some highly-available” vs “provide average annual durability of 99.999999999%”. Amazon isn’t saying your data is going to be _available_ with that percentage, it’s saying it’s 99.999999999% likely they’re not going to accidentally lose it.


Leave a Reply

Your email address will not be published. Required fields are marked *