Best Practices: Backups

April, 02 2011

My last post was on the importance of password management. Another critical best practice when you deal with other peoples data is backups.

Since the beginning of the digital age retaining data for both the short and long term has become an open problem or sorts. Fortunately fragile bits are easy to copy but every type of media has a limited lifetime, especially hard disks. Coming up with the right balance to ensure retention is tricky. Much like claim is the most important part of insurance, recovery is the most important aspect of any backup system and testing it can be tricky.

I will outline my approach to backing up data, but the real key is to have a strategy and to think about the problem. What works for me won't necessarily work for you. Don't be reliant a old beige computer in an office somewhere which you hope won't die on you.

Time Machine and Time Capsule

If you are running Mac OS X definitely take advantage of Time Machine. It can do automated hourly backups with a well tested recovery mechanism. A feature was added under System Preferences, Time Machine, Options... to enable or disable doing backups on battery power. I find disabling help keep performance up. My backups may not be hourly but when they happen they don't drag my system down.

Apple's Time Capsule sounds like a destination for your backups, but in my experience it is some what questionable. Apple has had some quality issues with Time Capsules. My first Time Capsule was newer than the generation which has officially be designated as flawed, but it still died after about 18 months. It died in such a way that the hard disk seemed functional, but I wasn't allowed to remove it and get warranty replace of the Time Capsule unless I paid an Apple certified tech to do it. The removal wasn't covered under warranty. For a data backup device this doesn't seem like a sensible policy.

There are likely better options which allow the removal of your disks if the device itself fails. I am currently still using my replacement Time Capsule, but when it dies in 18 months I will definitely look into replacing it with something else. Likely some kind of network hard disk since Gigabit Ethernet can move bytes faster than USB 2.0 which is important when moving around large 100 GB backup files when dealing with failure.

Two Separate Devices and Off-site Backups

The lesson I have learned is data really isn't backed up unless it is on two separate devices, not just two separate drives. The more critical data is the more places it should be stored. Often the most critical things are also things which need to be kept secure. Having many copies and security seem contradictory, but with a bit of encryption built on the passwords system I outline last post, both can be achieved.

The best backup strategies include off-site backups. The risk to your data are beyond just device failure. There is also external issues such as fire or other disaster. With data hosting services like Amazon's S3, Rackspace's Cloudfiles and others, you can take advantage of off-site backups while hooking into major players redundant storage infrastructure at a great price.

Unfortunately universal tools to make this easy and accessible to everyone aren't quite mature yet, but make no mistake they are on their way.

In my next post I will outline how I backup my most critical information to Amazon S3 in a secure way using Python, Fabric and OpenSSL.


Tweet comments, corrections, or high fives to @amjoconn