Backup Blues; or, Murphy was a freakin’ optimist

by

Ephemeral. That’s the nature of an author’s work, especially today when every stage of the writing and publishing process is digital. The products of our creativity and sweat exist primarily as files in computer storage, fragile collections of bits that are highly susceptible to corruption or loss. In the first column of the Naughty Bits series, I explained how strings of ones and zeroes can encode any sort of meaning, including the wondrous products of a writer’s imagination. At the time, I didn’t emphasize the dark side of this technological marvel. Let just a few ones flip over to zeros, or vice versa, and a manuscript becomes unreadable!

The media we use to store our precious stories are subject to a wide range of threats. Dust, electromagnetic radiation, power surges, manufacturing defects, and controller malfunctions can all cause hard disks to fail. Indeed, every hard disk will fail, eventually; the tiny, intricate mechanisms used to read and write data will simply wear out.

Thus, it’s not really a question of whether you’ll lose data, but when. Murphy won’t be ignored. When that dreaded day arrives and your work in progress disappears, what will happen? If you’ve been disciplined and diligent about backing up your work, you’ll face some inconvenience, but you’ll be able to recover most if not all of your efforts. If you’ve been lazy or disorganized – if you’ve closed your eyes to the grim realities of the computer world – you might well have to start from scratch.

In this column, I want to discuss general backup issues and strategies. I’m not going to recommend specific programs, services or processes, because there’s not one single approach that will work for every author. If you always write on same computer, in the same location, your backup needs are different than those of an author who travels constantly and writes on the train or in coffee shops. If you generate gigabytes of content weekly, you can’t use the same approach as someone who produces only a few megabytes. A Linux geek like me, comfortable typing command lines and writing scripts, is going to use different tools than someone who runs Windows and wants to do everything through a graphical user interface.

My goal is to outline the dimensions of the problem and sketch some possible solutions, with their advantages and disadvantages.

What’s the worse that can happen?

There are three categories of threat you should consider when choosing a backup strategy:

  • Media or computer failure. Your hard disk crashes. Your tablet gets run over by a truck. Your kid spills Coke all over your laptop keyboard. In these cases, you need a local copy of your data so you can quickly replace what you’ve lost.
  • Major disasters. Your house burns down, or is devastated by a tornado, or is washed away in a flood. Assuming that you and your family escape unscathed, what happens to your stories? This sort of threat suggests a need for storing copies of your data at some other location, away from your computer.
  • Errors or carelessness. You’re not exactly sure how, but in the process of your work this afternoon, you managed to clobber the three chapters you wrote this morning. I’ve done this more times than I care to admit, often by “cleaning up” what I thought was a redundant copy of a file or by mistyping some common command. In this situation, even the best backups will often not allow you retrieve your most recent work. Some text processing software can create automatic backup versions while you write, or you can train yourself to do this manually. As frustrating as this situation may be — my poor husband has become accustomed to my swearing and melodramatic threats of suicide – the damage done is likely to be less extensive than in the other two scenarios. With decent backups, you should at least be able to return to where you were yesterday.

The basic idea behind every backup strategy is that you want to make copies of your important files, which can be retrieved if something bad happens to your original data. The decisions you face involve the medium used to store the copy, the frequency and mechanisms of making the copy, where the copies should be kept, and how long a particular backup copy should be retained.

Backup Media Options

Let’s consider medium first. For local backups, you can make copies on:

  • Another hard disk attached to the same computer (including an external hard drive);
  • A disk on a different, networked computer;
  • A flash drive or memory stick;
  • Writable non-volatile media, like CD-ROMs or DVDs;
  • Paper.

The first option is probably most convenient. However, if your entire computer is destroyed, you run the risk of losing both your original and your backup copies. The second option is more complicated, since you need to figure out how to get the data from the primary disk to the backup disk, but is likely to provide a more robust backup.

I know many authors who use flash drives as their primary backup media. This solution has attractive aspects. Flash drives are cheap and highly portable, so they’re great for authors writing on the go. They’re also really easy to use on most platforms, though copying the data is likely to be manual rather than automated. However, using flash memory as your main backup medium has two serious problems. First, flash memory supports a limited number of write operations (in the tens of thousands). So a flash drive will become non-functional much sooner than a hard drive (possibly without warning). Second, memory sticks are so small that it’s really easy to lose them. What would happen if somebody else got hold of copies of your stories? Personally, I use flash drives for backup when I’m traveling, but as soon as I get back to home base, I’ll update the primary backup disk with the work I did on the road.

CDs and DVDs are less likely that hard drives or flash drives to be corrupted by environmental factors such as magnetism or power surges (though they are vulnerable to dust and scratches). The primary disadvantages of these media are their relatively small capacity (though they may well be large enough for some users), the physical space they require for storage (assuming you accumulate your backups over time), and the fact that standards change and formats become obsolete. (I have some backups of my early work on floppy disks. That means we need to retain at least one computer that has a floppy drive – something that is becoming increasingly rare!)

Using printed paper copies for backup is better than not having any backup at all. However, if you ever want to recover your work, you will need to scan it, subject it to OCR (Optical Character Recognition), and then correct the OCR errors – a time-consuming and labor-intensive process. In addition, you can’t use paper to backup non-text data such as book trailers or cover images. Finally, paper requires a lot of physical storage space.

Of course, you don’t have to settle on a single medium for your backups. For instance, you could do a nightly backup to a networked hard drive and a weekly or monthly backup to DVD. This kind of hybrid approach is more complex but generally more reliable than a single-medium solution.

Backup Sequence, Timing and Location

One serious error many people make is to use the same device or medium over and over, with each day’s backups replacing those from the previous day. If you’re doing this, and a file gets corrupted today, but you don’t discover it until the day after tomorrow, you are – pardon my crudeness – screwed. You probably had a good copy of the file in yesterday’s backup. But today’s backup will overwrite that good copy with the bad one – before you realize what you’ve lost.

For this reason, an alternating or tiered backup strategy is often a wise choice. In an alternating backup, you copy your work to one drive on Monday, Wednesday and Friday, and to another one on Tuesday, Thursday and Saturday. In a tiered backup, your work gets copied to one disk or computer every day, but then that computer is backed up periodically (maybe once or twice a week) to another computer/disk. This arrangement increases the likelihood that after your delayed realization of disaster, you’ll still have a copy of the uncorrupted data somewhere.

What about backing your data up to “the cloud”, storing your files on some server on the Internet? This has become a popular solution, and it does have some advantages. For one thing, it reduces the risk from major disasters, since your backups aren’t stored in your home or at your office. It’s also cheap (sometimes free) and convenient, especially with the automated backup clients provided by some storage services.

However, you should read the terms and conditions for such services pretty carefully. In the cases I checked:

  • The service has no liability if your data gets lost, corrupted or stolen;
  • The service makes no guarantee that it will remain available. Indeed, the service has the right to terminate your account at any time and will not necessarily allow you to retrieve the data you’ve stored with them if they do terminate you.

Beware, in particular, of free services, which can change or disappear at any time. When you’re not paying anything, you also have no leverage at all with a service provider.

And what if you lose Internet connectivity? Our Internet went down for a week in May. I shudder the horrible recollection! If I’d been dependent on the ‘net for my backups, I would have risked losing an entire week’s work.

So, as you probably gather, I personally don’t trust cloud-based backup services. However, I do use the Internet on occasion for off-site storage of important files. I pay for hosting my website, and this includes a good-sized chunk of disk space. When I want to be sure a copy of some data exists, physically distant from my home, I’ll copy it up to the web server. I have to be careful, though, to put the data in a private directory. Otherwise any visitor to the site could (potentially) access that data.

How often should you backup your work? If you’re like me, you’re using your computer every day. Thus every day offers a new opportunity to mess things up! Daily backups are essential for me. Your situation might differ, of course.

And how should you make the backup copies? If you can find a way to automate the backup process, rather than relying on manual backups, it’s likely to be more sustainable in the long run. It’s all too easy to forget to copy your work when you’re tired, or excited, or suddenly interrupted. Many software solutions for automated or semi-automated backup are available for different platforms. If you have geekish tendencies, you can roll your own.

The question of where to keep backup copies pits convenience against safety. Probably you want recent backups to be immediately available when Murphy strikes. On the other hand, it’s highly desirable to have some sort of off site backup facility as well, in case your premises are destroyed. We keep a backup hard drive with our most important files in our safe deposit box at the bank.

Backup Longevity

How long should you keep backups? Clearly it’s not feasible, even with today’s cheap storage, to save full copies of all daily backups. However, I can tell you from experience that it’s worth making periodic snapshots of your work to keep indefinitely. More than once I’ve found myself scanning through CD-based backups from five or even ten years ago, trying to locate an old file. Sometimes I’m successful, sometimes not. In the balance, though, it can be a lot cheaper and easier to keep permanent copies of files than to recreate those files de novo. The latter is often simply impossible.

Testing Your Backup Strategy

Suppose you choose a backup strategy and put it into practice. You might be lucky. You could go months or even years without losing a file. Then Murphy strikes, and you discover (for example) that none of your backup CDs can be read.

The lesson here is that you should test your backups periodically, to make sure that you actually have the information you need. I worked for a software company (back in the dark ages) where every night our systems were backed up using an expensive automated mechanism onto tape cartridges. Things went swimmingly for several months. Then our server had a disk crash. When the system administrator tried to restore from the backups, he found that every single tape was blank.

What does the Erotogeek do?

You might be wondering about my personal backup strategy for my writing and marketing data. I have to admit that my husband, who’s even more of a geek than I am, was a major force in its design and implementation. Furthermore, we have refined our approach over the years. Here’s a summary of what I’m doing right now.

  • Every night when I’ve finished working, I run a script that creates a zip file of all the content in the directories where I do most of my work, then encrypts that archive to keep its contents private.
  • The script keeps three days’ worth of archives on my computer’s temporary drive. Each day it renames the one from the previous day. So if today’s archive is (for example), backup.zip, then yesterday’s will be backup-1.zip and the previous day’s, backup-2.zip. When I run the script tomorrow, backup-1.zip will be renamed to (and thus overwrite) backup-2.zip. Today’s backup.zip will become backup-1.zip. Thus I always have three days worth of backups available immediately.
  • Early each morning, an automatic process on our one of our backup servers pulls the encrypted archives from my computer to a big central storage area. We have two different backup servers, which alternate.
  • Every month or two, my husband updates a big hard drive with our latest files, and takes that to his office so that it will provide off-site backup.
  • Every six months or so, my husband replaces the hard drive in our safe deposit box with an updated version of the drive from the office. The previous drive from the safe deposit box rotates back to our home.

In addition to this regular process, I sometimes create CD-based snapshots of important directories to save. We have several binders of CD backups. Also when I’m actively working on a new book, I’ll manually send each day’s work to a different computer, just to have an extra copy.

A Minimal Strategy

You’re probably shaking your head at this point, thinking that there’s no way you could handle a process this complex. Very likely you can satisfy your needs with something simpler.

Here’s a very easy procedure that will provide the primary benefits:

  • Every week, on the same day, copy all your working directories to a CD or DVD.
  • Check that you can read the DVD. Then store it somewhere outside your home.

If you do this regularly, you’re not likely to lose more than a week’s worth of work.

Don’t delude yourself. Sooner or later, your disk will crash – or something worse – and you will need your backups. Now is the time to consider the issues I’ve raised in this article and adopt a process that makes sense for you.

Don’t let Murphy get the better of you!

Lisabet Sarai
July 2012


“Naughty Bits: The Erotogeek’s Guide for the Technologically Challenged Author” © 2012 Lisabet Sarai. All rights reserved. Content may not be copied or used in whole or part without written permission from the author.

Tip Archives

Pin It on Pinterest