Team LiB
Previous Section Next Section

Hack 83. Make Network Backups

The cost of computing is so low that it is not uncommon to have more than one computer in a house. Increasingly, people are buying many computers, networking them together, and using them for different purposes. For example, in my home are two Linux boxes, one Linux server, one Linux firewall, two Windows machines, and a Mac. With a large number of computers, each with important data on it, backups of that data become a very real and important issue to consider.

The natural assumption when faced with a need to perform backups is to use a medium such as tape or a CD/DVD. But in this hack you are going to perform a series of network backups that simply copy files from one machine on the network to another.

9.15.1. Simple Single-Shot Backup

If you want to do a simple full backup of a directory, you can do it with a single command by using secure copy (scp). This little tool lets you copy a number of files from one computer to another in a secure encrypted form. One of the major benefits of scp is that you can copy files across the Internet; you are not limited just to computers on your local network. To use scp, you need to have the Secure SHell (SSH) daemon running on the machine you are copying to and have the scp program (which is part of the SSH package) on the computer you are copying from.

To get started, you can copy a directory full of work from your machine martin to a machine called simon (if your hostnames are not resolvable, use their IP addresses). You can do this with the following command:

foo@martin:~$ scp -r importantwork simon:/home/alan

This command uses scp to recursively (-r) copy the files within the importantwork directory to the host simon and into the /home/alan directory. Once you have entered the command, you see a status bar for each file as it is transferred. This gives you a visual indication of the copy's progress.

9.15.2. Elaborate Backups Using rsync

Using scp for backing up a configuration has a few inherent problems. The first issue is that each time you need to make a backup, the contents of your entire directory, importantwork, are copied over again, which can consume a lot of bandwidth and time. The other problem is that scp is rather inelegant in that it copies only specific files, and cannot easily distinguish between the files on the backed-up computer and the source files.

A better solution for managing backups of large groups of files and directories is rsync. This tool is easy to set up (you only need to install the rsync program on the remote and local computers), and it has the ability to intelligently copy files and directories over and during later syncs to copy only the specific files and directories that have changed. To use rsync you basically need to specify the machine and directory you are copying from and where you are copying the files to on the local computer. For example, you can copy some files from a remote machine called martin to your current machine:

foo@bar:~$ rsync -avz martin:/home/martin/importantwork /home/foo

In this example, you use two command-line switches that adjust how rsync works. The -v switch puts rsync in verbose mode and outputs what it is doing at all times, and the -z switch compresses the files to lower the bandwidth required to make the transfer. This compression is less important when copying files between computers on a local network than it is when copying files over the Internet, but using compression is not a bad habit to get into. The rsync program is very flexible, and a few other options are worth exploring when making backups such as this. First, you should be aware that rsync's default behavior is to add files only when making a backup. This means that if you've backed up a file and you delete the local copy, the backed-up copy remains on the remote machine even during later syncs. In some cases this might be unsuitable, such as when you want to mirror a directory full of files and you want the backed-up files removed when they are removed from the main directory. To do this you can add --delete to the line:

foo@bar:~$ rsync -avz --delete martin:/home/martin/importantwork /home/alan

A particularly useful feature within rsync is the ability to exclude specific files from the backup. You can do this with the --exclude switch. For example, if you want to keep your importantwork/passwords/importantpasswords.txt file out of the backup, you can use this command:

foo@bar:~$ rsync -avz --delete --exclude=passwords/importantpasswords.txt 
martin:/home/martin/importantwork /home/foo

If you need to exclude a number of files, include a number of --exclude flags for different files or directories, one after the other.

One final point to note about rsync is that as with many other network tools, all traffic is unencrypted and potentially subject to malicious people sniffing your traffic and discovering sensitive information. If you are concerned about your security, it is advisable to use the -e switch built into rsync to use an SSH shell to encrypt all traffic. Simply add the e to the collection of switches and specify ssh as the shell to use:

foo@bar:~$ rsync -avze ssh --delete --exclude=passwords/importantpasswords.txt 
martin:/home/martin/importantwork /home/foo

Although the most common use of rsync is between a local and a remote machine, it really doesn't matter where the two machines are. As far as rsync is concerned, one is just a source and the other is just a destination. Both the source and destination could be on the local machine, or one could be local and one could be remote, or both could be on different remote machines.


    Team LiB
    Previous Section Next Section