Hack 83. Make Network Backups 
The cost of computing is so low that it is
not uncommon to have more than one
computer in a house. Increasingly, people are buying many computers,
networking them together, and using them for different purposes. For
example, in my home are two Linux boxes, one Linux server, one Linux
firewall, two Windows machines, and a Mac. With a large number of
computers, each with important data on it, backups of that data
become a very real and important issue to consider.
The natural assumption when faced with a need to perform backups is
to use a medium such as tape or a CD/DVD. But in this hack you are
going to perform a series of network backups that simply copy files
from one machine on the network to another.
9.15.1. Simple Single-Shot Backup
If you want to do a simple full backup
of a directory, you can do it with a single command by using secure
copy (scp). This little tool lets you copy a
number of files from one computer to another in a secure encrypted
form. One of the major benefits of scp is that
you can copy files across the Internet; you are not limited just to
computers on your local network. To use scp, you
need to have the Secure SHell (SSH) daemon running on the machine you
are copying to and have the scp program (which
is part of the SSH package) on the computer you are copying from.
To get started, you can copy a directory full of work from your
machine martin to a machine called
simon (if your hostnames are not resolvable, use
their IP addresses). You can do this with the following command:
foo@martin:~$ scp -r importantwork simon:/home/alan
This command uses scp to recursively
(-r) copy the files within the
importantwork directory to the host
simon and into the /home/alan
directory. Once you have entered the command, you see a status bar
for each file as it is transferred. This gives you a visual
indication of the copy's progress.
9.15.2. Elaborate Backups Using rsync
Using scp for backing up
a
configuration has a few inherent problems. The first issue is that
each time you need to make a backup, the contents of your entire
directory, importantwork, are copied over again,
which can consume a lot of bandwidth and time. The other problem is
that scp is rather inelegant in that it copies
only specific files, and cannot easily distinguish between the files
on the backed-up computer and the source files.
A better solution for managing backups of large groups of files and
directories is rsync. This tool is easy to set
up (you only need to install the rsync program
on the remote and local computers), and it has the ability to
intelligently copy files and directories over and during later syncs
to copy only the specific files and directories that have changed. To
use rsync you basically need to specify the
machine and directory you are copying from and where you are copying
the files to on the local computer. For example, you can copy some
files from a remote machine called martin to your
current machine:
foo@bar:~$ rsync -avz martin:/home/martin/importantwork /home/foo
In this example, you use two command-line switches that adjust how
rsync works. The -v switch
puts rsync in verbose mode and outputs what it
is doing at all times, and the -z switch
compresses the files to lower the bandwidth required to make the
transfer. This compression is less important when copying files
between computers on a local network than it is when copying files
over the Internet, but using compression is not a bad habit to get
into. The rsync program is very flexible, and a
few other options are worth exploring when making backups such as
this. First, you should be aware that
rsync's default behavior is to
add files only when making a backup. This means that if
you've backed up a file and you delete the local
copy, the backed-up copy remains on the remote machine even during
later syncs. In some cases this might be unsuitable, such as when you
want to mirror a directory full of files and you want the backed-up
files removed when they are removed from the main directory. To do
this you can add --delete to the line:
foo@bar:~$ rsync -avz --delete martin:/home/martin/importantwork /home/alan
A particularly useful feature within rsync is
the ability to exclude specific files from the backup. You can do
this with the --exclude switch. For example, if
you want to keep your
importantwork/passwords/importantpasswords.txt
file out of the backup, you can use this command:
foo@bar:~$ rsync -avz --delete --exclude=passwords/importantpasswords.txt
martin:/home/martin/importantwork /home/foo
If you need to exclude a number of files, include a number of
--exclude flags for different files or
directories, one after the other.
One final point to note about rsync is that as
with many other network tools, all traffic is unencrypted and
potentially subject to malicious people sniffing your traffic and
discovering sensitive information. If you are concerned about your
security, it is advisable to use the -e switch
built into rsync to use an SSH shell to encrypt
all traffic. Simply add the e to the collection of
switches and specify ssh as the shell to use:
foo@bar:~$ rsync -avze ssh --delete --exclude=passwords/importantpasswords.txt
martin:/home/martin/importantwork /home/foo
 |
Although the most common use of rsync is between
a local and a remote machine, it really doesn't
matter where the two machines are. As far as
rsync is concerned, one is just a source and the
other is just a destination. Both the source and destination could be
on the local machine, or one could be local and one could be remote,
or both could be on different remote machines.
|
|
|