Essential System Administration, Third Edition

Most systems offer a variety of utilities for performing backups, ranging from general-purpose archiving programs like tar and cpio to programs designed for implementing multilevel incremental backup schemes on a per-filesystem basis. When the largest tapes held only a couple hundred megabytes, choosing the right utility for system backups was easy. tar and cpio were used for small and ad hoc backups and other data transfer needs, and the more sophisticated utilities specifically designed for the task were used for system backups, because their specialized abilities the ability to span tapes and to automatically perform incremental backups were essential to getting the job done.

This distinction breaks down to a great extent when a single tape can hold gigabytes of data. For example, incrementals are less important when you can fit all the important data on a system onto one or two tapes and you have the time to do so. Large tapes also make it practical to back up a system in logically grouped chunks of files, which may be spread arbitrarily throughout the physical filesystem. A successful system backup process can be built around whatever utilities make sense for your system.

One dubious piece of advice about backups that is frequently given is that you should limit filesystem size to the maximum backup media capacity available on the system. In this view, multi-tape backup sets are simply too much trouble, and the backup process is simplified if all of the data from a filesystem will fit onto a single tape.

While being able to back up a filesystem with a single tape is certainly convenient, I think it is a mistake to let current media capacity dictate filesystem planning to such a degree. Breaking disks into more, smaller filesystems limits flexibility in allocating their resources, a concern that is almost always far more important than reducing the complexity of backing them up. Designing the filesystem needs to take all of the factors affecting the system and its efficient use into account. If tape-sized backup sets are what is desired, it's easy enough to write scripts to do so when overall circumstances dictate that some individual filesystems need to be bigger.

11.3.1 When tar or cpio Is Enough

In some cases, especially single-user systems, an elaborate backup process is not needed. Rather, since the administrator and the user are one and the same person, it will be obvious which files are important, how often they change, and so on. In cases like this, the simpler tape commands, tar and cpio, may be sufficient to periodically save important files to tape (or other media).

While the canonical model for this situation is Unix running on a workstation, these utilities may also be sufficient for systems with relatively small amounts of critical data. tar and cpio also have the advantage that they will back up both local and remote filesystems mounted via NFS.

11.3.1.1 The tar command

We'll begin with a simple example. The following tar command saves all files under /home to the default tape drive:

$ tar -c /home

-c says to create a backup archive.

tar's -C option (big C) is useful for gathering files from various parts of the filesystem into a single archive. This option causes the current directory to be set to the location specified as its argument before tar processes any subsequent pathname arguments. Multiple -C options may be used on the same command. For example, the following tar commands save all the files under the directories /home, /home2, and /chem/public:

$ tar -cf /dev/rmt1 /home /home2 /chem/public $ tar -cf /dev/rmt1 -C /home . -C /home2 . -C /chem public

The two commands differ in this respect: the first command saves all of the files using absolute pathnames: /home/chavez/.login, for example. The second command saves files using relative pathnames: ./chavez/.login. The file from the first archive would always be restored to the same filesystem location, while the file from the second archive would be restored relative to the current directory (in other words, relative to the directory from which the restore command was given).

It is a good idea to use absolute pathnames in the arguments to -C. Relative pathnames specified to -C are interpreted with respect to the current directory at the time that option is processed rather than with respect to the initial current directory from which the tar command was issued. In other words, successive -C options accumulate, and tar commands using several of them as well as relative pathnames can become virtually uninterpretable.

Traditionally, all tar options were placed in a single group immediately following the command verb, and a preceding hyphen was not needed. The POSIX standard specifies a more traditional Unix syntax, preferring the second form to the first one for this command:

$ tar xpfb /dev/rmt1 1024 ... $ tar -x -p -f /dev/rmt1 -b 1024 ...

The versions of tar on current operating systems usually accept both formats, but an initial hyphen may become be a requirement at some point in the future.

tar archives are often compressed, so it is very common to see compressed tar archives with names like file.tar.Z, file.tar.gz or file.tgz (the latter two files are compressed with the GNU gzip utility).

11.3.1.1.1 Solaris enhancements to the tar command

The Solaris version of tar offers enhancements that make the command more suitable for system-level backups. They allow all or part of the list of files and directories to be backed up to be placed in one or more text files (with one item per line). These files are included in the file list given to tar, preceded by -I, as in this example:

$ tar cvfX /dev/rst0 Dont_Save /home -I Other_User_Files -I Misc

This command backs up the files and directories in the two include files, as well as those in /home. The command also illustrates the use of the -X option, which specifies the name of an exclusion file listing the names of files and directories that should be skipped if encountered by tar. Note that wildcards are not permitted in either include or exclusion files. In case of conflicts, exclusion takes precedence over inclusion.

NOTE

The -I and -X options may also be used in restore operations performed with the tar command.

On Solaris and a variety of other System V systems, the file /etc/default/tar may be used to customize the mappings of the default archive destinations specified with tar's single-digit code characters (for example, the command tar 1c creates an archive on drive 1). Here is a version from a Solaris system:

# Block # #Archive=Device Size Blocks # archive0=/dev/rmt/0 20 0 archive1=/dev/rmt/0n 20 0 archive2=/dev/rmt/1 20 0 archive3=/dev/rmt/1n 20 0 archive4=/dev/rmt/0 126 0 archive5=/dev/rmt/0n 126 0 archive6=/dev/rmt/1 126 0 archive7=/dev/rmt/1n 126 0

The first entry specifies the device that will be used when tar 0 is specified. In this case, it is the first tape drive in its default modes. The second entry defines archive 1 as the first tape drive in non-rewinding mode. The remaining two fields are optional; they specify the block size for the device and its total capacity (which may be set to zero to have the command simply detect the end-of-media marker).

11.3.1.1.2 The GNU tar utility: Linux and FreeBSD

Linux distributions and FreeBSD provide the GNU version of the tar command. It supports tar's customary features and contains some enhancements to them, including the ability to optionally span media volumes (-M) and to use gzip compression (-z). For example, the following command will extract the contents of the specified compressed tar archive:

$ tar xfz funsoftware.tgz

11.3.1.2 The cpio command

cpio can also be used to make backups. It has several advantages:

  • It is designed to easily back up completely arbitrary sets of files; tar is easiest to use with directory subtrees.

  • It packs data on tape much more efficiently than tar. If fitting all your data on one tape is an issue, cpio may be preferable.

  • On restores, it skips over bad spots on the tape, while tar just dies.

  • It can span tapes, while many versions of tar are limited to a single volume.

Using its -o option, cpio copies the files whose pathnames are passed to it via standard input (often by ls or find) to standard output; you redirect standard output to use cpio to write to floppy disk or tape. The following examples illustrate some typical backup uses of cpio:

$ find /home -print | cpio -o >/dev/rmt0 $ find /home -cpio /dev/rmt0

The first command copies all files in /home and its subdirectories to the tape in drive 0. The second command performs the identical backup via a version of find that offers a -cpio option.

11.3.1.3 Incremental backups with tar and cpio

Combining find with tar or cpio is one easy way to perform incremental backups, especially when only two or three distinct backup levels are needed. For example, the following commands both copy all files under /home which have been modified today into an archive on /dev/rmt1, excluding any object (.o) files:

$ find /home -mtime -1 ! -name \*.o -print | cpio -o >/dev/rmt1 $ tar c1 `find /home -mtime -1 ! -name `*.o' ! -type d -print`

The find command used with tar needs to exclude directories, because tar will automatically archive every file underneath any directory named in the file list, and all directories in which any file has changed will appear in the output from find.

You can also use find's -newer option to perform an incremental backup in this way:

$ touch /backup/home_full $ find /home -print | cpio -o > /dev/rmt0 A day later . . . $ touch /backup/home_incr_1 $ find /home -newer /backup/home_full -print | cpio -o > /dev/rmt0

The first command timestamps the file /backup/home_full using the touch command (/backup is a directory created for such backup time records), and the second command performs a full backup of /home. Some time later, the second two commands could be used to archive all files that whose data has changed since the first backup and to record when it began. Timestamping the record files before this backup begins ensures that any files that are modified while it is being written will be backed up during a subsequent incremental, regardless of whether such files have been included in the current backup or not.

11.3.1.4 pax: Detente between tar and cpio

The pax command attempts to bridge the gap between tar and cpio by providing a single general-purpose archiving utility.[16] It can read and write archives in either format (by default, it writes tar archives), and offers enhancements over both of them, making it an excellent utility for system backups in many environments. pax is available for all of the Unix versions we are considering. Like cpio, pax archives may span multiple media volumes.

[16] Indeed, on systems offering pax, cpio and tar are often just links to it. pax's syntax is an amalgamation of the two, which is not surprising for a peace imposed by POSIX (although the name purportedly stands for portable archive exchange).

pax's general syntax is:

pax [mode_option] other_options files_to_backup

The mode_option indicates whether files are being written to or extracted from an archive, where -w says to write to an archive, -r says to read and extract from an archive, and -rw indicates a pass-through mode in which files are copied to an alternate directory on disk (as with cpio -p); pax's default mode when no mode_option is given is to list the contents of an archive.

The following commands illustrate pax file archiving modes of operation:

$ pax -w -f /dev/rmt0 /home /chem $ find /home /chem -mtime -1 -print | pax -w -f /dev/rmt0 $ pax -w -X -f /dev/rmt0 /

The first two commands perform a full and incremental backup of the files in /home and /chem to the default tape drive in each case. The third command saves all of the files in the disk partition corresponding to the root directory; the -X option tells pax not to cross filesystem boundaries.

AIX prefers pax over vanilla tar and cpio. The command has been enhanced to support large files (over 2 GB).

Getting Users to Do Backups

At some sites, certain backup responsibilities are left to individual users: when a site has far too many workstations to make backing up all of their local disks practical, when important data resides on non-Unix systems like PCs (especially if they are not connected to the local area network), and so on.

However, even when you're not actually performing the backups yourself, you will probably still be responsible for providing technical support and, more often than not, reminders to the users who will be performing the backups. Here are some approaches I've tried to facilitate this:

  • Make a habit of encouraging users rather than threatening them (threats don't work anyway).

  • Use peer pressure to your advantage. Setting up a central backup storage location that you look after can make it obvious who is and isn't doing the backups they are supposed to. Note that this idea is inappropriate if data sensitivity is an issue.

  • Create tools that automate the backup process as much as possible for users. Everyone has time to drop in a tape and start a script before they leave for the day.

  • Provide a central repository for key files that get backed up as part of the system/site procedure. Users can copy key files and know they will be backed up when they're really in a jam and really don't have time to do a backup themselves.

11.3.2 Backing Up Individual Filesystems with dump

The BSD dump utility represents the next level of sophistication forbackup systems under Unix. It selectively backs up all of the files within a filesystem (single disk partition), doing so by copying the data corresponding to each inode to the archive on the backup device. It also has the advantage of being able to back up any type of file, including device special files and sparse files. Although there are slight variations among different versions of this command, the discussion here applies to the following Unix implementations of this command:

AIX
backup
FreeBSD
dump
HP-UX
dump and vxdump
Linux
dump (but the package is not usually installed by default)
Solaris
ufsdump
Tru64
dump and vdump

On systems supporting multiple filesystem types, dump may be limited to UFS (BSD-type) filesystems; on Linux systems, it is currently limited to ext2/ext3 filesystems, although the XFS filesystem provides the similar xfsdump utility. Under HP-UX, vxdump and vxrestore support VxFS filesystems. Tru64 provides vdump for AdvFS filesystems.

dump keeps track of when it last backed up each filesystem and the level at which it was saved. This information is stored in the file /etc/dumpdates (except on HP-UX systems, which use /var/adm/dumpdates). A typical entry in this file looks like this:

/dev/disk2e 2 Sun Feb 5 13:14:56 1995

This entry indicates that the filesystem /dev/disk2e was last backed up on Sunday, February 5 during a level 2 backup. If dump does not find a filesystem in this list, it assumes that it has never been backed up.

If dumpdates doesn't exist, the following command will create it:

# touch /path/dumpdates

The dumpdates file must be owned by the user root. If it does not exist, dump will not create it and won't record when filesystem backups occur, so create the file before running dump for the first time.

The dump command takes two general forms:

$ dump options-with-arguments filesystem $ dump option-letters corresponding-arguments filesystem

where filesystem is the block special file corresponding to the filesystem to be backed up or the corresponding mount point from the filesystem configuration file. In the first, newer form, the first item is the list of options to be used for this backup, with their arguments immediately following the option letters in the normal way (e.g., -f /dev/tape).

In the second, older form, option-letters is a list of argument letters corresponding to the desired options, and corresponding-arguments are the values associated with each argument, in the same order. This syntax is still the only one available under Solaris and HP-UX.

Although not all options require arguments, the list of arguments must correspond exactly, in order and in number, to the options requiring arguments. For example, consider the set of options 0sd. The s and d options require arguments; 0 does not. Thus, a dump command specifying these options must have the form:

$ dump 0sd s-argument d-argument filesystem

Failing to observe this rule can have painful consequences if you are running the command as root, including destroying the filesystem if you swap the argument to the f option and dump's final argument when you are running the command as root. You'll get no argument from me if you want to assert that this is a design defect that ought to have been fixed long before now. When you use dump, just make sure an argument is supplied for each option requiring one. To avoid operator errors, you may want to create shell scripts that automatically invoke dump with the proper options.

dump's most important options are the following (we will use the newer form):

-0, . . . , -9

These options indicate the level of the dump this command will perform. Given any level n, dump will search dumpdates for an entry reporting the last time this filesystem was dumped at level n-1 or lower. dump then backs up all files that have been changed since this date. If n is zero, dump will back up the entire filesystem. If there is no record of a backup for this filesystem for level n-1 or lower, dump will also back up the entire filesystem. If no level option is specified, it defaults to -9. This option does not require any argument.

Older versions of dump not supporting hyphenated options require that the level option be the first option letter.

-u

If dump finishes successfully, this option updates its history file, dumpdates. It does not require an argument.

-f device

This option states that you want to send the dump to something other than the default tape drive (i.e., to a file or to another device). The defaults used by various Unix versions were listed previously. If you use this option, it must have an argument, and this argument must precede the filesystem being dumped. A value of "-" (a single hyphen) for its argument indicates standard output.

-W

Display only what will be backed up when the indicated command is invoked, but don't perform the actual backup operation.

-s feet -d dens

These options were needed on older versions of dump to determine the capacity of the backup media. Recent versions of dump generally don't need them as they keep writing until they detect an end-of-media mark.

If you do need to use them to lie to dump about the tape length because your version uses a default capacity limit suitable for ancient 9-track tapes, -s specifies the size of the backup tape, in feet; -d specifies the density of the backup tape, in bits per inch. Since dump will respect end-of-media marks that it encounters before it has reached this limit, the fix for such situations is to set the capacity to something far above the actual limit. For example, the options -d 50000 -s 90000 define a tape capacity somewhat over 4 GB.

-b factor

Specifies the block size to use on the tape, in units of 1024-byte (or sometimes 512-byte) blocks.

Here is a typical use of the dump command:

$ dump -1 -u -f /dev/tape /chem

The second command performs a level 1 incremental backup on the /chem filesystem using the tape drive linked to /dev/tape; dump will update the file the dumpdates file upon completion.

dump notifies the user whenever it requires some interaction. Most often, dump will have filled the tape currently in use and ask for another. It will also ask whether to take corrective actions if problems arise. In addition, dump prints many messages describing what it is doing, how many tapes it thinks it will need, and the like.

11.3.2.1 The HP-UX fbackup utility

HP-UX provides the fbackup and frecover utilities designed to perform system backups. One significant advantage that they have over the standard Unix utilities is that they can save and restore HP-UX access control lists along with other file metadata.

fbackup provides for nine levels of incremental backups, just like dump. fbackup stores backup records in the file /var/adm/fbackupfiles/dates, which the system administrator must create before using fbackup.

The following examples illustrate how fbackup might be used for system backup operations:

# fbackup -0u -f /dev/rmt/1m -i /chem # fbackup -1u -i /chem -i /bio -e /bio/med # fbackup -1u -f /dev/rmt/0m -f /dev/rmt/1m -i /chem # fbackup -0u -g /backup/chemists.graph -I /backup/chemists.TOC

The first command performs a full backup of /chem to tape drive 1, updating the fbackup database. The second command does a level 1 backup of /chem and /bio, excluding the directory /bio/med (as many -i and -e options as you need can be included). The third command performs a level 1 backup of /chem using multiple tape drives in sequence.

The final command performs a full backup as specified by the graph file /backup/chemists.graph, writing an index of the backup to the file /backup/chemists.TOC. A graph file is a text file with the following format:

c path

where c is a code indicating whether path is to be included (i) or excluded (e) from the backup.

11.3.3 Related Tape Utilities

There are two other Unixtape utilities you should know about, which are also of use in performing backups from time to time.

11.3.3.1 Data copying and conversion with dd

The dd utility transfers raw data between devices. It is useful for converting data between systems and for reading and writing tapes from and to non-Unix systems. It takes a number of option=value pairs as its arguments. Some of the most useful options are:

if

Input file: source for data.

of

Output file: destination for data.

ibs

Input block size, in bytes (the default is 512).

obs

Output block size, in bytes (the default is 512).

fskip

Skip tape files before transferring data (not available in all implementations).

count

The amount of data (number of blocks) to transfer.

conv

Keyword(s) specifying desired conversion of input data before outputting: swab means swap bytes, and it is the most used conversion type. lcase and ucase mean convert to lower- and uppercase, respectively, and ascii and ebcdic mean convert to ASCII or EBCDIC.

For example, the following command processes the third file on the tape in drive 0, using an input block size of 1024 bytes and swapping bytes in all data; the command writes the converted output to the file /chem/data/c70o.dat:

$ dd if=/dev/rmt0 of=/chem/data/c70o.dat \ ibs=1024 fskip=2 conv=swab

As always, be careful to specify the appropriate devices for if and of; transposing them can have disastrous consequences.

11.3.3.2 Tape manipulation with mt

Unix provides the mt command for direct manipulation of tapes. It can be used to position tapes (to skip past backup save sets, for example), to rewind tapes, and to perform other basic tape operations. Its syntax is:

$ mt [-f tape-device] command

where tape-device specifies which tape drive to use, and command is a keyword indicating the desired action. Useful keywords include rewind (to rewind the tape), status (display device status you can see whether it is in use, for example), fsf n (skip the next n files), and bsf n (skip back n files).

For example, to rewind the tape in the second tape drive, you might use a command like:

$ mt -f /dev/rmt1 rewind

The Solaris version of mt includes an asf subcommand, which moves the tape to the nth file on the tape (where n is given as asf's argument), regardless of the tape's current position.

Under FreeBSD, the mt command is used to set the tape drive density and compression:

$ mt -f /dev/nrsa0 comp on density 0x26

AIX also includes the tctl utility (to which mt is really a link). tctl has the same syntax as mt and offers a few additional seldom-wanted subcommands.

Категории