Backup & Recovery: Inexpensive Backup Solutions for Open Systems
3.8. Backing Up and Restoring with the cpio Utility
cpio is a powerful utility. Unlike dump, it works on the file level. For this reason, it handles changing filesystems a little better than dump, but it changes the access time (atime) of files as it is backing them up. (It does have an option to reset atime, but this changes ctime.) Unless you're using GNU cpio, one of cpio's biggest challenges is compatibility between different operating systems. In addition, cpio requires you to specify files to include on standard input, which makes it a bit different from all other backup tools. cpio does make you do more work than dump does. This means you need to know a little bit more about how it works if you want to use it for regular system backups. You need to understand:
One good thing about cpio is that its name is usually cpio. (A great advantage over dump to be sure!)
Let's start with the basic syntax of cpio, followed by some example commands. cpio's backup syntax is as follows: cpio -o [aBcv]
cpio's restore syntax is as follows: cpio -i [Btv] [patterns]
The following example command creates a full backup of /home to a local tape drive: $ cd /home $ touch level.0.cpio.timestamp
The touch command is optional, but it makes incremental backups possible. $ find . -print|cpio -oacvB > device
Of course, the device in the preceding command also could be a local file if you are backing up to an optical or CD device. This command creates an incremental backup of /home to a local tape drive: $ cd /home $ touch level.1.cpio.timestamp $ find . -newer level.0.cpio.timestamp -print \ |cpio -oacvB > device
These commands create a full backup of /home to a remote tape drive: $ cd /home $ find . -print|cpio -oacvB \ |(rsh remote_system dd of=device bs=5120) Here's a more secure method that uses ssh: $ find . -print|cpio -oacvB \ |(ssh remote_system dd of=device bs=5120)
3.8.1. The Syntax of cpio When Backing Up
The cpio command takes its list of files from standard input (stdin) and by default sends its data stream to standard output (stdout). To provide a list of files to back up, do anything that generates a list of files:
All the preceding references generate an include list with a path that is relative to the current working directory. This is done automatically with dump, but with cpio, you can use either relative paths (e.g., cd /home;find .) or absolute paths (e.g., find / home1). However, using absolute paths severely limits your restore flexibility. If a table of contents of your cpio file shows /home1/directory/somefile, you can restore it only to / home1/directory/somefile. (Sometimes it is possible to use chroot to fix this, but it is very tricky!) On the other hand, if the table of contents shows ./home1/directory/somefile or home1/directory/somefile, you can restore it to anywhere you want by changing to another directory and running the restore from there. Therefore, you should always use relative paths when creating include lists for cpio or tar. (GNU tar suppresses absolute paths during a restore, but it is probably better to develop a habit of using relative paths when creating include lists for either of these backup utilities.) find is the usual method for making regular system backups because it can make cpio perform incremental backups. Before beginning a full backup of a filesystem or directory, create a timestamp file in the top-level directory. For example, in the native version of cpio, if you want to do incremental backups of /home1, create a file called / home1/level.0.cpio.timestamp. Then perform the full backup, using a find command that lists the entire contents of that directory or filesystem (e.g., find . -print). When it is time for a level 1 backup, you create the file /home1/level.1.cpio.timestamp and use a find command that looks for files newer than /home1/level.0.cpio.timestamp (e.g., find . -newer level.0.cpio.timestamp). The level.1.cpio.timestamp file can then do a level 2 backup, using a find command that looks for files newer than that file. You can use this technique to generate as many levels of backups as you wish. 3.8.2. The Options to the cpio Command
There are six options that should be used when making regular cpio backups. The first five usually are listed all at once (e.g., -oacvB), and the last one usually is listed as a separate argument (e.g., -C 5120). (Note that the -B and -C options are mutually exclusive; they cannot be used together.)
In addition, you can specify a device or file to which cpio can send its output rather than sending it to stdout. All of these options and more are available in the GNU version of cpio, as is the ability to use remote devices.
3.8.2.1. Specifying the output mode (o)
The o option is one of the three modes of cpio (o, i, and p) and is used to create a backup. It is listed as the first of several arguments. 3.8.2.2. Restoring access times (a)
One of the differences between dump and cpio is that dump backs up directly using the disk device, whereas cpio must go through the filesystem. Therefore, when cpio reads a file to back it up, it changes its access time (atime). System administrators typically use this value to see when a user has last used a file by looking at it in some way. Files that have not been accessed in a long time are typically removed from the system as part of a cleanup process. If your backup program changes the access time of a file, it appears as if all files are used every night. This option to cpio can reset atime to its original value.
3.8.2.3. Specifying the ASCII format (c)
When cpio backs up, it can send the data to the backup device using a number of header formats. These formats can be very platform-dependent, and therefore not very exchangeable between systems. The most exchangeable format (although not completely exchangeable) is called the ASCII format. The c option tells cpio to use this format. As mentioned in the sidebar "Use GNU cpio if You Can!", this format may not be as interchangeable as you might think. If you are really concerned with portability, you should consider using GNU cpio. If you can't use it, you should try transferring cpio files between the different flavors of Unix that you have. At least you will know where you stand. Either way, using the c option can't hurt. 3.8.2.4. Requesting verbose output (v)
The v option causes cpio to print the list of files that it backs up to standard error (stderr). The actual data of the cpio backup goes to standard out (stdout). (The backup data always goes to stdout, unless your version of cpio supports the -O option, which can specify an output file or device.) 3.8.2.5. Specifying a blocking factor of 5,120 (B)
The B option simply tells cpio to send its data to stdout in blocks of 5,120, instead of the default block size of 512. This can help the backup to go faster. However, it is nowhere near the large blocking factors that many modern backup drives prefer. You should therefore use the C option listed next if it is available on your system. The two options are mutually exclusive. 3.8.2.6. Specifying an I/O block size (C)
The C option does require an argument and allows you to specify the actual block size. If you are on AIX, the value is a blocking factor, which is multiplied by the minimum block size of 512. Most other Unix versions allow you to specify the value in bytes.[ [ Either way, you can set this value to be quite large, allowing cpio to perform much better with modern backup drives. Once again, this option is mutually exclusive with the B option and usually is listed separately with its argument, as in the following example: $ find . -print|cpio -oacv -C 129024 >device
3.8.2.7. Specifying an output device or file (O)
Some versions of cpio allow you to specify a -O device argument, which causes the output to go to device. (This option is not always available.) All versions of cpio, however, default to sending the backup data to stdout. Once again, for simplicity, you don't have to use the -O option even if it is available. To specify a backup device, simply redirect stdout to a file or device. This method always works, no matter what version of Unix you are using. 3.8.2.8. Backing up to a remote device (piping to an rsh or ssh command)
The native version of cpio does not automatically support remote devices in the way that dump does. (The GNU cpio version does do this.) So, in order to back up to a remote backup drive, you need to replace the > device option with a pipe to an rsh or ssh command: $ find . -print|cpio -oacv \ | rsh remote_system dd of=device bs=5k
Here's a more secure version: $ find . -print|cpio -oacv \ | ssh remote_system dd of=device bs=5k
Notice that it is piped to a dd command on the remote host. Since the input file is stdin, you need only specify the output file (of=) and the block size. You need to specify the 5 K block size because that is readable by any version of cpio. 3.8.3. Restoring with cpio
The same rules apply to cpio as to any other restore command. I hope that you aren't sitting there with a cpio volume in your hand that contains your very critical system backup, and you've never restored with cpio before. Remember, test, test, test, and practice, practice, practice! OK, now that I'm off my soapbox, don't worry. Restoring from a cpio volume isn't that hard, although there are a number of possible challenges that you may face when trying to read a cpio volume.
3.8.3.1. Different versions of cpio
Just because you know that a backup volume was written in cpio format doesn't mean you can read it easily. This is because, although most versions of cpio are called cpio, they don't always produce the same format. Even the ASCII header that is intended to provide portability is not readable among all platforms. If you just want to see if you can read the volume, try a simple cpio -itv < device. If that works, then you're golden! If it doesn't work, you might get errors like: Not a cpio file, bad header
or: Impossible header type
3.8.3.2. Byte-order problems
If you are reading the volume on a type of platform that is different from the one on which the volume was written, you might have a byte-order problem, and you will probably get the first of the two preceding errors. The b, s, and S options to cpio are designed to help with byte-order problems: $ cpio -itbv < device # Reverse the order of the bytes within each word. $ cpio -itsv < device # Reverse the order of the bytes within each half word. $ cpio -itSv < device # Swap half word within each word
3.8.3.3. Wrong header type
If you don't have a byte-order problem, the cpio data might have been written with a different type of header. Some versions of cpio can automatically detect some of the headers, but they can't detect all of them, and some versions of cpio can detect only one type automatically. You may have to experiment with different headers to see which one it was written in. If this is your problem, you are probably getting the "Impossible header type" error. (Again, GNU cpio is able to detect any header type automatically.) Try some of the following commands: $ cpio -ictv <device # Try reading the incoming data in ASCII format $ cpio -itv -H header <device # Try reading with a header of value header
The value header could be crc, tar, ustar, odc, and so on. Consult your manpage. This option is not available everywhere. $ cpio -ictv -H header <device # Combining ASCII and header options
3.8.3.4. Strange block size
Finally, the cpio volume could have been written with a block size other than what cpio expects. If the block size of your cpio backup is 5 K, you can try telling cpio to use that block size by adding the B option to any of the preceding commands (cpio - itBv). If the block size is not 5 K, you can get cpio to use it by adding a -C blocksize at the end of the cpio command (cpio -itv -C 5120). 3.8.3.5. Full or partial restore, or table of contents only?
Once you determine that you can read the cpio backup volume, you have several choices of what to do with it:
3.8.4. cpio's Restore Options
Before doing any of the things just described, you have several options available to read from a cpio volume. Many of these are the same options that you used to create a cpio volume, such as (B) for 5 K blocks, (c) to read an ASCII header, and (v) to give verbose output. In addition, you have the following:
3.8.5. Telling cpio Which Device to Use
Unlike tar or dump, cpio does not take the name of the backup device as an argument.[||] [||] That is, unless you want to use the -I option supported by some versions of cpio. Once again, though, this book concentrates on those options that work almost everywhere. You must feed cpio the data through stdin. You can do this the hard way by using dd or cat: $ dd if=device bs=blocksize | cpio -options
Alternatively, you can simply redirect stdin to read from the device: $ cpio -options < device 3.8.6. Examples of a cpio Restore
The only question now is what options are needed. The easiest way to explain this is to show you example commands for the things that you can do with a cpio volume. Several "optional" options are listed in these example commands. Many of these options, while not required, make the operation easier or more robust. Some of the options may not be applicable to your particular application, so feel free to not use them. 3.8.6.1. Listing the files on a cpio volume
The following command reads the cpio volume in (B) blocks of 5120 bytes, uses the (c) ASCII format when reading the header, (k) skips bad spots on the volume when possible, and lists only the (t) table of contents with a (v) verbose (ls -l) style listing: $ cpio -iBcktv < device
3.8.6.2. Doing an entire filesystem restore
The following command reads the cpio volume in (B) blocks of 5,120 bytes, uses the (c) ASCII format when reading the header, and makes (d) directories where needed. It (k) skips bad spots on the volume when possible, retains the original file (m) modification times, (u) unconditionally overwrites files, and (v) lists the names of the files that it recovers as it reads them: $ cpio -iBcdkmuv < device Of course, you can do the same thing, but without the (u) unconditional overwrite: $ cpio -iBcdkmv < device 3.8.6.3. Doing a pattern-match restore
To restore files that match a certain pattern, simply list the pattern(s) you are looking for after the command: $ cpio -iBcdkmuv "pattern1" "pattern2" "pattern3" < device
The pattern uses filename expansion wildcards, not regular expressions.[#] [#] For learning more than you ever thought possible about regular expressions, I highly recommend Mastering Regular Expressions, by Jeffrey Friedl (O'Reilly). Understanding what they are and what they do is an eye-opening experience and will make your use of tools such as grep, sed, awk, and vi much more fruitful. Filename expansion wildcards work like the ones on the command line (e.g., *ome* finds both home1 and rome). The cpio command is the only native restore utility that supports wildcard restores in this way. For example, if you want to restore all of the files that were in my home directory (/home1/curtis), you can type: $ cpio -iBcdkmuv "*curtis*"
To restore all files except those matching a certain pattern, use the f option, and list the excluded pattern(s): $ cpio -iBcfdkmuv "pattern1" "pattern2" "pattern3" < device
3.8.6.4. Renaming files interactively
The following is the same command as that in the previous section "Doing an entire filesystem restore" but prompts the user to interactively (r) rename any files that are restored: $ cpio -iBcdkmruv < device The following is the same command as that in the previous section "Doing a pattern-match restore" but prompts the user to interactively (r) rename any files that are restored: $ cpio -iBcdkmruv "pattern" < device 3.8.6.5. Other useful options
3.8.6.6. Restoring to a different directory
If you made your backup volumes using relative pathnames, this is not a problem. Simply cd to the directory where you want to restore, and issue your cpio restore commands from there. If you don't know whether the volume was written with relative pathnames, enter the command cpio -itv < device, and look at the filenames. If they start with a /, the volume was made with absolute paths. In that case, you can do one of two things:
3.8.7. Using cpio's Directory Copy Feature
If you need to move a directory from one place to another, you can try this little-used feature of cpio. Issue the following command: $ cd old-directory ; find . -print | cpio -padlmuv new-directory
This moves old-directory to new-directory, resetting (a) access times, creating (d) directories when needed, (l) linking files when possible, retaining the original (m) modification times, and (u) unconditionally overwriting all files, while giving a (v) verbose output of the files that get copied.
If you were to compile a list of all the options that are available on all Unix platforms, it would be very long. Depending on your platform, there may be a lot of other neat options that can make cpio more useful for you. There are also a number of extra features in GNU's version of cpio. Make sure you read the manpage for your version of cpio. Please be aware that if you use any of the options that affect how the cpio backup is written, it may reduce its portability. |
Категории