Archives and Compression
Overview
Working with a large number of related files can sometimes be cumbersome and time-consuming . Archive files help reduce the complexity of file manipulation. By grouping related files together into a single archive file, moving and transferring that one file has the effect of moving or transferring all of the individual files. The individual files are extracted when desired.
Users familiar with iSeries save files will find similarities in the way other archive files are used to back up, move, or transfer groups of files. The iSeries save file is an archive file.
The most common types of archive files are zip files, jar files, and tar files. The archive file type refers to the utility used to create the archive file. For example, a zip utility creates zip archive files.
A compressed archive file uses less space than the sum of all of the individual files that it contains. Some archive utilities, such as jar , use data compression when creating the archive file, while others, such as tar , do not. You can, however, use compression utilities to compress data in archive files that do not use data compression themselves . For example, archive files created by the tar utility are commonly compressed using the compress utility.
Archive files are binary files that have defined data formats. When transferring archive files (perhaps using FTP), always transfer in binary mode. The Qshell archive utility converts the data as required by the utility during insertion or extraction of files from the archive files.
Example Data
The examples in this chapter use an assortment of files, links, and directories as the source of the data. For demonstration purposes, the examples all use the v option to verbosely display the tar processing that occurs. The example data is listed in Figure 16.1
ls -l total: 36 kilobytes -rwxr-xr-x 1 JSMITH 0 107 Jul 13 15:45 HelloC.c -rwxr-xr-x 1 JSMITH 0 118 Jul 13 15:45 HelloCpp.C drwxrwsrwx 2 JSMITH 0 8192 Jul 13 15:46 data lrwxrwxrwx 1 JSMITH 0 37 Jul 13 15:44 goodoleboys.sql -> /home/jsmith/src/data/goodoleboys.sql lrwxrwxrwx 1 JSMITH 0 37 Jul 13 15:44 goodoleboys.txt -> /home/jsmith/src/data/goodoleboys.txt lrwxrwxrwx 1 JSMITH 0 31 Jul 13 15:43 qcsrc -> /qsys.lib/jsmith.lib/qcsrc.file ls -lL data qcsrc data: total: 40 kilobytes -rw-rw---- 1 JSMITH 0 3341 Jul 13 15:46 customers.txt -rw-rw---- 1 JSMITH 0 13400 Jul 13 15:46 notes.txt qcsrc: total: 228 kilobytes -rwx---rwx 1 JSMITH 0 73431 Jul 13 15:42 GOODBYE.MBR -rwx---rwx 1 JSMITH 0 19434 Jul 13 15:42 HELLO.MBR -rwx---rwx 1 JSMITH 0 73431 Jul 13 15:43 TEST.MBR
Figure 16.1: These files and directories are used by the archive examples in this chapter.
Tar
Tar, the tape archive utility, is named for its ability to read, write, and list files on a tape drive. A tar file is a binary file; the tar file format specifies an ASCII file format. The tar utility converts files inserted into the archive to ASCII. When extracting files, the tar utility converts files extracted from the archive to EBCDIC (the default CCSID of the job). Setting the QIBM_CCSID environment variable to a value other than zero causes tar to convert the ASCII data in the archived file to that CCSID.
Like tape media, a tar file contains a sequential list of file or directory entries. The same source file may be present in the tar file more than once. If you add or update file entries in a tar file, the entries are added to the end.
When the tar utility encounters a directory, it processes the directories recursively unless the directory is a symbolic link. Symbolic-link processing is controlled by the H , L , and P options.
The syntax of tar is shown here:
tar operation option [ ][ tar-file ][ block-size ][ files ]
Tar Options
The first option character passed to tar is the operation that tar should perform. Exactly one option for the operation is required from the list shown in Table 16.1. Specify the tar-file parameter if the f option is used; otherwise , the tar-file parameter defaults to the file archive.tar . Tar assumes that the first argument consists of one or more options. For this reason, the options do not have to be preceded by a hyphen.
Operation |
Description |
---|---|
c |
Create a new archive file. |
x |
Extract files from an archive file. |
t |
List files in an archive file. |
r |
Append files to an existing archive file. |
u |
Update files in the archive file. The u option is rather misleading, because updated files are actually added to the end of the archive, similar to what you might expect the r option to do. |
If no files are specified for an extract or list operation, all files in the archive are targeted . If no files are specified for a create, update, or append operation, the filenames are read from standard input.
All other options for tar are optional. They are shown in Table 16.2, and affect the way that the tar operation is carried out.
Option |
Description |
---|---|
b |
This option indicates that the next argument is the blocksize argument. The blocksize argument is used to define the size of the block when creating an archive file. |
e |
Exit immediately if an error is encountered . |
f |
This option indicates that the next argument is the archive file argument. Use an archive file name of “ (dash) to read or write the archive to standard input or standard output. If unspecified, tar uses a default archive filename of archive.tar. |
m |
Modification times of extracted files are not restored when extracting files from the archive file. |
o |
Extracted files are assigned the owner and group of the current user . Saved owner and group settings are not restored from the archive file. |
p |
Preserve the file mode, access, and modification times, as well as owner and group settings, of files when extracting files from the archive file. |
v |
Use verbose mode. Write additional information about files processed during any operation. |
w |
Wait for a confirmation from the user before taking any action. |
H |
Remove references to symbolic links for files specified on the command line. Archive the files referred to by the links instead of the symbolic links themselves . |
L |
Remove references to all symbolic links encountered. Archive the files referred to by the links instead of the symbolic links themselves. |
P |
Do not remove references to symbolic links. Archive the symbolic links themselves instead of the files referred to by the symbolic links. This is the default symbolic-link behavior. |
X |
While processing directories recursively, do not process directories that have a different device ID (for example, symbolic links to directories in a different file system). |
Tar Examples
This section contains several examples that demonstrate some basic tar operations. Figure 16.2 demonstrates how to create an archive file. The default symbolic-link behavior is as if the P option were used. Therefore, the qcsrc and goodoleboys symbolic links are copied directly to the archives. The l (list) operation shows that the symbolic links are present in the archive file, not the files to which the links point.
tar vc * HelloC.c HelloCpp.C archive.tar data data/customers.txt data/notes.txt goodoleboys.sql goodoleboys.txt qcsrc tar: 001-2298 For archive file tar and volume 1, 9 files were processed with 0 bytes read and 0 bytes written. ls -l archive.tar -rw-rw-rw- 1 JSMITH 0 30720 Jul 13 16:02 archive.tar tar tv -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 HelloC.c -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 HelloCpp.C -rw-rw-rw- 1 JSMITH 0 0 Jul 13 16:02 archive.tar drwxrwsrwx 2 JSMITH 0 0 Jul 13 15:46 data -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 data/customers.txt -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 data/notes.txt lrwxrwxrwx 1 JSMITH 0 0 Jul 13 15:44 goodoleboys.sql => /home/jsmith/src/data/goodoleboys.sql lrwxrwxrwx 1 JSMITH 0 0 Jul 13 15:44 goodoleboys.txt => /home/jsmith/src/data/goodoleboys.txt lrwxrwxrwx 1 JSMITH 0 0 Jul 13 15:43 qcsrc=> /qsys.lib/jsmith.lib/qcsrc.file
Figure 16.2: The default archive file contains all files and directories in the current directory.
Figure 16.3 demonstrates the w (Wait for Confirmation) option that prompts the user for individual file actions. Only the HelloC.c file is added to the archive file. A subsequent tar update operation (the u option) adds HelloC.c to the archive file again, but this time the user renames the file "HelloC-version2.c." Used this way, the tar utility provides a rudimentary historical view of a changing data file.
tar cw H* tar: Starting interactive file rename operation. tar: The current file is HelloC.c with mode -rwxr-xr-x and a modification time of Jul 13 17:01. tar: Enter a new name, or a period (".") to quit, or press Enter to skip this file: . tar: The file name is not changed. tar: Starting interactive file rename operation. tar: The current file is HelloCpp.C with mode -rwxr-xr-x and a modification time of Jul 13 17:01. tar: Enter a new name, or a period (".") to quit, or press Enter to skip this file: tar: The file is skipped. tar tvf archive.tar -rwxr-xr-x 1 JSMITH 0 0 Jul 13 17:01 HelloC.c tar: 001-2298 For archive file tar and volume 1, 1 files were processed with 0 bytes read and 102400 bytes written. tar wuf archive.tar HelloC.c tar: Starting interactive file rename operation. tar: The current file is HelloC.c with mode -rwxr-xr-x and a modification time of Jul 13 17:01. tar: Enter a new name, or a period (".") to quit, or press Enter to skip this file: HelloC-version2.c tar: The file name is changed to HelloC-version2.c. tar tvf archive.tar -rwxr-xr-x 1 JSMITH 0 0 Jul 13 17:01 HelloC.c -rwxr-xr-x 1 JSMITH 0 0 Jul 13 17:01 HelloC-version2.c tar: 001-2298 For archive file tar and volume 1, 2 files were processed with 0 bytes read and 199168 bytes written.
Figure 16.3: Options can be used to confirm file operations.
Use the L option to create an archive file containing the current directory (the dot). With this option, the files or directories referred to by symbolic links (in Figure 16.4, qcsrc, goodoleboys.sql, and goodoleboys.txt) are copied directly into the archive instead of the links. When extracting those objects, only the files remain .
tar cLf archive-file.tar . tar: 001-2298 For archive file tar and volume 1, 14 files were processed with 0 bytes read and 0 bytes written. tar tvf archive-file.tar drwxrwsrwx 2 JSMITH 0 0 Jul 13 16:26 . -rw-rw-rw- 1 JSMITH 0 0 Jul 12 16:39 ./goodoleboys.sql -rw-rw-rw- 1 JSMITH 0 0 Jun 29 13:03 ./goodoleboys.txt drwx---rwx 2 JSMITH 0 0 Jul 13 15:43 ./qcsrc -rwx---rwx 1 JSMITH 0 0 Jul 13 15:42 ./qcsrc/GOODBYE.MBR -rwx---rwx 1 JSMITH 0 0 Jul 13 15:42 ./qcsrc/HELLO.MBR -rwx---rwx 1 JSMITH 0 0 Jul 13 15:43 ./qcsrc/TEST.MBR -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 ./HelloCpp.C -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 ./HelloC.c drwxrwsrwx 2 JSMITH 0 0 Jul 13 15:46 ./data -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 ./data/customers.txt -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 ./data/notes.txt -rw-rw-rw- 1 JSMITH 0 0 Jul 13 16:28 ./archive.tar -rw-rw-rw- 1 JSMITH 0 0 Jul 13 16:26 ./test.tar tar: 001-2298 For archive file tar and volume 1, 14 files were processed with 0 bytes read and 266240 bytes written.
Figure 16.4: The L option follows symbolic links.
Figure 16.5 demonstrates a combination of the X and L options. While the L option causes tar to follow all symbolic links, the X option prevents tar from writing the contents of directory qcsrc, because qcsrc refers to a different file system.
tar cvXL . . ./goodoleboys.sql ./goodoleboys.txt ./qcsrc ./HelloCpp.C ./HelloC.c ./data ./data/customers.txt ./data/notes.txt ./archive.tar ./test.tar tar: 001-2298 For archive file tar and volume 1, 11 files were processed with 0 bytes read and 0 bytes written. tar tvf archive.tar drwxrwsrwx 2 JSMITH 0 0 Jul 13 16:26 . -rw-rw-rw- 1 JSMITH 0 0 Jul 12 16:39 ./goodoleboys.sql -rw-rw-rw- 1 JSMITH 0 0 Jun 29 13:03 ./goodoleboys.txt drwx---rwx 2 JSMITH 0 0 Jul 13 15:43 ./qcsrc -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 ./HelloCpp.C -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 ./HelloC.c drwxrwsrwx 2 JSMITH 0 0 Jul 13 15:46 ./data -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 ./data/customers.txt -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 ./data/notes.txt -rw-rw-rw- 1 JSMITH 0 0 Jul 13 16:32 ./archive.tar -rw-rw-rw- 1 JSMITH 0 0 Jul 13 16:26 ./test.tar tar: 001-2298 For archive file tar and volume 1, 11 files were processed with 0 bytes read and 92160 bytes written.
Figure 16.5: Use the X option to avoid files in other file systems.
If no file names are passed to tar on the command line, tar reads the file names from standard input. In Figure 16.6, the ls utility is used to enter all files in the current directory. As each filename is read by tar , that file is added to the archive.
ls -1 tar cvf test.tar HelloC.c HelloCpp.C archive.tar data data/customers.txt data/notes.txt goodoleboys.sql goodoleboys.txt qcsrc tar: 001-2298 For archive file tar and volume 1, 9 files were processed with 0 bytes read and 0 bytes written. tar tvf test.tar -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 HelloC.c -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 HelloCpp.C -rw-rw-rw- 1 JSMITH 0 0 Jul 13 16:24 archive.tar drwxrwsrwx 2 JSMITH 0 0 Jul 13 15:46 data -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 data/customers.txt -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 data/notes.txt lrwxrwxrwx 1 JSMITH 0 0 Jul 13 15:44 goodoleboys.sql => /home/jsmith/src/data/goodoleboys.sql lrwxrwxrwx 1 JSMITH 0 0 Jul 13 15:44 goodoleboys.txt => /home/jsmith/src/data/goodoleboys.txt lrwxrwxrwx 1 JSMITH 0 0 Jul 13 15:43 qcsrc=> /qsys.lib/jsmith.lib/qcsrc.file tar: 001-2298 For archive file tar and volume 1, 9 files were processed with 0 bytes read and 61440 bytes written.
Figure 16.6: By default, tar reads file names from standard input.
Figure 16.7 demonstrates restoring individual files from an archive. The archive contents are first shown using the t (List) option. The x (Extract) option extracts files that match the patterns passed on the command line. In this example, a wildcard is used. The pattern with the wildcard must be quoted on the Qshell command line to prevent Qshell from expanding it and allow the tar utility to expand the wildcard. Restore all files by specifying no filename parameters on the command line.
tar tvf archive.tar drwxrwsrwx 2 JSMITH 0 0 Jul 13 16:45 . lrwxrwxrwx 1 JSMITH 0 0 Jul 13 15:44 ./goodoleboys.sql => /home/jsmith/src/data/goodoleboys.sql lrwxrwxrwx 1 JSMITH 0 0 Jul 13 15:44 ./goodoleboys.txt => /home/jsmith/src/data/goodoleboys.txt lrwxrwxrwx 1 JSMITH 0 0 Jul 13 15:43 ./qcsrc=> /qsys.lib/jsmith.lib/qcsrc.file -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 ./HelloCpp.C -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 ./HelloC.c drwxrwsrwx 2 JSMITH 0 0 Jul 13 15:46 ./data -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 ./data/customers.txt -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 ./data/notes.txt -rw-rw-rw- 1 JSMITH 0 0 Jul 13 16:45 ./archive.tar -rw-rw-rw- 1 JSMITH 0 0 Jul 13 16:26 ./test.tar tar: 001-2298 For archive file tar and volume 1, 11 files were processed with 0 bytes read and 92160 bytes written. tar xvf archive.tar './Hello*' ./data ./HelloCpp.C ./HelloC.c ./data ./data/customers.txt ./data/notes.txt tar: 001-2298 For archive file tar and volume 1, 11 files were processed with 0 bytes read and 92160 bytes written . tar xvf archive.tar . ./goodoleboys.sql ./goodoleboys.txt ./qcsrc ./HelloCpp.C ./HelloC.c ./data ./data/customers.txt ./data/notes.txt ./archive.tar ./test.tar tar: 001-2298 For archive file tar and volume 1, 11 files were processed with 0 bytes read and 92160 bytes written.
Figure 16.7: The tar utility also restores files from an archive.
The u (Update) option updates files in an archive. Updated files are inserted into the archive again. The listings in Figure 16.8 ( tar tvf ) show the contents of the tar file before and after the update. After the update, note that the files HelloC.c and HelloCpp.C are present in the archive twice with different file timestamps. When the HelloC.c file is extracted, the tar utility processes the archive sequentially (like a tape drive). The HelloC.c file is encountered twice and extracted each time. The result is that the latest version of the HelloC.c file with timestamp "Jul 13 16:58" is extracted.
tar tvf archive.tar -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 HelloC.c -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 HelloCpp.C drwxrwsrwx 2 JSMITH 0 0 Jul 13 15:46 data -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 data/customers.txt -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 data/notes.txt lrwxrwxrwx 1 JSMITH 0 0 Jul 13 16:54 goodoleboys.sql => /home/jsmith/src/data/goodoleboys.sql lrwxrwxrwx 1 JSMITH 0 0 Jul 13 16:54 goodoleboys.txt => /home/jsmith/src/data/goodoleboys.txt lrwxrwxrwx 1 JSMITH 0 0 Jul 13 16:54 qcsrc=> /qsys.lib/jsmith.lib/qcsrc.file tar: 001-2298 For archive file tar and volume 1, 10 files were processed with 0 bytes read and 215040 bytes written. ls -l Hello* -rwxr-xr-x 1 JSMITH 0 92160 Jul 13 16:58 HelloC.c -rwxr-xr-x 1 JSMITH 0 118 Jul 13 16:58 HelloCpp.C tar uvf archive.tar Hello* tar: 001-2315 The archive is being read to position to the end of the archive. done. HelloC.c HelloCpp.C tar: 001-2298 For archive file tar and volume 1, 10 files were processed with 0 bytes read and 114176 bytes written. tar tvf archive.tar -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 HelloC.c -rwxr-xr-x 1 JSMITH 0 0 Jul 13 15:45 HelloCpp.C drwxrwsrwx 2 JSMITH 0 0 Jul 13 15:46 data -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 data/customers.txt -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 data/notes.txt lrwxrwxrwx 1 JSMITH 0 0 Jul 13 16:54 goodoleboys.sql => /home/jsmith/src/data/goodoleboys.sql lrwxrwxrwx 1 JSMITH 0 0 Jul 13 16:54 goodoleboys.txt => /home/jsmith/src/data/goodoleboys.txt lrwxrwxrwx 1 JSMITH 0 0 Jul 13 16:54 qcsrc=> /qsys.lib/jsmith.lib/qcsrc.file -rwxr-xr-x 1 JSMITH 0 0 Jul 13 16:58 HelloC.c -rwxr-xr-x 1 JSMITH 0 0 Jul 13 16:58 HelloCpp.C tar: 001-2298 For archive file tar and volume 1, 10 files were processed with 0 bytes read and 215040 bytes written. tar xvf archive.tar HelloC.c HelloC.c HelloC.c tar: 001-2298 For archive file tar and volume 1, 12 files were processed with 0 bytes read and 307200 bytes written.
Figure 16.8: The u option updates files in an archive.
The Jar Utility
The jar ( Java archive ) utility includes data compression in the file format. Most convenient because of the widespread use of zip files, jar creates and extracts zip file archives. Jar's zip files are compatible with the PkZip and WinZip applications that are typically used on Windows workstations.
The jar utility has syntax similar to tar . See chapter 23 for a detailed description of the jar utility.
The Pax Utility
The pax ( portable archive exchange ) utility enables access to less commonly used archive file formats, so you can use it to exchange data with older systems. The pax utility supports six different file formats, listed in Table 16.3.
Format |
Description |
---|---|
cpio |
A data format defined by the Posix 1003.2 standard |
bcpio |
An older version of the cpio format; seldom used and not very portable |
sv4cpio |
The UNIX System V release 4 version of the cpio file format |
sv4crc |
The UNIX System V release 4 version of the cpio file format that uses file CRC checksums |
tar |
An old tar file format used by BSD UNIX 4.3 |
ustar |
The data format defined for tar by the Posix 1003.2 standard |
There are four main forms of pax : list, read, write, and copy. Each uses different arguments. Here is their basic syntax:
pax [ list-options ][ -f archive-file ][ file-pattern ] pax -r [ read-options ][ -f archive-file ][ file-pattern ] pax -w [ write-options ][ -f archive-file ][ files ] pax r -w [ copy-options ] [ -f archive-file ][ files ] directory
Although pax might come in handy in some instances, it is infrequently used. Other than the tar format, the file formats that pax supports are not common ones.
Pax Options
The presence or absence of the option characters r and w control the operation that pax should perform. Other options control behavior of the list, read, write or copy operations. The options for pax are listed in Table 16.4.
Option |
Description |
---|---|
a |
Append files to an existing archive file . This is used with the write operation. |
A |
Use the pax utility as if it's the old tar utility. |
b |
Specify a block size for writing to the archive file. The block size is a multiple of 512, and the maximum size is 32,256. End the blocksize parameter with a b or k to specify the parameter as the number of blocks or number of kilobytes. Permissible blocksize values depend on the file format. |
B |
The maximum size of the archive file written is set to the size parameter. End the size parameter with a b , k , or m to specify it as the number of 512-byte blocks, kilobytes, or megabytes. |
c |
Complement (invert) the file-pattern or files parameters. All individual files except those matching the file-pattern or files parameters are targeted for the operation. |
C < ccsid > |
Convert data from CCSID 819 (ASCII) to the CCSID specified when files are extracted. |
d |
Do not recurse into directories. |
D |
This is similar to the -u option, but the file inode change time is used instead of the file modification time. |
E |
This holds the number of errors tolerated while trying to read a corrupted archive. Valid values are none, zero, or another integer value. Don't use a value of none, however, because the pax utility could loop forever on a corrupted archive. |
f archive-file |
Specify the archive file used. If not specified, the pax utility uses standard input or standard output for the archive data. The pax utility prompts for subsequent archive-volume file names if required. |
G < groupname> |
Match files using the group name after the file-pattern or files parameters. If the groupname string starts with a pound sign (#), it is the group number instead of the name. |
i |
Interactively prompt the user to rename files. Respond with a blank line to skip a file, a period (dot) to use the current name of the file, or the new file name that pax should use. |
H |
Remove references to symbolic links for files specified on the command line. Archive the files referred to by the links instead of the symbolic links themselves . |
k |
Keep (do not replace) existing files. |
l |
Link files when doing a copy operation. If possible, hard links are used between the source files and the target files. The data is only present in the file system once. |
L |
Remove references to all symbolic links encountered . Archive the files referred to by the links instead of the symbolic links themselves. |
n |
Stop selecting files that match the file-pattern parameter after the first file matches. |
o |
This is optional information to modify the behavior of the algorithm used to extract or write the particular archive file format. |
p |
Indicate file privileges and settings. Use multiple privileges or multiple instance of the “p option to specify more than one privilege. a ”Do not preserve file access times. e ”Preserve all file privileges and settings of the files. m ”Do not preserve modification times. o ”Preserve the owner information (user ID and group ID). p ”Preserve file mode (authority). |
P |
Do not remove references to symbolic links; the default symbolic link behavior. Archive the symbolic links themselves instead of the files referred to by the symbolic links. |
s |
Substitute file-name text for files matched by the file-pattern or files parameter. The value of the subst string is specified as /match/replace/[gp], where match is a regular expression. (See chapter 18 for more information about regular expressions.) |
t |
Retain the file-access times of any file that the pax utility accesses . File-access times are reset to the values that they had prior to pax accessing them. |
T [ ][ , ] |
Match files using the time and date range after the files-pattern or files parameter. Either the from parameter or the to parameter may be omitted. Files older than the to parameter and newer than the from parameter are selected. |
u |
Update files based on the file-modification time. Files extracted that are older than an existing file are skipped. Files written to an archive are skipped if they are older than existing files in the archive. |
U |
Match files using the user name after the file-pattern or files parameters. If the user name starts with a pound sign (#), it is the user number instead of the name. |
v |
Process the operation verbosely. |
x |
Specify the archive file format. The file format is detected automatically for existing archive files. |
-X |
While processing directories recursively, do not process directories that have a different device ID (for example, symbolic links to directories in a different file system). |
Y |
Similar to the “ D option, but the file-inode change time is used instead of the file-modification time after any file rename operations have finished. |
Z |
Similar to the “ u option, but the file-inode change time is used instead of the file-modification time after any file-rename operations have finished. |
Pax examples
Figure 16.9 shows how to create a tar archive with pax .
pax -wvx ustar -f archive.tar * HelloC.c HelloCpp.C archive.tar data data/customers.txt data/notes.txt goodoleboys.sql goodoleboys.txt qcsrc.file pax: 001-2298 For archive file ustar and volume 1, 9 files were processed with 0 bytes read and 0 bytes written. ls -l archive.tar rw-rw-rw- 1 JSMITH 0 122880 Jul 25 20:47 archive.tar
Figure 16.9: You can create a tar-format file archive using the pax utility.
You can also use pax to list the contents of a tar file archive, as shown in Figure 16.10. The x option is not required because pax automatically detects the file format of an existing archive.
pax -vf archive.tar -rwxr-xr-x 1 JSMITH 0 0 Jul 13 17:01 HelloC.c -rwxr-xr-x 1 JSMITH 0 0 Jul 13 17:01 HelloCpp.C drwxrwsrwx 2 JSMITH 0 0 Jul 13 15:46 data -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 data/customers.txt -rw-rw---- 1 JSMITH 0 0 Jul 13 15:46 data/notes.txt lrwxrwxrwx 1 JSMITH 0 0 Jul 13 16:54 goodoleboys.sql => /home/jsmith/src/data/goodoleboys.sql lrwxrwxrwx 1 JSMITH 0 0 Jul 13 16:54 goodoleboys.txt => /home/jsmith/src/data/goodoleboys.txt lrwxrwxrwx 1 JSMITH 0 0 Jul 25 20:44 qcsrc.file => /qsys.lib/jsmith.lib/qcsrc.file pax: 001-2298 For archive file ustar and volume 1, 8 files were processed with 0 bytes read and 122880 bytes written.
Figure 16.10: The pax utility can also be used to list the contents of a tar archive.
Compress and Uncompress
Data compression is typically used in conjunction with archive files to reduce the size of archive files. The compress and uncompress utilities replace a file with a compressed or uncompressed version of the file. Any IFS file can be compressed, but the best compression with the compress utility is achieved for text files. The compress and uncompress utilities are virtually identical in syntax, and they operate as a pair:
compress [ options ] [ files ] uncompress [ options ] [ files ]
Compressed files have a file extension of .Z. For example, the result of compressing a file named "readme.txt" is a smaller file named "readme.txt.Z." The uncompress utility is used on the .Z file to expand it to its original content.
You might be better off compressing data with the jar utility instead of these compression utilities. The jar utility creates and extracts zip files and may give you better data compression and better cross platform portability.
The Compress and Uncompress Options
The options for compress and uncompress are listed in Table 16.5.
Option |
Description |
---|---|
c |
Write the output to standard output. The files given are not modified. |
f |
Used only for the compress utility, to force compression even if no reduction in size occurs. |
v |
Print compression information for each file. |
b |
The compress and uncompress tools use a modified Lempel-Ziv compression algorithm. Use this to specify the maximum number of bits used for the replacement codes in the algorithm. The bits parameter must be from nine to 16. In all but the most advanced cases, you won't need to specify the number of bits to use because compress and uncompress choose appropriately. |
The compress and uncompress utilities are very simple. Enter compress or uncompress with the file names you want to compress listed as arguments. In Figure 16.11, for example, the archive.tar file contains only text data. The compress utility reduces the size of the archive.tar file significantly.
ls -l archive* -rw-rw-rw- 1 JSMITH 0 10240 Jul 25 22:32 archive.tar compress -v archive.tar archive.tar.Z: 93.03% compression ls -l archive* -rw-rw-rw- 1 JSMITH 0 714 Jul 25 22:32 archive.tar.Z uncompress -v archive.tar.Z ls -l archive* -rw-rw-rw- 1 JSMITH 0 10240 Jul 25 22:32 archive.tar
Figure 16.11: Compression has a significant effect on text files.
Use the zcat utility to copy the contents of a compressed file to stdout . The first ls command in Figure 16.12 shows that the file fone.txt is the only file whose name begins with the string fone . After compression, fone.txt has been replaced with fone.txt.Z. The zcat utility writes the uncompressed contents of the archive to stdout.
ls -l fone* -rw-rw---- 1 JSMITH 0 251 Dec 25 13:41 fone.txt /home/JSMITH $ cat fone.txt Name Phone Fax Cell ======== ======== ======== ======== Larry 234-5678 234-6789 234-1111 Moe 345-6789 345-7890 345-2022 Abbott 456-7890 456-8901 none Costello 987-6543 none 987-3323 Curly 876-5432 876-4321 876-4441 /home/JSMITH $ compress -v fone.txt fone.txt.Z: 29.48% compression /home/JSMITH $ ls -l fone* -rw-rw---- 1 JSMITH 0 177 Dec 25 13:41 fone.txt.Z /home/JSMITH $ zcat fone.txt.Z Name Phone Fax Cell ======== ======== ======== ======== Larry 234-5678 234-6789 234-1111 Moe 345-6789 345-7890 345-2022 Abbott 456-7890 456-8901 none Costello 987-6543 none 987-3323 Curly 876-5432 876-4321 876-4441
Figure 16.12: Use zcat to view the contents of a compressed file.
Summary
Archive files are good mechanisms to reduce the complexity of manipulating large numbers of files in Qshell. Data-compression utilities available in Qshell can also help with network performance and size requirements when dealing with large archives. The Java jar utility can be used as a very portable data compression tool. The pax utility is dated, and no longer often used.