A Practical Guide to UNIX for Mac OS X Users

This section describes utilities that copy, move, print, search through, display, sort, and compare files.

Caution: Resource forks: some files have additional data

All files, except for empty files, contain data. Unlike traditional UNIX files, Mac OS X files can contain additional information about a file. This information is kept in forks. The data is kept in the data fork. Additional information is kept in the resource fork. The resource fork can be empty or it can hold important information. For more information refer to "Resource forks" on page 93.

Under Mac OS X 10.4 and later, all of the utilities in this section, and almost all other utilities that come with Mac OS X, recognize and work with both data and resource forks. An exception occurs when you redirect the output of a utility. Refer to "| (Pipe): Communicates Between Processes" on page 51 for more information.

Under Mac OS X 10.3 and earlier, many utilities recognize only the data fork of a file. If you are not sure if a utility respects resource forks, make a copy of the file using the Finder, which always respects resource forks, before modifying a file with the suspect utility.

cp: Copies a File

The cp (copy) utility (Figure 3-2, next page) makes a copy of a file. This utility can copy any file, including text and executable program (binary) files. You can use cp to make a backup copy of a file or a copy to experiment with. Under Mac OS X 10.4 and later, cp copies resource forks; under version 10.3 and earlier, it does not. Refer to the preceding tip for information on resource forks. See ditto (next) if you need to copy resource forks under earlier versions of OS X.

Figure 3-2. cp copies a file

$ ls memo $ cp memo memo.copy $ ls memo memo.copy

A cp command line uses the following syntax to specify source and destination files:

cp source-file destination-file

The source-file is the name of the file that cp will copy. The destination-file is the name that cp assigns to the resulting (new) copy of the file.

Tip: cp can destroy a file

If the destination-file exists before you give a cp command, cp overwrites it. Because cp overwrites (and destroys the contents of) an existing destination-file without warning, you must take care not to cause cp to overwrite a file that you need. The cp -i (interactive) option (see page 29 for a tip on options) prompts you before it overwrites a file.

The following example assumes that the file named orange.2 exists before you give the cp command. The user answers y to overwrite the file.

$ cp -i orange orange.2 overwrite orange.2? (y/n [n]) y

The command line shown in Figure 3-2 copies the file named memo to memo.copy. The period is part of the filenamejust another character. The initial ls command shows that memo is the only file in the directory. After the cp command, the second ls shows both files, memo and memo.copy, in the directory.

Sometimes it is useful to incorporate the date into the name of a copy of a file. The following example includes the date January 30 (0130):

$ cp memo memo.0130

Although it has no significance to the system, the date can help you find a version of a file that you created on a certain date. The date can also help you avoid overwriting existing files by providing a unique filename each day. Refer to "Filenames" on page 74.

Use scp (page 832) or ftp (page 738) when you need to copy a file from one system to another on a network.

ditto: Copies Files and Directories

The ditto utility is similar to cp but is not available on traditional UNIX systems. Without any options, ditto copies files as well as directories and their contents. It can also create archives (page 921) of files or extract files from archives. For example, the following example makes a dated copy of the Documents directory:

$ ditto Documents Documents.0130

This command copies the Documents directory hierarchy, preserving the resource forks of the files it copies. If you are running Mac OS X 10.3 or earlier, you must specify the -rsrc option if you want to copy resource forks. This option is not required under 10.4 and later. For more information on options, refer to page 115.

Unlike cp under Mac OS X 10.3 and earlier, the ditto utility not only copies the data a file contains (as does cp), but also copies resource forks (see the tip on page 43). Refer to page 715 for more information on ditto.

mv: Changes the Name of a File

The mv (move) utility renames a file without making a copy of it. The mv command line specifies an existing file and a new filename using the same syntax as cp:

mv existing-filename new-filename

The command line in Figure 3-3 changes the name of the file memo to memo.0130. The initial ls command shows that memo is the only file in the directory. After you give the mv command, memo.0130 is still the only file in the directory. Compare this result to the earlier cp example.

Figure 3-3. mv renames a file

$ ls memo $ mv memo memo.0130 $ ls memo.0130

The mv utility can be used for more than changing the name of a file. Refer to "mv, cp: Moves or Copies a File" on page 84.

Using the mv utility is analogous to using the Finder to rename a file or move a file from one folder to another. This utility also supports resource forks. Refer to page 792 for more information on mv.

lpr: Prints a File

The lpr (line printer) utility places one or more files in a print queue for printing. Mac OS X provides print queues so that only one job gets printed on a given printer at a time. Such a queue allows several people or jobs to send output simultaneously to a single printer with the expected results. On systems with access to more than one printer, use lpstat -p to display a list of available printers. Use the -P option to instruct lpr to place the file in the queue for a specific printer, including one that is connected to another system on the network. The following command prints the file named report:

$ lpr report

Because this command does not specify a printer, the output goes to the default printer, which is the printer when you have only one.

The next command line prints the same file on the printer named mailroom:

$ lpr -Pmailroom report

You can see which jobs are in the print queue by using the lpq utility:

$ lpq lp is ready and printing Rank Owner Job Files Total Size active alex 86 (standard input) 954061 bytes

In this example, Alex has one job that is being printed; no other jobs are in the queue. You can use the job number, 86 in this case, with the lprm utility to remove the job from the print queue and stop it from printing:

$ lprm 86

You can send more than one file to the printer with a single command. The following command line prints three files on the printer named laser1:

$ lpr -Plaser1 05.txt 108.txt 12.txt

Refer to page 774 for more information on lpr.

grep: Searches for a String

The grep[1] utility searches through one or more files to see whether any contain a specified string of characters. It does not change the file it searches but simply displays each line that contains the string.

[1] Originally the name grep was a play on an edan original UNIX editor, available on Mac OS Xcommand: g/re/p. In this command the g stands for global, re is a regular expression delimited by slashes, and p means print.

The grep command in Figure 3-4 searches through the file memo for lines that contain the string credit and displays a single line that meets this criterion. If memo contained such words as discredit, creditor, or accreditation, grep would have displayed those lines as well because they contain the string it was searching for. The -w option causes grep to match only whole words. You do not need to enclose the search string in single quotation marks, but doing so allows you to put SPACEs and special characters in it.

Figure 3-4. grep searches for a string

$ cat memo Helen: In our meeting on June 6 we discussed the issue of credit. Have you had any further thoughts about it? Alex $ grep 'credit' memo discussed the issue of credit.

The grep utility can do much more than search for a simple string in a single file. Refer to grep on page 751 and in Appendix A, "Regular Expressions," for more information.

head: Displays the Beginning of a File

By default the head utility displays the first ten lines of a file. You can use head to help you remember what a particular file contains. If you have a file named months that lists the 12 months of the year in calendar order, one to a line, head displays Jan through Oct (Figure 3-5).

Figure 3-5. head displays the first lines of a file

$ cat months Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec $ head months Jan Feb Mar Apr May Jun Jul Aug Sep Oct

This utility can display any number of lines, so you can use it to look at only the first line of a file, at a full screen, or even more. To specify the number of lines head displays, include a hyphen followed by the number of lines in the head command. For example, the following command displays only the first line of months:

$ head -1 months Jan

Refer to page 759 for more information on head.

tail: Displays the End of a File

The tail utility is similar to head but by default displays the last ten lines of a file. Depending on how you invoke it, the tail utility can display fewer or more than ten lines, use a count of characters rather than lines to display parts of a file, and display lines being added to a file that is changing. The following command causes tail to display the last five lines, Aug through Dec, of the months file shown in Figure 3-5:

$ tail -5 months Aug Sep Oct Nov Dec

You can monitor lines as they are added to the end of the file named logfile with the following command:

$ tail -f logfile

Press the interrupt key (usually CONTROL-C) to stop tail and display the shell prompt. Refer to page 859 for more information on tail.

sort: Displays a File in Order

The sort utility displays the contents of a file in order by lines but does not change the original file. If a file named days contains the name of each day of the week in calendar order, each on a separate line, sort displays the file in alphabetical order (Figure 3-6).

Figure 3-6. sort displays a file in order

$ cat days Monday Tuesday Wednesday Thursday Friday Saturday Sunday $ sort days Friday Monday Saturday Sunday Thursday Tuesday Wednesday

The sort utility is useful for putting lists in order. The -u option generates a sorted list in which each line is unique (no duplicates). The -n option puts a list of numbers in order. Refer to page 837 for more information on sort.

uniq: Removes Duplicate Lines from a File

The uniq (unique) utility displays a file, skipping adjacent duplicate lines, but does not change the original file. If a file contains a list of names and has two successive entries for the same person, uniq skips the extra line (Figure 3-7).

Figure 3-7. uniq removes duplicate lines

$ cat dups Cathy Fred Joe John Mary Mary Paula $ uniq dups Cathy Fred Joe John Mary Paula

If a file is sorted before it is processed by uniq, this utility ensures that no two lines in the file are the same. (Of course, sort can do that all by itself with the -u option.) Refer to page 885 for more information on uniq.

diff: Compares Two Files

The diff (difference) utility compares two files and displays a list of the differences between them. This utility does not change either file and is useful when you want to compare two versions of a letter or a report or two versions of the source code for a program.

The diff utility with the -u (unified output format) option first displays two lines indicating which of the files you are comparing will be denoted by a plus sign (+) and which by a minus sign (-). In Figure 3-8, a minus sign indicates the colors.1 file; a plus sign, the colors.2 file.

Figure 3-8. diff displaying the unified output format

$ diff -u colors.1 colors.2 --- colors.1 Fri Nov 25 15:45:32 2005 +++ colors.2 Fri Nov 25 15:24:46 2005 @@ -1,6 +1,5 @@ red +blue green yellow -pink -purple orange

The diff -u command breaks long, multiline text into hunks. Each hunk is preceded by a line starting and ending with two at signs (@@). This hunk identifier indicates the starting line number and the number of lines from each file for this hunk. In Figure 3-8, this line indicates that the hunk covers the section of the colors.1 file (indicated by a minus sign) from the first line and continuing for six lines (for a total of seven lines). Similarly the +1,5 indicates that the hunk covers colors.2 from the first line through five subsequent lines.

Following these header lines, diff -u displays each line of text with a leading minus sign, plus sign, or nothing. A leading minus sign indicates that the line occurs only in the file denoted by the minus sign. A leading plus sign indicates that the line comes from the file denoted by the plus sign. A line that begins with neither a plus sign nor a minus sign occurs in both files at the same location. Refer to page 707 for more information on diff.

file: Tests the Contents of a File

You can use the file utility to learn about the contents of any file on a Mac OS X system without having to open and examine the file yourself. In the following example, file reports that letter_e.bz2 contains data that was compressed by the bzip2 utility (page 54):

$ file letter_e.bz2 letter_e.bz2: bzip2 compressed data, block size = 900k

Next file reports on two more files:

$ file memo picture.jpg memo: ASCII text picture.jpg: JPEG image data, JFIF standard 1.01

The file utility ignores creator codes and Macintosh file types (both on page 96); it just looks at the contents of a file (the data fork; see the tip on page 43). Refer to page 726 for more information on file.

| (Pipe): Communicates Between Processes

Because pipes are integral to the functioning of a Mac OS X system, they are introduced here for use in examples. Pipes are covered in detail on page 128.

A process is the execution of a command by Mac OS X (page 119). Communication between processes is one of the hallmarks of UNIX and UNIX-like systems. A pipe (written as a vertical bar, |, on the command line and appearing as a solid or broken vertical line on a keyboard) provides the simplest form of this kind of communication. Simply put, a pipe takes the output of one utility and sends that output as input to another utility. Using UNIX terminology, a pipe takes standard output of one process and redirects it to become standard input of another process. (For more information refer to "Standard Input and Standard Output" on page 120.) Most of what a process displays on the screen is sent to standard output. If you do not redirect it, this output appears on the screen. Using a pipe, you can redirect the output so that it becomes instead standard input of another utility. A utility such as head can take its input from a file whose name you specify on the command line following the word head, or it can take its input from standard input. For example, you can give the command shown in Figure 3-5 on page 47 as follows:

$ cat months | head Jan Feb Mar Apr May Jun Jul Aug Sep Oct

The next command displays the number of files in a directory. The wc (word count) utility with the -w option displays the number of words in its standard input or in a file you specify on the command line:

$ ls | wc -w 14

You can use a pipe to send output of a program to the printer:

$ tail months | lpr

Caution: Pipes do not work with resource forks

Pipes work with the data fork of a file only;they do not work with resource forks (page 43). Although this limitation is probably not an issue now, it may become more important as you read on. For more information see the "Redirection does not support resource forks" tip on page 94.

Категории