A Practical Guide to UNIX for Mac OS X Users
As discussed on page 262, before a process can read from or write to a file it must open that file. When a process opens a file, Mac OS X associates a number (called a file descriptor) with the file. Each process has its own set of open files and its own file descriptors. After opening a file, a process reads from and writes to that file by referring to its file descriptor. When it no longer needs the file, the process closes the file, freeing the file descriptor. A typical Mac OS X process starts with three open files: standard input (file descriptor 0), standard output (file descriptor 1), and standard error (file descriptor 2). Often those are the only files the process needs. Recall that you redirect standard output with the symbol > or the symbol 1> and that you redirect standard error with the symbol 2>. Although you can redirect other file descriptors, because file descriptors other than 0, 1, and 2 do not have any special conventional meaning, it is rarely useful to do so. The exception is in programs that you write yourself, in which case you control the meaning of the file descriptors and can take advantage of redirection. Opening a file descriptor The Bourne Again Shell opens files using the exec builtin as follows: exec n> outfile exec m< infile The first line opens outfile for output and holds it open, associating it with file descriptor n. The second line opens infile for input and holds it open, associating it with file descriptor m. Duplicating a file descriptor The <& token duplicates an input file descriptor; use >& to duplicate an output file descriptor. You can duplicate a file descriptor by making it refer to the same file as another open file descriptor, such as standard input or output. Use the following format to open or redirect file descriptor n as a duplicate of file descriptor m: exec n<&m Once you have opened a file, you can use it for input and output in two different ways. First, you can use I/O redirection on any command line, redirecting standard output to a file descriptor with >&n or redirecting standard input from a file descriptor with <&n. Second, you can use the read (page 571) and echo builtins. If you invoke other commands, including functions (page 314), they inherit these open files and file descriptors. When you have finished using a file, you can close it with exec n<&
When you invoke the shell function in the next example, named mycp, with two arguments, it copies the file named by the first argument to the file named by the second argument. If you supply only one argument, the script copies the file named by the argument to standard output. If you invoke mycp with no arguments, it copies standard input to standard output. Tip: A function is not a shell script The mycp example is a shell function; it will not work as you expect if you execute it as a shell script. (It will work: The function will be created in a very short-lived subshell, which is probably of little use.) You can enter this function from the keyboard. If you put the function in a file, you can run it as an argument to the . (dot) builtin (page 261). You can also put the function in a startup file if you want it to be always available (page 315). function mycp () { case $# in 0) # zero arguments # file descriptor 3 duplicates standard input # file descriptor 4 duplicates standard output exec 3<&0 4<&1 ;; 1) # one argument # open the file named by the argument for input # and associate it with file descriptor 3 # file descriptor 4 duplicates standard output exec 3< $1 4<&1 ;; 2) # two arguments # open the file named by the first argument for input # and associate it with file descriptor 3 # open the file named by the second argument for output # and associate it with file descriptor 4 exec 3< $1 4> $2 ;; *) echo "Usage: mycp [source [dest]]" return 1 ;; esac # call cat with input coming from file descriptor 3 # and output going to file descriptor 4 cat <&3 >&4 # close file descriptors 3 and 4 exec 3<&- 4<&- } The real work of this function is done in the line that begins with cat. The rest of the script arranges for file descriptors 3 and 4, which are the input and output of the cat command, to be associated with the appropriate files. With its output redirected, the cat utility supports only data forks. Files that have resource forks or other metadata will not be copied properly by this script. For more information see the "Redirection does not support resource forks" tip on page 94.
Parameters and Variables
Shell parameters and variables were introduced on page 278. This section adds to the previous coverage with a discussion of array variables, global versus local variables, special and positional parameters, and expanding null and unset variables. Array Variables
The Bourne Again Shell supports one-dimensional array variables. The subscripts are integers with zero-based indexing (i.e., the first element of the array has the subscript 0). The following format declares and assigns values to an array: name=(element1 element2 ...)
The following example assigns four values to the array NAMES: $ NAMES=(max helen sam zach)
You reference a single element of an array as follows: $ echo ${NAMES[2]} sam
The subscripts [*] and [@] both extract the entire array but work differently when used within double quotation marks. An @ produces an array that is a duplicate of the original array; an * produces a single element of an array (or a plain variable) that holds all the elements of the array separated by the first character in IFS (normally a SPACE). In the following example, the array A is filled with the elements of the NAMES variable using an *, and B is filled using an @. The declare builtin with the a option displays the values of the arrays (and reminds you that bash uses zero-based indexing for arrays): $ A=("${NAMES[*]}") $ B=("${NAMES[@]}") $ declare -a declare -a A='([0]="max helen sam zach")' declare -a B='([0]="max" [1]="helen" [2]="sam" [3]="zach")' ... declare -a NAMES='([0]="max" [1]="helen" [2]="sam" [3]="zach")' From the output of declare, you can see that NAMES and B have multiple elements. In contrast, A, which was assigned its value with an * within double quotation marks, has only one element: A has all its elements enclosed between double quotation marks. In the next example, echo attempts to display element 1 of array A. Nothing is displayed because A has only one element and that element has an index of 0. Element 0 of array A holds all four names. Element 1 of B holds the second item in the array and element 0 holds the first item. $ echo ${A[1]} $ echo ${A[0]} max helen sam zach $ echo ${B[1]} helen $ echo ${B[0]} max
You can apply the ${#name[*]} operator to array variables, returning the number of elements in the array: $ echo ${#NAMES[*]} 4
The same operator, when given the index of an element of an array in place of *, returns the length of the element: $ echo ${#NAMES[1]} 5 You can use subscripts on the left side of an assignment statement to replace selected elements of the array: $ NAMES[1]=alex $ echo ${NAMES[*]} max alex sam zach Locality of Variables
By default variables are local to the process in which they are declared. Thus a shell script does not have access to variables declared in your login shell unless you explicitly make the variables available (global). Under bash, export makes a variable available to child processes. Under tcsh, setenv (page 353) assigns a value to a variable and makes it available to child processes. The examples in this section use the bash syntax but the theory applies to both shells. Once you use the export builtin with a variable name as an argument, the shell places the value of the variable in the calling environment of child processes. This call by value gives each child process a copy of the variable for its own use. The following extest1 shell script assigns a value of american to the variable named cheese and then displays its filename (extest1) and the value of cheese. The extest1 script then calls subtest, which attempts to display the same information. Next subtest declares a cheese variable and displays its value. When subtest finishes, it returns control to the parent process, which is executing extest1. At this point extest1 again displays the value of the original cheese variable. $ cat extest1 cheese=american echo "extest1 1: $cheese" subtest echo "extest1 2: $cheese" $ cat subtest echo "subtest 1: $cheese" cheese=swiss echo "subtest 2: $cheese" $ extest1 extest1 1: american subtest 1: subtest 2: swiss extest1 2: american The subtest script never receives the value of cheese from extest1, and extest1 never loses the value. Unlike in the real world, a child can never affect its parent's attributes. When a process attempts to display the value of a variable that has not been declared, as is the case with subtest, the process displays nothing; the value of an undeclared variable is that of a null string. The following extest2 script is the same as extest1 except that it uses export to make cheese available to the subtest script: $ cat extest2 export cheese=american echo "extest2 1: $cheese" subtest echo "extest2 2: $cheese" $ extest2 extest2 1: american subtest 1: american subtest 2: swiss extest2 2: american Here the child process inherits the value of cheese as american and, after displaying this value, changes its copy to swiss. When control is returned to the parent, the parent's copy of cheese retains its original value: american. An export builtin can optionally include an assignment: export cheese=american The preceding statement is equivalent to the following two statements: cheese=american export cheese
Although it is rarely done, you can export a variable before you assign a value to it. You do not need to export an already-exported variable a second time after you change its value. For example, you do not usually need to export PATH when you assign a value to it in ~/.bash_profile because it is typically exported in the /etc/profile global startup file. Functions
Because functions run in the same environment as the shell that calls them, variables are implicitly shared by a shell and a function it calls. $ function nam () { > echo $myname > myname=zach > } $ myname=sam $ nam sam $ echo $myname zach
In the preceding example, the myname variable is set to sam in the interactive shell. Then the nam function is called. It displays the value of myname it has (sam) and sets myname to zach. The final echo shows that, in the interactive shell, the value of myname has been changed to zach. Function local variables Local variables are helpful in a function written for general use. Because the function is called by many scripts that may be written by different programmers, you need to make sure that the names of the variables used within the function do not interact with variables of the same name in the programs that call the function. Local variables eliminate this problem. When used within a function, the typeset builtin declares a variable to be local to the function it is defined in. The next example shows the use of a local variable in a function. It uses two variables named count. The first is declared and assigned a value of 10 in the interactive shell. Its value never changes, as echo verifies after count_down is run. The other count is declared, using typeset, to be local to the function. Its value, which is unknown outside the function, ranges from 4 to 1, as the echo command within the function confirms. The example shows the function being entered from the keyboard; it is not a shell script. (See the tip "A function is not a shell script" on page 556). $ function count_down () { > typeset count > count=$1 > while [ $count -gt 0 ] > do > echo "$count..." > ((count=count-1)) > sleep 1 > done > echo "Blast Off." > } $ count=10 $ count_down 4 4... 3... 2... 1... Blast Off\! $ echo $count 10
The ((count=count1)) assignment is enclosed between double parentheses, which cause the shell to perform an arithmetic evaluation (page 585). Within the double parentheses you can reference shell variables without the leading dollar sign ($). Special Parameters
Special parameters enable you to access useful values pertaining to command line arguments and the execution of shell commands. You reference a shell special parameter by preceding a special character with a dollar sign ($). As with positional parameters, it is not possible to modify the value of a special parameter by assignment. $$: PID Number
The shell stores in the $$ parameter the PID number of the process that is executing it. In the following interaction, echo displays the value of this variable and the ps utility confirms its value. Both commands show that the shell has a PID number of 5209: $ echo $$ 5209 $ ps -a PID TT STAT TIME COMMAND 1709 p1 Ss 0:00.32 -bash 4168 p1 R+ 0:00.00 ps -a
Because echo is built into the shell, the shell does not have to create another process when you give an echo command. However, the results are the same whether echo is a builtin or not, because the shell substitutes the value of $$ before it forks a new process to run a command. Try using the echo utility (/bin/echo), which is run by another process, and see what happens. In the following example, the shell substitutes the value of $$ and passes that value to cp as a prefix for a filename: $ echo $$ 8232 $ cp memo $$.memo $ ls 8232.memo memo
Incorporating a PID number in a filename is useful for creating unique filenames when the meanings of the names do not matter; it is often used in shell scripts for creating names of temporary files. When two people are running the same shell script, these unique filenames keep them from inadvertently sharing the same temporary file. The following example demonstrates that the shell creates a new shell process when it runs a shell script. The id2 script displays the PID number of the process running it (not the process that called itthe substitution for $$ is performed by the shell that is forked to run id2): $ cat id2 echo "$0 PID= $$" $ echo $$ 8232 $ id2 ./id2 PID= 8362 $ echo $$ 8232
The first echo displays the PID number of the interactive shell. Then id2 displays its name ($0) and the PID of the subshell that it is running in. The last echo shows that the PID number of the interactive shell has not changed. $! The value of the PID number of the last process that you ran in the background is stored in $! (not available in tcsh). The following example executes sleep as a background task and uses echo to display the value of $!: $ sleep 60 & [1] 8376 $ echo $! 8376
$?: Exit Status
When a process stops executing for any reason, it returns an exit status to the parent process. The exit status is also referred to as a condition code or a return code. The $? ($status under tcsh) variable stores the exit status of the last command. By convention a nonzero exit status represents a false value and means that the command failed. A zero is true and indicates that the command was successful. In the following example, the first ls command succeeds and the second fails: $ ls es es $ echo $? 0 $ ls xxx ls: xxx: No such file or directory $ echo $? 1
You can specify the exit status that a shell script returns by using the exit builtin, followed by a number, to terminate the script. If you do not use exit with a number to terminate a script, the exit status of the script is that of the last command the script ran. $ cat es echo This program returns an exit status of 7. exit 7 $ es This program returns an exit status of 7. $ echo $? 7 $ echo $? 0
The es shell script displays a message and terminates execution with an exit command that returns an exit status of 7, the user-defined exit status in this script. The first echo then displays the value of the exit status of es. The second echo displays the value of the exit status of the first echo. The value is 0 because the first echo was successful. Positional Parameters
The positional parameters comprise the command name and command line arguments. They are called positional because within a shell script, you refer to them by their position on the command line. Only the set builtin (page 568) allows you to change the values of positional parameters with one exception: You cannot change the value of the command name from within a script. The tcsh set builtin does not change the values of positional parameters. $#: Number of Command Line Arguments
The $# parameter holds the number of arguments on the command line (positional parameters), not counting the command itself: $ cat num_args echo "This script was called with $# arguments." $ num_args sam max zach This script was called with 3 arguments.
$0: Name of the Calling Program
The shell stores the name of the command you used to call a program in parameter $0. This parameter is numbered zero because it appears before the first argument on the command line: $ cat abc echo "The command used to run this script is $0" $ abc The command used to run this script is ./abc $ /Users/sam/abc The command used to run this script is /Users/sam/abc
The preceding shell script uses echo to verify the name of the script you are executing. You can use the basename utility and command substitution to extract and display the simple filename of the command: $ cat abc2 echo "The command used to run this script is $(basename $0)" $ /Users/sam/abc2 The command used to run this script is abc2
$1-$n: Command Line Arguments
The first argument on the command line is represented by parameter $1, the second argument by $2, and so on up to $n. For values of n over 9, the number must be enclosed within braces. For example, the twelfth command line argument is represented by ${12}. The following script displays positional parameters that hold command line arguments: $ cat display_5args echo First 5 arguments are $1 $2 $3 $4 $5 $ display_5args jenny alex helen First 5 arguments are jenny alex helen
The display_5args script displays the first five command line arguments. The shell assigns a null value to each parameter that represents an argument that is not present on the command line. Thus the $4 and $5 variables have null values in this example. $* The $* variable represents all the command line arguments, as the display_all program demonstrates: $ cat display_all echo All arguments are $* $ display_all a b c d e f g h i j k l m n o p All arguments are a b c d e f g h i j k l m n o p
Enclose references to positional parameters between double quotation marks. The quotation marks are particularly important when you are using positional parameters as arguments to commands. Without double quotation marks, a positional parameter that is not set or that has a null value disappears: $ cat showargs echo "$0 was called with $# arguments, the first is :$1:." $ showargs a b c ./showargs was called with 3 arguments, the first is :a:. $ echo $xx $ showargs $xx a b c ./showargs was called with 3 arguments, the first is :a:. $ showargs "$xx" a b c ./showargs was called with 4 arguments, the first is ::.
The showargs script displays the number of arguments ($#) followed by the value of the first argument enclosed between colons. The preceding example first calls showargs with three simple arguments. Next the echo command demonstrates that the $xx variable, which is not set, has a null value. In the final two calls to showargs, the first argument is $xx. In the first case the command line becomes showargs a b c; the shell passes showargs three arguments. In the second case the command line becomes showargs "" a b c, which results in calling showargs with four arguments. The difference in the two calls to showargs illustrates a subtle potential problem that you should keep in mind when using positional parameters that may not be set or that may have a null value. "$*"versus "$@" The $* and $@ parameters work the same way except when they are enclosed within double quotation marks. Using "$*" yields a single argument (with SPACEs or the value of IFS [page 288] between the positional parameters), whereas "$@" produces a list wherein each positional parameter is a separate argument. This difference typically makes "$@" more useful than "$*" in shell scripts. The following scripts help to explain the difference between these two special parameters. In the second line of both scripts, the single quotation marks keep the shell from interpreting the enclosed special characters so they can be displayed as themselves. The bb1 script shows that set "$*" assigns multiple arguments to the first command line parameter: $ cat bb1 set "$*" echo $# parameters with '"$*"' echo 1: $1 echo 2: $2 echo 3: $3 $ bb1 a b c 1 parameters with "$*"' 1: a b c 2: 3:
The bb2 script shows that set "$@" assigns each argument to a different command line parameter: $ cat bb2 set "$@" echo $# parameters with '"$@"' echo 1: $1 echo 2: $2 echo 3: $3 $ bb2 a b c 3 parameters with "$@" 1: a 2: b 3: c
shift: Promotes Command Line Arguments
The shift builtin promotes each command line argument. The first argument (which was $1) is discarded. The second argument (which was $2) becomes the first argument (now $1), the third becomes the second, and so on. Because no "unshift" command exists, you cannot bring back arguments that have been discarded. An optional argument to shift specifies the number of positions to shift (and the number of arguments to discard); the default is 1. The following demo_shift script is called with three arguments. Double quotation marks around the arguments to echo preserve the spacing of the output. The program displays the arguments and shifts them repeatedly until there are no more arguments left to shift: $ cat demo_shift echo "arg1= $1 arg2= $2 arg3= $3" shift echo "arg1= $1 arg2= $2 arg3= $3" shift echo "arg1= $1 arg2= $2 arg3= $3" shift echo "arg1= $1 arg2= $2 arg3= $3" shift $ demo_shift alice helen jenny arg1= alice arg2= helen arg3= jenny arg1= helen arg2= jenny arg3= arg1= jenny arg2= arg3= arg1= arg2= arg3=
Repeatedly using shift is a convenient way to loop over all the command line arguments in shell scripts that expect an arbitrary number of arguments. See page 529 for a shell script that uses shift. set: Initializes Command Line Arguments
When you call the set builtin with one or more arguments, it assigns the values of the arguments to the positional parameters, starting with $1 (not available in tcsh). The following script uses set to assign values to the positional parameters $1, $2, and $3: $ cat set_it set this is it echo $3 $2 $1 $ set_it it is this Combining command substitution (page 327) with the set builtin is a convenient way to get standard output of a command in a form that can be easily manipulated in a shell script. The following script shows how to use date and set to provide the date in a useful format. The first command shows the output of date. Then cat displays the contents of the dateset script. The first command in this script uses command substitution to set the positional parameters to the output of the date utility. The next command, echo $*, displays all positional parameters resulting from the previous set. Subsequent commands display the values of parameters $1, $2, $3, and $4. The final command displays the date in a format you can use in a letter or report: $ date Wed Jan 5 23:39:18 PST 2005 $ cat dateset set $(date) echo $* echo echo "Argument 1: $1" echo "Argument 2: $2" echo "Argument 3: $3" echo "Argument 6: $6" echo echo "$2 $3, $6" $ dateset Wed Jan 5 23:39:25 PST 2005 Argument 1: Wed Argument 2: Jan Argument 3: 5 Argument 6: 2005 Jan 5, 2005 You can also use the +format argument to date (page 701) to modify the format of its output. When used without any arguments, set displays a list of the shell variables that are set, including user-created variables and keyword variables. Under bash, this list is the same as that displayed by declare and typeset when they are called without any arguments. The set builtin also accepts options that let you customize the behavior of the shell (not available in tcsh). For more information refer to "set ±o: Turns Shell Features On and Off" on page 318. Expanding Null and Unset Variables
The expression ${name} (or just $name if it is not ambiguous) expands to the value of the name variable. If name is null or not set, bash expands ${name} to a null string. The Bourne Again Shell provides the following alternatives to accepting the expanded null string as the value of the variable:
You can choose one of these alternatives by using a modifier with the variable name. In addition, you can use set o nounset (page 320) to cause bash to display an error and exit from a script whenever an unset variable is referenced. :- Uses a Default Value
The :- modifier uses a default value in place of a null or unset variable while allowing a nonnull variable to represent itself: ${name:default}
The shell interprets : as "If name is null or unset, expand default and use the expanded value in place of name; else use name." The following command lists the contents of the directory named by the LIT variable. If LIT is null or unset, it lists the contents of /Users/alex/literature: $ ls ${LIT:-/Users/alex/literature}
The default can itself have variable references that are expanded: $ ls ${LIT:-$HOME/literature}
:= Assigns a Default Value
The : modifier does not change the value of a variable. You may want to change the value of a null or unset variable to its default in a script, however. You can do so with the := modifier: ${name:=default}
The shell expands the expression ${name:=default} in the same manner as it expands ${name:default} but also sets the value of name to the expanded value of default. If a script contains a line such as the following and LIT is unset or null at the time this line is executed, LIT is assigned the value /Users/alex/literature: $ ls ${LIT:=/Users/alex/literature}
: builtin Shell scripts frequently start with the : (colon) builtin followed on the same line by the := expansion modifier to set any variables that may be null or unset. The : builtin evaluates each token in the remainder of the command line but does not execute any commands. Without the leading colon (:), the shell evaluates and attempts to execute the "command" that results from the evaluation. Use the following syntax to set a default for a null or unset variable in a shell script (there is a SPACE following the first colon): : ${name:=default}
When a script needs a directory for temporary files and uses the value of TEMPDIR for the name of this directory, the following line makes TEMPDIR default to /tmp: : ${TEMPDIR:=/tmp}
:? Displays an Error Message
Sometimes a script needs the value of a variable but you cannot supply a reasonable default at the time you write the script. If the variable is null or unset, the :? modifier causes the script to display an error message and terminate with an exit status of 1: ${name:?message} You must quote message if it contains SPACEs. If you omit message, the shell displays the default error message (parameter null or not set). Interactive shells do not exit when you use :?. In the following command, TESTDIR is not set so the shell displays on standard error the expanded value of the string following :?. In this case the string includes command substitution for date, with the %T format being followed by the string error, variable not set. cd ${TESTDIR:?$(date +%T) error, variable not set.} bash: TESTDIR: 16:16:14 error, variable not set. Builtin Commands
Builtin commands were introduced in Chapter 5. Commands that are built into a shell do not fork a new process when you execute them. This section discusses the type, read, exec, trap, kill, and getopts builtins and concludes with Table 13-6 on page 583, which lists many bash builtins. See Table 9-10 on page 373 for a list of tcsh builtins. type: Displays Information About a Command
The type builtin (use which under tcsh) provides information about a command: $ type cat echo who if lt cat is hashed (/bin/cat) echo is a shell builtin who is /usr/bin/who if is a shell keyword lt is aliased to 'ls -ltrh tail'
The preceding output shows the files that would be executed if you gave cat or who as a command. Because cat has already been called from the current shell, it is in the hash table (page 935) and type reports that cat is hashed. The output also shows that a call to echo runs the echo builtin, if is a keyword, and lt is an alias. read: Accepts User Input
When you begin writing shell scripts, you soon realize that one of the most common tasks for user-created variables is storing information a user enters in response to a prompt. Using read, scripts can accept input from the user and store that input in variables. See page 358 for information about reading user input under tcsh. The read builtin reads one line from standard input and assigns the words on the line to one or more variables: $ cat read1 echo -n "Go ahead: " read firstline echo "You entered: $firstline" $ read1 Go ahead: This is a line. You entered: This is a line.
The first line of the read1 script uses echo to prompt you to enter a line of text. The n option suppresses the following NEWLINE, allowing you to enter a line of text on the same line as the prompt. The second line reads the text into the variable firstline. The third line verifies the action of read by displaying the value of firstline. The variable is quoted (along with the text string) in this example because you, as the script writer, cannot anticipate which characters the user might enter in response to the prompt. Consider what would happen if the variable were not quoted and the user entered * in response to the prompt: $ cat read1_no_quote echo -n "Go ahead: " read firstline echo You entered: $firstline $ read1_no_quote Go ahead: * You entered: read1 read1_no_quote script.1 $ ls read1 read1_no_quote script.1
The ls command lists the same words as the script, demonstrating that the shell expands the asterisk into a list of files in the working directory. When the variable $firstline is surrounded by double quotation marks, the shell does not expand the asterisk. Thus the read1 script behaves correctly: $ read1 Go ahead: * You entered: * If you want the shell to interpret the special meanings of special characters, do not use quotation marks. REPLY The read builtin has features that can make it easier to use. When you do not specify a variable to receive read's input, bash puts the input into the variable named REPLY. You can use the p option to prompt the user instead of using a separate echo command. The following read1a script performs exactly the same task as read1: $ cat read1a read -p "Go ahead: " echo "You entered: $REPLY"
The read2 script prompts for a command line and reads the user's response into the variable cmd. The script then attempts to execute the command line that results from the expansion of the cmd variable: $ cat read2 read -p "Enter a command: " cmd $cmd echo "Thanks"
In the following example, read2 reads a command line that calls the echo builtin. The shell executes the command and then displays Thanks. Next read2 reads a command line that executes the who utility: $ read2 Enter a command: echo Please display this message. Please display this message. Thanks $ read2 Enter a command: who alex console Jun 10 15:20 alex ttyp1 Jun 16 15:16 (bravo.example.com) Thanks If cmd does not expand into a valid command line, the shell issues an error message: $ read2 Enter a command: xxx ./read2: line 2: xxx: command not found Thanks
The read3 script reads values into three variables. The read builtin assigns one word (a sequence of nonblank characters) to each variable: $ cat read3 read -p "Enter something: " word1 word2 word3 echo "Word 1 is: $word1" echo "Word 2 is: $word2" echo "Word 3 is: $word3" $ read3 Enter something: this is something Word 1 is: this Word 2 is: is Word 3 is: something
When you enter more words than read has variables, read assigns one word to each variable, with all leftover words going to the last variable. Both read1 and read2 assigned the first word and all leftover words to the one variable they each had to work with. In the following example, read accepts five words into three variables, assigning the first word to the first variable, the second word to the second variable, and the third through fifth words to the third variable: $ read3 Enter something: this is something else, really. Word 1 is: this Word 2 is: is Word 3 is: something else, really.
Table 13-4 lists some of the options supported by the read builtin.
The read builtin returns an exit status of 0 if it successfully reads any data. It has a nonzero exit status when it reaches the EOF (end of file). The following example runs a while loop from the command line. It takes its input from the names file and terminates after reading the last line from names. $ cat names Alice Jones Robert Smith Alice Paulson John Q. Public $ while read first rest > do > echo $rest, $first > done < names Jones, Alice Smith, Robert Paulson, Alice Q. Public, John $
The placement of the redirection symbol (<) for the while structure is critical. It is important that you place the redirection symbol at the done statement and not at the call to read.
exec: Executes a Command
The exec builtin (not available in tcsh) has two primary purposes: to run a command without creating a new process and to redirect a file descriptorincluding standard input, output, or errorof a shell script from within the script (page 555). When the shell executes a command that is not built into the shell, it typically creates a new process. The new process inherits environment (global or exported) variables from its parent but does not inherit variables that are not exported by the parent. (For more information refer to "Locality of Variables" on page 560.) In contrast, exec executes a command in place of (overlays) the current process. exec versus .(dot) Insofar as exec runs a command in the environment of the original process, it is similar to the .(dot) command (page 261). However, unlike the . command, which can run only shell scripts, exec can run both scripts and compiled programs. Also, whereas the . command returns control to the original script when it finishes running, exec does not. Finally, the . command gives the new program access to local variables, whereas exec does not. exec runs a command The exec builtin used for running a command has the following syntax: exec command arguments
exec does not return control Because the shell does not create a new process when you use exec, the command runs more quickly. However, because exec does not return control to the original program, it can be used only as the last command that you want to run in a script. The following script shows that control is not returned to the script: $ cat exec_demo who exec date echo "This line is never displayed." $ exec_demo jenny ttyp1 May 30 7:05 (bravo.example.com) hls ttyp2 May 30 6:59 Mon May 30 11:42:56 PDT 2005 The next example, a modified version of the out script (page 529), uses exec to execute the final command the script runs. Because out runs either cat or less and then terminates, the new version, named out2, uses exec with both cat and less: $ cat out2 if [ $# -eq 0 ] then echo "Usage: out2 [-v] filenames" 1>&2 exit 1 fi if [ "$1" = "-v" ] then shift exec less "$@" else exec cat -- "$@" fi
exec redirects input and output The second major use of exec is to redirect a file descriptorincluding standard input, output, or errorfrom within a script. The next command causes all subsequent input to a script that would have come from standard input to come from the file named infile: exec < infile Similarly the following command redirects standard output and standard error to outfile and errfile, respectively: exec > outfile 2> errfile When you use exec in this manner, the current process is not replaced with a new process, and exec can be followed by other commands in the script. /dev/tty When you redirect the output from a script to a file, you must make sure that the user sees any prompts the script displays. The /dev/tty device is a pseudonym for the screen the user is working on; you can use this device to refer to the user's screen without knowing which device it is. (The tty utility displays the name of the device you are using.) By redirecting the output from a script to /dev/tty, you ensure that prompts and messages go to the user's terminal, regardless of which terminal the user is logged in on. Messages sent to /dev/tty are also not diverted if standard output and standard error from the script are redirected. The to_screen1 script sends output to three places: standard output, standard error, and the user's screen. When it is run with standard output and standard error redirected, to_screen1 still displays the message sent to /dev/tty on the user's screen. The out and err files hold the output sent to standard output and standard error. $ cat to_screen1 echo "message to standard output" echo "message to standard error" 1>&2 echo "message to the user" > /dev/tty $ to_screen1 > out 2> err message to the user $ cat out message to standard output $ cat err message to standard error
The following command redirects the output from a script to the user's screen: exec > /dev/tty
Putting this command at the beginning of the previous script changes where the output goes. In to_screen2, exec redirects standard output to the user's screen so the >/dev/tty is superfluous. Following the exec command, all output sent to standard output goes to /dev/tty (the screen). Output to standard error is not affected. $ cat to_screen2 exec > /dev/tty echo "message to standard output" echo "message to standard error" 1>&2 echo "message to the user" > /dev/tty $ to_screen2 > out 2> err message to standard output message to the user
One disadvantage of using exec to redirect the output to /dev/tty is that all subsequent output is redirected unless you use exec again in the script. You can also redirect the input to read (standard input) so that it comes from /dev/tty (the keyboard): read name < /dev/tty
or exec < /dev/tty
trap: Catches a Signal
A signal is a report to a process about a condition. Mac OS X uses signals to report interrupts generated by the user (for example, pressing the interrupt key) as well as bad system calls, broken pipes, illegal instructions, and other conditions. The trap builtin (tcsh uses onintr) catches, or traps, one or more signals, allowing you to direct the actions a script takes when it receives a specified signal. This discussion covers six signals that are significant when you work with shell scripts. Table 13-5 lists these signals, the signal numbers that systems often ascribe to them, and the conditions that usually generate each signal. Give the command kill l, trap l, or man 7 signal for a list of signal names.
When it traps a signal, a script takes whatever action you specify: It can remove files or finish any other processing as needed, display a message, terminate execution immediately, or ignore the signal. If you do not use trap in a script, any of the six actual signals listed in Table 13-5 (not EXIT, DEBUG, or ERR) terminates the script. Because a process cannot trap a KILL signal, you can use kill KILL (or kill 9) as a last resort to terminate a script or any other process. (See page 580 for more information on kill.) The trap command has the following syntax: trap ['commands'] [signal]
The optional commands part specifies the commands that the shell executes when it catches one of the signals specified by signal. The signal can be a signal name or numberfor example, INT or 2. If commands is not present, trap resets the trap to its initial condition, which is usually to exit from the script. The trap builtin does not require single quotation marks around commands as shown in the preceding syntax, but it is a good practice to use them. The single quotation marks cause shell variables within the commands to be expanded when the signal occurs, not when the shell evaluates the arguments to trap. Even if you do not use any shell variables in the commands, you need to enclose any command that takes arguments within either single or double quotation marks. Quoting the commands causes the shell to pass to TRap the entire command as a single argument. After executing the commands, the shell resumes executing the script where it left off. If you want trap to prevent a script from exiting when it receives a signal but not to run any commands explicitly, you can specify a null (empty) commands string, as shown in the locktty script (page 543). The following command traps signal number 15 after which the script continues. trap '' 15 The following script demonstrates how the trap builtin can catch the terminal interrupt signal (2). You can use SIGINT, INT, or 2 to specify this signal. The script returns an exit status of 1: $ cat inter #!/bin/bash trap 'echo PROGRAM INTERRUPTED; exit 1' INT while true do echo "Program running." sleep 1 done $ inter Program running. Program running. Program running. CONTROL-C ^CPROGRAM INTERRUPTED $
:(null) builtin The second line of inter sets up a trap for the terminal interrupt signal using INT. When trap catches the signal, the shell executes the two commands between the single quotation marks in the TRap command. The echo builtin displays the message PROGRAM INTERRUPTED, exit terminates the shell running the script, and the parent shell displays a prompt. If exit were not there, the shell would return control to the while loop after displaying the message. The while loop repeats continuously until the script receives a signal because the true utility always returns a true exit status. In place of true you can use the : (null) builtin, which is written as a colon and always returns a 0 (true) status. The trap builtin frequently removes temporary files when a script is terminated prematurely so that the files are not left to clutter the filesystem. The following shell script, named addbanner, uses two traps to remove a temporary file when the script terminates normally or owing to a hangup, software interrupt, quit, or software termination signal: $ cat addbanner #!/bin/bash script=$(basename $0) if [ ! -r "$HOME/banner" ] then echo "$script: need readable $HOME/banner file" 1>&2 exit 1 fi trap 'exit 1' 1 2 3 15 trap 'rm /tmp/$$.$script 2> /dev/null' 0 for file do if [ -r "$file" -a -w "$file" ] then cat $HOME/banner $file > /tmp/$$.$script cp /tmp/$$.$script $file echo "$script: banner added to $file" 1>&2 else echo "$script: need read and write permission for $file" 1>&2 fi done When called with one or more filename arguments, addbanner loops through the files, adding a header to the top of each. This script is useful when you use a standard format at the top of your documents, such as a standard layout for memos, or when you want to add a standard header to shell scripts. The header is kept in a file named ~/banner. Because addbanner uses the HOME variable, which contains the pathname of the user's home directory, the script can be used by several users without modification. If Alex had written the script with /Users/alex in place of $HOME and then given the script to Jenny, either she would have had to change it or addbanner would have used Alex's banner file when Jenny ran it (assuming Jenny had read permission for the file). The first trap in addbanner causes it to exit with a status of 1 when it receives a hangup, software interrupt (terminal interrupt or quit signal), or software termination signal. The second trap uses a 0 in place of signal-number, which causes trap to execute its command argument whenever the script exits because it receives an exit command or reaches its end. Together these TRaps remove a temporary file whether the script terminates normally or prematurely. Standard error of the second trap is sent to /dev/null for cases in which trap attempts to remove a nonexistent temporary file. In those cases rm sends an error message to standard error; because standard error is redirected, the user does not see this message. See page 543 for another example that uses trap. kill: Aborts a Process
The kill builtin sends a signal to a process or job. The kill command has the following syntax: kill [signal] PID where signal is the signal name or number (for example, INT or 2) and PID is the process identification number of the process that is to receive the signal. You can specify a job number (page 131) as %n in place of PID. If you omit signal, kill sends a TERM (software termination, number 15) signal. For more information on signal names and numbers see Table 13-5 on page 577. The following command sends the TERM signal to job number 1: $ kill -TERM %1
Because TERM is the default signal for kill, you can also give this command as kill %1. Give the command kill l (lowercase "l") to display a list of signal names. A program that is interrupted often leaves matters in an unpredictable state: Temporary files may be left behind (when they are normally removed), and permissions may be changed. A well-written application traps, or detects, signals and cleans up before exiting. Most carefully written applications trap the INT, QUIT, and TERM signals. To terminate a program, first try INT (press CONTROL-C, if the job is in the foreground). Because an application can be written to ignore these signals, you may need to use the KILL signal, which cannot be trapped or ignored; it is a "sure kill." Refer to page 761 for more information on kill. See also the related utility killall (page 763). getopts: Parses Options
The getopts builtin (not available in tcsh) parses command line arguments, thereby making it easier to write programs that follow the Mac OS X argument conventions. The syntax for getopts is getopts optstring varname [arg...]
where optstring is a list of the valid option letters, varname is the variable that receives the options one at a time, and arg is the optional list of parameters to be processed. If arg is not present, getopts processes the command line arguments. If optstring starts with a colon (:), the script takes care of generating error messages; otherwise, getopts generates error messages. The getopts builtin uses the OPTIND (option index) and OPTARG (option argument) variables to store option-related values. When a shell script starts, the value of OPTIND is 1. Each time getopts locates an argument, it increments OPTIND to the index of the next option to be processed. If the option takes an argument, bash assigns the value of the argument to OPTARG. To indicate that an option takes an argument, follow the corresponding letter in optstring with a colon (:). The option string dxo:lt:r indicates that getopts should search for d, x, o, l, t, and r options and that the o and t options take arguments. Using getopts as the test-command in a while control structure allows you to loop over the options one at a time. The getopts builtin checks the option list for options that are in optstring. Each time through the loop, getopts stores the option letter it finds in varname. Suppose that you want to write a program that can take three options:
In addition, the program should ignore all other options and end option processing when it encounters two hyphens (--). The problem is to write the portion of the program that determines which options the user has supplied. The following solution does not use getopts. SKIPBLANKS= TMPDIR=/tmp CASE=lower while [[ "$1" = -* ]] # [[ = ]] does pattern match do case $1 in -b) SKIPBLANKS=TRUE ;; -t) if [ -d "$2" ] then TMPDIR=$2 shift else echo "$0: -t takes a directory argument." >&2 exit 1 fi ;; -u) CASE=upper ;; --) break ;; # Stop processing options *) echo "$0: Invalid option $1 ignored." >&2 ;; esac shift done
This program fragment uses a loop to check and shift arguments while the argument is not . As long as the argument is not two hyphens, the program continues to loop through a case statement that checks for possible options. The case label breaks out of the while loop. The * case label recognizes any option; it appears as the last case label to catch any unknown options, displays an error message, and allows processing to continue. On each pass through the loop, the program does a shift to get to the next argument. If an option takes an argument, the program does an extra shift to get past that argument. The following program fragment processes the same options, but uses getopts: SKIPBLANKS= TMPDIR=/tmp CASE=lower while getopts :bt:u arg do case $arg in b) SKIPBLANKS=TRUE ;; t) if [ -d "$OPTARG" ] then TMPDIR=$OPTARG else echo "$0: $OPTARG is not a directory." >&2 exit 1 fi ;; u) CASE=upper ;; :) echo "$0: Must supply an argument to -$OPTARG." >&2 exit 1 ;; \?) echo "Invalid option -$OPTARG ignored." >&2 ;; esac done
In this version of the code, the while structure evaluates the getopts builtin each time it comes to the top of the loop. The getopts builtin uses the OPTIND variable to keep track of the index of the argument it is to process the next time it is called. There is no need to call shift in this example. In the getopts version of the script the case patterns do not start with a hyphen because the value of arg is just the option letter (getopts strips off the hyphen). Also, getopts recognizes as the end of the options, so you do not have to specify it explicitly as in the case statement in the first example. Because you tell getopts which options are valid and which require arguments, it can detect errors in the command line and handle them in two ways. This example uses a leading colon in optstring to specify that you check for and handle errors in your code; when getopts finds an invalid option, it sets varname to ? and OPTARG to the option letter. When it finds an option that is missing an argument, getopts sets varname to : and OPTARG to the option lacking an argument. The \? case pattern specifies the action to take when getopts detects an invalid option. The : case pattern specifies the action to take when getopts detects a missing option argument. In both cases getopts does not write any error message; it leaves that task to you. If you omit the leading colon from optstring, both an invalid option and a missing option argument cause varname to be assigned the string ?. OPTARG is not set and getopts writes its own diagnostic message to standard error. Generally this method is less desirable because you have less control over what the user sees when an error is made. Using getopts will not necessarily make your programs shorter. Its principal advantages are that it provides a uniform programming interface and it enforces standard option handling. A Partial List of Builtins
Table 13-6 lists some of the bash builtins. See "Listing bash builtins" on page 138 for instructions on how to display complete lists of builtins.
Expressions
An expression is composed of constants, variables, and operators that can be processed to return a value. This section covers arithmetic, logical, and conditional expressions as well as operators. Table 13-8 on page 588 lists the bash operators. Arithmetic Evaluation
The Bourne Again Shell can perform arithmetic assignments and evaluate many different types of arithmetic expressions, all using integers. The shell performs arithmetic assignments in a number of ways. One is with arguments to the let builtin: $ let "VALUE=VALUE * 10 + NEW"
In the preceding example, the variables VALUE and NEW contain integer values. Within a let statement you do not need to use dollar signs ($) in front of variable names. Double quotation marks must enclose a single argument, or expression, that contains SPACEs. Because most expressions contain SPACEs and need to be quoted, bash accepts ((expression)) as a synonym for let "expression", obviating the need for both quotation marks and dollar signs: $ ((VALUE=VALUE * 10 + NEW)) You can use either form wherever a command is allowed and can remove the SPACEs if you like. In the following example, the asterisk (*) does not need to be quoted because the shell does not perform pathname expansion on the right side of an assignment (page 280): $ let VALUE=VALUE*10+NEW
Because each argument to let is evaluated as a separate expression, you can assign values to more than one variable on a single line: $ let "COUNT = COUNT + 1" VALUE=VALUE*10+NEW
You need to use commas to separate multiple assignments within a set of double parentheses: $ ((COUNT = COUNT + 1, VALUE=VALUE*10+NEW)) Tip: Arithmetic evaluation versus arithmetic expansion Arithmetic evaluation differs from arithmetic expansion. As explained on page 325, arithmetic expansion uses the syntax $((expression)), evaluates expression, and replaces $((expression)) with the result. You can use arithmetic expansion to display the value of an expression or to assign that value to a variable. Arithmetic evaluation uses the let expression or ((expression)) syntax, evaluates expression, and returns a status code. You can use arithmetic evaluation to perform a logical comparison or an assignment.
Logical expressions You can use the ((expression)) syntax for logical expressions, although that task is frequently left to [[expression ]]. The next example expands the age_check script (page 326) to include logical arithmetic evaluation in addition to arithmetic expansion. $ cat age2 #!/bin/bash echo -n "How old are you? " read age if ((30 < age && age < 60)); then echo "Wow, in $((60-age)) years, you'll be 60!" else echo "You are too young or too old to play." fi $ age2 How old are you? 25 You are too young or too old to play.
The test-statement for the if structure evaluates two logical comparisons joined by a Boolean AND and returns 0 (true) if they are both true or 1 (false) otherwise. Logical Evaluation (Conditional Expressions)
The syntax of a conditional expression is [[ expression ]]
where expression is a Boolean (logical) expression. You must precede a variable name with a dollar sign ($) within expression. The result of executing this builtin, like the test builtin, is a return status. The conditions allowed within the brackets are almost a superset of those accepted by test (page 871). Where the test builtin uses a as a Boolean AND operator, [[ expression ]] uses &&. Similarly, where test uses o as a Boolean OR operator, [[ expression ]] uses ||. You can replace the line that tests age in the age2 script (preceding) with the following conditional expression. You must surround the [[ and ]] tokens with whitespace or a command terminator, and place dollar signs before the variables: if [[ 30 < $age && $age < 60 ]]; then
You can also use test's relational operators gt, ge, lt, le, eq, and ne: if [[ 30 -lt $age && $age -lt 60 ]]; then
String comparisons The test builtin tests whether strings are equal or unequal. The [[ expression ]] syntax adds comparison tests for string operators. The > and < operators compare strings for order (for example, "aa" < "bbb"). The = operator tests for pattern match, not just equality: [[ string = pattern ]] is true if string matches pattern. This operator is not symmetrical; the pattern must appear on the right side of the equal sign. For example, [[ artist = a* ]] is true (= 0), whereas [[ a* = artist ]] is false (= 1): $ [[ artist = a* ]] $ echo $? 0 $ [[ a* = artist ]] $ echo $? 1
The next example uses a command list that starts with a compound condition. The condition tests that the directory bin and the file src/myscript.bash exist. If this is true, cp copies src/myscript.bash to bin/myscript. If the copy succeeds, chmod makes myscript executable. If any of these steps fails, echo displays a message. $ [[ -d bin && -f src/myscript.bash ]] && cp src/myscript.bash \ bin/myscript && chmod +x bin/myscript || echo "Cannot make \ executable version of myscript"
String Pattern Matching
The Bourne Again Shell provides string pattern-matching operators that can manipulate pathnames and other strings. These operators can delete from strings prefixes or suffixes that match patterns. The four operators are listed in Table 13-7.
The syntax for these operators is ${varname op pattern}
where op is one of the operators listed in Table 13-7 and pattern is a match pattern similar to that used for filename generation. These operators are commonly used to manipulate pathnames so as to extract or remove components or to change suffixes: $ SOURCEFILE=/usr/local/src/prog.c $ echo ${SOURCEFILE#/*/} local/src/prog.c $ echo ${SOURCEFILE##/*/} prog.c $ echo ${SOURCEFILE%/*} /usr/local/src $ echo ${SOURCEFILE%%/*} $ echo ${SOURCEFILE%.c} /usr/local/src/prog $ CHOPFIRST=${SOURCEFILE#/*/} $ echo $CHOPFIRST local/src/prog.c $ NEXT=${CHOPFIRST%%/*} $ echo $NEXT local
Here the string-length operator, ${#name}, is replaced by the number of characters in the value of name: $ echo $SOURCEFILE /usr/local/src/prog.c $ echo ${#SOURCEFILE} 21
Operators
Arithmetic expansion and arithmetic evaluation use the same syntax, precedence, and associativity of expressions as the C language. Table 13-8 lists operators in order of decreasing precedence (priority of evaluation); each group of operators has equal precedence. Within an expression you can use parentheses to change the order of evaluation.
Pipe The pipe token has higher precedence than operators. You can use pipes anywhere in a command that you can use simple commands. For example, the command line $ cmd1 | cmd2 || cmd3 | cmd4 && cmd5 | cmd6 is interpreted as if you had typed $ ((cmd1 | cmd2) || (cmd3 | cmd4)) && (cmd5 | cmd6)
Tip: Do not rely on rules of precedence: use parentheses Do not rely on the precedence rules when you use compound commands. Instead, use parentheses to explicitly state the order in which you want the shell to interpret the commands.
Increment and decrement operators The postincrement, postdecrement, preincrement, and predecrement operators work with variables. The pre- operators, which appear in front of the variable name as in ++COUNT and VALUE, first change the value of the variable (++ adds 1; subtracts 1) and then provide the result for use in the expression. The post- operators appear after the variable name as in COUNT++ and VALUE; they first provide the unchanged value of the variable for use in the expression and then change the value of the variable. $ N=10 $ echo $N 10 $ echo $((--N+3)) 12 $ echo $N 9 $ echo $((N++ - 3)) 6 $ echo $N 10
Remainder The remainder operator (%) gives the remainder when its first operand is divided by its second. For example, the expression $((15%7)) has the value 1. Boolean The result of a Boolean operation is either 0 (false) or 1 (true). The && (AND) and || (OR) Boolean operators are called short-circuiting operators. If the result of using one of these operators can be decided by looking only at the left operand, the right operand is not evaluated. The && operator causes the shell to test the exit status of the command preceding it. If the command succeeded, bash executes the next command; otherwise, it skips the remaining commands on the command line. You can use this construct to execute commands conditionally: $ mkdir bkup && cp -r src bkup
This compound command creates the directory bkup. If mkdir succeeds, the contents of directory src is copied recursively to bkup. The || separator also causes bash to test the exit status of the first command but has the opposite effect: The remaining command(s) are executed only if the first one failed (that is, exited with nonzero status): $ mkdir bkup || echo "mkdir of bkup failed" >> /tmp/log
The exit status of a command list is the exit status of the last command in the list. You can group lists with parentheses. For example, you could combine the previous two examples as $ (mkdir bkup && cp -r src bkup) || echo "mkdir failed" >> /tmp/log In the absence of parentheses, && and || have equal precedence and are grouped from left to right. The following examples use the true and false utilities. These utilities do nothing and return true (0) and false (1) exit statuses, respectively: $ false; echo $? 1
The $? variable holds the exit status of the preceding command (page 564). The next two commands yield an exit status of 1 (false): $ true || false && false $ echo $? 1 $ (true || false) && false $ echo $? 1
Similarly the next two commands yield an exit status of 0 (true): $ false && false || true $ echo $? 0 $ (false && false) || true $ echo $? 0 Because || and && have equal precedence, the parentheses in the two preceding pairs of examples do nothing to change the order of operations. Because the expression on the right side of a short-circuiting operator may never get executed, you must be careful with assignment statements in that location. The following example demonstrates what can happen: $ ((N=10,Z=0)) $ echo $((N || ((Z+=1)) )) 1 $ echo $Z 0
Because the value of N is nonzero, the result of the || (OR) operation is 1 (true), no matter what the value of the right side is. As a consequence ((Z+=1)) is never evaluated and Z is not incremented. Ternary The ternary operator, ? :, decides which of two expressions should be evaluated, based on the value returned from a third expression: expression1 ? expression2 : expression3
If expression1 produces a false (0) value, expression3 is evaluated; otherwise, expression2 is evaluated. The value of the entire expression is the value of expression2 or expression3, depending on which one is evaluated. If expression1 is true, expression3 is not evaluated. If expression1 is false expression2 is not evaluated: $ ((N=10,Z=0,COUNT=1)) $ ((T=N>COUNT?++Z:--Z)) $ echo $T 1 $ echo $Z 1
Assignment The assignment operators, such as +=, are shorthand notations. For example, N+=3 is the same as ((N=N+3)). Other bases The following commands use the syntax base#n to assign base 2 (binary) values. First v1 is assigned a value of 0101 (5 decimal) and v2 is assigned a value of 0110 (6 decimal). The echo utility verifies the decimal values. $ ((v1=2#0101)) $ ((v2=2#0110)) $ echo "$v1 and $v2" 5 and 6 Next the bitwise AND operator (&) selects the bits that are on in both 5 (0101 binary) and 6 (0110 binary). The result is binary 0100, which is 4 decimal. $ echo $(( v1 & v2 )) 4
The Boolean AND operator (&&) produces a result of 1 if both of its operands are nonzero and a result of 0 otherwise. The bitwise inclusive OR operator (|) selects the bits that are on in either 0101 or 0110, resulting in 0111, which is 7 decimal. The Boolean OR operator (||) produces a result of 1 if either of its operands is nonzero and a result of 0 otherwise. $ echo $(( v1 && v2 )) 1 $ echo $(( v1 | v2 )) 7 $ echo $(( v1 || v2 )) 1
Next the bitwise exclusive OR operator (^) selects the bits that are on in either, but not both, of the operands 0101 and 0110, yielding 0011, which is 3 decimal. The Boolean NOT operator (!) produces a result of 1 if its operand is 0 and a result of 0 otherwise. Because the exclamation point in $(( ! v1 )) is enclosed within double parentheses, it does not need to be escaped to prevent the shell from interpreting the exclamation point as a history event. The comparison operators produce a result of 1 if the comparison is true and a result of 0 otherwise. $ echo $(( v1 ^ v2 )) 3 $ echo $(( ! v1 )) 0 $ echo $(( v1 < v2 )) 1 $ echo $(( v1 > v2 )) 0 Shell Programs
The Bourne Again Shell has many features that make it a good programming language. The structures that bash provides are not a random assortment. Rather, they have been chosen to provide most of the structural features that are in other procedural languages, such as C or Pascal. A procedural language provides the ability to
Programming languages implement these capabilities in different ways but with the same ideas in mind. When you want to solve a problem by writing a program, you must first figure out a procedure that leads you to a solutionthat is, an algorithm. Typically you can implement the same algorithm in roughly the same way in different programming languages, using the same kinds of constructs in each language. Chapter 8 and this chapter have introduced numerous bash features, many of which are useful for interactive use as well as for shell programming. This section develops two complete shell programs, demonstrating how to combine some of these features effectively. The programs are presented as problems for you to solve along with sample solutions. A Recursive Shell Script
A recursive construct is one that is defined in terms of itself. Alternatively, you might say that a recursive program is one that can call itself. This may seem circular, but it need not be. To avoid circularity a recursive definition must have a special case that is not self-referential. Recursive ideas occur in everyday life. For example, you can define an ancestor as your mother, your father, or one of their ancestors. This definition is not circular; it specifies unambiguously who your ancestors are: your mother or your father, or your mother's mother or father or your father's mother or father, and so on. A number of Mac OS X system utilities can operate recursively. See the R option to the chmod (page 676), chown (page 682), and cp (page 690) utilities for examples. Solve the following problem by using a recursive shell function:
Write a shell function named makepath that, given a pathname, creates all components in that pathname as directories. For example, the command makepath a/b/c/d should create directories a, a/b, a/b/c, and a/b/c/d. (The mkdir utility supports a p option that does exactly this. Solve the problem without using mkdir p.)
One algorithm for a recursive solution follows:
In general, a recursive function must invoke itself with a simpler version of the problem than it was given until it is finally called with a simple case that does not need to call itself. Following is one possible solution based on this algorithm: makepath # this is a function # enter it at the keyboard, do not run it as a shell script # function makepath() { if [[ ${#1} -eq 0 || -d "$1" ]] then return 0 # Do nothing fi if [[ "${1%/*}" = "$1" ]] then mkdir $1 return $? fi makepath ${1%/*} || return 1 mkdir $1 return $? }
In the test for a simple component (the if statement in the middle of the function), the left expression is the argument after the shortest suffix that starts with a / character has been stripped away (page 587). If there is no such character (for example, if $1 is alex), nothing is stripped off and the two sides are equal. If the argument is a simple filename preceded by a slash, such as /usr, the expression ${1%/*} evaluates to a null string. To make the function work in this case, you must take two precautions: Put the left expression within quotation marks and ensure that the recursive function behaves sensibly when it is passed a null string as an argument. In general, good programs are robust: They should be prepared for borderline, invalid, or meaningless input and behave appropriately in such cases. By giving the following command from the shell you are working in, you turn on debugging tracing so that you can watch the recursion work: $ set -o xtrace
(Give the same command, but replace the hyphen with a plus sign (+) to turn debugging off.) With debugging turned on, the shell displays each line in its expanded form as it executes the line. A + precedes each line of debugging output. In the following example, the first line that starts with + shows the shell calling makepath. The makepath function is called from the command line with arguments of a/b/c. Subsequently it calls itself with arguments of a/b and finally a. All the work is done (using mkdir) as each call to makepath returns. $ makepath a/b/c + makepath a/b/c + [[ 5 -eq 0 ]] + [[ -d a/b/c ]] + [[ a/b = \a\/\b\/\c ]] + makepath a/b + [[ 3 -eq 0 ]] + [[ -d a/b ]] + [[ a = \a\/\b ]] + makepath a + [[ 1 -eq 0 ]] + [[ -d a ]] + [[ a = \a ]] + mkdir a + return 0 + mkdir a/b + return 0 + mkdir a/b/c + return 0
The function works its way down the recursive path and back up again. It is instructive to invoke makepath with an invalid path and see what happens. The following example, run with debugging turned on, tries to create the path /a/b, which requires that you create directory a in the root directory. Unless you have permission to write to the root directory, you are not permitted to create this directory. $ makepath /a/b + makepath /a/b + [[ 4 -eq 0 ]] + [[ -d /a/b ]] + [[ /a = \/\a\/\b ]] + makepath /a + [[ 2 -eq 0 ]] + [[ -d /a ]] + [[ '' = \/\a ]] + makepath + [[ 0 -eq 0 ]] + return 0 + mkdir /a mkdir: cannot create directory '/a': Permission denied + return 1 + return 1
The recursion stops when makepath is denied permission to create the /a directory. The error return is passed all the way back, so the original makepath exits with nonzero status. Tip: Use local variables with recursive functions The preceding example glossed over a potential problem that you may encounter when you use a recursive function. During the execution of a recursive function, many separate instances of that function may be active simultaneously. All but one of them are waiting for their child invocation to complete. Because functions run in the same environment as the shell that calls them, variables are implicitly shared by a shell and a function it calls so that all instances of the function share a single copy of each variable. Sharing variables can give rise to side effects that are rarely what you want. As a rule, you should use typeset to make all variables of a recursive function be local variables. See page 561 for more information.
The quiz Shell Script
Solve the following problem using a bash script:
Write a generic multiple-choice quiz program. The program should get its questions from data files, present them to the user, and keep track of the number of correct and incorrect answers. The user must be able to exit from the program at any time with a summary of results to that point.
The detailed design of this program and even the detailed description of the problem depend on a number of choices: How will the program know which subjects are available for quizzes? How will the user choose a subject? How will the program know when the quiz is over? Should the program present the same questions (for a given subject) in the same order each time, or should it scramble them? Of course, you can make many perfectly good choices that implement the specification of the problem. The following details narrow the problem specification:
Following is a top-level design for this program:
Clearly some of these steps (such as step 3) are simple, whereas others (such as step 4) are complex and worthy of analysis on their own. Use shell functions for any complex step, and use the trap builtin to handle a user interrupt. Here is a skeleton version of the program with empty shell functions: function initialize { # Initializes variables. } function choose_subj { # Writes choice to standard output. } function scramble { # Stores names of question files, scrambled, # in an array variable named questions. } function ask { # Reads a question file, asks the question, and checks the # answer. Returns 1 if the answer was correct, 0 otherwise. If it # encounters an invalid question file, exit with status 2. } function summarize { # Presents the user's score. } # Main program initialize # Step 1 in top-level design subject=$(choose_subj) # Step 2 [[ $? -eq 0 ]] || exit 2 # If no valid choice, exit cd $subject || exit 2 # Step 3 echo # Skip a line scramble # Step 4 for ques in ${questions[*]}; do # Step 5 ask $ques result=$? (( num_ques=num_ques+1 )) if [[ $result == 1 ]]; then (( num_correct += 1 )) fi echo # Skip a line between questions sleep ${QUIZDELAY:=1} done summarize # Step 6 exit 0
To make reading the results a bit easier for the user, a sleep call appears inside the question loop. It delays $QUIZDELAY seconds (default = 1) between questions. Now the task is to fill in the missing pieces of the program. In a sense this program is being written backward. The details (the shell functions) come first in the file but come last in the development process. This common programming practice is called top-down design. In top-down design you fill in the broad outline of the program first and supply the details later. In this way you break the problem up into smaller problems, each of which you can work on independently. Shell functions are a great help in using the top-down approach. One way to write the initialize function follows. The cd command causes QUIZDIR to be the working directory for the rest of the script and defaults to ~/quiz if QUIZDIR is not set. function initialize () { trap 'summarize ; exit 0' INT # Handle user interrupts num_ques=0 # Number of questions asked so far num_correct=0 # Number answered correctly so far first_time=true # true until first question is asked cd ${QUIZDIR:=~/quiz} || exit 2 }
Be prepared for the cd command to fail. The directory may be unsearchable or conceivably another user may have removed it. The preceding function exits with a status code of 2 if cd fails. The next function, choose_subj, is a bit more complicated. It displays a menu using a select statement: function choose_subj () { subjects=($(ls)) PS3="Choose a subject for the quiz from the preceding list: " select Subject in ${subjects[*]}; do if [[ -z "$Subject" ]]; then echo "No subject chosen. Bye." >&2 exit 1 fi echo $Subject return 0 done }
The function first uses an ls command and command substitution to put a list of subject directories in the subjects array. Next the select structure (page 551) presents the user with a list of subjects (the directories found by ls) and assigns the chosen directory name to the Subject variable. Finally the function writes the name of the subject directory to standard output. The main program uses command substitution to assign this value to the subject variable [subject=$(choose_subj)]. The scramble function presents a number of difficulties. In this solution it uses an array variable (questions) to hold the names of the questions. It scrambles the entries in an array using the RANDOM variable (each time you reference RANDOM it has the value of a [random] integer between 0 and 32767): function scramble () { typeset -i index quescount questions=($(ls)) quescount=${#questions[*]} # Number of elements ((index=quescount-1)) while [[ $index > 0 ]]; do ((target=RANDOM % index)) exchange $target $index ((index -= 1)) done }
This function initializes the array variable questions to the list of filenames (questions) in the working directory. The variable quescount is set to the number of such files. Then the following algorithm is used: Let the variable index count down from quescount 1 (the index of the last entry in the array variable). For each value of index, the function chooses a random value target between 0 and index, inclusive. The command ((target=RANDOM % index))
produces a random value between 0 and index 1 by taking the remainder (the % operator) when $RANDOM is divided by index. The function then exchanges the elements of questions at positions target and index. It is convenient to do this in another function named exchange: function exchange () { temp_value=${questions[$1]} questions[$1]=${questions[$2]} questions[$2]=$temp_value }
The ask function also uses the select structure. It reads the question file named in its argument and uses the contents of that file to present the question, accept the answer, and determine whether the answer is correct. (See the code that follows.) The ask function uses file descriptor 3 to read successive lines from the question file, whose name was passed as an argument and is represented by $1 in the function. It reads the question into the ques variable and the number of questions into num_opts. The function constructs the variable choices by initializing it to a null string and successively appending the next choice. Then it sets PS3 to the value of ques and uses a select structure to prompt the user with ques. The select structure places the user's answer in answer, and the function then checks it against the correct answer from the file. The construction of the choices variable is done with an eye toward avoiding a potential problem. Suppose that one answer has some whitespace in it. Then it might appear as two or more arguments in choices. To avoid this problem, make sure that choices is an array variable. The select statement does the rest of the work: quiz $ cat quiz #!/bin/bash # remove the # on the following line to turn on debugging # set -o xtrace #================== function initialize () { trap 'summarize ; exit 0' INT # Handle user interrupts num_ques=0 # Number of questions asked so far num_correct=0 # Number answered correctly so far first_time=true # true until first question is asked cd ${QUIZDIR:=~/quiz} || exit 2 } #================== function choose_subj () { subjects=($(ls)) PS3="Choose a subject for the quiz from the preceding list: " select Subject in ${subjects[*]}; do if [[ -z "$Subject" ]]; then echo "No subject chosen. Bye." >&2 exit 1 fi echo $Subject return 0 done } #================== function exchange () { temp_value=${questions[$1]} questions[$1]=${questions[$2]} questions[$2]=$temp_value } #================== function scramble () { typeset -i index quescount questions=($(ls)) quescount=${#questions[*]} # Number of elements ((index=quescount-1)) while [[ $index > 0 ]]; do ((target=RANDOM % index)) exchange $target $index ((index -= 1)) done } #================== function ask () { exec 3<$1 read -u3 ques || exit 2 read -u3 num_opts || exit 2 index=0 choices=() while (( index < num_opts )) ; do read -u3 next_choice || exit 2 choices=("${choices[@]}" "$next_choice") ((index += 1)) done read -u3 correct_answer || exit 2 exec 3<&- if [[ $first_time = true ]]; then first_time=false echo -e "You may press the interrupt key at any time to quit.\n" fi PS3=$ques" " # Make $ques the prompt for select # and add some spaces for legibility. select answer in "${choices[@]}"; do if [[ -z "$answer" ]]; then echo Not a valid choice. Please choose again. elif [[ "$answer" = "$correct_answer" ]]; then echo "Correct!" return 1 else echo "No, the answer is $correct_answer." return 0 fi done } #================== function summarize () { echo # Skip a line if (( num_ques == 0 )); then echo "You did not answer any questions" exit 0 fi (( percent=num_correct*100/num_ques )) echo "You answered $num_correct questions correctly, out of \ $num_ques total questions." echo "Your score is $percent percent." } #================== # Main program initialize # Step 1 in top-level design subject=$(choose_subj) # Step 2 [[ $? -eq 0 ]] || exit 2 # If no valid choice, exit cd $subject || exit 2 # Step 3 echo # Skip a line scramble # Step 4 for ques in ${questions[*]}; do # Step 5 ask $ques result=$? (( num_ques=num_ques+1 )) if [[ $result == 1 ]]; then (( num_correct += 1 )) fi echo # Skip a line between questions sleep ${QUIZDELAY:=1} done summarize # Step 6 exit 0
Chapter Summary
The shell is a programming language. Programs written in this language are called shell scripts, or simply scripts. Shell scripts provide the decision and looping control structures present in high-level programming languages while allowing easy access to system utilities and user programs. Shell scripts can use functions to modularize and simplify complex tasks. Control structures The control structures that use decisions to select alternatives are if...then, if...then...else, and if...then...elif. The case control structure provides a multiway branch and can be used when you want to express alternatives using a simple pattern-matching syntax. The looping control structures are for...in, for, until, and while. These structures perform one or more tasks repetitively. The break and continue control structures alter control within loops: break transfers control out of a loop, and continue transfers control immediately to the top of a loop. The Here document allows input to a command in a shell script to come from within the script itself. File descriptors The Bourne Again Shell provides the ability to manipulate file descriptors. Coupled with the read and echo builtins, file descriptors allow shell scripts to have as much control over input and output as programs written in lower-level languages. Variables You assign attributes, such as readonly, to bash variables using the typeset builtin. The Bourne Again Shell provides operators to perform pattern matching on variables, provide default values for variables, and evaluate the length of variables. This shell also supports array variables and local variables for functions and provides built-in integer arithmetic capability, using the let builtin and an expression syntax similar to the C programming language. Builtins Bourne Again Shell builtins include type, read, exec, trap, kill, and getopts. The type builtin displays information about a command, including its location; read allows a script to accept user input. The exec builtin executes a command without creating a new process. The new command overlays the current process, assuming the same environment and PID number of that process. This builtin executes user programs and other Mac OS X commands when it is not necessary to return control to the calling process. The trap builtin catches a signal sent by Mac OS X to the process running the script and allows you to specify actions to be taken upon receipt of one or more signals. You can use this builtin to cause a script to ignore the signal that is sent when the user presses the interrupt key. The kill builtin allows you to terminate a running program. The getopts builtin parses command line arguments, making it easier to write programs that follow standard conventions for command line arguments and options. Utilities in scripts In addition to using control structures, builtins, and functions, shell scripts generally call utilities. The find utility, for instance, is commonplace in shell scripts that search for files in the system hierarchy and can perform a vast range of tasks, from simple to complex. A well-written shell script adheres to standard programming practices, such as specifying the shell to execute the script on the first line of the script, verifying the number and type of arguments that the script is called with, displaying a standard usage message to report command line errors, and redirecting all informational messages to standard error. Expressions There are two basic types of expressions: arithmetic and logical. Arithmetic expressions allow you to do arithmetic on constants and variables, yielding a numeric result. Logical (Boolean) expressions compare expressions or strings, or test conditions to yield a true or false result. As with all decisions within UNIX shell scripts, a true status is represented by the value zero; false, by any nonzero value. Exercises
Advanced Exercises
|
Категории