Sed
Overview
Programmers are accustomed to source editors. The iSeries has Source Entry Utility (SEU), the Edit File command (EDTF), and Code/400. In addition to many free and commercial text editors, all Windows-based PCs have the Notepad editor.
The commands and methods vary, but all of these editors are interactive programs. Sed , on the other hand, is a noninteractive editor. It applies one or more commands to a stream of text and produces a stream file in the process. Since sed does not directly change the input text, it is a nondestructive editor. Otherwise, sed does the same sorts of things that interactive editors do. You can replace text strings with other strings, insert new lines, delete lines, and so forth. The difference is that you can do it in a batch-type mode.
There are some good reasons to use sed :
- "You don't have to be present to win." If you need to modify an FTP script at 1 A.M. daily to include put commands for all the files in a certain subdirectory, sed will do the job for you while you sleep.
- Sed is a filter. It can be used when editing data in any Qshell command pipeline or command-substitution statement. No scripts or data files are needed.
- Sed doesn't mind modifying extremely large files that interactive editors complain about. Only the current input record and holding buffer are in memory at any time.
- Sed will happily apply a long sequence of edits that you might not want to type on a recurring basis.
- Sed can be the basis for source-code generation utilities.
- Sed's pattern-matching ability is more powerful than the search functions of the iSeries' interactive editors.
- After you get comfortable with the sed utility, you can make changes more quickly than with interactive editors.
Sed uses two sources of input. Data to be edited is read from the standard input stream or one or more stream files. A list of commands is read from the command line or disk files. A list of sed commands stored in a disk file is called a sed script .
Sed reads a line of input into a buffer and applies all appropriate editing commands. Once all edits have been made, it writes the modified line to stdout and reads the next line of input. Figure 18.1 shows a pseudocode summary of this logic.
while not end of input copy the next input record into the pattern space for each editing command if the editing command applies to this input record apply the editing command to the pattern space end if end for if write to stdout is not prohibited by command or option write the pattern space to stdout end if end while
Figure 18.1: This is the basic sed processing logic.
Sed uses two buffers: the pattern space and the holding buffer . The pattern space is the buffer into which an input record is read. Normally, the pattern space contains one record, but you can use the N function to add additional records to the pattern space. All edits take place on the contents of the pattern space.
The holding buffer is an additional buffer into which the pattern space may be placed for later retrieval.
Forms of the Sed Command
The syntax of the sed command comes in two forms, as follows :
sed [ -an ] command [file ] sed [ -an ] [-e command] [ -f command_file] [file ]
You will use one form or the other, but not both, in one command. The first form is the simpler one: one sed command is used to edit the input file(s). If no input files are used, sed edits data read from stdin. Here is a command of this type:
sed '/A/d' goodoleboys.txt
The sed command is /A/d. The input file is goodoleboys.txt, and the modified data is written to stdout .
You need to use the second form when you want to apply more than one editing command to the input. This type of command is shown in the following example:
sed -e 's/Daisy/Ethel/' -f seddata1.txt goodoleboys.txt
The e option allows you to include editing commands within the command string itself, while the f option tells sed to read editing commands from a file. In this example, sed reads commands from two places. First, it applies the command following the e option. Then, it applies the commands in the seddata1.txt file. If the e option followed the f option, sed would apply the commands in seddata1.txt file first.
You can use the filter form of sed when you're doing interactive or scripted Qshell work. Sometimes, a sed solution is easier to write correctly than other forms of Qshell variable expansion and substitution.
Here is an example of sed as a filter:
for i in *.txt ; do j=$(echo $i sed -e 's/.txt$/.new/'); echo mv $i $j; done
This example sends all text-file names in the current directory to sed in a command substitution. Sed changes each name so that it has an extension of .new , and assigns the result to the j variable. The loop then generates a command using the mv utility to rename the original text file to the new file. The mv command isn't executed directly; instead, echo is used to display the mv command to standard output. Displaying generated Qshell commands is always a prudent debugging step before executing them.
Sed Options
The four options for sed are listed in Table 18.1.
Option |
Description |
---|---|
a |
Delay opening of files to which output is directed with the w command. |
e |
Read a sed command from the following argument. |
f |
Read sed commands from the file named in the following argument. |
n |
Do not automatically write to stdout . |
The a option delays the opening of files that are to be overwritten until the last possible moment. Normally, Qshell clears files that are to be overwritten before sed begins to run. This means that files will be cleared that might not be written to. The a option ensures that a file is not cleared unless it is written to.
The e option, which is repeatable, precedes a sed command. The f option, which is also repeatable, precedes the name of a file in which sed commands are stored. The e and f options are not mutually exclusive. As you saw in the previous example, you may use both of them in the same command.
The n option tells sed not to automatically write the contents of the pattern space to stdout after applying all editing commands.
Sed Commands
A sed command consists of three parts: the address, a function, and arguments. You may precede the address and function parts of the command with white space. As the following syntax shows, the only required part is the function:
[address[,address]] function [arguments]
Let's look at each of the three parts in more detail.
Address
The address identifies the lines to be selected. Depending on the function, you may specify no address, a single address, or two addresses separated from one another by a comma.
If you do not specify an address, all lines of the input file are selected for editing. If you specify one address, only the lines matching the address are edited. If you specify two addresses, sed edits one or more ranges of lines.
Each address can be
- A line number, from all input files numbered consecutively
- A dollar sign, to indicate the last line of the last input file
- A regular expression delimited by the forward-slash character, /
The regular expressions are similar to the basic regular expressions that grep and other utilities recognize, but sed adds two features of its own:
- The escape sequence matches the newline character.
- Any character other than a backslash or newline may be used as a delimiter in regular expressions. Any delimiter may be escaped with a backslash.
Table 18.2 lists the regular-expression metacharacters for sed .
Metacharacter |
Description |
---|---|
(period) |
Match any character except end-of-line. |
* |
Match zero or more occurrences of the preceding character. |
^ |
Match from the beginning of the line. |
$ |
Match from the end of the line. |
[ ] |
Match any character within the brackets. Ranges may be specified with a hyphen. |
[^ ] |
Negate the groups or ranges of characters in the brackets. The caret must be the first character within the brackets. |
{m} |
Match exactly m occurrences of the preceding character. |
{m,} |
Match m or more occurrences of the preceding character. |
{m,n} |
Match m to n occurrences of the preceding character. |
Turn off the special meaning of the following character. |
|
() |
Define a back reference to save matched characters as a pattern. The saved pattern can be referenced with a backslash followed by a number. |
// |
Match the last-used regular expression. |
Function and Arguments
The function is the command itself. It tells sed what to do with the input record. All functions are one character long. They are listed in Table 18.3.
Function |
Arguments |
Description |
Maximum Addresses |
---|---|---|---|
a |
text |
Write text to stdout after writing the pattern space. |
1 |
b |
label (optional) |
Branch to a label. If a label is not specified, branch to the end of the list of functions. |
2 |
c |
text |
Replace line(s) with new text. |
2 |
d |
Do not write the pattern space to stdout. |
2 |
|
D |
Delete the pattern space up to and including the first newline character |
2 |
|
g |
Copy the holding buffer to the pattern space. |
2 |
|
G |
Append the holding buffer to the pattern space. |
2 |
|
h |
Copy the pattern space to the holding buffer. |
2 |
|
H |
Append the pattern space to the holding buffer. |
2 |
|
i |
text |
Write text to stdout before writing the pattern space. |
1 |
l(ell) |
Replace nonprintable characters with visual representations. |
2 |
|
n |
Write the pattern space to stdout (unless the-n option was specified), and read the next line of input into the pattern space. |
2 |
|
N |
Append the next input line to the pattern space |
2 |
|
p |
Print the pattern space to stdout immediately. |
2 |
|
P |
Print the pattern space, up to and including the first newline character, immediately. |
2 |
|
q |
Terminate the editing session after processing the current input record. |
1 |
|
r |
file name |
Read a file into stdout. |
1 |
s |
search string, replacement string, flags |
Substitute one string for another. |
2 |
t |
Branch if substitutions have been made. |
2 |
|
w |
file name |
Write the pattern space to a file. |
2 |
x |
Exchange the contents of the holding buffer and the pattern space. |
2 |
|
y |
Replace each character in a set with the corresponding character of another set. |
2 |
|
= |
Write the line number to stdout. |
1 |
|
: ( colon ) |
label |
Define a label as a target for a branch. |
You may negate a function by preceding it with an exclamation point. The following example deletes all lines except line 2:
sed '2!d' goodoleboys.txt
You may also include more than one function for an address. Enclose the group of functions in braces, and follow each function with a semicolon and at least one space, as shown here:
sed '2{h; d; }' goodoleboys.txt
When sed reads line 2, it executes two functions, h and d .
Instead of a semicolon and space, you may also separate functions with newline characters, like this:
sed '2{ > h > d > }' goodoleboys.txt
As you can see, sed allows each function to be listed on its own line. When sed reads line 2, it executes two functions, h and d . Notice that the commands are separated by newline characters rather than by semicolons. The greater-than signs are the Qshell secondary prompt character.
Examples
The examples on the following pages are provided to illustrate how sed works. The input data is taken from file goodoleboys.txt, shown in Figure 18.2.
cat goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.2: The goodoleboys.txt file is the basis for the examples in this chapter.
Delete
The Delete command, d , does not actually delete records from the input files. Instead, it omits them from the output stream. When sed encounters a d command, it ignores the remainder of the editing commands and continues immediately with the next input record. For example, the command in Figure 18.3 deletes the first two lines. Since sed prints lines by default, all other lines are written to stdout .
sed '1,2d' goodoleboys.txt Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.3: Delete the first two lines.
In Figure 18.4, the exclamation point is used for negation, so all lines that do not contain the string 444- are deleted.
sed '/444-/!d' goodoleboys.txt Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Otis Sept 17 444-8000 Ol' Sal Sally 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75
Figure 18.4: Delete all lines that do not contain the string 444- .
The dollar sign by itself, specified as an address as shown in Figure 18.5, symbolizes the last record of the file. The dollar sign preceded by a backslash, as shown in Figure 18.6, is interpreted literally instead of as the symbol for end-of-line.
sed '$d' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20
Figure 18.5: Delete the last record of the file.
sed '/ $/d' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Junior April 30BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.6: Delete the lines that contain a dollar sign.
In Figure 18.7, sed deletes the range between a record containing Jun and one containing Sal . Sed finds Jun in Billy Bob's record and Sal in Otis's record. It resumes the search and finds Jun in Arlis' record. Since no following records contain Sal , sed deletes everything through the end of the file.
sed '/Jun/,/Sal/d' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ======== ======== ======== ======= ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410
Figure 18.7: Delete each range between a record containing Jun and one containing Sal .
The command in Figure 18.8 starts deleting at the third record, and continues through the first record containing the string BR.
sed '3,/BR/d' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.8: Delete from the third record through the first record containing the string BR (Junior's record).
In the final example of deletion, the command in Figure 18.9 simply deletes all records that begin with B .
sed '/^B/d' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Ernest T. ?? none none none none
Figure 18.9: Delete each record that begins with a B .
Substitute
The Substitute command, s , replaces data in the pattern space. Here is the syntax of this command:
s/regular-expression/replacement/flags
The s command is followed by a delimiter of the programmer's choosing, which is often the forward slash. (The backslash and newline characters are not allowed to serve as delimiters.) The value to be replaced follows the first delimiter . This is a regular expression, similar to the regular expressions many Unix utilities use.
The replacement string follows another delimiter. The replacement string can include two special substitution values:
- An ampersand (&) stands for the matched characters.
- A backslash followed by a single digit indicates a back reference to a matching pattern that was saved with the () metacharacter pair.
The permitted flags are listed in Table 18.4.
Flag |
Description |
---|---|
g |
Global substitution; make all possible substitutions on each line. |
p |
Write the pattern space to stdout. |
w file |
Write the pattern space to a file. |
0 to 9 |
Substitute the nth occurrence of the search string only. |
If sed replaces one string with another, the substitution flags take effect. If the matched pattern and replacement string are equivalent, a replacement is still considered to have been made.
The comand in Figure 18.10 changes the first B in each record to an asterisk. It is not necessary to escape the asterisk, since it is in the replacement area and therefore not interpreted as a metacharacter. Notice that only the first B in the records for Bubba and Billy Bob was changed to an asterisk.
sed 's/B/*/' goodoleboys.txt Name *orn Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 *lue Mary Sue 12 .50 *ubba Oct 13 444-1111 Buck Mary Jean 12 *illy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 *lue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy *eth 12 .75 Junior April 30 *R-549 Percival Lilly Faye 12 *ill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.10: Change the first B in each record to an asterisk.
Figures 18.11 and 18.12 produce the same results of changing all B s in each record to asterisks . However, Figure 18.12 uses a pound sign as a delimiter, instead of a forward slash.
sed 's/B/*/' goodoleboys.txt Name *orn Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 *lue Mary Sue 12 .50 *ubba Oct 13 444-1111 *uck Mary Jean 12 *illy *ob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 *lue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy *eth 12 .75 Junior April 30 *R-549 Percival Lilly Faye 12 *ill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.11: The g option tells sed to globally substitute in the pattern buffer.
sed 's#B#*#g' goodoleboys.txt Name *orn Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 *lue Mary Sue 12 .50 *ubba Oct 13 444-1111 *uck Mary Jean 12 *illy *ob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 *lue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy *eth 12 .75 Junior April 30 *R-549 Percival Lilly Faye 12 *ill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.12: Use the pound sign (#) as a delimiter.
The command in Figure 18.13 changes Chuck's dog's name from Blue to Petey . The string Chuck is the address, s is the command, and Blue and Petey are the arguments to the command.
sed '/Chuck/s/Blue /Petey/' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Petey Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.13: Change Chuck's dog's name from Blue to Petey .
In Figure 18.14, the string (12/02) is appended to every record that ends with a period followed by two digits. The period character is escaped with a backslash in the substitution expression because it is a metacharacter. If the forward slash had been used as the substitution delimiter, as in the previous example, it would also have to be escaped. However, the pound sign (#) is used here as the substitution delimiter, so it is not necessary to escape the slash.
sed 's#.[0-9][0-9]$#& (12/02)#' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 (12/02) Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 (12/02) Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.14: Add the string (12/02) to every record that ends with a period followed by two digits.
The command in Figure 18.15 appends the string ** Committee ** to the end of each line, starting with the first record containing Blue and ending with the last record containing Blue .
sed '/Blue/,/Blue/s/$/ ** Committee **/' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======= ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 ** Committee ** Bubba Oct 13 444-1111 Buck Mary Jean 12 ** Committee ** Billy Bob June 11 444-4340 Leotis Lisa Sue 12 ** Committee ** Amos Jan 4 333-1119 Amos Abigail 20 ** Committee ** Otis Sept 17 444-8000 Ol' Sal Sally 12 ** Committee ** Claude May 31 333-4340 Blue Etheline 12 **Committee ** Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 PercivalLilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.15: From the first record containing Blue to the last record containing Blue , add the string ** Committee ** to the end of the line.
The command in Figure 18.16 has been entered on two lines. Qshell displays the secondary prompt, >, to show that the command is not complete. Because the Enter key was pressed between the two lines, there is a newline character in the replacement pattern. The backslash character that ends the first command line is necessary to "escape" the Enter key.
sed 's/ ([A-Z]) / > /' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.16: For each input record, sed looks for the first blank that is followed by an uppercase letter. The () combination tells sed to save the uppercase letter as pattern 1. Sed substitutes a new line followed by the letter saved in pattern.
In Figure 18.17, sed looks for capital letters preceded by spaces. When it finds such a combination, it replaces the space with a newline character.
sed 's/ ([A-Z]) / > /g' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Billy Mary Jean 12 Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.17: Replace the space before an uppercase letter with a newline character.
By default, sed prints the pattern space to stdout after processing all applicable edit commands. You can disable the automatic write to stdout by specifying the -n option in the sed command.
Sed provides the Print command, p , to let you control printing if you prefer. When sed encounters the print command, it prints the line to stdout immediately. The pattern space remains as-is, and further edits may be made if desired.
There are two primary reasons to use the p command:
- You wish to write additional output to stdout.
- You wish to control when output is sent to stdout.
For example, in Figure 18.18, when sed finds a line that contains Bi , it forces the line to print. The result is that matched lines are printed twice.
sed '/Bi/p' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.18: Print lines containing Bi twice.
The following command, on the other hand, prints just lines containing the string Bi :
sed -n '/Bi/p' goodoleboys.txt Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Bill Feb 29 333-4444 Daisy Daisy 20
Sed does not print lines to stdout automatically because of the n option. Another example of this option is shown in Figure 18.19.
sed -n '3,5{p; s/4/*/gp; }' goodoleboys.txt Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Chuck Dec 25 ***-23*5 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Bubba Oct 13 ***-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Billy Bob June 11 ***-*3*0 Leotis Lisa Sue 12
Figure 18.19: Print lines 3 through 5 twice ”before and after substituting asterisks for fours.
In Figure 18.20, when sed finds a record that contains either two or three adjacent fours, it prints the record as it was read, then substitutes an "at" sign (@) for the fours.
sed '/4{2,3}/{p; s/4{2,3}/@/; }' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Chuck Dec 25 @-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Bubba Oct 13 @-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Billy Bob June 11 @-434 0 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Otis Sept 17 @-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Roscoe Feb 2 @-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Arlis June 19 @-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Bill Feb 29 333-@4 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.20: Substitute an "at" sign for double or triple fours.
Append and Insert
You can have sed add lines to the output stream, either before or after lines that match the address. Use the a command to append (write after the selected line) and the i command to insert (write before the selected line). Follow the command with a backslash and continue on the following line. End each line of the added text, except the last one, with a backslash, as shown in Figure 18.21.
sed '/ 410/a > ===> Needs a bigger gun > ' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 ===> Needs a bigger gun Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.21: Append a line after every record containing a blank followed by 410 . Notice the backslash after the a command.
In Figure 18.22, two lines of text are inserted before each line containing the string June , to indicate that dues should be paid. Again, there is a backslash after the i command, as well as after the first of two lines of the added text.
sed '/ June/i > Needs to pay Dues > V V V V V V > ' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Needs to pay Dues V V V V V V Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Needs to pay Dues V V V V V V Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.22: Before each line containing the string June , insert two lines of text that indicate that dues should be paid.
Quit
The Quit command, q , terminates sed after the current record is processed . No more input lines are read. In Figure 18.23, sed quits the edit when it finds a record containing the string 410 . Notice that the record that matched the address pattern is the last one written to stdout.
sed '/410/q' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1118 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410
Figure 18.23: Quit the edit when the string 410 is found.
Figure 18.24 shows two editing commands. The second one replaces all fours with asterisks in records that contain the string Feb . Note that Roscoe's record is changed, but Bill's is not because the q command precedes the s command.
sed -e '/Dai/q' -e '/Feb/s/4/*/g' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 ***-223* Rover Alice Jean *10 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20
Figure 18.24: Replace all fours with asterisks in records that contain the string Feb .
In Figure 18.25, sed also applies two editing commands to each line of the goodoleboys.txt file. The first command looks for two adjacent one characters. If it finds a record with two ones, it quits the edit. The second command substitutes J for Per . If it makes such a substitution, sed writes the record to file temp.txt. Since the a switch is not present, file temp.txt is cleared before editing begins. If sed finds a record with two ones before finding a record with the value Per in it, nothing is written to temp.txt.
sed -e '/11/q' -e 's/Per/J/w temp.txt' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ======== ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 /home/smith $ cat temp.txt /home/smith $
Figure 18.25: Apply two editing commands to each line of goodoleboys.txt.
Transform
The Transform command, y , replaces each character in a set with a corresponding character in another set. For example, in Figure 18.26, when sed finds a line that contains the string June , it replaces all lowercase letters with their uppercase equivalents.
sed > '/June/y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/' > goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 BILLY BOB JUNE 11 444-4340 LEOTIS LISA SUE 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 ARLIS JUNE 19 444-1314 REDEYE SUZY BETH 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.26: Replace lowercase letters with uppercase ones for records containing the string June .
Change
The Change command, c , replaces the pattern space with a different string of characters, as shown in Figure 18.27. When sed finds a line containing the string Amos , it replaces the contents of the pattern space with the text following the c command. The effect is that the entire record is replaced.
sed '/Amos/c > Bilford Nov 22 333-2244 Phideaux Polly Ann 20 > ' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Bilford Nov 22 333-2244 Phideaux Polly Ann 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.27: Replace each record that contains the string Amos .
List Nonprinting Characters
The List Nonprinting Characters command is l , a lowercase letter "ell." You can use it when you need to see the control characters that are embedded in a file, as shown in Figure 18.28.
sed -n 'l' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid$ ========= ======== ======== ======== ========= ======= =====$ Chuck Dec 25 444-2345 Blue Mary Sue 12 .50$ Bubba Oct 13 444-1111 Buck Mary Jean 12$ Billy Bob June 11 444-4340 Leotis Lisa Sue 12$ Amos Jan 4 333-1119 Amos Abigail 20$ Otis Sept 17 444-8000 Ol' Sal Sally 12 $ Claude May 31 333-4340 Blue Etheline 12$ Roscoe Feb 2 444-2234 Rover Alice Jean 410$ Arlis June 19 444-1314 Redeye Suzy Beth 12 .75$ Junior April 30BR-549 Percival Lilly Faye 12$ Bill Feb 29 333-4444 Daisy Daisy 20$ Ernest T. ?? none none none none$
Figure 18.28: The only nonprintable characters in this file are the end-of-line characters, which sed represents with a dollar sign.
Read
The Read command, r , reads in an entire file after processing the current input record, as shown in Figure 18.29.
cat notice.txt ========================================================= = suspended for non-payment of dues = ========================================================= sed '/Bubba/r notice.txt' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 ======================================================== = suspended for non-payment of dues = ========================================================= Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.29: When sed finds a record containing Bubba , it reads file notice.txt file into the output stream.
In Figure 18.30, when sed finds a record that contains the string 444- , it reads the file 444.txt into the output stream.
cat 444.txt ** long distance phone call ** /home/JSMITH $ sed '/444-/r 444.txt' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 ** long distance phone call ** Bubba Oct 13 444-1111 Buck Mary Jean 12 ** long distance phone call ** Billy Bob June 11 444-4340 Leotis Lisa Sue 12 ** long distance phone call ** Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 ** long distance phone call ** Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 ** long distance phone call ** Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 ** long distance phone call ** Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.30: Read 444.txt into the output stream when the string 444- is found.
Figure 18.31 illustrates the use of the read command in an FTP application. The ftpshell file contains the commands needed to send files using FTP. The *reqs record is a placeholder that tells sed where to insert put commands.
cat ftpshell myid mypass namefmt 1 *reqs quit /home/JSMITH/ftp $ ls puts pfile1.txt pfile2.txt pfile3.txt /home/JSMITH/ftp $ cat ftprun.qsh #! /bin/qsh # nightly ftp run # get list of files to put ls puts >ftptemp # insert put in front of each file name sed 's/^/put /' ftptemp >ftprequests # put requests are now in ftprequests # merge ftp instructions and ftprequests sed '/*r/{ r ftprequests d } ' ftpshell >ftpscript /home/JSMITH/ftp $ ftprun.qsh /home/JSMITH/ftp $ cat ftpscript myid mypass namefmt 1 put pfile1.txt put pfile2.txt put pfile3.txt quit /home/JSMITH/ftp $
Figure 18.31: Two sed commands are used to generate an FTP script.
The puts subdirectory has the files that are to be sent to the remote system. The ls command loads the file names into ftptemp. Sed places the word put and a space at the beginning of each record to generate FTP put commands.
Sed runs a second time to replace the *reqs record in ftpshell with the generated put statements. When sed finds a record that contains an asterisk followed by an r , it reads the ftprequests and deletes the *r record from the output stream. The result is the FTP script in the ftpscript file, which can be used with the FTP CL command to send all the files in the puts subdirectory.
Write
The Write command, w , writes the pattern space to a file. As shown in Figure 18.32, the file is cleared before the first input record is processed, unless the a option is used. If the a option is used, the file is cleared just before the first record is written to it.
sed '/444-/w fours.txt' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none cat fours.txt Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Otis Sept 17 444-8000 Ol' Sal Sally 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75
Figure 18.32: Sed writes all records that contain the string 444- to file fours.txt. If fours.txt already contains data, the data is replaced.
The Line number Function
The Line-number function, =, writes the number of a matching line to stdout. If there is no address before the function, all input lines match, as shown in Figure 18.33.
sed = goodoleboys.txt 1 Name Born Phone Dog Wife Shotgun Paid 2 ========= ======== ======== ======== ========= ======= ===== 3 Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 4 Bubba Oct 13 444-1111 Buck Mary Jean 12 5 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 6 Amos Jan 4 333-1119 Amos Abigail 20 7 Otis Sept 17 444-8000 Ol' Sal Sally 12 8 Claude May 31 333-4340 Blue Etheline 12 9 Roscoe Feb 2 444-2234 Rover Alice Jean 410 10 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 11 Junior April 30 BR-549 Percival Lilly Faye 12 12 Bill Feb 29 333-4444 Daisy Daisy 20 13 Ernest T. ?? none none none none
Figure 18.33: Since there is no address before the function, each line number is written to stdout.
Figure 18.34 shows the command with an address. In this case, sed prints the numbers of lines that contain Feb .
sed '/Feb/=' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Otis Sept 17 444-8000 Ol' Sal Sally 12 Claude May 31 333-4340 Blue Etheline 12 9 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.34: Print the numbers of lines that contain Feb .
Next
The Next commands, n and N , force sed to read the next input record.
The n command tells sed to cease processing commands against the pattern space, write the pattern space to stdout (unless otherwise prohibited ), and read the next input record into the pattern space. The effect is that any further editing commands are skipped .
In Figure 18.35, when sed finds a line containing Feb , it executes the n command, which causes it to skip the substitute command. The line is written to stdout, and the next line is brought into the pattern space. The effect is that sed replaces all zeros and fours on every line that does not contain the string Feb .
sed -e '/Feb/n' -e 's/[04]/*/g' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 ***-23*5 Blue Mary Sue 12 .5* Bubba Oct 13 ***-1111 Buck Mary Jean 12 Billy Bob June 11 ***-*3** Leotis Lisa Sue 12 Amos Jan * 333-1119 Amos Abigail 2* Otis Sept 17 ***-8*** Ol' Sal Sally 12 Claude May 31 333-*3** Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 ***-131* Redeye Suzy Beth 12 *.75 Junior April 3* BR-5*9 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.35: Replace all zeros and fours on every line that does not contain Feb .
In Figure 18.36, sed reads the file, replacing all fours with asterisks and all threes with X s until it gets to Claude's record. At that point, it replaces the fours in Claude's record with asterisks, writes to stdout, brings Roscoe's record into the pattern space, replaces the threes with X s, and continues as usual. The result is that all fours and threes are replaced in all records except Claude's and Roscoe's. In Claude's records, only the fours are replaced; in Roscoe's record, only the threes.
sed -e 's/4/*/g' -e '/Claude/n' -e 's/3/X/g' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 ***-2X*5 Blue Mary Sue 12 .50 Bubba Oct 1X ***-1111 Buck Mary Jean 12 Billy Bob June 11 ***-*X*0 Leotis Lisa Sue 12 Amos Jan * XXX-1119 Amos Abigail 20 Otis Sept 17 ***-8000 Ol' Sal Sally 12 Claude May 31 333-*3*0 Blue Etheline 12 Roscoe Feb 2 444-22X4 Rover Alice Jean 410 Arlis June 19 ***-1X1* Redeye Suzy Beth 12 .75 Junior April X0 BR-5*9 Percival Lilly Faye 12 Bill Feb 29 XXX-**** Daisy Daisy 20 Ernest T. ?? none none none none
Figure 18.36: Replace fours and threes in all records except Claude's and Roscoe's.
The N command tells sed to append the next record to the pattern space and separate the two records with a newline character, before continuing with the next sed command.
The first sed command in Figure 18.37 numbers the lines. Output consists of alternating records of line numbers and data. This is piped into another sed command, which combines every pair of records into one record. The second sed reads a record that contains a line number, then begins to process the functions. The N function reads the next input record and appends it to the pattern space. At this point, the pattern space contains a line number, a newline character, and a data record. The s command replaces the newline character that separates the two records with a blank. The records are numbered on the left. This is similar to using the cat utility with the n option.
sed = goodoleboys.txt sed 'N; s$ $ $' 1 Name Born Phone Dog Wife Shotgun Paid 2 ========= ======== ======== ======== ========= ======= ===== 3 Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 4 Bubba Oct 13 444-1111 Buck Mary Jean 12 5 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 6 Amos Jan 4 333-1119 Amos Abigail 20 7 Otis Sept 17 444-8000 Ol' Sal Sally 12 8 Claude May 31 333-4340 Blue Etheline 12 9 Roscoe Feb 2 444-2234 Rover Alice Jean 410 10 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 11 Junior April 30 BR-549 Percival Lilly Faye 12 12 Bill Feb 29 333-4444 Daisy Daisy 20 13 Ernest T. ?? none none none none
Figure 18.37: The effect of these two commands is similar to using the cat utility with the -n option.
Using the Holding Buffer
The h and H commands copy the pattern space to the holding buffer. The h command tells sed to replace the contents of the holding buffer, while the H tells sed to append to the contents of the holding buffer.
The holding buffer can be retrieved later with the g and G functions. To replace the contents of the pattern space, use g . To append the holding buffer to the pattern space, use G . You can swap the contents of the two buffers with either the x function or the substitute function's x flag.
In Figure 18.38. Otis's record is stored in the holding buffer and deleted from the output stream. When the last input record is read, the holding buffer is appended to the pattern space. The effect is that sed moves Otis's record to the end of the file.
sed -e '/Otis/{h; d; }' -e '$G' goodoleboys.txt Name Born Phone Dog Wife Shotgun Paid ========= ======== ======== ======== ========= ======= ===== Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 Bubba Oct 13 444-1111 Buck Mary Jean 12 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Amos Jan 4 333-1119 Amos Abigail 20 Claude May 31 333-4340 Blue Etheline 12 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Junior April 30 BR-549 Percival Lilly Faye 12 Bill Feb 29 333-4444 Daisy Daisy 20 Ernest T. ?? none none none none Otis Sept 17 444-8000 Ol' Sal Sally 12
Figure 18.38: Move the record containing Otis to the end of the file.
As each record in Figure 18.39 is read, the pattern buffer is swapped with the current contents of the holding space, and the holding space is appended to the pattern buffer. This places the new record at the end of the holding space, with a newline character separating the new record from the previous contents of the holding space. When the last input record is processed, the first command will have placed all records in the holding buffer in reverse order. The second command, which is executed only when the last record is processed, places the holding space into the pattern space and prints out the pattern space. The result is that the file is listed in reverse order.
sed -n -e '{x; H; }' -e '${x; p; }' goodoleboys.txt Ernest T. ?? none none none none Bill Feb 29 333-4444 Daisy Daisy 20 Junior April 30 BR-549 Percival Lilly Faye 12 Arlis June 19 444-1314 Redeye Suzy Beth 12 .75 Roscoe Feb 2 444-2234 Rover Alice Jean 410 Claude May 31 333-4340 Blue Etheline 12 Otis Sept 17 444-8000 Ol' Sal Sally 12 Amos Jan 4 333-1119 Amos Abigail 20 Billy Bob June 11 444-4340 Leotis Lisa Sue 12 Bubba Oct 13 444-1111 Buck Mary Jean 12 Chuck Dec 25 444-2345 Blue Mary Sue 12 .50 ========= ======== ======== ======== ========= ======= ===== Name Born Phone Dog Wife Shotgun Paid
Figure 18.39: List the file in reverse order.
Sed Scripts
A sed script is a file containing sed commands. Instead of keying all the commands on the Qshell command line, you can tell sed to read the commands from the file. Use the f option followed by the file name to start a script file. In the file, place one editing command on each line.
To place comments in a sed script, begin them with a pound sign (#). Do not place comments on lines with other executable code. You can also leave blank lines for readability. A special comment of #n on the first record of the sed script is equivalent to the n option. That is, it suppresses the automatic write to stdout .
The sedcmd1.txt file in Figure 18.40 is a sed script containing five records: one comment and four editing commands. The first three editing commands in this example delete undesired records. The last replaces the blank in Billy Bob's name with a hyphen. Sed reads the goodoleboys.txt file, applying the commands, and sends the output to a While loop. The loop sends the formatted birth month and name to Sort , which prints the data in alphabetical order by month of birth.
cat sedcmd1.txt # Sort members by birthmonth name /=/d /??/d /Born/d s/Billy /Billy-/ /home/JSMITH $ sed -f sedcmd1.txt goodoleboys.txt > while read name born rest ; > do printf '%-10s %-10s ' $born $name ; done sort April Junior Dec Chuck Feb Bill Feb Roscoe Jan Amos June Arlis June Billy-Bob May Claude Oct Bubba Sept Otis
Figure 18.40: The sed script in sedcmd1.txt prints the lines in goodoleboys.txt in alphabetical order by month of birth.
You can use the t and b functions to branch within lists of sed commands. While you can use branching when entering sed commands on the command line, it is more likely that you'll use them in sed scripts. The b command causes an unconditional branch to a label , which is a line that begins with a colon and a label name. The t command causes a branch only if a substitution is made. If a label is not specified, sed branches past the remaining commands.
The sedcmd2.txt sed script in Figure 18.41 involves branching. The first branch, 1,2b , tells sed to exit the script if it is processing either of the first two lines. None of the remaining script applies to the goodoleboys.txt data. The two other branches depend on whether or not the string 444- is found in a record.
cat sedcmd2.txt # remove Paid column from headings 1s/ Paid// 2s/ =====$// 1,2s/ ^/ / 1,2b # data lines # remove Paid figures s/$[0-9]*.*[0-9]*$// # if 444 prefix /444-/!b not444 # show "local number" s/444-[0-9]{4}/local / s/ ^/ / b # if not 444 prefix :not444 y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ s/ ^/===> / /home/JSMITH $ sed -f sedcmd2.txt goodoleboys.txt Name Born Phone Dog Wife Shotgun ========= ======== ======== ======== ========= ======= Chuck Dec 25 local Blue Mary Sue 12 Bubba Oct 13 local Buck Mary Jean 12 Billy Bob June 11 local Leotis Lisa Sue 12 ===> AMOS JAN 4 333-1119 AMOS ABIGAIL 20 Otis Sept 17 local Ol' Sal Sally 12 ===> CLAUDE MAY 31 333-4340 BLUE ETHELINE 12 Roscoe Feb 2 local Rover Alice Jean 410 Arlis June 19 local Redeye Suzy Beth 12 ===> JUNIOR APRIL 30 BR-549 PERCIVAL LILLY FAYE 12 ===> BILL FEB 29 333-4444 DAISY DAISY 20 ===> ERNEST T. ?? NONE NONE NONE NONE
Figure 18.41: Sed scripts may include branching operations.
Summary
Sed is a batch editor that applies editing commands to a stream file or standard input, and produces modified output. It is useful for editing such files when human interaction is not needed. Sed is often used to build scripts for other utilities, such as FTP.