[ Bottom of Page | Previous Page | Next Page | Contents | Index | Library Home |
Legal |
Search ]
General Programming Concepts:
Writing and Debugging Programs
Manipulating Strings with sed
The sed program performs its editing
without interacting with the person requesting the editing. This method of
operation allows sed to do the following:
- Edit very large files
- Perform complex editing operations many times without
requiring extensive retyping and cursor positioning (as interactive editors
do)
- Perform global changes in one pass through the
input.
The editor keeps only a few lines of the file being
edited in memory at one time, and does not use temporary files. Therefore,
the file to be edited can be any size as long as there is room for both the
input file and the output file in the file system.
Starting the Editor
To use the editor, create a command file containing
the editing commands to perform on the input file. The editing commands perform
complex operations and require a small amount of typing in the command file.
Each command in the command file must be on a separate line. Once the command
file is created, enter the following command on the command line:
sed -fCommandFile >Output <Input
In this command the parameters mean the following:
CommandFile |
The name of the file containing editing commands. |
Output |
The name of the file to contain the edited output. |
Input |
The name of the file, or files, to be edited. |
The sed program then makes the
changes and writes the changed information to the output file. The contents
of the input file are not changed.
How sed Works
The sed program is a stream
editor that receives its input from standard input, changes that input as
directed by commands in a command file, and writes the resulting stream to
standard output. If you do not provide a command file and do not use any flags
with the sed command, the sed program copies standard input to standard output without
change. Input to the program comes from two sources:
Input stream |
A stream of ASCII characters either from one or more files or entered
directly from the keyboard. This stream is the data to be edited. |
Commands |
A set of addresses and associated commands to be performed, in the
following general form:
[Line1 [,Line2] ] command [argument] The parameters Line1 and Line2 are called addresses. Addresses can
be either patterns to match in the input stream, or line numbers in the input
stream. |
You can also enter editing commands along with the sed command by using the -e flag.
When sed edits, it reads the
input stream one line at a time into an area in memory called the pattern
space. When a line of data is in the pattern space, sed
reads the command file and tries to match the addresses in the command file
with characters in the pattern space. If it finds an address that matches
something in the pattern space, sed then performs the
command associated with that address on the part of the pattern space that
matched the address. The result of that command changes the contents of the
pattern space, and thus becomes the input for all following commands.
When sed has tried to match
all addresses in the command file with the contents of the pattern space,
it writes the final contents of the pattern space to standard output. Then
it reads a new input line from standard input and starts the process over
at the start of the command file.
Some editing commands change the way the process operates.
Flags used with the sed command can also change the operation of the command.
See Using the sed Command Summary for more information.
Using Regular Expressions
A regular expression is a string that contains literal
characters, pattern-matching characters and/or operators that define a set
of one or more possible strings. The stream editor uses a set of pattern-matching
characters that is different from the shell pattern-matching characters, but
the same as the line editor, ed.
Using the sed Command Summary
All sed commands are single
letters plus some parameters, such as line numbers or text strings. The commands
summarized below make changes to the lines in the pattern space.
The following symbols are used in the syntax diagrams:
Symbol |
Meaning |
[ ] |
Square brackets enclose optional parts of the commands |
italics |
Parameters in italics represent general names for a name that you
enter. For example, FileName represents a parameter
that you replace with the name of an actual file. |
Line1 |
This symbol is a line number or regular expression to match that
defines the starting point for applying the editing command. |
Line2 |
This symbol is a line number or regular expression to match that
defines the ending point to stop applying the editing command. |
Line Manipulation
Function |
Syntax/Description |
append lines |
[Line1]a\\nText
Writes the lines contained in Text to the output stream after Line1. The a command must appear at the end of a line. |
change lines |
[Line1 [,Line2] ]c\\nText
Deletes the lines specified by Line1 and Line2 as the delete lines command does. Then
it writes Text to the output stream in place of the
deleted lines. |
delete lines |
[Line1 [,Line2] ]d
Removes lines from the input stream
and does not copy them to the output stream. The lines not copied begin at
line number Line1. The next line copied to the output
stream is line number Line2 + 1. If you specify only
one line number, then only that line is not copied. If you do not specify
a line number, the next line is not copied. You cannot perform any other functions
on lines that are not copied to the output. |
insert lines |
[Line1] i \\nText
Writes the lines contained in Text to the output stream before Line1. The i command must appear at the end of a line. |
next line |
[Line1 [,Line2] ]n
Reads the next line, or group
of lines from Line1 to Line2
into the pattern space. The current contents of the pattern space are written
to the output if it has not been deleted. |
Substitution
Function |
Syntax/Description |
substitution for pattern |
[Line1 [,Line2] ] s/Pattern/String/Flags
Searches the indicated line(s) for a set of characters that matches the regular
expression defined in Pattern. When it finds a match,
the command replaces that set of characters with the set of characters specified
by String. |
Input and Output
Function |
Syntax/Description |
print lines |
[Line1 [,Line2] ] p
Writes the indicated lines to
STDOUT at the point in the editing process that the p
command occurs. |
write lines |
[Line1 [,Line2] ]w FileName
Writes the indicated lines to a FileName at the point
in the editing process that the w command occurs.
If FileName exists, it is overwritten;
otherwise, it is created. A maximum of 10 different files can be mentioned
as input or output files in the entire editing process. Include exactly one
space between w and FileName. |
read file |
[Line1]r FileName
Reads FileName
and appends the contents after the line indicated by Line1.
Include exactly one space between r and FileName. If FileName cannot be opened, the command reads it as a null file without giving
any indication of an error. |
Matching Across Lines
Function |
Syntax/Description |
join next line |
[Line1 [,Line2] ]N
Joins the indicated input lines
together, separating them by an embedded new-line character. Pattern matches
can extend across the embedded new-lines(s). |
delete first line of pattern space |
[Line1 [,Line2] ]D
Deletes all text in the pattern
space up to and including the first new-line character. If only one line is
in the pattern space, it reads another line. Starts the list of editing commands
again from the beginning. |
print first line of pattern space |
[Line1 [,Line2] ]P
Prints all text in the pattern
space up to and including the first new-line character to STDOUT. |
Pick up and Put down
Function |
Syntax/Description |
pick up copy |
[Line1 [,Line2] ]h
Copies the contents of the pattern
space indicated by Line1 and Line2 if present, to the holding area. |
pick up copy, appended |
[Line1 [,Line2] ]H
Copies the contents of the pattern
space indicated by Line1 and Line2 if present, to the holding area, and appends it to the end of the previous
contents of the holding area. |
put down copy |
[Line1 [,Line2] ]g
Copies the contents of the holding
area to the pattern space indicated by Line1 and Line2 if present. The previous contents of the pattern
space are destroyed. |
put down copy, appended |
[Line1 [,Line2] ]G
Copies the contents of the holding
area to the end of the pattern space indicated by Line1 and Line2 if present. The previous contents
of the pattern space are not changed. A new-line character separates the previous
contents from the appended text. |
exchange copies |
[Line1 [,Line2] ]x
Exchanges the contents of the
holding area with the contents of the pattern space indicated by Line1 and Line2 if present. |
Control
Function |
Syntax/Description |
negation |
[Line1 [,Line2] ]!
The ! (exclamation
point) applies the command that follows it on the same line to the parts of
the input file that are not selected by Line1 and Line2. |
command groups |
[Line1 [,Line2] ]{
grouped commands
}
The { (left brace) and the } (right brace) enclose
a set of commands to be applied as a set to the input lines selected by Line1 and Line2. The first command
in the set can be on the same line or on the line following the left brace.
The right brace must be on a line by itself. You can nest groups within groups. |
labels |
:Label
Marks a place
in the stream of editing command to be used as a destination of each branch.
The symbol Label is a string of up to 8 bytes. Each Label in the editing stream must be different from any
other Label. |
branch to label, unconditional |
[Line1 [,Line2] ]bLabel
Branches to the point in the editing stream indicated by Label and continues processing the current input line with the commands
following Label. If Label
is null, branches to the end of the editing stream, which results in reading
a new input line and starting the editing stream over. The string Label must appear as a Label in the editing
stream. |
test and branch |
[Line1 [,Line2] ]tLabel
If any successful substitutions were made on the current input line, branches
to Label. If no substitutions were made, does nothing.
Clears the flag that indicates a substitution was made. This flag is cleared
at the start of each new input line. |
wait |
[Line1 ]q
Stops editing in an orderly fashion by writing the current line to the output,
writing any appended or read test to the output, and stopping the editor. |
find line number |
[Line1 ]=
Writes to standard output the line number of the line that matches Line1. |
Using Text in Commands
The append, insert and change lines commands all use a supplied
text string to add to the output stream. This text string conforms to the
following rules:
- Can be one or more lines long.
- Each \n (new-line character) inside Text must have an additional \ character before it (\\n).
- The Text string ends with
a new-line that does not have an additional \ character before it (\n).
- Once the command inserts the Text string, the string:
- Is always written to the output stream, regardless
of what other commands do to the line that caused it to be inserted.
- Is not scanned for address matches.
- Is not affected by other editing commands.
- Does not affect the line number counter.
Using String Replacement
The s command performs string
replacement in the indicated lines in the input file. If the command finds
a set of characters in the input file that satisfies the regular expression Pattern, it replaces the set of characters with the set
of characters specified in String.
The String parameter is a
literal set of characters (digits, letters and symbols). Two special symbols
can be used in String:
Symbol |
Use |
& |
This symbol in String is replaced by the
set of characters in the input lines that matched Pattern. For example, the command: |
s/boy/&s/
tells sed to find a pattern
boy in the input line, and copy that pattern to the output with an appended s. Therefore, it changes the input line:
From: |
The boy look at the game. |
To: |
The boys look at the game. |
Symbol |
Use |
\d |
d is a single digit. This symbol in String is replaced by the set of characters in the input lines that matches
the dth substring in Pattern.
Substrings begin with the characters \( and end with the characters\ ). For
example, the command:
s/\(stu\)\(dy\)/\1r\2/
- From:
- The study chair
- To:
- The sturdy chair
|
The letters that appear as flags change the replacement
as follows:
Symbol |
Use |
g |
Substitutes String for all instances of Pattern in the indicated line(s). Characters in String are not scanned for a match of Pattern
after they are inserted. For example, the command:
s/r/R/g
changes:
- From:
- the red round rock
- To:
- the Red Round Rock
|
p |
Prints (to STDOUT) the line that contains a successfully matched Pattern. |
w FileName |
Writes to FileName the line that contains
a successfully matched Pattern. if FileName exists, it is overwritten; otherwise, it is created. A maximum
of 10 different files can be mentioned as input or output files in the entire
editing process. Include exactly one space between w
and FileName. |
[ Top of Page | Previous Page | Next Page | Contents | Index | Library Home |
Legal |
Search ]