Sorts files, merges files that are already sorted, and checks files to determine if they have been sorted.
sort [ -A ] [ -b ] [ -c ] [ -d ] [ -f ] [ -i ] [ -m ] [ -n ] [ -r ] [ -u ] [ -o OutFile ] [ -t Character ] [ -T Directory ] [ -y [ Kilobytes ] ] [ -z RecordSize ] [ [ + [ FSkip ] [ .CSkip ] [ b ] [ d ] [ f ] [ i ] [ n ] [ r ] ] [ - [ FSkip ] [ .CSkip ] [ b ] [ d ] [ f ] [ i ] [ n ] [ r ] ] ] ... [ -k KeyDefinition ] ... [ File ... ]
The sort command sorts lines in the files specified by the File parameter and writes the result to standard output. If the File parameter specifies more than one file, the sort command concatenates the files and sorts them as one file. A -(minus sign) in place of a file name specifies standard input. If you do not specify any file names, the command sorts standard input. An output file can be specified with the -o flag.
If no flags are specified, the sort command sorts entire lines of the input file based upon the collation order of the current locale.
A sort key is a portion of an input line that is specified by a field number and a column number. Fields are parts of input lines that are separated by field separators. The default field separator is a sequence of one or more consecutive blank characters. A different field separator can be specified using the -t flag. The tab and the space characters are the blank characters in the C and English Language locales.
When using sort keys, the sort command first sorts all lines on the contents of the first sort key. Next, all the lines whose first sort keys are equal are sorted upon the contents of the second sort key, and so on. Sort keys are numbered according to the order they appear on the command line. If two lines sort equally on all sort keys, the entire lines are then compared based upon the collation order in the current locale.
When numbering columns within fields, the blank characters in a default field separator are counted as part of the following field. Leading blanks are not counted as part of the first field, and field separator characters specified by the -t flag are not counted as parts of fields. Leading blank characters can be ignored using the -b flag.
Sort keys can be defined using the following two methods:
The -k KeyDefinition flag uses the following form:
-k [ FStart [ .CStart ] ] [ Modifier ] [ , [ FEnd [ .CEnd ] ][ Modifier ] ]
The sort key includes all
characters beginning with the field specified by the FStart
variable and the column specified by the CStart variable and ending
with the field specified by the FEnd variable and the column
specified by the CEnd variable. If Fend is not
specified, the last character of the line is assumed. If
CEnd is not specified the last character in the FEnd
field is assumed. Any field or column number in the
KeyDefinition variable may be omitted. The default values
are:
FStart | Beginning of the line |
CStart | First column in the field |
FEnd | End of the line |
CEnd | Last column of the field |
If there is any spaces between the fields, sort considers them as separate fields.
The value of the Modifier variable can be one or more of the letters b, d, f, i, n, or r. The modifiers apply only to the field definition they are attached to and have the same effect as the flag of the same letter. The modifier letter b applies only to the end of the field definition to which it is attached. For example:
-k 3.2b,3r
specifies a sort key beginning in the second nonblank column of the third field and extending to the end of the third field, with the sort on this key to be done in reverse collation order. If the FStart variable and the CStart variable fall beyond the end of the line or after the FEnd variable and the CEnd variable, then the sort key is ignored.
A sort key can also be specified in the following manner:
[+[FSkip1] [.CSkip1] [Modifier] ] [-[FSkip2] [.CSkip2] [Modifier]]
The +FSkip1
variable specifies the number of fields skipped to reach the first field of
the sort key and the +CSkip variable specifies the
number of columns skipped within that field to reach the first character in
the sort key. The -FSkip variable specifies the
number of fields skipped to reach the first character after the
sort key, and the -CSkip variable specifies the number
of columns to skip within that field. Any of the field and column skip
counts may be omitted. The defaults are:
FSkip1 | Beginning of the line |
CSkip1 | Zero |
FSkip2 | End of the line |
CSkip2 | Zero |
The modifiers specified by the Modifier variable are the same as in the -k flag key sort definition.
The field and column numbers specified by +FSkip1.CSkip1 variables are generally one less than the field and column number of the sort key itself because these variables specify how many fields and columns to skip before reaching the sort key. For example:
+2.1b -3r
specifies a sort key beginning in the second nonblank column of the third field and extending to the end of the third field, with the sort on this key to be done in reverse collation order. The statement +2.1b specifies that two fields are skipped and then the leading blanks and one more column are skipped. If the +FSkip1.CSkip1 variables fall beyond the end of the line or after the -FSkip2.CSkip2 variables, then the sort key is ignored.
Note: The maximum number of fields on a line is 10.
Note: A -b, -d, -f, -i, -n, or -r flag that appears before any sort key definition applies to all sort keys. None of the -b, -d, -f, -i, -n, or -r flags may appear alone after a -k KeyDefinition; if they are attached to a KeyDefinition variable as a modifier, they apply only to the attached sort key. If one of these flags follows a +Fskip.Cskip or -Fskip.Cskip sort key definition, the flag only applies to that sort key.
This command returns the following
exit values:
LANG=En_US sort fruits
This command sequence displays the contents of the fruits file sorted in ascending lexicographic order. The characters in each column are compared one by one, including spaces, digits, and special characters. For instance, if the fruits file contains the text:
banana orange Persimmon apple %%banana apple ORANGE
%%banana ORANGE Persimmon apple apple banana orange
In the ASCII collating sequence, the % (percent sign) precedes uppercase letters, which precede lowercase letters. If your current locale specifies a character set other than ASCII, your results may be different.
ORANGE Persimmon apple apple %%banana banana orange
The -d flag ignores the % (percent sign) character because it is not a letter, digit, or space, placing %%banana with banana.
apple apple %%banana banana ORANGE orange Persimmon
apple %%banana orange Persimmon
Not only is the duplicate apple removed, but banana and ORANGE as well. These are removed because the -d flag ignores the %% special characters and the -f flag ignores differences in case.
Given the fruits file shown in example 1, the added +0 distinguishes %%banana from banana and ORANGE from orange. However, the two instances of apple are identical, so one of them is deleted.
apple %%banana banana ORANGE orange Persimmon
yams:104 turnips:8 potatoes:15 carrots:104 green beans:32 radishes:5 lettuce:15
Then, with the LC_ALL, LC_COLLATE, or LANG environment variable set to C, the sort command displays:
carrots:104 yams:104 lettuce:15 potatoes:15 green beans:32 radishes:5 turnips:8
Note that the numbers are not in numeric order. This happened when a lexicographic sort compares each character from left to right. In other words, 3 comes before 5, so 32 comes before 5.
radishes:5 turnips:8 lettuce:15 potatoes:15 green beans:32 carrots:104 yams:104
radishes:5 turnips:8 potatoes:15 lettuce:15 green beans:32 yams:104 carrots:104
The command sorts the lines in numeric order. When two lines have the same number, they appear in reverse alphabetic order.
/usr/bin/sort | Contains the sort command. |
The comm command, join command, and uniq command.
Files Overview, Input and Output Redirection Overview in AIX 5L Version 5.1 System User's Guide: Operating System and Devices.
National Language Support Overview for Programming in AIX 5L Version 5.1 General Programming Concepts: Writing and Debugging Programs.