[ Previous | Next | Table of Contents | Index | Library Home | Legal | Search ]

General Programming Concepts: Writing and Debugging Programs


Chapter 15. m4 Macro Processor Overview

The m4 macro processor is a front-end processor for any programming language being used in the operating system environment.

The m4 macro processor is useful in many ways. At the beginning of a program, you can define a symbolic name or symbolic constant as a particular string of characters. You can then use the m4 program to replace unquoted occurrences of the symbolic name with the corresponding string. Besides replacing one string of text with another, the m4 macro processor provides the following features:

The m4 macro processor processes strings of letters and digits called tokens. The m4 program reads each alphanumeric token and determines if it is the name of a macro. The program then replaces the name of the macro with its defining text, and pushes the resulting string back onto the input to be rescanned. You can call macros with arguments, in which case the arguments are collected and substituted into the right places in the defining text before the defining text is rescanned.

The m4 program provides built-in macros such as define. You can also create new macros. Built-in and user-defined macros work the same way.


Using the m4 Macro Processor

To use the m4 macro processor, enter the following command:

m4 [file]

The m4 program processes each argument in order. If there are no arguments or if an argument is - (dash), m4 reads standard input as its input file. The m4 program writes its results to standard output. Therefore, to redirect the output to a file for later use, use a command such as:

m4 [file] >outputfile

 


Creating a User-Defined Macro


define (MacroName, Replacement) Defines new macro MacroName with a value of Replacement.

For example, if the following statement is in a program:

define(name, stuff)

The m4 program defines the string name as stuff. When the string name occurs in a program file, the m4 program replaces it with the string stuff. The string name must be ASCII alphanumeric and must begin with a letter or underscore. The string stuff is any text, but if the text contains parentheses the number of open, or left, parentheses must equal the number of closed, or right, parentheses. Use the / (slash) character to spread the text for stuff over multiple lines.

The open (left) parenthesis must immediately follow the word define. For example:

define(N, 100)
 . . . 
if (i > N)

defines N to be 100 and uses the symbolic constant N in a later if statement.

Macro calls in a program have the following form:

name(arg1,arg2, . . . argn)

A macro name is recognized only if it is surrounded by nonalphanumerics. Using the following example:

define(N, 100)
 . . . 
if (NNN > 100)

the variable NNN is not related to the defined macro N.

You can define macros in terms of other names. For example:

define(N, 100)
define(M, N)

defines both M and N to be 100. If you later change the definition of N and assign it a new value, M retains the value of 100, not N.

The m4 macro processor expands macro names into their defining text as soon as possible. The string N is replaced by 100. Then the string M is also replaced by 100. The overall result is the same as using the following input in the first place:

define(M, 100)

The order of the definitions can be interchanged as follows:

define(M, N)
define(N, 100)

Now M is defined to be the string N, so when the value of M is requested later, the result is the value of N at that time (because the M is replaced by N, which is replaced by 100).

Using the Quote Characters

To delay the expansion of the arguments of define, enclose them in quote characters. If you do not change them, quote characters are ` ' (left and right single quotes). Any text surrounded by quote characters is not expanded immediately, but quote characters are removed. The value of a quoted string is the string with the quote characters removed. If the input is:

define(N, 100)
define(M, 'N')

the quote characters around the N are removed as the argument is being collected. The result of using quote characters is to define M as the string N, not 100. The general rule is that the m4 program always strips off one level of quote characters whenever it evaluates something. This is true even outside of macros. To make the word define appear in the output, enter the word in quote characters, as follows:

'define' = 1;

Another example of using quote characters is redefining N. To redefine N, delay the evaluation by putting N in quote characters. For example:

define(N, 100)
. . . 
define('N', 200)

To prevent problems from occurring, quote the first argument of a macro. For example, the following fragment does not redefine N:

define(N, 100)
. . . 
define(N, 200)

The N in the second definition is replaced by 100. The result is the same as the following statement:

define(100, 200)

The m4 program ignores this statement because it can only define names, not numbers.

Changing the Quote Characters

Quote characters are normally ` ' (left or right single quotes). If those characters are not convenient, change the quote characters with the following built-in macro:

 

changequote (l, r ) Changes the left and right quote characters to the characters represented by the l and r variables.

To restore the original quote characters, use changequote without arguments as follows:

changequote

Arguments

The simplest form of macro processing is replacing one string by another (fixed) string. However, macros can also have arguments, so that you can use the macro in different places with different results. To indicate where an argument is to be used within the replacement text for a macro (the second argument of its definition), use the symbol $n to indicate the nth argument. When the macro is used, the m4 macro processor replaces the symbol with the value of the indicated argument. For example, the symbol:

$2

refers to the second argument of a macro. Therefore, if you define a macro called bump as:

define(bump, $1 = $1 + 1)

the m4 program generates code to increment the first argument by 1. The bump(x) statement is equivalent to x = x + 1.

A macro can have as many arguments as needed. However, you can access only nine arguments using the $n symbol ($1 through $9). To access arguments past the ninth argument, use the shift macro.

shift (ParameterList) Returns all but the first element of ParameterList to perform a destructive left shift of the list.

This macro drops the first argument and reassigns the remaining arguments to the $n symbols (second argument to $1, third argument to $2. . . tenth argument to $9). Using the shift macro more than once allows access to all arguments used with the macro.

The $0 macro returns the name of the macro. Arguments that are not supplied are replaced by null strings, so that you can define a macro that concatenates its arguments like this:

define(cat, $1$2$3$4$5$6$7$8$9)

Thus:

cat(x, y, z)

is the same as:

xyz

Arguments $4 through $9 in this example are null since corresponding arguments were not provided.

The m4 program discards leading unquoted blanks, tabs, or new-line characters in arguments, but keeps all other white space. Thus:

define(a, b c)

defines a to be b c.

Arguments are separated by commas. Use parentheses to enclose arguments containing commas, so that the comma does not end the argument. For example:

define(a, (b,c))

has only two arguments. The first argument is a, and the second is (b,c). To use a comma or single parenthesis, enclose it in quote characters.


Using a Built-In m4 Macro

The m4 program provides a set of predefined macros. The subsequent sections explain many of the macros and their uses.

Removing a Macro Definition


undefine (`MacroName') Removes the definition of a user-defined or built-in macro (`MacroName')

For example:

undefine(`N')

removes the definition of N. Once you remove a built-in macro with the undefine macro, as follows:

undefine(`define')

then you cannot use its definition of the built-in macro again.

Single quotes are required in this case to prevent substitution.

Checking for a Defined Macro


ifdef (`MacroName', Argument1, Argument2)
  If macro MacroName is defined and is not defined to zero, returns the value of Argument1. Otherwise, it returns Argument2.

The ifdef macro permits three arguments. If the first argument is defined, the value of ifdef is the second argument. If the first argument is not defined, the value of ifdef is the third argument. If there is no third argument, the value of ifdef is null.

Using Integer Arithmetic

The m4 program provides the following built-in functions for doing arithmetic on integers only:

incr (Number) Returns the value of Number + 1.
decr (Number ) Returns the value of Number - 1.
eval Evaluates an arithmetic expression.

Thus, to define a variable as one more than the Number value, use the following:

define(Number, 100)
define(Number1, `incr(Number)')

This defines Number1 as one more than the current value of Number.

The eval function can evaluate expressions containing the following operators (listed in decreasing order of precedence):

unary + and -

** or ^ (exponentiation)

* / % (modulus)

+ -

== != < <= > >=

!(not)

& or && (logical AND)

| or || (logical OR)

Use parentheses to group operations where needed. All operands of an expression must be numeric. The numeric value of a true relation (for example, 1 > 0) is 1, and false is 0. The precision of the eval function is 32 bits.

For example, define M to be 2==N+1 using the eval function as follows:

define(N, 3)
define(M, `eval(2==N+1)')

Use quote characters around the text that defines a macro unless the text is very simple.

Manipulating Files

To merge a new file in the input, use the built-in include function.

include (File) Returns the contents of the file File.

For example:

include(FileName)

inserts the contents of FileName in place of the include command.

A fatal error occurs if the file named in the include macro cannot be accessed. To avoid a fatal error, use the alternate form sinclude.

sinclude (File ) Returns the contents of the file File, but does not report an error if it cannot access File.

The sinclude (silent include) macro does not write a message, but continues if the file named cannot be accessed.

Redirecting Output

The output of the m4 program can be redirected again to temporary files during processing, and the collected material can be output upon command. The m4 program maintains nine possible temporary files, numbered 1 through 9. If you use the built-in divert macro.

divert (Number) Changes output stream to the temporary file Number.

The m4 program writes all output from the program after the divert function at the end of temporary file, Number. To return the output to the display screen, use either the divert or divert(0) function, which resumes the normal output process.

The m4 program writes all redirected output to the temporary files in numerical order at the end of processing. The m4 program discards the output if you redirect the output to a temporary file other than 0 through 9.

To bring back the data from all temporary files in numerical order, use the built-in undivert macro.

undivert (Number1, Number2... ) Appends the contents of the indicated temporary files to the current temporary file.

To bring back selected temporary files in a specified order, use the built-in undivert macro with arguments. When using the undivert macro, the m4 program discards the temporary files that are recovered and does not search the recovered data for macros.

The value of the undivert macro is not the diverted text.

divnum Returns the value of the currently active temporary file.

If you do not change the output file with the divert macro, the m4 program puts all output in a temporary file named 0.

Using System Programs in a Program

You can run any program in the operating system from a program by using the built-in syscmd macro. For example, the following statement runs the date program:

syscmd(date)

Using Unique File Names

Use the built-in maketemp macro to make a unique file name from a program.

maketemp (String...nnnnn...String) Creates a unique file name by replacing the characters nnnnn in the argument string with the current process ID.

For example, for the statement:

maketemp(myfilennnnn)

the m4 program returns a string that is myfile concatenated with the process ID. Use this string to name a temporary file.

Using Conditional Expressions


ifelse (String1, String2, Argument1, Argument2)
  If String1 matches String2, returns the value of Argument1. Otherwise it returns Argument2.

The built-in ifelse macro performs conditional testing. In the simplest form:

ifelse(a, b, c, d)

compares the two strings a and b.

If a and b are identical, the built-in ifelse macro returns the string c. If they are not identical, it returns string d. For example, you can define a macro called compare to compare two strings and return yes if they are the same, or no if they are different, as follows:

define(compare, `ifelse($1, $2, yes, no)`)

The quote characters prevent the evaluation of the ifelse macro from occurring too early. If the fourth argument is missing, it is treated as empty.

The ifelse macro can have any number of arguments, and therefore, provides a limited form of multiple-path decision capability. For example:

ifelse(a, b, c, d, e, f, g)

This statement is logically the same as the following fragment:

if(a == b) x = c;
else if(d == e) x = f;
else x = g;
return(x);

If the final argument is omitted, the result is null, so:

ifelse(a, b, c)

is c if a matches b, and null otherwise.

Manipulating Strings


len Returns the byte length of the string that makes up its argument

Thus:

len(abcdef)

is 6, and:

len((a,b))

is 5.

dlen Returns the length of the displayable characters in a string

Characters made up from 2-byte codes are displayed as one character. Thus, if the string contains any 2-byte, international character-support characters, the results of dlen will differ from the results of len.

substr (String, Position, Length) Returns a substring of String that begins at character number Position and is Length characters long.

Using input, substr (s, i, n) returns the substring of s that starts at the ith position (origin zero) and is n characters long. If n is omitted, the rest of the string is returned. For example, the function:

substr('now is the time',1)

returns the following string:

now is the time


index (String1, String2) Returns the character position in String1 where String2 starts (starting with character number 0), or -1 if String1 does not contain String2.

As with the built-in substr macro, the origin for strings is 0.

translit (String, Set1, Set2) Searches String for characters that are in Set1. If it finds any, changes (transliterates) those characters to corresponding characters in Set2.

It has the general form:

translit(s, f, t)

which modifies s by replacing any character found in f by the corresponding character of t. For example, the function:

translit(s, aeiou, 12345)

replaces the vowels by the corresponding digits. If t is shorter than f, characters that do not have an entry in t are deleted. If t is not present at all, characters from f are deleted from s. So:

translit(s, aeiou)

deletes vowels from string s.

dnl Deletes all characters that follow it, up to and including the new-line character.

Use this macro to get rid of empty lines. For example, the function:

define(N, 100)
define(M, 200)
define(L, 300)

results in a new-line at the end of each line that is not part of the definition. These new-line characters are passed to the output. To get rid of the new lines, add the built-in dnl macro to each of the lines.

define(N, 100) dnl
define(M, 200) dnl
define(L, 300) dnl

Printing


errprint (String) Writes its argument (String) to the standard error file

For example:

errprint ('error')


dumpdef (`MacroName'... ) Dumps the current names and definitions of items named as arguments (`MacroName'...)

If you do not supply arguments, the dumpdef macro prints all current names and definitions. Remember to quote the names.


List of Additional m4 Macros

A list of additional m4 macros, with a brief explanation of each, follows:

changecom (l, r ) Changes the left and right comment characters to the characters represented by the l and r variables.
defn (MacroName) Returns the quoted definition of MacroName
en (String) Returns the number of characters in String.
eval (Expression) Evaluates Expression as a 32-bit arithmetic expression.
m4exit (Code) Exits m4 with a return code of Code.
m4wrap (MacroName) Runs macro MacroName at the end of m4.
popdef (MacroName) Replaces the current definition of MacroName with the previous definition saved with the pushdef macro.
pushdef (MacroName, Replacement) Saves the current definition of MacroName and then defines MacroName to be Replacement.
syscmd (Command) Executes the system command Command with no return value.
sysval Gets the return code from the last use of the syscmd macro.
traceoff (MacroList) Turns off trace for any macro in MacroList. If MacroList is null, turns off all tracing.
traceon (MacroName) Turns on trace for macro MacroName. If MacroName is null, turns trace on for all macros.


[ Previous | Next | Table of Contents | Index | Library Home | Legal | Search ]