[ Previous | Next | Contents | Home | Search ]
AIX Version 4.3 Assembler Language Reference

Subroutine Linkage Convention

This article discusses the following:

Linkage Convention Overview

The subroutine linkage convention describes the machine state at subroutine entry and exit. When followed, this scheme allows routines compiled separately in the same or different languages to be linked and executed when called.

The linkage convention allows for parameter passing and return values to be in floating-point registers (FPRs), general-purpose registers (GPRs), or both.

Object Mode Considerations

For AIX Version 4.3, the following discussion applies to both 32-bit mode and 64-bit mode with the following notes:

Register Usage and Conventions

The PowerPC 32-bit architecture has 32 GPRs and 32 FPRs. Each GPR is 32 bits wide, and each FPR is 64 bits wide. There are also special registers for branching, exception handling, and other purposes. The General-Purpose Register Convention table shows how GPRs are used.

General-Purpose Register Conventions
Register Status Use
GPR0 volatile In function prologs.
GPR1 dedicated Stack pointer.
GPR2 dedicated Table of Contents (TOC) pointer.
GPR3 volatile First word of a function's argument list; first word of a scalar function return.
GPR4 volatile Second word of a function's argument list; second word of a scalar function return.
GPR5 volatile Third word of a function's argument list.
GPR6 volatile Fourth word of a function's argument list.
GPR7 volatile Fifth word of a function's argument list.
GPR8 volatile Sixth word of a function's argument list.
GPR9 volatile Seventh word of a function's argument list.
GPR10 volatile Eighth word of a function's argument list.
GPR11 volatile In calls by pointer and as an environment pointer for languages that require it (for example, PASCAL).
GPR12 volatile For special exception handling required by certain languages and in glink code.
GPR13:GPR31 nonvolatile These registers must be preserved across a function call.

The preferred method of using GPRs is to use the volatile registers first. Next, use the nonvolatile registers in descending order, starting with GPR31 and proceeding down to GPR13. GPR1 and GPR2 must be dedicated as stack and Table of Contents (TOC) area pointers, respectively. GPR1 and GPR2 must appear to be saved across a call, and must have the same values at return as when the call was made.

Volatile registers are scratch registers presumed to be destroyed across a call and are, therefore, not saved by the callee. Volatile registers are also used for specific purposes as shown in the previous table. Nonvolatile and dedicated registers are required to be saved and restored if altered and, thus, are guaranteed to retain their values across a function call.

The Floating-Point Register Conventions table shows how the FPRs are used.

Floating-Point Register Conventions
Register Status Use
FPR0 volatile As a scratch register.
FPR1 volatile First floating-point parameter; first 8 bytes of a floating-point scalar return.
FPR2 volatile Second floating-point parameter; second 8 bytes of a floating-point scalar return.
FPR3 volatile Third floating-point parameter; third 8 bytes of a floating-point scalar return.
FPR4 volatile Fourth floating-point parameter; fourth 8 bytes of a floating-point scalar return.
FPR5 volatile Fifth floating-point parameter.
FPR6 volatile Sixth floating-point parameter.
FPR7 volatile Seventh floating-point parameter.
FPR8 volatile Eighth floating-point parameter.
FPR9 volatile Ninth floating-point parameter.
FPR10 volatile Tenth floating-point parameter.
FPR11 volatile Eleventh floating-point parameter.
FPR12 volatile Twelfth floating-point parameter.
FPR13 volatile Thirteenth floating-point parameter.
FPR14:FPR31 nonvolatile If modified, must be preserved across a call.

The preferred method of using FPRs is to use the volatile registers first. Next, the nonvolatile registers are used in descending order, starting with FPR31 and proceeding down to FPR14.

Only scalars are returned in multiple registers. The number of registers required depends on the size and type of the scalar. For floating-point values, the following results occur:

Special Registers in the PowerPC

The Special-Purpose Register Conventions table shows the PowerPC special purpose registers (SPRs). These are the only SPRs for which there is a register convention.

Special-Purpose Register Conventions
Register or Register Field Status Use
LR volatile Used as a branch target address or holds a return address.
CTR volatile Used for loop count decrement and branching.
XER volatile Fixed-point exception register.
FPSCR volatile Floating-point exception register.
CR0, CR1 volatile Condition-register bits.
CR2, CR3, CR4 nonvolatile Condition-register bits.
CR5, CR6, CR7 volatile Condition-register bits.

Routines that alter CR2, CR3, and CR4 must save and restore at least these fields of the CR. Use of other CR fields does not require saving or restoring.

Run-Time Process Stack

The stack format convention is designed to enhance the efficiency of the following:

The Run-Time Stack figure illustrates the run-time stack. It shows the stack after the sender function calls the catcher function, but before the catcher function calls another function. This figure is based on the assumption that the catcher function will call another function. Therefore, the catcher function requires another link area (as described in the stack layout). PWn refers to the nth word of parameters that are passed.

Stack Layout

Only one register, referred to as the stack pointer (SP), is used for addressing the stack, and GPR1 is the dedicated stack pointer register. It grows from numerically higher storage addresses to numerically lower addresses.

The Run-Time Stack figure illustrates what happens when the sender function calls the catcher function, and how the catcher function requires a stack frame of its own. When a function makes no calls and requires no local storage of its own, no stack frame is required and the SP is not altered.

Notes:
  1. To reduce confusion, data being passed from the sender function (the caller) is referred to as arguments, and the same data being received by the catcher function (the callee) is referred to as parameters. The output argument area of sender is the same as the input parameter area of catcher.
  2. The address value in the stack pointer must be quadword-aligned. (The address value must be a multiple of 16.)
Stack Areas

For convenience, the stack layout has been divided into eight areas numbered 1 to 8, starting from the bottom of the diagram (high address) to the top of the diagram (low address). The sender's stack pointer is pointing to the top of area 3 when the call to the catcher function is made, which is also the same SP value that is used by the catcher function on entry to its prolog. The following is a description of the stack areas, starting from the bottom of the diagram (area 1) and moving up to the top (area 8):

Stack-Related System Standard

All language processors and assemblers must maintain the stack-related system standard that the SP must be atomically updated by a single instruction. This ensures that there is no timing window where an interrupt that would result in the stack pointer being only partially updated can occur.

Note: The examples of program prologs and epilogs show the most efficient way to update the stack pointer.

Prologs and Epilogs

Prologs and epilogs may be used for functions, including setting the registers on function entry and restoring the registers on function exit.

No predetermined code sequences representing function prologs and epilogs are dictated. However, certain operations must be performed under certain conditions. The Stack Frame Layout figure shows the stack frame layout.

A typical function's execution stack is:

The Prolog Actions and Epilog Actions tables show the conditions and actions required for prologs and epilogs.

Prolog Actions
If: Then:
Any nonvolatile FPRs (FPR14:FPR31) are used Save them in the FPR save area (area 4 in the previous figure).
Any nonvolatile GPRs (GPR13:GPR31) are used Save them in the GPR save area (area 5 in the previous figure).
LR is used for a nonleaf procedure Save the LR at offset eight from the caller function SP.
Any of the nonvolatile condition register (CR) fields are used. Save the CR at offset four from the caller function SP.
A new stack frame is required Get a stack frame and decrement the SP by the size of the frame padded (if necessary) to a multiple of 16 to acquire a new SP and save caller's SP at offset 0 from the new SP.
Note: A leaf function that does not require stack space for local variables and temporaries can save its caller registers at a negative offset from the caller SP without actually acquiring a stack frame.
Epilog Actions
If: Then:
Any nonvolatile FPRs were saved Restore the FPRs that were used.
Any nonvolatile GPRs were saved Restore the GPRs that were saved.
The LR was altered because a nonleaf procedure was invoked Restore LR.
The CR was altered Restore CR.
A new stack was acquired Restore the old SP to the value it had on entry (the caller's SP). Return to caller.

While the PowerPC architecture provides both load and store multiple instructions for GPRs, it discourages their use because their implementation on some machines may not be optimal. In fact, use of the load and store multiple instructions on some future implementations may be significantly slower than the equivalent series of single word loads or stores. However, saving many FPRs or GPRs with single load or store instructions in a function prolog or epilog leads to increased code size. For this reason, the system environment must provide routines that can be called from a function prolog and epilog that will do the saving and restoring of the FPRs and GPRs. The interface to these routines, their source code, and some prolog and epilog code sequences are provided.

As shown in the stack frame layout, the GPR save area is not at a fixed position from either the caller SP or the callee SP. The FPR save area starts at a fixed position, directly above the SP (lower address) on entry to that callee, but the position of the GPR save area depends on the number of FPRs saved. Thus, it is difficult to write a general-purpose GPR-saving function that uses fixed displacements from SP.

If the routine needs to save both GPRs and FPRs, use GPR12 as the pointer for saving and restoring GPRs. (GPR12 is a volatile register, but does not contain input parameters.) This results in the definition of multiple-register save and restore routines, each of which saves or restores m FPRs and n GPRs. This is achieved by executing a bla (Branch and Link Absolute) instruction to specially provided routines containing multiple entry points (one for each register number), starting from the lowest nonvolatile register.

Notes:
  1. There are no entry points for saving and restoring GPR and FPR numbers greater than 29. It is more efficient to save a small number of registers in the prolog than it is to call the save and restore functions.
  2. If the LR is not saved or restored in the following code segments, the language processor must perform the saving and restoring as appropriate.

Language processors must use a proprietary method to conserve the values of nonvolatile registers across a function call.

Three sets of save and restore routines must be made available by the system environment. These routines are:

Saving GPRs Only

For a function that saves and restores n GPRs and no FPRs, the saving can be done using individual store and load instructions or by calling system-provided routines as shown in the following example:

Note: The number of registers being saved is n. Sequences such as <32-n> in the following examples indicate the first register number to be saved and restored. All registers from <32-n> to 31, inclusive, are saved and restored.
#Following are the prolog/epilog of a function that saves n GPRS #(n>2):
mflr    r0                       #move LR into GPR0
bla     _savegpr0_<32-n>         #branch and link to save GPRs
stwu    r1,<-frame_size>(r1)     #update SP and save caller's SP
...                              #frame_size is the size of the
                                 #stack frame to be required
<save CR if necessary>
...
...                              #body of function
...
<reload save CR if necessary>
...
<reload caller's SP into R!>     #see note below
ba      _restgpr0_<32-n>         #restore GPRs and return
Note: The restoring of the calling function SP can be done by either adding the frame_size value to the current SP whenever frame_size is known, or by reloading it from offset 0 from the current SP. The first approach is more efficient, but not possible for functions that use the alloca subroutine to dynamically allocate stack space.

The following example shows a GPR save routine when FPRs are not saved:

_savegpr0_13    stw     r13,-76(r1)              #save r13
_savegpr0_14    stw     r14,-72(r1)              #save r14
_savegpr0_15    stw     r15,-68(r1)              #save r15
_savegpr0_16    stw     r16,-64(r1)              #save r16
_savegpr0_17    stw     r17,-60(r1)              #save r17
_savegpr0_18    stw     r18,-56(r1)              #save r18
_savegpr0_19    stw     r19,-52(r1)              #save r19
_savegpr0_20    stw     r20,-48(r1)              #save r20
_savegpr0_21    stw     r21,-44(r1)              #save r21
_savegpr0_22    stw     r22,-40(r1)              #save r22
_savegpr0_23    stw     r23,-36(r1)              #save r23
_savegpr0_24    stw     r24,-32(r1)              #save r24
_savegpr0_25    stw     r25,-28(r1)              #save r25
_savegpr0_26    stw     r26,-24(r1)              #save r26
_savegpr0_27    stw     r27,-20(r1)              #save r27
_savegpr0_28    stw     r28,-16(r1)              #save r28
_savegpr0_29    stw     r29,-12(r1)              #save r29
                stw     r30,-8(r1)               #save r30
                stw     r31,-4(r1)               #save r31
                stw     r0 , 8(r1)               #save LR in 
                                                 #caller's frame
                blr                              #return
Note: This save routine must not be called when GPR30 or GPR31, or both, are the only registers beings saved. In these cases, the saving and restoring must be done inline.

The following example shows a GPR restore routine when FPRs are not saved:

_restgpr0_13    lwz     r13,-76(r1)              #restore r13
_restgpr0_14    lwz     r14,-72(r1)              #restore r14
_restgpr0_15    lwz     r15,-68(r1)              #restore r15
_restgpr0_16    lwz     r16,-64(r1)              #restore r16
_restgpr0_17    lwz     r17,-60(r1)              #restore r17
_restgpr0_18    lwz     r18,-56(r1)              #restore r18
_restgpr0_19    lwz     r19,-52(r1)              #restore r19
_restgpr0_20    lwz     r20,-48(r1)              #restore r20
_restgpr0_21    lwz     r21,-44(r1)              #restore r21
_restgpr0_22    lwz     r22,-40(r1)              #restore r22
_restgpr0_23    lwz     r23,-36(r1)              #restore r23
_restgpr0_24    lwz     r24,-32(r1)              #restore r24
_restgpr0_25    lwz     r25,-28(r1)              #restore r25
_restgpr0_26    lwz     r26,-24(r1)              #restore r26
_restgpr0_27    lwz     r27,-20(r1)              #restore r27
_restgpr0_28    lwz     r28,-16(r1)              #restore r28
_restgpr0_29    lwz     r0,8(r1)                 #get return
                                                 #address from
                                                 #frame
                lwz     r29,-12(r1)              #restore r29
                mtlr    r0                       #move return
                                                 #address to LR
                lwz     r30,-8(r1)               #restore r30
                lwz     r31,-4(r1)               #restore r31
                blr                              #return
Note: This restore routine must not be called when GPR30 or GPR31, or both, are the only registers beings saved. In these cases, the saving and restoring must be done inline.
Saving GPRs and FPRs

For a function that saves and restores n GPRs and m FPRs (n>2 and m>2), the saving can be done using individual store and load instructions or by calling system-provided routines as shown in the following example:

#The following example shows the prolog/epilog of a function #which save n GPRs and m FPRs:
mflr    r0                       #move LR into GPR 0
subi    r12,r1,8*m               #compute GPR save pointer
bla     _savegpr1_<32-n>         #branch and link to save GPRs
bla     _savefpr_<32-m>
stwu    r1,<-frame_size>(r1)     #update SP and save caller's SP
...
<save CR if necessary>
...
...                              #body of function
...
<reload save CR if necessary>
...
<reload caller's SP into r1>     #see note below on
subi    r12,r1,8*m               #compute CPR restore pointer
bla     _restgpr1_<32-n>         #restore GPRs
ba      _restfpr_<32-m>          #restore FPRs and return
Note: The calling function SP can be restored by either adding the frame_size value to the current SP whenever the frame_size is known or by reloading it from offset 0 from the current SP. The first approach is more efficient, but not possible for functions that use the alloca subroutine to dynamically allocate stack space.

The following example shows a GPR save routine when FPRs are saved:

_savegpr1_13    stw     r13,-76(r12)             #save r13
_savegpr1_14    stw     r14,-72(r12)             #save r14
_savegpr1_15    stw     r15,-68(r12)             #save r15
_savegpr1_16    stw     r16,-64(r12)             #save r16
_savegpr1_17    stw     r17,-60(r12)             #save r17
_savegpr1_18    stw     r18,-56(r12)             #save r18
_savegpr1_19    stw     r19,-52(r12)             #save r19
_savegpr1_20    stw     r20,-48(r12)             #save r20
_savegpr1_21    stw     r21,-44(r12)             #save r21
_savegpr1_22    stw     r22,-40(r12)             #save r22
_savegpr1_23    stw     r23,-36(r12)             #save r23
_savegpr1_24    stw     r24,-32(r12)             #save r24
_savegpr1_25    stw     r25,-28(r12)             #save r25
_savegpr1_26    stw     r26,-24(r12)             #save r26
_savegpr1_27    stw     r27,-20(r12)             #save r27
_savegpr1_28    stw     r28,-16(r12)             #save r28
_savegpr1_29    stw     r29,-12(r12)             #save r29
                stw     r30,-8(r12)              #save r30
                stw     r31,-4(r12)              #save r31
                blr                              #return

The following example shows an FPR save routine:

_savefpr_14    stfd     f14,-144(r1)             #save f14
_savefpr_15    stfd     f15,-136(r1)             #save f15
_savefpr_16    stfd     f16,-128(r1)             #save f16
_savefpr_17    stfd     f17,-120(r1)             #save f17
_savefpr_18    stfd     f18,-112(r1)             #save f18
_savefpr_19    stfd     f19,-104(r1)             #save f19
_savefpr_20    stfd     f20,-96(r1)              #save f20
_savefpr_21    stfd     f21,-88(r1)              #save f21
_savefpr_22    stfd     f22,-80(r1)              #save f22
_savefpr_23    stfd     f23,-72(r1)              #save f23
_savefpr_24    stfd     f24,-64(r1)              #save f24
_savefpr_25    stfd     f25,-56(r1)              #save f25
_savefpr_26    stfd     f26,-48(r1)              #save f26
_savefpr_27    stfd     f27,-40(r1)              #save f27
_savefpr_28    stfd     f28,-32(r1)              #save f28
_savefpr_29    stfd     f29,-24(r1)              #save f29
               stfd     f30,-16(r1)              #save f30
               stfd     f31,-8(r1)               #save f31
               stw      r0 , 8(r1)               #save LR in 
                                                 #caller's frame
               blr                               #return

The following example shows a GPR restore routine when FPRs are saved:

_restgpr1_13    lwz     r13,-76(r12)             #restore r13
_restgpr1_14    lwz     r14,-72(r12)             #restore r14
_restgpr1_15    lwz     r15,-68(r12)             #restore r15
_restgpr1_16    lwz     r16,-64(r12)             #restore r16
_restgpr1_17    lwz     r17,-60(r12)             #restore r17
_restgpr1_18    lwz     r18,-56(r12)             #restore r18
_restgpr1_19    lwz     r19,-52(r12)             #restore r19
_restgpr1_20    lwz     r20,-48(r12)             #restore r20
_restgpr1_21    lwz     r21,-44(r12)             #restore r21
_restgpr1_22    lwz     r22,-40(r12)             #restore r22
_restgpr1_23    lwz     r23,-36(r12)             #restore r23
_restgpr1_24    lwz     r24,-32(r12)             #restore r24
_restgpr1_25    lwz     r25,-28(r12)             #restore r25
_restgpr1_26    lwz     r26,-24(r12)             #restore r26
_restgpr1_27    lwz     r27,-20(r12)             #restore r27
_restgpr1_28    lwz     r28,-16(r12)             #restore r28
_restgpr1_29    lwz     r29,-12(r12)             #restore r29
                lwz     r30,-8(r12)              #restore r30
                lwz     r31,-4(r12)              #restore r31
                blr                              #return

The following example shows an FPR restore routine:

_restfpr_14    lfd     r14,-144(r1)              #restore r14
_restfpr_15    lfd     r15,-136(r1)              #restore r15
_restfpr_16    lfd     r16,-128(r1)              #restore r16
_restfpr_17    lfd     r17,-120(r1)              #restore r17
_restfpr_18    lfd     r18,-112(r1)              #restore r18
_restfpr_19    lfd     r19,-104(r1)              #restore r19
_restfpr_20    lfd     r20,-96(r1)               #restore r20
_restfpr_21    lfd     r21,-88(r1)               #restore r21
_restfpr_22    lfd     r22,-80(r1)               #restore r22
_restfpr_23    lfd     r23,-72(r1)               #restore r23
_restfpr_24    lfd     r24,-64(r1)               #restore r24
_restfpr_25    lfd     r25,-56(r1)               #restore r25
_restfpr_26    lfd     r26,-48(r1)               #restore r26
_restfpr_27    lfd     r27,-40(r1)               #restore r27
_restfpr_28    lfd     r28,-32(r1)               #restore r28
_restfpr_29    lwz     r0,8(r1)                  #get return
                                                 #address from
                                                 #frame
               lfd     r29,-24(r1)               #restore r29
               mtlr    r0                        #move return
                                                 #address to LR
               lfd     r30,-16(r1)               #restore r30
               lfd     r31,-8(r1)                #restore r31
               blr                               #return
Saving FPRs Only

For a function that saves and restores m FPRs (m>2), the saving can be done using individual store and load instructions or by calling system-provided routines as shown in the following example:

#The following example shows the prolog/epilog of a function #which saves m FPRs and no GPRs:
mflr    r0                       #move LR into GPR 0
bla     _savefpr_<32-m>
stwu    r1,<-frame_size>(r1)     #update SP and save caller's SP
...
<save CR if necessary>
...
...                              #body of function
...
<reload save CR if necessary>
...
<reload caller's SP into r1>     #see note below
ba      _restfpr_<32-m>          #restore FPRs and return
Notes:
  1. There are no entry points for saving and restoring GPR and FPR numbers higher than 29. It is more efficient to save a small number of registers in the prolog than to call the save and restore functions.
  2. The restoring of the calling function SP can be done by either adding the frame_size value to the current SP whenever frame_size is known, or by reloading it from offset 0 from the current SP. The first approach is more efficient, but not possible for functions that use the alloca subroutine to dynamically allocate stack space.
Updating the Stack Pointer

The PowerPC stwu (Store Word with Update) instruction is used for computing the new SP and saving the back chain. This instruction has a signed 16-bit displacement field that can represent a maximum signed value of 32,768. A stack frame size greater than 32K bytes requires two instructions to update the SP, and the update must be done atomically.

The two assembly code examples illustrate how to update the SP in a prolog.

To compute a new SP and save the old SP for stack frames larger than or equal to 32K bytes:

addis   r12, r0, (<-frame_size> > 16) & 0XFFFF
                        # set r12 to left half of frame size
ori     r12, r12 (-frame_size> & 0XFFFF
                        # Add right halfword of frame size
stwux   r1, r1, r12     # save old SP and compute new SP

To compute a new SP and save the old SP for stack frames smaller than 32K bytes:

stwu   r1, <-frame_size>(r1)   #update SP and save caller's SP

Calling Routine's Responsibilities

When an assembler language program calls another program, the caller should not use the names of the called program's commands, functions, or procedures as global assembler language symbols. To avoid confusion, follow the naming conventions for the language of the called program when you create symbol names. For example, if you are calling a C language program, be certain you use the naming conventions for that language.

A called routine has two symbols associated with it: a function descriptor (Name) and an entry point (.Name). When a call is made to a routine, the compiler branches to the name point directly.

Except for when loading parameters into the proper registers, calls to functions are expanded by compilers to include an NOP instruction after each branch and link instruction. This extra instruction is modified by the linkage editor to restore the contents of the TOC register (register 2) on return from an out-of-module call.

The instruction sequence produced by compilers is:

bl .foo            #Branch to foo
cror 31,31,31     #Special NOP 0x4ffffb82
Note: Some compilers produce a cror 15,15,15 (0x4def7b82) instruction. To avoid having to restore condition register 15 after a call, the linkage editor transforms cror 15,15,15 into cror 31,31,31. Condition register bit 31 is not preserved across a call and does not have to be restored.

The linkage editor will do one of two things when it sees the bl instruction (in the previous instruction sequence, on a call to the foo function):

The bl .glink_of_foo instruction sequence is changed to:

bl .foo         #Branch to foo
cror 31,31,31   #Special NOP instruction 0x4ffffb82
Note: For any export, the linkage editor inserts the procedure's descriptor into the module.

Called Routine's Responsibilities

Prologs and epilogs are used in the called routines. On entry to a routine, the following steps should be performed:

  1. Use some or all of the prolog actions described in the Prolog Actions table.
  2. Store the back chain and decrement the stack pointer (SP) by the size of the stack frame.
    Note: If a stack overflow occurs, it will be known immediately when the store of the back chain is completed.

On exit from a procedure, perform the following step:

Traceback Tags

Every assembly (compiled) program needs traceback information for the debugger to examine if the program traps or crashes during execution. This information is in a traceback table at the end of the last machine instruction in the program and before the program's constant data.

The traceback table starts with a full word of zeros, X'00000000', which is not a valid system instruction. The zeros are followed by 2 words (64 bits) of mandatory information and several words of optional information, as defined in the /usr/include/sys/debug.h file. Using this traceback information, the debugger can unwind the CALL chain and search forward from the point where the failure occurred until it reaches the end of the program (the word of zeros).

In general, the traceback information includes the name of the source language and information about registers used by the program, such as which general-purpose and floating-point registers were saved.

Example

The following is an example of assembler code called by a C routine:

#      Call this assembly routine from C routine:
#       callfile.c:
#       main()
#       {
#       examlinkage();
#       }
#       Compile as follows:
#       cc -o callfile callfile.c examlinkage.s
#
#################################################################
#       On entry to a procedure(callee), all or some of the
#       following steps should be done:
#       1.  Save the link register at offset 8 from the
#           stack pointer for non-leaf procedures.
#       2.  If any of the CR bits 8-19(CR2,CR3,CR4) is used
#           then save the CR at displacement 4 of the current
#            stack pointer.
#       3.  Save all non-volatile FPRs used by this routine.
#           If more that three non-volatile FPR are saved,
#            a call to ._savefn can be used to
#           save them (n is the number of the first FPR to be
#            saved).
#       4.  Save all non-volatile GPRs used by this routine
#           in the caller's GPR SAVE area (negative displacement
#           from the current stack pointer r1).
#       5.  Store back chain and decrement stack pointer by the 
#           size of the stack frame.
#
#       On exit from a procedure (callee), all or some of the 
#       following steps should be done:
#       1.  Restore all GPRs saved.
#       2.  Restore stack pointer to value it had on entry.
#       3.  Restore Link Register if this is a non-leaf
#           procedure.
#       4.  Restore bits 20-31 of the CR is it was saved.
#       5.  Restore all FPRs saved.  If any FPRs were saved then
#           a call to ._savefn can be used to restore them 
#           (n is the first FPR to be restored).
#       6.  Return to caller.
#################################################################
#       The following routine calls printf() to print a string.
#       The routine performs entry steps 1-5 and exit steps 1-6.
#       The prolog/epilog code is for small stack frame size.
#       DSA + 8 < 32k
#################################################################
        .file   "examlinkage.s"
#Static data entry in T(able)O(f)C(ontents)
        .toc
T.examlinkage.c:        .tc     examlinkage.c[tc],examlinkage.c[rw]
        .globl  examlinkage[ds]
#examlinkage[ds] contains definitions needed for
#runtime linkage of function examlinkage
        .csect  examlinkage[ds]
        .long   .examlinkage[PR]     
        .long   TOC[tc0]
        .long   0
#Function entry in T(able)O(f)C(ontents)
        .toc
T.examlinkage:  .tc     .examlinkage[tc],examlinkage[ds]
#Main routine
        .globl  .examlinkage[PR]
        .csect  .examlinkage[PR]
#       Set current routine stack variables
#       These values are specific to the current routine and
#       can vary from routine to routine
        .set    argarea,     32
        .set    linkarea,    24
       .set    locstckarea,   0
        .set    nfprs,       18
        .set    ngprs,       19
        .set    szdsa,       
8*nfprs+4*ngprs+linkarea+argarea+locstckarea
#PROLOG:  Called Routines Responsibilities
        #    Get link reg.
        mflr    0
        #    Get CR if current routine alters it.
        mfcr    12
        #    Save FPRs 14-31.
        bl      ._savef14
        cror 31, 31, 31
        #    Save GPRs 13-31.
        stm     13, -8*nfprs-4*ngprs(1)
        #    Save LR if non-leaf routine.
        st      0, 8(1)
        #    Save CR if current routine alters it.
        st      12, 4(1)
        #    Decrement stack ptr and save back chain.
        stu     1, -szdsa(1)
################################
#load static data address
#################################
        l       14,T.examlinkage.c(2)
        #  Load string address which is an argument to printf.
        cal 3, printing(14)
        #  Call to printf routine
        bl      .printf[PR]
        cror 31, 31, 31
#EPILOG:  Return Sequence
        #    Restore stack ptr
        ai      1, 1, szdsa
        #    Restore GPRs 13-31.
        lm      13, -8*nfprs-4*ngprs(1)
        #    Restore FPRs 14-31.
        bl      ._restf14
        cror 31, 31, 31
        #    Get saved LR.
        l       0, 8(1)
        #    Get saved CR if this routine saved it.
        l       12, 4(1)
        #    Move return address to link register.
        mtlr    0
        #    Restore CR2, CR3, & CR4 of the CR.
        mtcrf   0x38,12
        #    Return to address held in Link Register.
        brl        
        .tbtag 0x0,0xc,0x0,0x0,0x0,0x0,0x0,0x0         
          #       External variables
        .extern ._savef14
        .extern ._restf14
        .extern .printf[PR]
#################################
#       Data 
#################################
        .csect  examlinkage.c[rw]
        .align  2
printing:       .byte  'E,'x,'a,'m,'p,'l,'e,' ,'f,'o,'r,' 
                .byte   'P,'R,'I,'N,'T,'I,'N,'G
        .byte   0xa,0x0

Using Milicode Routines

All of the fixed-point divide instructions, and some of the multiply instructions, are different for POWER and PowerPC. To allow programs to run on systems based on either architecture, a set of special routines is provided by the operating system. These are called milicode routines and contain machine-dependent and performance-critical functions. Milicode routines are located at fixed addresses in the kernel segment. These routines can be reached by a bla instruction. All milicode routines use the link register.

Notes:
  1. No unnecessary registers are destroyed. Refer to the definition of each milicode routine for register usage information.
  2. Milicode routines do not alter any floating-point register, count register, or general-purpose registers (GPRs) 10-12. The link register can be saved in a GPR (for example, GPR 10) if the call appears in a leaf procedure that does not use nonvolatile GPRs.
  3. Milicode routines do not make use of a TOC.

The following milicode routines are available:

__mulh Calculates the high-order 32 bits of the integer product arg1 * arg2.
Input R3 = arg1 (signed integer)
R4 = arg2 (signed integer)
Output R3 = high-order 32 bits of arg1*arg2
POWER Register Usage GPR3, GPR4, MQ
PowerPC Register Usage GPR3, GPR4
__mull Calculates 64 bits of the integer product arg1 * arg2, returned in two 32-bit registers.
Input R3 = arg1 (signed integer)
R4 = arg2 (signed integer)
Output R3 = high-order 32 bits of arg1*arg2
R4 = low-order 32 bits of arg1*arg2
POWER Register Usage GPR3, GPR4, MQ
PowerPC Register Usage GPR0, GPR3, GPR4
__divss Calculates the 32-bit quotient and 32-bit remainder of signed integers arg1/arg2. For division by zero and overflow, the quotient and remainder are undefined and may vary by implementation.
Input R3 = arg1 (dividend) (signed integer)
R4 = arg2 (divisor) (signed integer)
Output R3 = quotient of arg1/arg2 (signed integer)
R4 = remainder of arg1/arg2 (signed integer)
POWER Register Usage GPR3, GPR4, MQ
PowerPC Register Usage GPR0, GPR3, GPR4
__divus Calculated the 32-bit quotient and 32-bit remainder of unsigned integers arg1/arg2. For division by zero and overflow, the quotient and remainder are undefined and may vary by implementation.
Input R3 = arg1 (dividend) (unsigned integer)
R4 = arg2 (divisor) (unsigned integer)
Output R3 = quotient of arg1/arg2 (unsigned integer)
R4 = remainder of arg1/arg2 (unsigned integer)
POWER Register Usage GPR0, GPR3, GPR4, MQ, CR0 and CR1 of CR
PowerPC Register Usage GPR0, GPR3, GPR4
__quoss Calculates the 32-bit quotient of signed integers arg1/arg2. For division by zero and overflow, the quotient and remainder are undefined and may vary by implementation.
Input R3 = arg1 (dividend) (signed integer)
R4 = arg2 (divisor) (signed integer)
Output R3 = quotient of arg1/arg2 (signed integer)
POWER Register Usage GPR3, GPR4, MQ
PowerPC Register Usage GPR3, GPR4
__quous Calculates the 32-bit quotient of unsigned integers arg1/arg2. For division by zero and overflow, the quotient and remainder are undefined and may vary by implementation.
Input R3 = arg1 (dividend) (unsigned integer)
R4 = arg2 (divisor) (unsigned integer)
Output R3 = quotient of arg1/arg2 (unsigned integer)
POWER Register Usage GPR0, GPR3, GPR4, MQ, CR0 and CR1 of CR
PowerPC Register Usage GPR3, GPR4

The following example uses the mulh milicode routine in an assembler program:

li R3, -900
li R4, 50000
bla .__mulh
...
.extern .__mulh

Related Information

Assembling and Linking a Program.

Understanding Assembler Passes.

Interpreting an Assembler Listing.

Interpreting a Symbol Cross-Reference.

Understanding and Programming the TOC.

Running a Program.

The b (Branch) instruction, cror (Condition Register OR) instruction.


[ Previous | Next | Contents | Home | Search ]