IBM Books

Diagnosis Guide


Trace information

NIM debug SPOT

A NIM debug SPOT (Shared Product Object Tree) contains trace output from NIM invocation on the node. To diagnose a hang during installation (LED 611), it may be necessary to create and use a NIM debug SPOT. This section describes how to do this.

Note:
To run diagnostics on a node, the node supervisor card must be at microcode version 1294 or later versions. To determine the microcode level of the card, issue this command on the control workstation, substituting # with the frame number and node number respectively.
/usr/lpp/ssp/bin/spmon -G -q -l frame#/node#/codeVersion/value
If your card is not at microcode version 1294 or later versions, debug installation may loop issuing this message: 032-001 You entered a command command_name that is not valid.
  1. Obtain the lppsource name and boot/install server for the failing node by issuing the command:
    splstdata -b -n node_number
    
  2. On the control workstation, issue the spbootins command to set the boot response to disk. For example, for frame 1 node 15, issue the command:
    spbootins -r disk 1 15 1
    

    This will issue the necessary NIM commands to prepare for reallocation of the debug SPOT for frame 1 node 15.

  3. From the boot/install server issue:
    nim -Fo check -a debug=yes spot_lppsource_name
    
  4. When the previous command completes, issue the command:
    lsnim -l spot_lppsource_name
    
  5. Look for lines that start with: enter_dbg =.

    Choose a line as follows:

    The line chosen contains an address, such as 0x0013afa0. Omit the 0x part and record the remainder of the address.

  6. Issue the spbootins command to set the node's boot response to install.

    For example, for frame 1 node 15 issue the command:

    spbootins -r install 1 15 1
    
  7. Condition the node. For example, for frame 1 node 15 issue the command:
    nodecond 1 15 &
    
  8. Open a read-only tty. For frame 1 node 15, issue the command:
    s1term 1 15
    

    This may take a few minutes to complete. Do not enter anything until it finishes with:

    Trap instruction interrupt 
    

    and a 0> prompt is displayed. Type <Ctrl-c> to stop your s1term.

  9. Now start a console log to capture debug output:
    script filename
    
    where filename is a name you choose for the file.
  10. Open a write console to the node:
    s1term -w
    
  11. You should see the 0> prompt again. Issue:
    st hex_number 2
    
    (where hex_number is the address recorded in Step 5, omitting the 0x).
  12. You should see the 0> prompt, again. Issue:
    g
    

    The netboot will now be displayed as live. Network installation progress and SP-specific LED/LCD values give the meanings of the LED/LCD codes and help determine approximately where in the boot process your node is.

    As the node boots, it may hang with LED/LCD c46. This does not indicate a problem, but the debug netboot needs to be restarted by issuing <Ctrl-q>. If there is a hang at any other LED/LCD value, stop logging by going to Step 13.

  13. When the node hangs, exit the tty by typing <Ctrl-x>. Then, stop the logging by sending a kill signal to the script process from Step 7, by issuing:
    kill pid 
    
    To get the pid, issue:
    ps -ef | grep script 
    
    If there are two scripts, killing the child process will stop both of them, or the kill command may be used on both processes.

    Now view the log file to determine what went wrong with the installation. If you contact the IBM Support Center, make sure that you have the log file available.

  14. Finally, you will need to re-create a regular version of the SPOT. From the control workstation, issue this command:
    nim -Fo check spot_lppsource_name
    
    where lppsource_name is the same name used in Step 3.

NIM SPOT logs

NIM SPOT logs contain trace output from SPOT creation or update. The logs are located on the boot/install server. The trace is automatically activated when a SPOT is updated or created. These logs are located in: /tmp/spot.out.pid and /tmp/spot.updated.out.pid. Error messages give the exact file name.

NIM commands and their responses are recorded. Look for errors associated with command invocation, errors associated with file set installation, and errors associated with other NIM-related activities.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]