Use the following table to diagnose problems with the node installation component of PSSP. Locate the symptom and perform the action described in the following table.
If you have a symptom that is not in the table, or the recovery action does
not correct the problem, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
Table 10. Node Installation Symptoms
This is a problem configuring the NIM environment. Refer to PSSP: Messages Reference for the specific errors that are issued by the setup_server command. Follow the repair action described.
Once the repair is complete, run setup_server and the command should now complete.
This is a node bootp failure. Perform the following steps:
splstdata -b -l node_number
If the node's bootp response is not set properly, reset it using the spbootins command and restart the installation.
Typical problems that may be encountered include:
The bootp daemon reports that it is receiving a request from a hardware address that it does not recognize. This indicates a mismatch between the hardware address in the SDR and the adapter on the node.
Delete the NIM client using the delnimclient command, reacquire the hardware address of the node using the sphrdwrad command, run setup_server on the boot/install server node, and attempt the installation again.
If the console output from the s1term command indicates that the node is sending bootp packets, but there is no output from the bootp daemon indicating that it is receiving them, you may have a network problem or an adapter failure. Use hardware diagnostics to isolate and repair the problem and perform the installation again.
Once the problem is corrected, reissue the nodecond command and the install should proceed normally.
The problem is that a tftp of boot image to node failed. Perform these steps:
allow:/tftpboot
Once the problem is corrected, reissue the nodecond command. The install should proceed normally.
The problem is that the node failed to tftp the install_info file. Perform these steps:
If it is not present, verify that the node's bootp response is set to install, and run setup_server.
If there are errors, refer to the entries for the message numbers given in PSSP: Messages Reference and follow the instructions there. If the file still does not exist, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
allow:/tftpboot
Once the problem is corrected, the node should move past the failing LED/LCD and continue the installation.
The problem is that the node failed to tftp the config_info file. Perform these steps:
If it is not present, verify that the node's bootp response is set to install, and run setup_server.
If there are errors, refer to the entries for the message numbers given in PSSP: Messages Reference and follow the instructions there. If the file still does not exist, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
allow:/tftpboot
Once the problem is corrected, the node should move past the failing LED/LCD and continue the installation.
The problem is that the node failed to tftp the script.cust file. Perform these steps:
If it is not present, verify that the node's bootp response is set to install, and run setup_server.
If there are errors, refer to the entries for the message numbers given in PSSP: Messages Reference and follow the instructions there. If the file still does not exist, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
allow:/tftpboot
Once the problem is corrected, the node should move past the failing LED/LCD and continue the installation.
The problem is that the node failed to tftp the tuning.cust file. Perform these steps:
If it is not present, verify that the node's bootp response is set to install, and run setup_server.
If there are errors, refer to the entries for the message numbers given in PSSP: Messages Reference and follow the instructions there. If the file still does not exist, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
allow:/tftpboot
Once the problem is corrected, the node should move past the failing LED/LCD and continue the installation.
The problem is that the node failed to tftp the spfbcheck file. Perform these steps:
If it is not present, verify that the node's bootp response is set to install, and run setup_server.
If there are errors, refer to the entries for the message numbers given in PSSP: Messages Reference and follow the instructions there. If the file still does not exist, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
allow:/tftpboot
Once the problem is corrected, the node should move past the failing LED/LCD and continue the installation.
The problem is that the node failed to tftp the psspfb_script file. Perform these steps:
If it is not present, verify that the node's bootp response is set to install, and run setup_server.
If there are errors, refer to the entries for the message numbers given in PSSP: Messages Reference and follow the instructions there. If the file still does not exist, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
allow:/tftpboot
Once the problem is corrected, the node should move past the failing LED/LCD and continue the installation.
The problem is that the node failed to tftp the psspfb_script file. Perform these steps:
If it is not present, verify that the node's bootp response is set to install, and run setup_server.
If there are errors, refer to the entries for the message numbers given in PSSP: Messages Reference and follow the instructions there. If the file still does not exist, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
allow:/tftpboot
Once the problem is corrected, the node should move past the failing LED/LCD and continue the installation.
The problem is that the node failed to tftp the spsec_overrides file. Perform these steps:
If it is not present, verify that the node's bootp response is set to install, and run setup_server.
If there are errors, refer to the entries for the message numbers given in PSSP: Messages Reference and follow the instructions there. If the file still does not exist, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
allow:/spdata/sys1/spsec/
Once the problem is corrected, the node should move past the failing LED/LCD and continue the installation.
The problem is that the node failed to tftp the krb.conf file. Perform these steps:
If it is not present, verify that the node's bootp response is set to install, and run setup_server.
If there are errors, refer to the entries for the message numbers given in PSSP: Messages Reference and follow the instructions there. If the file still does not exist, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
allow:/etc/krb.conf
Once the problem is corrected, the node should move past the failing LED/LCD and continue the installation.
The problem is that the node failed to tftp the krb.realms file. Perform these steps:
If it is not present, verify that the node's bootp response is set to install, and run setup_server.
If there are errors, refer to the entries for the message numbers given in PSSP: Messages Reference and follow the instructions there. If the file still does not exist, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
allow:/etc/krb.realms
Once the problem is corrected, the node should move past the failing LED/LCD and continue the installation.
The problem is that the node failed to copy the srvtab file. Perform these steps:
If it is not present, verify that the node's bootp response is set to install, and run setup_server.
If there are errors, refer to the entries for the message numbers given in PSSP: Messages Reference and follow the instructions there. If the file still does not exist, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
Once the problem is corrected, the node should move past the failing LED/LCD and continue the installation.
The problem is that running script.cust is causing a hang condition. Investigate /tftpboot/script.cust. This is a user-supplied script. Look for any problems that might cause the node to hang.
Once the problem is corrected, the node can be reinstalled, and the installation should succeed.
The problem is that a PSSP directory failed to mount. Perform these steps to verify that directory is exported on the node's boot/install server:
splstdata -b -l node_number
If it is not present or is empty, create the directory and place the appropriate PSSP install images in it. Then run setup_server and reinstall the node.
If the line is not present, run setup_server on the node's boot/install server.
If there are errors, refer to the entries for the message numbers given in PSSP: Messages Reference and follow the instructions there. If the export still does not succeed, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
Once the problem is corrected, reissue the nodecond command, and the install should proceed normally.
The problem is that nodecond was unable to complete the network boot. Consult the log file listed in the error message to determine the exact cause of the error. Typical errors include:
Close the write mode s1term and reissue the nodecond command. If nodecond is unable to obtain a serial port, and no other write mode s1term is open, refer to Diagnosing System Monitor problems.
After correcting the problem, reissue the nodecond command. The install should proceed normally.
This is a noprompt installation failure. There is a mismatch in the data provided in the noprompt bosinst_data file being used by the node and the actual configuration of the node. The most common mismatch is the specification of a physical disk that does not exist on the node. To determine the nature of the failure, perform the following steps:
Once the problem is corrected, the node should proceed with the install.
This generally indicates a network problem. It may be caused by a bad Ethernet card or by an Ethernet adapter being set to an incorrect duplex setting. You can verify the duplex setting on your nodes using the lsattr command. For example:
busio Bus I/O address False busintr Bus interrupt level False intr_priority 3 Interrupt priority False tx_que_size 64 TRANSMIT queue size True rx_que_size 32 RECEIVE queue size True full_duplex no Full duplex True use_alt_addr no Enable ALTERNATE ETHERNET address True alt_addr 0x000000000000 ALTERNATE ETHERNET address True
Verify that the full_duplex setting is correct on all nodes for your particular network environment.
If all the adapters are correctly set, perform node diagnostics to determine if there is a bad adapter card in your system. If so, contact IBM Hardware Support to have it replaced. If none of these measures resolve the problem, record all relevant information, see Information to collect before contacting the IBM Support Center, and contact the IBM Support Center for further assistance.
This may be caused when a node is replaced with a different type of node without following the procedures documented in PSSP: Installation and Migration Guide. This causes an incorrect setting for the platform type in the NIM client object. If this is the case, follow the procedure for replacing a node with a different type of node in "Reconfiguring the RS/6000 SP System" of PSSP: Installation and Migration Guide. This will cause the NIM client object to be recreated properly.
PSSP 3.4 provides the ability to replace rsh and rcp calls in the PSSP code with secure remote commands and secure remote copy calls.
The root user must be able to run secure remote commands from the control workstation to the nodes without password or passphrase prompts. This normally means that a root public key generated at the control workstation must have been installed on the nodes and the control workstation. In addition, if the boot/install server node for the node is not the control workstation, the boot/install server node's public key must have also been installed on the node.
If secrshell is enabled, and during installation the dsh command fails, you should check that root can issue the secure remote command and secure copy command from the control workstation to the nodes without being prompted for a password or passphrase.
A check should be made that the SDR SP_Restricted class, attributes rcmd_pgm, dsh_remote_cmd, and remote_copy_cmd are correct and consistent. The command splstdata -e displays the current values of these attributes.
If the system administrator is setting the $RCMD_PGM, $DSH_REMOTE_CMD and $REMOTE_COPY_CMD environment variables to override the setting in the SDR, check these variables to make sure that they are consistent. Use these commands:
echo $RCMD_PGM echo $DSH_REMOTE_CMD echo $REMOTE_COPY_CMD
to check that the remote shell command choice is accurate and consistent with the executable defined by the $RCMD_PGM, $DSH_REMOTE_CMD and $REMOTE_COPY_CMD environment variables.
If the dsh_remote_cmd or remote_copy_command attributes of the SDR SP_Restricted class are null, the remote command and remote copy methods used must be in or linked to the bin directory. For example, if rcmd_pgm=rsh and dsh_remote_cmd and remote_copy_cmd are null, then executables /bin/rsh and /bin/rcp must exist.
If root cannot issue a secure command to the node without being prompted for a password or passphrase, this will cause a secure remote command install of the nodes to fail. Check the following: