SP Switch2 operation and recovery are initialized from the switch primary node. When the primary node fails, the primary backup node takes over and becomes the new switch primary node. For SP Switch2 related problems, the error messages that have been logged are found on the current primary node.
SP Switch2 Time Of Day (TOD) is maintained by the Master Switch Sequencer (MSS) node. The MSS node is selected and monitored from the control workstation by the emasterd daemon. For TOD related problems, the logged error messages are found on the control workstation.
If the Switch Admin daemon (cssadm2) is running on the control workstation, the logged error messages are found on the control workstation.
All the SP Switch2 log and temporary files are organized in a directory hierarchy. Next to each directory the specified file level is given.
/var/adm/SPlogs/css node / \ /var/adm/SPlogs/css0 adapter /var/adm/SPlogs/css1 adapter / \ /var/adm/SPlogs/css0/p0 port /var/adm/SPlogs/css1/p0 port
Relevant files are found in the directories:
In this chapter, whenever a temporary file is mentioned, the file level is given using this terminology:
This file has a full path name of /etc/plane.info. It is created by the user who wishes to override the SDR_config switch to plane number calculations. This file is optional. This file consists of one line for each switch, and has the following format:
Frame#:Slot# Plane# Sequence#
The Sequence# is the switch number within the plane. For example, the first switch in plane 1 is sequence number 1, the second switch in plane 1 is sequence number 2, and the first switch in plane 2 is sequence number 1.
A sample file for a two-plane SP Switch2 system would be:
1:17 0 1 2:17 1 1 3:17 0 2 4:17 1 2
If something in this file is incorrect, the switch_plane and switch_plane_seq numbers in the Switch class of the SDR will reflect the errors. If one of your switches goes down or you have to disconnect it, the SDR_config command may try to renumber your switches if it does not see that switch. In this case, you can reserve the spot for that switch by creating an /etc/plane.info file consisting of what your system should look like when that broken switch is up and running.
The /etc/plane.info file should be deleted when no longer needed, or the SDR_config command will always use it to override its own calculations.
In order to isolate an adapter or SP Switch2 error, first view the AIX error log.
The second and third strings enable error tracking. See First Failure Data Capture Programming Guide and Reference. The fourth and fifth strings give the level of this error (adapter or port), which can be used to point to more detailed log files.
errpt | more
Output is similar to the following:
ERROR_ID TIMESTAMP T CL Res Name ERROR_Description 34FFBE83 0604140393T T H Worm Switch Fault-detected by switch chip C3189234 0604135793 T H Worm Switch Fault-not isolated
The Resource Name (Res Name) in the error log gives you an
indication of what resource detected the failure.
Table 31. Resource Name failure indications - SP Switch2
For a more detailed description, issue the AIX command:
errpt -a [-N resource_name] | more
where the optional resource_name is one of the entries in Table 31.
There are several subcomponents that write entries in the AIX error log:
Table 32. Possible causes of adapter failures - SP Switch2
Table 33. Possible causes of fault service daemon failures - SP Switch2
Label and Error ID | Error description and analysis |
---|---|
CS_SW_ADPT_TYPE_ER
CC1DCEED | Explanation: The connected adapter type is not supported
on SP Switch2.
Cause: The user plugged an unsupported adapter or node into the SP Switch2 port. Action: Call the IBM Support Center. |
CS_SW_SEND_HANG_RE
69AB5AEC | Explanation: A Sender Hang was detected.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_TKNCOUNTER_RE
3EF9DDC7 | Explanation: A Token Counter Error occurred.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_INIT_STATE_RE
EB8CFA87 | Explanation: Initialization State Machine error
occurred.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_TOD_ECC_RE
35D40633 | Explanation: Receiver TOD ECC Error occurred.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_CQ_PE_NCL_RE
066BD301 | Explanation: Parity Error on Next Chunk Linked List
occurred.
Cause: SP Switch2 chip saw an error on a received package. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_CQ_PE_FSL_RE
E273ABC6 | Explanation: Parity Error on Free Space Linked List
occurred.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_CQ_SRM_EC_RE
5A19E266 | Explanation: Source Routed Multicast ECC Error
occurred.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_CQ_MCSRDT_RE
89316FCB | Explanation: Multicast Source Routed Decode Table Parity
error occurred.
Cause: A switch chip saw error on received package. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_CQ_MCLRTD_RE
A974CB87 | Explanation: Multicast Lookup Table Route Decoder Parity
Error occurred.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_CQ_RCA_PE_RE
27305E8F | Explanation: Repeat Count Array Parity Error
occurred.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_MULTICASTR_RE
FADF4398 | Explanation: Multicast Route Error occurred.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_CHIP_ID_ER_RE
43D748CF | Explanation: Chip ID Error occurred.
Cause: A switch chip configuration error. The hardware monitor daemon or the control workstation is down. Action: Check to see if the hardware monitor daemon is up. Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_SVC_ARROVR_RE
D32FD026 | Explanation: Service Array Overflow Latch
occurred.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_PE_SVCARRI_RE
BAEF6722 | Explanation: Parity Error on input to Service Array
occurred.
Cause: SP Switch2 chip saw an error on a received package. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_PE_SVCARRO_RE
59D3D44A | Explanation: Parity Error on output to Service Array
occurred.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_INV_SVCCMD_RE
D2833C50 | Explanation: Invalid Service Command error
occurred.
Cause: SP Switch2 chip saw an error on a received package. Action: If the problem persists, call IBM Hardware Service. |
CS_TOD_ERROR_RE
A11A52E1 | Explanation: Error occurred in TOD logic.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_CSS_IF_FAIL_ER
B454D630 | Explanation: SP Switch2 adapter service interface system
call failed.
Cause: Unable to communicate with the SP Switch2 adapter. Action:
|
CS_SW_TKN_CNT_O_RE
E5895205 | Explanation: A Sender Token Count Overflow
occurred.
Cause: SP Switch2 chip failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_ACK_FAILED_RE
D4E9B237 | Explanation: SP Switch2 daemon failed to acknowledge a
service command.
Cause: SP Switch2 communication failure. Cause: A traffic backlog on SP Switch2 adapter. Action: If the problem persists, call the IBM Support Center. |
CS_SW_EDC_ERROR_RE
E6E27F0C | Explanation: An EDC-class error was detected.
Cause: A transient error in data occurred during transmission over switch links. The EDC error may be one of the following:
Cause: A loose, disconnected, or faulty cable. Action:
Cause: A node was shutdown, reset, powered off, or disconnected. Action: See Verify SP Switch2 node operation. Cause: SP Switch2 adapter hardware failure. Action: For more information, see SP Switch2 device and link error information. |
CS_SW_RCVLNKSYNC_RE
37D7841C | Explanation: A Receiver Port Link Synch Failure
occurred.
Cause: A loose, disconnected, or faulty cable. Action:
Cause: A node was shutdown, reset, powered off, or disconnected. Action:
Cause: SP Switch2 adapter hardware failure. Action:
Cause: Remote SP Switch2 adapter hardware failure. Action:
|
CS_SW_FIFOOVRFLW_RE
821465C1 | Explanation: A Receiver FIFO Overflow error was
detected.
Cause: A loose, disconnected, or faulty cable. Action:
Cause: A node was shutdown, reset, powered off, or disconnected. Action:
Cause: SP Switch2 adapter hardware failure. Action:
|
CS_SW_EDCTHRSHLD_RE
39FCD5B9 | Explanation: EDC Error Threshold condition
occurred.
Cause: A loose, disconnected, or faulty cable. Action:
|
CS_SW_RECV_STATE_RE
255F1AA2 | Explanation: SP Switch2 receiver state machine
error.
Cause: SP Switch2 adapter or switch failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_PE_ON_DATA_RE
1F59782A | Explanation: SP Switch2 sender parity error on data was
detected.
Cause: SP Switch2 board failure. Action: Call IBM Hardware Service. |
CS_SW_INVALD_RTE_RE
02A63E85 | Explanation: SP Switch2 sender invalid route error
occurred.
Cause: SP Switch2 adapter microcode or a switch daemon software error. Action: Call the IBM Support Center. |
CS_SW_SNDLOSTEOP_RE
F835CDED | Explanation: Sender Lost EOP (end-of-packet) condition
occurred.
Cause: A loose, disconnected, or faulty cable. Action:
Cause: A node was shutdown, reset, powered off, or disconnected. Action: See Verify SP Switch2 node operation. Cause: SP Switch2 adapter hardware failure. Action:
|
CS_SW_SNDTKNTHRS_RE
80CF3B5A | Explanation: A Token Error Threshold error
occurred.
Cause: A loose, disconnected, or faulty cable. Action:
|
CS_SW_SND_STATE_RE
74CEAB0F | Explanation: A Sender State Machine Error
occurred.
Cause: SP Switch2 adapter or SP Switch2 failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_PE_ON_NMLL_RE
7F704673 | Explanation: A Parity Error on the NMLL was
detected.
Cause: SP Switch2 board failure. Action: Call IBM Hardware Service. |
CS_SW_CRC_SVCPKT_RE
8B091668 | Explanation: SP Switch2 service logic detected an
incorrect CRC on a Service Packet.
Cause: A transient error in data occurred during transmission over SP Switch2 links. Action: See /var/adm/SPlogs/css[0 | 1]/p0/flt for more information. Cause: A loose, disconnected, or faulty cable. Action:
Cause: A node was shutdown, reset, powered off, or disconnected. Action:
Cause: SP Switch2 adapter hardware failure. Action:
|
CS_SW_PE_RTE_TBL_RE
E8F741CD | Explanation: SP Switch2 service logic detected an
incorrect Parity Error in the Route Table.
Cause: SP Switch2 board failure. Action: Call IBM Hardware Service. |
CS_SW_SVC_STATE_RE
CF66D3CC | Explanation: SP Switch2 service logic state machine
error.
Cause: SP Switch2 board failure. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_OFFLINE_RE
57959ED9 | Explanation: Node received a fence (Offline)
request.
Cause: The operator ran the Efence command. Action: Run the Eunfence command to bring the node onto the SP Switch2. |
CS_SW_PRI_TAKOVR_RE
A8978621 | Explanation: SP Switch2 primary node takeover.
Cause: The SP Switch2 primary node became inaccessible. Action: See the AIX error log on the previous SP Switch2 primary node. |
CS_SW_BCKUP_TOVR_RE
FD2D84AD | Explanation: SP Switch2 primary-backup node
takeover.
Cause: The SP Switch2 primary-backup node became inaccessible Action: See the AIX error log on the previous switch primary- backup node. |
CS_SW_LST_BUP_CT_RE
2196D5B4 | Explanation: SP Switch2 primary-backup node not
responding.
Cause: The SP Switch2 primary-backup node become inaccessible. Action: See the AIX error log on the current SP Switch2 primary-backup node. |
CS_SW_UNINI_NODE_RE
96DD24B7 | Explanation: SP Switch2 nodes not initialized during
Estart command processing.
Cause: The listed nodes were shutdown, reset, powered off, or disconnected. Action:
Cause: SP Switch2 adapter problem. Action: Run adapter diagnostics on listed nodes. |
CS_SW_UNINI_LINK_RE
362E5B7 | Explanation: SP Switch2 links were not initialized during
Estart command processing.
Cause: The node was fenced. Action: Run the Eunfence command to unfence the node. Cause: The switch cable is not wired correctly. Action: See /var/adm/SPlogs/css[0 | 1]/p0/cable_miswire to determine if cables were not wired correctly. Cause: A loose, disconnected, or faulty cable. Action: See Cable diagnostics. |
CS_PROCESS_KILLD_RE
D250F9DB | Explanation: User Process was killed due to link
outage.
Cause: SP Switch2 adapter failure or SP Switch2 failure. Cause: The operator fenced this node. Action: See neighboring error log entries to determine the cause of the outage. |
CS_SW_MISWIRE_ER
933B622E | Explanation: SP Switch2 cable miswired (not connected to
the correct switch jack).
Cause: The SP Switch2 cable was not wired correctly. Action: See /var/adm/SPlogs/css[0 | 1]/p0/cable_miswire to determine if cables were not wired correctly. |
CS_SW_HARDWARE_ER
F96576C4 | Explanation: Defective SP Switch2 board.
Cause: SP Switch2 board configuration problem. Action:
Cause: Faulty SP Switch2 board. Action: If the problem persists, call IBM Hardware Service. |
CS_SW_LOGFAILURE_RE
5ABE7E20 | Explanation: Error writing SP Switch2 log files.
Cause: The /var file system is full. Action: Obtain free space in the file system or expand the file system. Cause: There are too many files open in the system. Action: Reduce the number of open files in the system. |
CS_SW_INIT_FAIL_ER
957E82AA | Explanation: Switch fault-service daemon initialization
failed.
Cause: The operating environment could not be established. Action:
|
CS_SW_SIGTERM_ER
A98EF5D8 | Explanation: SP Switch2 fault service daemon received
SIGTERM.
Cause: Another process sent a SIGTERM. Action: Run the rc.switch command to restart the daemon. |
CS_SW_SVC_Q_FULL_RE
172826EF | Explanation: SP Switch2 service send queue is
full.
Cause: There is a traffic backlog on the SP Switch2 adapter. Action: If the problem persists, call the IBM Support Center. |
CS_SW_GET_SVCREQ_ER
4DFEC48 | Explanation: SP Switch2 daemon could not get a service
request.
Cause: SP Switch2 device driver failure. Action: Call the IBM Support Center. |
CS_SW_RSGN_PRIM_RE
585D90B2 | Explanation: The SP Switch2 Primary node resigned from
the job as primary node.
Cause: Could not communicate over the SP Switch2. Action: See neighboring AIX error log entries to determine the cause of the outage. Cause: Another node was selected as the primary node. Action: None. |
CS_SW_RSGN_BKUP_RE
C32FD9D3 | Explanation: Resigning as the SP Switch2 primary-backup
node.
Cause: Could not communicate over the SP Switch2. Action: See neighboring error log entries to determine the cause of the outage. Cause: Another node was selected as the primary-backup node. Action: None. |
CS_SW_SDR_FAIL_RE
3E6F3E2E | Explanation: Switch fault service daemon failed to
communicate with SDR.
Cause: An Ethernet overload. Action: If the problem persists, call the IBM Support Center. Cause: Excessive traffic to the SDR. Action: If the problem persists, call the IBM Support Center. Cause: The SDR daemon on the control workstation is down. Action: Check to see if the SDR daemon is up. Cause: A software error. Action: If the problem persists, call the IBM Support Center. |
CS_SW_SCAN_FAIL_ER
63589548 | Explanation: SP Switch2 scan failed.
Cause: Could not communicate over the SP Switch2. Cause: SP Switch2 adapter failure or a SP Switch2 failure. Action: Issue the Estart command if primary takeover does not occur. |
CS_SW_PLANEMISW_ER
94E99A66 | Explanation: SP Switch2 plane miswire.
Cause: SP Switch2 cable is connected on one side to a switch-port or node-port belonging to a different SP Switch2 plane than the one that the other side of the cable is connected to. Action: See /var/adm/SPlogs/css[0 | 1]/p0/cable_miswire to determine which cables were not wired correctly. |
CS_SW_NODEMISW_RE
A19DCA76 | Explanation: SP Switch2 node miswired.
Cause: SP Switch2 cable was not plugged into the correct node. Action: See /var/adm/SPlogs/css[0 | 1]/p0/cable_miswire to determine if cables were not wired correctly. |
CS_SW_NODECONF_RE
CEB4B5AF | Explanation: SP Switch2 node configuration error.
Cause: A switch node was not configured properly to the system. Cause: An unknown node was plugged into the system - probable miswire. Action:
|
CS_SW_RTE_GEN_RE
44D2A1B5 | Explanation: SP Switch2 daemon failed to generate
routes.
Cause: A software error. Action: Call the IBM Support Center. |
CS_SW_FENCE_FAIL_RE
A6E635F9 | Explanation: Fence of node off SP Switch2 failed.
Cause: Could not communicate over the SP Switch2. Action:
|
CS_SW_REOP_WIN_ER
0C17D5C7 | Explanation: Switch fault service daemon reopen adapter
windows failed.
Cause: SP Switch2 adapter or daemon recovered. Action: If the problem persists, call the IBM Support Center. |
CS_SW_ESTRT_FAIL_RE
4EE9669F | Explanation: Estart command failed - switch
network could not be initialized.
Cause: Could not initialize SP Switch2 chips or nodes. Action:
|
CS_SW_IP_RESET_ER
A6BCABA3 | Explanation: Switch fault service daemon could not reset
IP.
Cause: SP Switch2 device driver error. Action: If the problem persists, call the IBM Support Center. |
CS_SW_CBCST_FAIL_RE
31C01480 | Explanation: Switch fault service daemon command
broadcast failed.
Cause: Could not communicate over the SP Switch2. Cause: A traffic backlog on the SP Switch2 adapter. Action: If the problem persists, call the IBM Support Center. |
CS_SW_UBCST_FAIL_RE
F7704403 | Explanation: Switch fault service daemon database updates
broadcast failed.
Cause: SP Switch2 communication failure. Cause: A traffic backlog on the SP Switch2 adapter. Action: If the problem persists, call the IBM Support Center. |
CS_SW_DNODE_FAIL_RE
19337D09 | Explanation: Switch daemon failed to communicate with
dependent nodes.
Cause: Failed to communicate over the SP Switch2. Cause: A traffic backlog on the SP Switch2 adapter. Action: If the problem persists, call the IBM Support Center. |
CS_SW_PORT_STUCK_RE
889BE7C3 | Explanation: SP Switch2 port cannot be disabled.
Eunfence command failed.
Cause:
Action:
|
CS_SW_FSD_TERM_ER
1C27CFCD | Explanation: Switch fault service daemon process was
terminated.
Action: See preceding error log entries to determine the cause of the failure. Cause: Faulty system planar. Action: Run complete diagnostics on the node. Cause: Not enough free space left in the node's /var/adm/SPlogs file system. Action: Obtain more space. |
Table 34. Possible causes of adapter diagnostic failures - SP Switch2
Label and Error ID | Error description and analysis |
---|---|
SWT_DIAG_ERROR1_ER
8998B96D | Explanation: SP Switch2 adapter failed post-diagnostics,
see the man page for the diag command.
Cause: Faulty switch adapter. Action: See SP Switch2 adapter diagnostics. |
SWT_DIAG_ERROR2_ER
2FFF253A | Explanation: SP Switch2 adapter failed
diagnostics.
Cause: Faulty switch adapter. Action: See SP Switch2 adapter diagnostics. |
Table 35. Possible causes of SP Switch2 TOD management (emasterd) failures
Table 36. Possible causes of SP Switch2 PCI Adapter failures
The following table is based on the possible values of the adapter_config_status attribute of the Adapter object of the SDR. Use the following command to determine its value:
SDRGetObjects Adapter adapter_type==[css0 | css1] node_number adapter_config_status
Use the value of the adapter_config_status attribute for the node in question, to index into Table 37. The value of a correctly configured CSS adapter is css_ready.
/usr/lpp/ssp/css/cfgcol -v -l [css0 | css1] > output_file_name
Table 37. adapter_config_status values - SP Switch2
adapter_config_status | Explanation and recovery |
---|---|
css_ready | Correctly configured CSS adapter. |
odm_fail
genmajor_fail genminor_fail getslot_fail build_dds_fail | Explanation: An ODM failure has occurred while
configuring the CSS adapter.
Action: Rerun the adapter configuration command. If the problem persists, contact the IBM Support Center and supply the command output. |
getslot_fail | Verify that the CSS adapter is properly seated, then rerun the adapter configuration command. If the problem persists, contact the IBM Support Center and supply them with the command output. |
busresolve_fail | Explanation: There are insufficient bus resources to
configure the CSS adapter.
Action: Contact the IBM Support Center. |
dd_load_fail | See Verify software installation. If software installation verification is successful and the problem persists, contact the IBM Support Center. |
make_special_fail | Explanation: The CSS device special file could not be
created during adapter configuration.
Action: Rerun the adapter configuration command. If the problem persists, contact the IBM Support Center and supply them with the command output. |
dd_config_fail | Explanation: An internal device driver error occurred
during CSS adapter configuration.
Action: See Information to collect before contacting the IBM Support Center. |
diag_fail | Explanation: SP Switch2 diagnostics failed.
Action: See SP Switch2 adapter diagnostics. |
not_configured | Explanation: The CSS adapter is missing or not configured. |
pdd_init_fail
load_khal_fail | See Verify software installation. If software installation verification is successful and the problem persists, contact the IBM Support Center. |
The device and link current status is gathered in the annotated switch topology file, out.top, that is created on each plane of each node that has its corresponding switch_responds set to 1. For plane 0, switch_responds0 must be 1. For plane1, switch_responds1 must be 1. The file looks like the switch topology file except that for each device or link that differs from the operational default status, an additional comment is made. For the directory that contains the out.top file, see SP Switch2 log and temporary file hierarchy.
These additional comments are appended to the file by the fault service daemon and reflect the current connectivity status of the link or device. No comment on a link or device line means that the link or device exists and is operational. The comment format is:
ideal-topology-line device-status-no which-device:device-status-string (link-status-string)
where:
Not all the comments reflect an error. Some may be a result of the system configuration or current system administration.
An example of a failing entry and description is in out.top. If the listed recovery actions fail to resolve your problem, contact the IBM Support Center.
The possible device status values for SP Switch systems, with their
recovery actions, are listed in Table 38. The possible link status values for SP Switch
systems, with their recovery actions, are listed in Table 39. Additional miswire information can be found in cable_miswire.
Table 38. SP Switch2 device status and recovery actions
Device status number | Device status text | Explanation and recovery actions |
---|---|---|
2 | Initialized | Explanation: Both devices are initialized. The
port's link status is not operational.
Cause: The link is faulty. Action: See Table 39 for link status. |
0 | Uninitialized | Explanation: No device is connected to this port.
Cause: There is no cable connected to this port. Action: If this is intentional, no action is needed. If not, connect a cable to the port. |
-3 | The device has been removed from network because of a bad signature | Explanation: The device was removed from the switch
network - device configuration failure.
Cause: A fault on the device. Action: Contact IBM Hardware Service. |
-4 | Device has been removed from network - faulty. | Explanation: The device has been removed from the switch
network.
Cause: A fault on the device. Action: If the device in question is a node, see Verify SP Switch2 node operation. Otherwise, contact IBM Hardware Service. |
-5 | Device has been removed from the network by the system administrator. | Explanation: The device was placed offline by the systems
administrator (Efence).
Cause: The switch administrator ran the Efence command. Action: Eunfence the device. |
-6 | Device has been removed from network - no AUTOJOIN. | Explanation: The device was removed and isolated from the
switch network.
Cause: The node was Efence without AUTOJOIN, the node was rebooted or powered off, or the node faulted. Action: First attempt to Eunfence the device. If the node fails to rejoin the switch network, see AIX Error Log information. If the problem persists, contact the IBM Support Center. |
-7 | Device has been removed from the network for not responding. | Explanation: The device was removed from the switch
network.
Cause: An attempt was made to contact the device, but the device did not respond. Action: If the device in question is a node, see Verify SP Switch2 node operation. Otherwise, contact the IBM Support Center. |
-8 | Device has been removed from the network because of a miswire. | Explanation: The device is not cabled properly.
Cause: Either the switch network is miswired, or the frame supervisor tty is not cabled properly. Action: First view the /var/adm/SPlogs/css[0 | 1]/p0/cable_miswire file. Verify and correct all links listed in the file. Then issue the Estart command. If the problem persists, contact IBM Hardware Service. |
-9 | Destination not reachable. | Explanation: The device was not reachable through the
switch network.
Cause: This is generally due to other errors in the switch network fabric. Action: Investigate and correct the other problems, then run the Estart command. |
Table 39. SP Switch2 link status and recovery actions
Link Status Number | Link Status Text | Explanation and Recovery Actions |
---|---|---|
0 | Uninitialized | Explanation: The link is uninitialized.
Cause: Switch Initialization was not complete. Action: Try to Estart the switch network again. If the problem persists, contact the IBM Support Center. |
-1 | The link is not operational - link re-timing | Explanation: The link is in the initialization
stage.
Cause: If the problem persists, the link may be faulty - Cable or interposer card faulty. Action: First attempt to Estart the switch again, If the link does not come up, try switching the cable or connecting a wrap plug to test the interposer card. |
-2 | Wrap plug is installed. | Explanation: This link is connected to a wrap
plug.
Cause: The wrap plug is connected to the port in order to test the port. This is not normally a problem. Action: None. |
-3 | The link is not operational - link failed to time. | Explanation: The link failed to initialize.
Cause: If problem persists, the link maybe faulty - Cable or interposer card faulty. Action: First attempt to Estart again. If the link does not come up, try switching the cable or connecting a wrap plug to test the interposer card. |
-4 | Link has been removed from network or miswired - faulty. | Explanation: The link is not operational and was removed
from the switch network.
Cause: Either the link is miswired or the link has failed. Action: First check the /var/adm/SPlogs/css[0 | 1]/p0 directory for the existence of a cable_miswire file. If the file exists, verify and correct all links listed in the file. Then issue the Estart command. If the cable_miswire file does not exist, examine the /var/adm/SPlogs/css[0 | 1]/p0/flt file for entries relating to this link. If entries are found, verify that the cable is seated at both ends, then run the Estart command. If the problem persists, contact the IBM Support Center. |
-5 | The link has been removed from network by the system administrator | Explanation: The link was removed (commented out) from the switch network by the switch administrator. This is not a problem. |
-6 | The link has been removed from network - no AUTOJOIN | Explanation: The device was removed and isolated from the
switch network.
Cause: The node was Efence without AUTOJOIN, the node was rebooted or powered off, or the node faulted. Action:
|
-7 | Link has been removed from network - fenced. | Explanation: The device was placed offline by the Systems
Administrator (Efence).
Action: Eunfence the associated node. |
-8 | Link has been removed from network - probable miswire. | Explanation: The link is not cabled properly.
Action: View the /var/adm/SPlogs/css[0 | 1]/p0/cable_miswire file. Verify and correct all links listed in the file, then run the Estart command. |
-9
|
Link has been removed from network - not connected.
| Explanation: The link cannot be reached by the primary
node, therefore initialization of the link is not possible.
Cause: This is generally caused by other problems in the switch network, such as a switch chip being disabled. Action: Investigate and correct the underlying problem, then run the Estart command. |