The following items are used to isolate problems in the SP Switch2 component of PSSP. More detailed information about each item appears in Error information. Before collecting any other information, check for the existance of the /etc/plane.info file, and make a copy of the file if it exists. See plane.info file.
The /usr/lpp/ssp/css/css.snap script collects log, trace, and dump information created by SP Switch2 support code (device driver, worm, fault-service daemon, diagnostics) into a single compressed package.
The complete package output file is in the directory: /var/adm/SPlogs/css. The file name varies according to the options:
hostname.yymmddhhmmss.adapter[0 | 1].css.snap.tar.Z
hostname.yymmddhhmmss.adapter[0 | 1].port0.css.snap.tar.Z
where hostname is the hostname of the node where the css.snap command was issued, and yymmddhhmmss is the date and time that the css.snap information was collected.
The css.snap script creates a log file, /var/adm/SPlogs/css/css.snap.log where all the files gathered in the package are listed.
This script is called whenever a serious error is detected by the switch support code. To directly cause the system to create such a snapshot, login the desired node and manually issue the command:
/usr/lpp/ssp/css/css.snap [-c | -n] [-s] -a [css0 | css1] [-p p0]
Table 40 shows the error log entries that automatically take a
snapshot, as well as the type of snap performed. The soft type
enables a continuation of work with the switch. The full snap
might corrupt the adapter, forcing an adapter reset and the node to
be fenced off of the switch.
Table 40. AIX Error Log entries that invoke css.snap - SP Switch2
|Error Log entry||Snap type (full/soft)|
Collect the css.snap information from both the primary node and all nodes that are experiencing SP Switch2 problems. Do not reboot the nodes before running css.snap, because rebooting causes the loss of valuable diagnostic information.
The css.snap script collects all the files which reside in the /var/adm/SPlogs/css, /var/adm/SPlogs/css0, /var/adm/SPlogs/css1, /var/adm/SPlogs/css1/p0, and /var/adm/SPlogs/css0/p0 directories, and some additional files from the /tmp directory. Some of the files reside on each node, while others reside only on the primary node or on the control workstation.
Table 41 lists important files gathered by css.snap and their location at css.snap time. Some of the files are created by css.snap in order to gather concurrent information on the switch status. For the SP Switch2 PCI Adapter, additional files are collected by css.snap.corsair and they are noted after the table.
Table 41 contains the list of files that are collected by
css.snap and their location in the log directory
Table 41. SP Switch2 log files
|Number||Log File name||Hierarchy||Contents||Location|
|1||adapter.log||adapter||Fault service daemon adapter status information. For more information, see adapter.log.||nodes|
|2||cable_miswire||port||Node-to-switch or switch-to-switch plane miswired connection information. For more information, see cable_miswire.||primary node|
|3||cadd_dump.out||adapter||Most recent css.snap's cadd_dump command dump file. SP Switch2 adapter device driver trace buffer dump file. For more information, see cadd_dump.out.||nodes|
|4||chgcss.log||node||Log file of chgcss, which changes the adapter device driver's attributes. For more information, see chgcss.log.||nodes|
|5||col_dump.out||adapter||The most recent css.snap's col_dump command dump file. Microcode dump information. For more information, see col_dump.out.||nodes|
|6||colad.trace||adapter||SP Switch2 adapter diagnostics messages. For more information, see colad.trace.||nodes|
|7||core||node||Fault service daemon core dump file.||nodes|
|8||cssadm2.debug||node||Trace of cssadm2 daemon. For more information, see cssadm2.debug.||control workstation|
|9||cssadm2.stderr||node||Unexpected error messages received by the cssadm2 daemon. For more information, see cssadm2.stderr.||control workstation|
|10||cssadm2.stdout||node||Unexpected informational messages received by the cssadm2 daemon. For more information, see cssadm2.stdout.||control workstation|
|11||css.snap.log||node||css.snap snapshot command log information - list of all files gathered in the last snapshot. For more information, see css.snap.log.||nodes|
|12||CSS_test.log||node||Present if the CSS_test command was run on the node.||nodes|
|13||daemon.log||node||Fault service daemon output file. For more information, see daemon.log.||nodes|
|14||DeviceDB.dump||port||Latest dump of the device data base from the fault service daemon. See DeviceDB.dump.||nodes|
|15||Ecommands.log||node||Log entries of all Ecommands. For more information, see Ecommands.log.||control workstation|
|16||emasterd.log||node||TOD Management emasterd daemon - errors and notifications. For more information, see emasterd.log.||control workstation|
|17||emasterd.stdout||node||TOD Management emasterd daemon - more detailed trace file. For more information, see emasterd.stdout.||control workstation|
|18||errpt.out||node||Most recent errpt -a and errpt results.
For more information, see errpt.out and the errpt command entry in AIX Command and Technical Reference.
|19||flt||port||Hardware error conditions found on the SP Switch2, recovery action taken by the fault-service daemon, and general operations that alter the SP Switch2 configuration. For more information, see flt.||nodes|
|20||fs_daemon_print.file||port||Fault service daemon port status information. For more information, see fs_daemon_print.file.||nodes|
|21||ifcl_dump.out||adapter||Most recent css.snap's ifcl_dump command dump file. IP dump information. For more information, see ifcl_dump.out.||nodes|
|22||logevnt.out||node||Log error log events monitored by ha. For more information, see logevnt.out.||nodes|
|23||netstat.out||adapter||Most recent css.snap's netstat command dump file. Network status information. For more information, see netstat.out, and the entry for the netstat command in AIX Command and Technical Reference.||nodes|
|24||odm.out||adapter||The node's adapter_status configuration as saved in the ODM. For more information, see odm.out.||nodes|
|25||out.top||port||SP Switch2 plane link information. For more information, see out.top.||nodes|
|26||rc.switch.log||node||Fault service daemon initialization information. For more information, see rc.switch.log.||nodes|
|27||rc.switch.log.previous||node||Node's previous fault service daemon initialization information. For more information, see rc.switch.log.||nodes|
|28||regs.out||adapter||Most recent css.snap's read_regs command dump file. SP Switch2 adapter's registers dump file. For more information, see regs.out.||nodes|
|29||router.log||port||SP Switch2 routing information. For more information, see router.log.||nodes|
|30||scan_out.log||adapter||TBIC scan ring binary information. For more information, see scan_out.log and scan_save.log.||nodes|
|31||scan_save.log||adapter||Previous TBIC scan ring binary information. For more information, see scan_out.log and scan_save.log.||nodes|
|32||spd.trace||port||Tracing of advanced switch diagnostics. See spd.trace.||control workstation|
|33||spdata.out||port||Most recent css.snap's splstdata command dump file. SP Switch2 data requests. For more information, see spdata.out.||primary node|
|34||summlog.out||node||Error information from the css.summlog daemon. For more information, see summlog.out.||control workstation|
|35||topology.data||port||System error messages from the distribution of the topology file to the secondary nodes. For more information, see topology.data.||primary node|
If css.snap is invoked for an SP Switch2 PCI Adapter error, the css.snap.corsair script is invoked. The following log files are also included:
The css.snap command avoids filling up the /var directory by following these rules:
The css.snap command is called automatically from the fault service daemon when certain serious errors are detected. The css.snap command can also be issued from the command line when a switch or adapter related problem is suspected. See css.snap package.