[ Previous | Next | Contents | Home | Search ]
AIX Versions 3.2 and 4 Asynchronous Communications Guide

Problem Determination

This section provides information for users who encounter problems while operating the Network Terminal Accelerator adapter. Use this section to determine what to do if you run into problems using the adapter.

Firmware Self-Test

Whenever power is first applied to the Network Terminal Accelerator adapter, it initializes and tests its circuitry. The i80960CA processor executes an on-chip self-test, and then loads its initial memory image (IMI) from ROM. After successfully loading the IMI, the i80960CA begins executing a set of self-test routines from ROM-resident firmware. These tests verify that the adapter hardware components listed below are operating properly:

The Network Terminal Accelerator adapter diagnostics can be run from the System Management Interface Tool (SMIT) and from standalone initial program load (IPL) of the host. The diagnostics verify correct hardware operation of the adapter components and the Ethernet interface.

The following information describes the test units (TUs) used during the different diagnostic operations.

TU 10 (Internal POST) Invokes the adapter power on/reset self-test by issuing a reset command to the board. The self-test program tests most of the functional blocks on the adapter. The Ethernet transceiver and connector are not checked.
TU 20 (POS/VPD) Includes two parts: POS register test and VPD test. The POS register test reads the adapter POS registers and validates the contents. The test tests portions of the Micro Channel interface and the host's ability to access the board. The VPD test checks portions of the Micro Channel interface, microprocessor, and main memory.
TU 30 (DMA Loopback) Checks the data path between the Micro Channel and the RAM on the adapter. This test does not send data out the Ethernet port.
TU 50 (Adapter/Controller Wrap Test) Checks the entire data path of the adapter, including the Ethernet controller, transceiver, and connector. This test requires the use of Ethernet wrap plug, IBM PN: 70F9625.

SMIT Options

The following list indicates which test units are used for specific problems:

Standalone Options

The following list indicates which test units should be used for specific problems on standalone machines:

Error Logging and Analysis

The Network Terminal Accelerator adapter presents error information to the host. During system startup, or under control of the adapter diagnostics, the host can monitor the progress and results of the adapter on-board self-test. The driver should log any self-test failures as fatal. When it logs a self-test failure, the driver should log the self-test checkpoint or error code reported by the adapter. The self-test error codes and checkpoints are listed and, respectively.

During the host handshake portion of the self-test, the driver is responsible for detection of some errors, as well as for logging them. In these cases, the driver should log additional data related to the error (for example, expected versus read data values).

During normal operation, the adapter reports error conditions through the DPRAM interface along with normal control and status information. These error codes are detailed. The driver should log any such errors as fatal.

The following hardware failure error messages are logged by the adapter driver modules upon detection of any of the errors described above. "Error Codes" describes in detail the various status and error codes logged with each error. In all cases, replacement of the failing adapter is the recommended corrective action.

rhpdd Error Messages

The following list details error messages from the rhpdd device driver indicating power-on self-test failures:

Error Message Comment
Self-test error code 0xHH HH is a self-test error code.
Self-test timed out, code 0xHH HH is a self-test checkpoint.
Self-test: handshake timed out, got 0xHH HH is the last data read during the handshake test.
Self-test: unexpected interrupt Driver received a spurious interrupt from the board during the handshake test.
Self-test: ISR ready error, flag 0xHH Adapter did not indicate ready during the interrupt portion of the host handshake test.
Self-test: ISR test error. exp 0xHH, got 0xHH Data compare error during interrupt-driven handshake test.
Self-test: interrupt not received Driver received no interrupt during handshake test.
Self-test: completion timed out, code HH HH is a self-test checkpoint.

The following list details error messages from the rhpdd device driver indicating hardware failures detected during operation:

Error Messages Comments
invalid dp_obuf count=N Byte count presented in the adapter's dp_ocnt field is out of range.
code/message= 0xHH/0xHHHHHHHH Indicates that the adapter has crashed, and gives the error code and message present at the time of the crash.

FRU List

There are two models of the adapter. They differ only in the amount of main memory installed on the board. The following list gives the field-replaceable unit (FRU) part number corresponding to each model.

Model FRU Part Number
2MB (256 users) 51G8538
8MB (2048 users) 51G8539

Error Codes

The following list details the error codes that the adapter board might log if the POST fails (at power-up or during the execution of TU 20). The codes are grouped according to type.

These error codes indicate a hardware malfunction of some kind, generally requiring adapter repair or replacement.

Internal Data RAM Test
80 Internal data RAM locations 0004H to 03FFH were written with each location's 32-bit address; when read back, the written addresses did not match the true addresses.
81 Internal data RAM locations 0004H to 03FFH were written with either the data pattern 55555555H or AAAAAAAAH; when checked, the pattern was not correct.
82 Internal data RAM locations 0004H to 03FFH were written with 00; when read back, the written data did not equal 00.
Register Cache Test
84 During the testing of the internal register cache RAM, one of the values read back from the cache RAM did not match the value written.
UART Test
89 The universal asynchronous receiver/transmitter (UART) status register contains an incorrect value.
8A Data is looped through the UART in polled mode; the byte read does not match the byte written.
8B A parity, overrun, framing, or break error occurred during the polled loop.
8C One character was sent to the transmitter, but more than one was received in the polled loop.
Interrupt UART Test
8D After waiting 500 ms, all the characters have not looped through.
8E The received data does not match the transmitted data.
8F An unexpected UART interrupt occurred.
Clear and Check DRAM Parity
90 A DRAM parity error was detected.
PROM Checksum Test
94 The calculated programmable read-only memory (PROM) checksum did not match the precalculated checksum.
RAM Size Test
98 The calculated DRAM size does not match the valid sizes allowed for this product.
Real-Time Clock Period Test
9C A real-time clock tick did not occur within 500 ms.
9D The 52 ms real-time clock period is not within a 5% tolerance (49.4 to 54.6 ms).
Memory Region Byte Order Test
A1 The value read back from memory region 1 did not match the value written.
A2 The value read back from memory region 2 did not match the value written.
A3 The value read back from memory region 3 did not match the value written.
NINDY RAM Test
A4 One of the following patterns read back from the upper 32KB of DRAM did not match the value written: 55555555H, AAAAAAAAH, 5A5A5A5AH, or A5A5A5A5H.
A5 The 32-bit value read from one of the upper 32KB of DRAM did not match the value written into it. The value written was the address of the 32-bit location.
A6 The 32-bit value read from one of the upper 32KB of DRAM did not match the value written into it. The value should have been 0.
Destructive DRAM Test
B0 After writing one of the following patterns, the read value did not match the written value:
  • 8-bit values, 55H, AAH, A5H, 5AH
  • 16-bit values 5555H, AAAAH, A5A5H, and 5A5AH
  • 32-bit values 5555555H, AAAAAAAH, A5A5A5A5H, and 5A5A5A5AH
  • Quad-word values 08040210H, 80402010H, F7FDFBFEH, and 7FDFBFEFH.
B1 The 32-bit value read from a DRAM location did not match the value written. The value written was the address of the 32-bit location.
B2 The value read from a DRAM location did not match the value written. The value written was 0.
Modified Algorithm Test Sequence DRAM Test
B3 The value read from a DRAM location did not match the value written.
Ethernet Controller Internal Self-Test
B4 The Ethernet controller self-test failed.
Ethernet Controller Reset and Initialization
B5 The Ethernet controller failed to reset properly, or the status of the channel attention line is incorrect.
Ethernet Controller Interrupt Test
B6 The Ethernet controller failed to generate an interrupt.
Set Ethernet Controller Address
B7 An attempt to set the Ethernet address in the Ethernet controller failed.
Ethernet Controller Internal Loopback
B8 The Ethernet controller internal data loopback test failed.
Ethernet Controller OnBoard Loopback
B9 The Ethernet controller onboard loopback failed.
Ethernet Controller External Loopback
BA The Ethernet controller external loopback failed.
Ethernet Controller Internal Register Dump Test
BB The Ethernet controller failed to dump its internal registers.
Memory-to-Memory DMA Test
BC Testing of the 80960CA's direct memory access (DMA) channel 1 failed.
BD An unexpected DMA interrupt occurred during DMA channel 1 testing.
Host Handshake (Polled)
C0 The host did not echo data back to the adapter within 250 ms.
Host Handshake (Interrupt)
C1 The adapter did not receive an interrupt from the host within 12 seconds to indicate that data had been received.
C2 The adapter received two interrupts from the host when only one was expected.
C4 Data received by the adapter did not match what was sent to the host.
C5 The host did not respond to the adapter's interrupt within 12 seconds.
C6 The processor received an interrupt from a source other than the host during the host handshake test.
C8 Unexpected DMAP status register value.
C9 The adapter timed out waiting for the DMAP to the host to complete.
CA The DMAP data read back from the i82325 data register did not match that written.
CB Unexpected direct memory access controller (DMAC) status register value.
CC The adapter timed out waiting for the DMAC operation to complete.
E0 Electrically eraseable programmable read only memory (EEPROM) internal CRC mismatch.
E1 Could not successfully initialize EEPROM.

Checkpoints

The following list describes the checkpoints that reference self-tests performed by the adapter board along with their corresponding self-tests:

01 Internal data RAM test
02 Register cache test
03 i82325 initialization and test
04 Memory region byte-order test
06 Clear and check DRAM parity
07 PROM checksum test
05 DRAM size test
08 NINDY RAM test
09 Real-time clock period test
0A Polled UART test
0B Interrupt UART test
0C Dynamic RAM test (destructive)
0D Ethernet controller internal self-test
0E Ethernet controller reset and initialization
0F Ethernet controller interrupt test
10 Ethernet controller address
11 Ethernet controller internal loopback
12 Ethernet controller onboard loopback
13 Ethernet controller external loopback
14 Ethernet controller internal register dump
15 Memory-to-memory DMA test
40 Adapter waiting for the host to execute its portion of the host handshake test.
1E Polled handshake test
1F Interrupt host handshake test
20 Programmed I/O to host (i82325 DMAP operation)
21 DMA block of data to or from host (i82325 DMAC operation)
22 EEPROM test

[ Previous | Next | Contents | Home | Search ]