[ Previous | Next | Contents | Home | Search ]
AIX Version 4.3 Kernel Extensions and Device Support Programming Concepts

IDE Error Recovery

If an error, such as a check condition or hardware failure occurs, the transaction active during the error is returned with the ataide_buf.bufstruct.b_error field set to EIO. The IDE device driver should process or recover the condition, rerunning any mode selects to recover from this condition properly. After this recovery, it should reschedule the transaction that had the error. In many cases, the IDE device driver only needs to retry the unsuccessful operation.

The IDE adapter device driver should never retry an IDE command on error after the command has successfully been given to the adapter. The consequences for retrying an IDE command at this point range from minimal to catastrophic, depending upon the type of device. Commands for certain devices cannot be retried immediately after a failure (for example, tapes and other sequential access devices). If such an error occurs, the failed command returns an appropriate error status with an iodone call to the IDE device driver for error recovery. Only the IDE device driver that originally issued the command knows if the command can be retried on the device. The IDE adapter device driver must only retry commands that were never successfully transferred to the adapter. In this case, if retries are successful, the ataide_buf status should not reflect an error. However, the IDE adapter device driver should perform error logging on the retried condition.

Analyzing Returned Status

The following order of precedence should be followed by IDE device drivers when analyzing the returned status:

  1. If the ataide_buf.bufstruct.b_flags field has the B_ERROR flag set, then an error has occurred and the ataide_buf.bufstruct.b_error field contains a valid errno value.

    If the b_error field contains the ENXIO value, either the command needs to be restarted or it was canceled at the request of the IDE device driver.

    If the b_error field contains the EIO value, then either one or no flag is set in the ataide_buf.status_validity field. If a flag is set, an error in either the ata.status or ata.errval field is the cause.

    If the status_validity field is 0, then the ataide_buf.bufstruct.b_resid field should be examined to see if the IDE command issued was in error. The b_resid field can have a value without an error having occurred. To decide whether an error has occurred, the IDE device driver must evaluate this field with regard to the IDE command being sent and the IDE device being driven.

  2. If the ataide_buf.bufstruct.b_flags field does not have the B_ERROR flag set, then no error is being reported. However, the IDE device driver should examine the b_resid field to check for cases where less data was transferred than expected. For some IDE commands, this occurrence may not represent an error. The IDE device driver must determine if an error has occurred.

    There is a special case when b_resid will be nonzero. The DMA service routine may not be able to map all virtual to real memory pages for a single DMA transfer. This may occur when sending close to the maximum amount of data that the adapter driver supports. In this case, the adapter driver transfers as much of the data that can be mapped by the DMA service. The unmapped size is returned in the b_resid field, and the status_validity will have the ATA_IDE_DMA_NORES bit set. The IDE device driver is expected to send the data represented by the b_resid field in a separate request.

    If a nonzero b_resid field does represent an error condition, then the device queue is not halted by the IDE adapter device driver. It is possible for one or more succeeding queued commands to be sent to the adapter (and possibly the device). Recovering from this situation is the responsibility of the IDE device driver.


[ Previous | Next | Contents | Home | Search ]