Frequently Asked Questions regarding the Device Event Table

Frequently Asked Questions regarding the Device Event Table

In the Device Event Table, what are hard events?

The hard event count entry in the device event table is a count of events detected by the SCSI I/O processor since the Device Event Table was last cleared. These events are usually not caused by the target device. The adapter processor can detect many types of events. Usually these events are related to SCSI cabling, back planes or internal problems in the ServeRAID adapter. Hard events are usually not related to the hard drives or other SCSI devices that are on the bus.

How should hard events be handled?

If you find a hard event entered into the Event log, first check to see if there is a discernible pattern to the events in the device error table. For example a large number of events on a particular drive or channel may indicate a problem with the cabling or back plane for that particular drive, channel, etc. Always check for cables being properly seated, bent pins, pushed pins, damaged cables and proper termination. Before replacing the ServeRAID adapter, replace the SCSI cables followed by the back plane. If you have exhausted all other possibilities, then replace the ServeRAID adapter. Remember that the ServeRAID card is the least likely item in the subsystem to cause hard events and the most expensive to replace.

In the Device Event Table what is the meaning of soft events?

The soft event entry in the device error table is a count of the SCSI check conditions (other than unit attention) received from the target device (hard disk drive, CD ROM, tape drive, etc.) since the Device Error Table was last cleared. There are many types of SCSI check conditions that can be received by the ServeRAID adapter. Some of the check conditions indicate errors while others indicate unexpected (but not error) conditions such as the command queue on a drive being temporarily full.

How should soft events be handled?

An occasional soft event in the absence of PFA , parity and hard event entries is usually not a problem. Check for cables being properly seated, bent pins, pushed pins, damaged cables and proper termination of the SCSI bus. No further action is necessary if there are only a small number of random events.

Is there a threshold guide line to follow before replacing the drive?

As with hard errors, a pattern of entries for a particular drive or channel may indicate a problem. There is not an absolute threshold for soft errors and when drives should be replaced. The ServeRAID adapter internally filters the types of soft errors and will mark a disk drive defunct (DDD) when appropriate.

In the Device Error Table what is the meaning of parity events?

The parity event entry in the device event table is the number of single bit errors on the SCSI bus found by the ServeRAID adapter since the last time the Device Event Table was cleared. Parity errors found by the targets (hard disk drive, tape drive, etc.) are reported as soft events.

How should parity events be handled?

Check for cables being properly seated, bent pins, pushed pins, damaged cables and proper termination of the SCSI bus. If some or all of the devices are operating at Fast or Ultra speeds, then ensure that the maximum cable lengths for the SCSI interface are not exceeded. No further action is necessary if there are only a small number of random events. A large number of events on a particular channel or to a particular target may require replacement of the backplane or SCSI cabling. It is possible but unlikely that the ServeRAID adapter has caused the parity errors.

In the Device Event Table what is the meaning of miscellaneous events?

Miscellaneous events are all entries that are not parity, soft, hard or PFA entries. Miscellaneous events are very often target (hard drive, tape drive, etc.) problems.

How should miscellaneous events be handled?

Some common events that cause miscellaneous errors are selection time-out when accessing the drive, unexpected SCSI bus free detected by the SCSI I/O processor or SCSI phase error. Check for cables being properly seated, bent pins, pushed pins, damaged cables and proper termination of the SCSI bus. If all the preceding items are correct, then suspect the target indicated in the log. The least likely cause would be a problem in the ServeRAID adapter.

What is PFA and do all drives have this capability?

PFA stands for Predictive Failure Analysis. Most server class hard disk drives have the capability of monitoring internal parameters in the drive that could predict a future failure in the drive. The algorithms and data monitored are very complex and in most cases proprietary.

If a drive determines that a failure is likely, then it notifies the ServeRAID adapter of a possible future failure. This notification is included as a Device Event Table entry. Any drive with a PFA entry in the Device Event Table should be replaced as soon as possible.

How should PFA events be handled?

For RAID-1 or RAID-5, replace the drive immediately and rebuild the array. For RAID-0 back up all data to tape immediately, replace the drive and then restore the data to the RAID-0 array.


Back to  Jump to TOP-of-PAGE

Please see the LEGAL  -  Trademark notice.
Feel free - send a Email-NOTE  for any BUG on this page found - Thank you.