PROBLEMS WITH HOT PLUGGING EXTERNAL SCSI DEVICES
ITEM: RTA000035301
QUESTION:
Customer is having many problems with the SCSI on a 570. If they
1) don't have a terminator upon boot-up or 2) try to 'hot plug' a
SCSI device like an external 8mm tape drive or 3) accidentally unplug
the terminator, the machine will no longer boot and they have to
re-load the whole machine.
I thought that SCSI was 'hot-pluggable' -- am I wrong? I was also told
by the CE who covers this account that he thought there was a RETAIN
item relating specifically towards this problem.
Can you help?
---------- ---------- ---------- --------- ---------- ----------
A: When removing or adding a SCSI device to the SCSI bus all power must
be removed from the system. Do not turn on, turn off, or disconnect
any SCSI device while power is present at the system unit. Such "hot
plugging" is forbidden because it might blow the controller fuse,
trip the PTC resistor, corrupt data, or permanently damage SCSI
controller chips in adapters or devices.
---------- ---------- ---------- --------- ---------- ----------
QUESTION:
Would it be normal then if the terminator was accidentally unplugged
that the machine would need to be reloaded
---------- ---------- ---------- --------- ---------- ----------
A: Yes, if a terminator is removed from the SCSI bus data can be
corrupted and reinstalling the operating system may be necessary.
The following question which was taken from another ASKQ item
illustrate this point:
************************************************************************
(1) With AIX 3.1.5, I must first shutdown and power off the system
before connecting / disconnecting any external SCSI devices. Is this a
true statement? I would be very very surprised if the answer was no.
At least twice after I added an external device to the system without
powering off the system, I had to reload the system because the system
hung with a "552" on LED. I also heard from many of my co-workers
the same thing happened to them.
************************************************************************
---------- ---------- ---------- --------- ---------- ----------
QUESTION:
The error that originally showed up on the LED readout was 555 and
did require a reload. A SCSI device was again accidentally bumped
off and now the error is a 557 -- an in service mode, the machine
cannot be reloaded -- we are now calling a CE. We can't find
documentation on these error codes.
Is there ANY workaround that is known to exist or something that we
might try to 'unlock' the SCSI channel and allow it to boot again
from the hard disk?? Understanding that we need to address the root
of the problem (the SCSI cables being too easy to bump/unplug), is
there anyway around this problem??? Thanks in advance.
---------- ---------- ---------- --------- ---------- ----------
A: As long as no vital data was corrupted when your SCSI bus hung,
it may be possible to recover your system. Following is a document
that explains your error, and gives some options you can try to
get your system running again.
RECOVERY FROM AN LED 551, 555, OR 557 IN AIX 3.1 OR 3.2
SPECIAL NOTICES
The problem for which you received this document is not con-
sidered a code warranty issue. This document is provided
as an aid by the Austin AIX Support Center. If you need
further assistance, contact your local branch office or
point of sale, or call 1-800-CALL-AIX for information about
support offerings. All of the above services may be
billable. Faxes on a variety of subjects may be ordered
free of charge from 1-800-IBM-4FAX.
Comments about this document may be sent by fax to "Info
Feedback" at (512) 823-7634. IBM representatives can send
comments internally to ROUSHC at AUSVM8.
The information contained in this document is distributed
"AS IS" without any warranties of any kind either expressed
or implied. IBM will not be responsible for any direct,
incidental, consequential, special or indirect damages. IBM
EXPRESSLY DISCLAIMS ANY IMPLIED WARRANTY OF MERCHANTABILITY
AND ANY IMPLIED WARRANTY OF FITNESS FOR A PARTICULAR
PURPOSE.
The use of this information or the implementation of any of
these techniques is the sole responsibility of the customer
and depends on the customer's ability to evaluate and inte-
grate this information or implementation into the customer's
operational environment.
CAUSES OF AN LED 551, 555, OR 557
The known causes of an LED 551, 555, or 557 during IPL on an
RISC System/6000 are:
o A corrupted file system.
o A corrupted journaled-file-system (JFS) log device.
o A failing fsck (file-system check) caused by a bad file-
system helper.
o A bad disk in the machine that is a member of the
rootvg.
SUMMARY OF THE RECOVERY PROCEDURE
To diagnose and fix the problem, you will need to boot from
bootable media and run logform on /dev/hd8. Then run fsck
to fix any file systems that may be corrupted.
STEPS
1. Turn the key to the Service position.
2. With bosboot diskettes or tape OF THE SAME VERSION AND
LEVEL AS THE SYSTEM, boot the system. (If booting from
diskettes, insert the Display diskette when you see LED
c07.)
------------------------------------------------------------
WARNING: If you boot a 3.2 system with 3.1 media, or boot a
3.1 system with 3.2 media, then you will not be able to use
the standard scripts (getrootfs or /etc/continue) to bring
your workstation into full maintenance mode.
Moreover, performing the scripts on a 3.1 system with 3.2
boot media may actually remove some files and prevent your
system from booting successfully in normal mode until
missing files (/etc/mount and /etc/umount) are replaced on
the disk.
------------------------------------------------------------
NOTE: If you get a 551, 555, or 557 on this step, the
diskette or tape is bad, and the machine is trying to
boot off the fixed disk. Try it again with new bosboot
diskettes or tape.
Follow the prompts to the installation/maintenance menu.
3. Choose the maintenance shell (option 5 for AIX 3.1,
option 4 for AIX 3.2).
4. Determine the hdisk# to use with the getrootfs or
/etc/continue command. If you have only one disk, then
"hdisk0" is the proper hdisk# to use. If you have more
than one disk, run
lqueryvg -Atp hdisk# | grep hd5
for each hdisk# (hdisk0, hdisk1, etc.) until you get
output that looks like:
00005264feb3631c.2 hd5 1
You may find more than one disk has this output. These
will all be disks which belong to the rootvg volume
group. You may use any of the disks identified to be in
rootvg in the following step.
5. Now access the rootvg volume group by running
/etc/continue (for AIX 3.1) or getrootfs (for AIX 3.2).
('#' is the number of the fixed disk, determined in step
4.)
For AIX 3.1 only, run
/etc/continue hdisk# sh
For AIX 3.2 only, run
getrootfs hdisk# sh
If you get errors indicating that a physical volume is
missing from the rootvg, run diagnostics on the physical
volumes to find out if you have a bad disk. Do not con-
tinue with the rest of the steps in this document.
If you get other errors from getrootfs or /etc/continue,
do not continue with the rest of the steps in this docu-
ment. Correct the problem causing the error. If you
need assistance correcting the problem causing the
error, contact one of the following:
o local branch office
o your point of sale
o call 1-800-CALL-AIX (to register for fee-based ser-
vices)
All of the above avenues for assistance may be billable.
6. Format the default jfslog for the rootvg jfs file
systems.
For AIX 3.1 only, run
/etc/aix/logform /dev/hd8
For AIX 3.2 only, run
logform /dev/hd8
Answer YES when asked if you want to destroy the log.
7. Next, run the following commands to check and repair
file systems. (The "-y" option gives fsck permission to
repair file systems when necessary.)
fsck -y /dev/hd1
fsck -y /dev/hd2
fsck -y /dev/hd3
fsck -y /dev/hd4
For AIX 3.2 only, also run
fsck -y /dev/hd9var
8. Type "exit". The file systems will automatically mount
after you type "exit".
9. If you are running the Andrew File System (AFS), use the
following commands to save the AFS file-system helper
and replace it with the original file-system helper.
In AIX 3.1,
cd /etc/helpers
In AIX 3.2,
cd /sbin/helpers
Then in both AIX 3.1 and 3.2,
copy v3fshelper v3fshelper.afs
copy v3fshelper.orig v3fshelper
10. Determine which disk is the boot disk with the lslv
command. The boot disk will be shown in the PV1 column
of the lslv output.
lslv -m hd5
11. Recreate the boot image. (hdisk# is the boot disk
determined in step 10.)
bosboot -a -d /dev/hdisk#
12. If you are running the Andrew File System (AFS), copy
the AFS file-system helper back:
copy v3fshelper.afs v3fshelper
13. With the key in Normal position, run
shutdown -Fr
If you followed all of the above steps and the system still
stops at an LED 551, 555, or 557 during a reboot in Normal
mode, you may want to pursue further system recovery assist-
ance from one of the following:
o local branch office
o your point of sale
o 1-800-CALL-AIX (to register for fee-based services)
All of the above avenues for assistance may be billable.
For reasons of time and the integrity of your AIX operating
system, the best alternative at this point may be to
reinstall AIX.
END OF DOCUMENT
If the above does not help, then you will probably need to reinstall
AIX.
---------- ---------- ---------- --------- ---------- ----------
This item was created from library item Q644324 CLZQH
Additional search words:
CLZQH DEVICE DEVICES EXTERN EXTERNAL HARDWARE HOT IX JAN94 OZNOTPID
WWQA: ITEM: RTA000035301 ITEM: RTA000035301
Dated: 04/1996 Category: RISCOHW
This HTML file was generated 99/06/24~12:43:13
Comments or suggestions?
Contact us