ITEM: T5851L

system hang at c06 after attempted install



Question:

Customer accepted offhour charge.
.ENV:  AIX 3.2.4, model 550
.DESC:  Customer installed the ptf U428287 but he ran out of space.
He then rebooted the system but it hang at c06.
.ACT:  I had customer boot from the mksysb tape.  We ran fsck on 
the system.  I then had him ran /usr/sbin/logform /dev/hd8,
next type "exit" to mount the filesystems.
.Had him ran "df" and the results.

hd4  1104 free 91% used
hd2  1232 free 99% used
hd3  11348 free 7% used
hd9  14884 free 9% used
.I had him ran the bosboot -a -d /dev/hdisk1, and it returned these 
messages:
. cannot load program /usr/sbin/bootinfo
 symbol _system _cofiguration in ksh is undefined.
 error was: exec format error
 /usr/bosboot  991!: test: argument expected
  missing the proto type /etc/ascsidd
.I advised him to edit the /sbin/rc.boot file in line
          'BOOTYPE=`bootinfo -t`     to      BOOTYPE=1
Had vi didn't work, terminal settings seems to be good.
      \# lppchk -v       -> no output
.Had him run bosboot command but same error messages.
.I had customer remove files in the /usr/lib/drivers

                ascsidd    ---- 777 0 0 /etc/drivers/ascsidd
                ascsiddpin ---- 777 0 0 /etc/drivers/ascsiddpin
                vscsidd    ---- 777 0 0 /etc/drivers/vscsidd
                vscsiddpin ---- 777 0 0 /etc/drivers/vscsiddpin
               pscsi720dd    ---- 777 0 0 /etc/drivers/pscsi720dd
                pscsi720ddpin    ---- 777 0 0 /etc/drivers/pscsi720ddpin
                cfgascsi   ---- 777 0 0 /etc/methods/cfgascsi
                cfgvscsi   ---- 777 0 0 /etc/methods/cfgvscsi
                cfgpscsi   ---- 777 0 0 /etc/methods/cfgpscsi 
.We tried to run "bosboot -a -d /dev/hdisk0" but got the same 
error:
. cannot load program /usr/sbin/bootinfo
 symbol _system _cofiguration in ksh is undefined.
 error was: exec format error
 /usr/bosboot  991!: test: argument expected
 bosboot: boot image is 11049 512 byte blocks
.When we tried to boot from the new boot image, the system hung at
led 554.
.Told customer that he will have to reload the system from the mksysb
Customer is going to talk to his software support and he will call
back if he needs further information.

Response:

Act:    Conferenced in Jim and his customer Tom.  The system is
        in maintenance mode and they are having problems accessing
        a non-root volume group.  They want to just backup the datavg
        and reinstall a mksysb since the system seems unrecoverable.

        'lspv' shows 3 disks instead of 6.

                hdisk0  rootvg
                hdisk1  none
                hdisk2  none

        hdisk3,4,and5 are external disks off a different scsi
        adapter and are listed as defined by 'lsdev -Cc disk'.

        The disk are also mirrored as follows:

                hdisk0---hdisk5 rootvg
                hdisk1---hdisk3 root2vg
                hdisk2---hdisk4 root2vg

        Running 'exportvg root2vg' 'importvg -y root2vg hdisk1'
        fails with a quorum error like it cannot access the external
        drives at all.

        'mkdev -l hdisk3' failed and 'cfgmgr' caused the machine to
        crash after several 'unable to configure device into kernel'
        errors.

        We booted back into maintenance and ran diagnostics.  The external
        drives were not listed, although they did blink during the boot.
        Running diagnostics on the  scsi adapter for the external disks
        returns:

                "A problem was detected, a device failed to configure.
                 869-271 50% fuse, 45% scs1, 5% software"

        'lslpp -h | grep BROKEN' finds no problems, but it could be
        software related because it looks like the original installation
        of the ptf U287287 did not complete.

        'lppchk -v' shows several missing ptfs,
        including the device driver update U427864.

        hdisk0 was 100% full, so we removed hd1 (/home=2GB) and increased
        /usr by 50MB since it was 99% full.  Also increased /(root) to
        8MB to make room to complete the installation.

        Trying to install the 3.2.5 maintenance following the 'led554 fax'
        failed becaue it is not on the tape.  Instead we used:

        ifreqs | installp -BNXacgq -d /dev/rmt0.1 -f- 2>&1 | tee /tmp/ifreqs.log

        to install the missing ptfs which failed during the initial install.
        Hopefully, this will solve the 'bosboot' problems and install the
        drivers necessary to access the external drives.

NExt:   waiting for install to complete

Response:

Act:    The installp command completed successfully installing
        several ptfs.  After completion, the bosboot command
        works.  We ran:

                syncvg -v rootvg
                synclvodm -v rootvg
                bosboot -a -d /dev/hdisk0
                sync; sync; sync
                shutdown -Fr

        and booted in normal. This time we got a login.  All disk
        are listed by 'lspv' but the nonroot vg was not imported.

        Ran 'importvg -y root2vg hdisk1' and 'varyonvg root2vg'
        This completed, but mounts failed with:

                506-324 mount failed due to format of the media

        Running fsck on the root2vg logical volumes solved the
        problem and they mounted successfully after that.

Next:   Ran 'synclvodm -v root2vg'. Jim and Tom will recreate   
        a /home logical volume and restore /home from tape.

        closing call


Support Line: system hang at c06 after attempted install ITEM: T5851L
Dated: November 1995 Category: N/A
This HTML file was generated 99/06/24~13:30:35
Comments or suggestions? Contact us