ITEM: CF4468L

System hangs at LED 299 on boot



Env: Model 580 running AIX 4.1.4

Desc: When customer tries to boot machine, it hangs at LED 553.

Action: I had the customer verify the current problem. They boot the
machine in normal with no other boot media but the hard disks It
boots to LED 299 and goes no further. They waited for over 10 minutes
with no change.

LED 299 means that the system ROS has passed control over to the
loaded kernel image.

If it hangs at LED 299, it indicates a bad boot image OR a hardware
problem.

In AIX 4.1, the boot image is compressed by default. I decided to make
sure that an uncompressed image would fail the same way.

The customer only has 2 disks and they are both in "rootvg". The bootlv
(hd5) is on hdisk0.

I had her boot in service off her CDROM and make a new boot LV on the
other disk in rootvg. The uncompressed boot LV will require at least
8meg of space, so I made the LV with 2 PP's.

 mklv -t boot rootvg 2 hdisk1

I had her run "lslv -m bootlv00" on the LV returned from "mklv" to
make sure it was on consecutive numbered PP's. I then had her run

 bosboot -a -U -l /dev/bootlv00 -d /dev/hdisk1

I then had her reboot, and this time it booted successfully. There means
there is a problem with the compressed side of the boot code.

In a compressed image, the kernel is compressed with the "compress"
command and a command called "/usr/sbin/bootexpand" is prefixed to
that image along with a compressed RAM filesystem and "savebase"
information.

If either "compress" or "bootexpand" are corrupted/wrong, then a problem
like this may occur.

"/usr/sbin/bootexpand" was fine, but "/usr/bin/compress" was not.

A "what /usr/bin/compress" returned nothing when it should have
returned a string like this:

 /usr/bin/compress:
        85  1.38  src/bos/usr/bin/compress/compress.c,
                  cmdfiles, bos412, 9446C 11/14/94 16:47:47

I had her run "/usr/bin/compress -?" to try and get more information.
This command returned a 1.2.4 version number in addition to the usage
information. This is not an AIX command, and may be a 3rd party
version like "GNU".

She replaced this command with a same version from another AIX box.

I had her remove the bootimage I had made and remake the boot images
with these commands:

 mkboot -c -d /dev/hdisk0
 mkboot -c -d /dev/hdisk1
 rmlv bootlv00
 bosboot -a -d /dev/hdisk0

I then had her reboot, and she came up fine this time. Any AIX command
should NOT be replaced with 3rd party commands. The effect can be
unknown, and could cause the machine to fail on a reboot.

The proper place for commands like this would be in a user own "bin"
directory or in a different bin directory like /usr/local/bin.

She will pursue this with her users.

NextAction: Close call with customer agreement

Testcase: none


Support Line: System hangs at LED 299 on boot ITEM: CF4468L
Dated: September 1996 Category: N/A
This HTML file was generated 99/06/24~13:30:18
Comments or suggestions? Contact us