ITEM: DD1556L

Network boot leds 511 622 then blank



Question:

Env:
  AIX 4.2
  RISC E30 (server CW)  SP High Node
  sysback 4.1.1.5

Problem:
  When the customer tries to do a network boot from the E30
  onto the High Node it hangs on these leds 511 622 then blank

Action Taken:
  It appears to be failing on the cfgcons command
  I will make sure the customer has installed all the
  needed filesets on the E30 to support his serial adapter
  on the High node.
  
devices.mca.dee6 
     Standard I/O (dee6) Adapter Software 
devices.mca.df5f 
      Standard I/O (df5f) Adapter Software 
devices.mca.fed9 
      Standard I/O (fed9) Adapter Software 
devices.mca.f6fe 
      Standard I/O (f6fe) Adapter Software  
devices.sio.sa.rte
bos.rte.tty

These were all installed on the E30, reboot and the boot image
recreated, which still did not fix the problem.

I made sure the customer did not have a symbolic link
of /etc/objrepos/Config_Rules to /usr/lib/Config_Rules

I had the customer put the network boot into debug mode:

How to debug a network boot...
-------------------------------

REQUIREMENTS
  Sysback 3.3.2+
  Any Ascii Terminal 

STEPS TO BE DONE ON THE SERVER SYSTEM
-------------------------------------

1.  cd /usr/sbin
2.  cp mksbnetboot mksbnetboot.org
3.  vi mksbnetboot
      Search on bosboot (It should be found on your second hit)

    Change:

     bosboot -a -d $DEVICE -p $tmp_proto -b $BOOTIMAGE \\
             -k $bootdir/unix_$kernel -T $platform

   To:

     bosboot  -Ia -d $DEVICE -p $tmp_proto -b $BOOTIMAGE \\
             -k $bootdir/unix_$kernel -T $platform

(Add a -I (upper case i) flag to the command)
 Save the file

4.  Create your boot image:
    /usr/sbin/mksbnetboot -B -d'ethernet' -T'rs6k' -k mp
   (if your client is using a ethernet network) 

5.  nm -x /usr/lib/boot/unix_mp | grep enter_dbg

    Example output:

    enter_dbg            D 0x001801f4  0x0004

    Write down the second fields number

    From the example above write this number down.
    1801f4
 STEPS TO BE DONE ON THE CLIENT SYSTEM
-------------------------------------

6.  Start the network boot on the High Node

7.  After the bootp and tftp are done the ascii terminal will display
    the kernal debugger information.
    Type this from the command prompt:
    >  st 1801f4 2
   >  g

The console (cfgcon command) is not getting configured.  It is 
failing with this error message:

cfgcon: console is not defined and no candidate terminals were 
        detected

Looking through the debug output it seems to fail on these entries
a great deal

  0514-001 System error:
sh: cannot fork: no swap space

 cfgmgr  cfgmgr Method error (/etc/methods/ptynode \^H):
        0514-010 Error returned from odm_run_method.
 cfgmgr  cfgmgr Method error (/etc/methods/startlft \^H):
        0514-010 Error returned from odm_run_method.
 cfgmgr  cfgmgr Method error (/etc/methods/startrcm \^H):
        0514-010 Error returned from odm_run_method.
 cfgmgr  cfgmgr Meth/etc/methods/starttty \^H):
        0514-010 Error returned from odm_run_method.
 cfgmgr  cfgmgr Method error (/etc/methods/startsmt \^H):
        0514-010 Error returned from odm_run_method.
 cfgmgr  cfgmgr Method error (/etc/methods/startsgio \^H):
        0514-010 Error returned from odm_run_method.
 cfgmgr  cfgmgr Method error (/usr/lib/methods/defops \^H):
        0514-010 Error returned from odm_run_method.
 cfgmgr  cfgmgr Method error (/usr/lib/methods/defssar \^H):
        0514-010 Error returned from odm_run_method.
 cfgmgr  cfgmgr Method error (/etc/methods/darcfgrule \^H):
        0514-010 Error returned from odm_run_method.
 cfgmgr  cfgmgr Method error (/usr/lib/methods/cfgfan \^H):
        0514-010 Error returned from odm_run_method.

It appears that the customer is able to do a NIM boot from
this source  /source/aix/420

I had the customer send me this information:
--------------------------------------------
lslpp -L > CW.lpp.list  (list of fileset installed on the E30)

ls /source/aix/420 > lpp.source  (list of filesets in the lpp
                                  source used by NIM booting)

I noticeds these filesets on the control workstation at different levels than
  in the user SPOT.

What is on the control workstation (E30):
-----------------------------------------
devices.mca.8f97.com       4.2.1.0     Common SSA Adapter (8f97)
devices.ssa.IBM_raid.rte   4.2.1.0     SSA Raid Manager Software
devices.ssa.disk.rte       4.2.1.0     SSA DASD Software

The matching filesets are at 4.2.0.0 level in the NIM SPOT
This could be our problem.  The customer has 8 SSA adapters 
installed on the system.  

I had the customer remove those filesets above, reboot and
rebuild the network boot image which fixed the problem.

Action Plan:
  Closing with customer approval


Support Line: Network boot leds 511 622 then blank ITEM: DD1556L
Dated: June 1997 Category: N/A
This HTML file was generated 99/06/24~13:30:16
Comments or suggestions? Contact us