RS/6000 SP 9076 SMP System and Service Processor Firmware Update

Applies to:
IBM RS/6000 SP 9076 Power3 SMP High Nodes and 375 MHz Power3 SMP High Nodes

If you experience problems or have questions regarding any component of this document or the installation of the code please call 1-800-IBM-SERV for assistance.

Please read this document in it's entirety and in particular understand the Notes and Cautions section before applying this F/W to any node.
 

Contents:

1.0 Machines/Nodes Affected

2.0 Revision History

3.0 Related Publications

4.0 Cautions and Important Notes

5.0 Downloading and Unpacking the Firmware Update Package

6.0 Putting the Firmware on the System

7.0 Validating the Checksum

8.0 Setting up the Distributed Shell

9.0 Determining Your Current Firmware Levels

10.0 Updating the Firmware on the Target Nodes
 
 


1.0 Machine/Nodes affected

This update provides new System and Service Processor (SvP) firmware (FW) on RS/6000 SP 9076 Power3 SMP High Nodes (FC 2054) and RS/6000 SP 9076 375 MHz Power3 SMP High Nodes (FC 2058).  This firmware can be applied to a Power3 SMP High node operating on any model of 9076 that has an operating system of AIX at 4.3.3 or greater and PSSP 3.1.1 or higher.

NOTE:  Both the System Firmware and the SvP Firmware have been combined into a single file to reduce the number of reboots.

NOTE:  This procedure will require two formatted diskettes.

The latest level image is:  nk060315.img

This image is made up of the following components:

  • System Firmware NI06074 (Previously NI05255)
  • Service Processor Firmware nh020426  (Remains The Same)

 


2.0 Revision History
 

Level

Description

nk060315

  • Corrects Nighthawk NIM Install Fails

nk050912

  • Corrects link errors on 4port Ethernet
  • Corrects 20A80001 ping fails during NIM boots
  • Evaluates restart-cmd on forced-reboot
  • Fix for reboot-command to be set correct
  • Fix for E443 hang when installed with AIX 5.3

nk030505

  • Corrected Auto configuration, ping and NIM failures onEthernet adapters attached to switch or router with  spanning tree algorithm enabled.
  • Corrected boot failure with checkpoint E1F6 displayed after bootlist is set via AIX and more than 5 devices are specified.
  • Corrected missing devices in SMS bootlist when bootlist set via AIX.
  • Parameter change to limit bootlist entries set via AIX to 5.
  • Corrected boot failure with 'Default Catch' message displayed on console when booting from tape media containing large boot image.
  • Corrected error 20EE000B: unable to find boot record after restore on NIM install on 36GB or larger disks.
  • Corrected 'Default Catch' message displayed on console after SMS 'Change SCSI ID' utility screen is accessed.

nk020426

  • Enhancements to memory diagnostics
  • Adds support for 4-Port 10/100 Ethernet (FC4961, A-E)
  • Fixes boot problems from Dual-Channel Ultra3 SCSI (FC6203, 4-Y)
  • Corrects Local/Remote server NIM / MAC address discrepancies 

nk020110

  • Increased support for 10/100 Ethernet adapter (FC4962, A-F)
  • Added UDF driver to support DVD media type
  • NIM install/boot enhancements
  • Improved FRU isolations 

nk010813

 

  • Added support for Fiber Channel / Boot RAS enhancements
  • Fix for tty overrun and all reduce MPIC
  • Correction to keep memory bank1 from deconfiguring when adding memory bank3
  • Removed checking for PCI vendor ID and PCI device ID

nk010605

  •  Enhancements on FRU isolations for RAS
  • Correction to Ethernet adapter (FC2987 and 2985) error code

nk010206

  • Corrects false Redundant CPU Power Supply warnings (LDC 40110012, SRN A15-120)
  • ENT NIM install problem for auto speed resolved
  • Improved FRU isolation
  • Fixes incorrect FRU location codes
  • NCA5.1 performance tuning changes

nk001020

  • Improved checkstop analysis of 630 Processor errors
  • Improved checkstop analysis for Colony Adapter errors
  • Corrects False Power Supply warnings
  • Addresses CPU redundant Power Supply failures
  • Improved FRU isolation
  • Fixes incorrect FRU location codes
  • Fixes NIM install and miscellaneous errors with Ethernet UTP Adapter (FC2975)
  • Changed L2 Single Bit Error algorithm to include stuck bit detection
  • Resolved EPOW related problems
  • Fix for node not powering on after firmware update
  • Memory update improvements
  • 630+ processor performance improvements
  • Added support for NCA5.1
  • FRU update made possible when upgrading NH1 to NH2

nk000725

  • Improved checkstop analysis for SABER errors on I/O Planar
  • Improved checkstop analysis for 630 Processor errors
  • Improved checkstop analysis for Colony Adapter errors
  • Corrects False Power Supply warnings

nk000622

  • Enhanced L2 Cache single bit error algorithm for improved single bit error analysis and reporting. This fix will reduce CPU card replacements and performance degradation's due to L2 ECC single bit errors. This fix is critical to all Power3 SMP High Nodes. Please install at your earliest convenience as this fix is classified as "Strongly Recommended".
  • Enables ICache recovery
  • Enhanced scheme for LIVELOCK breaker
  • Fixes 32 GB memory limit
  • RTAS error enhancement
  • Addresses CPU redundant Power Supply failures
  • Improved FRU isolation
  • Fixes incorrect FRU location codes
  • Special error codes created for NCA port table time-outs
  • Adds SMS menu support for Ethernet UTP adapter (FC2975)

3.0 Related Publications
 

You may have to refer to the following publications during this install:

  • GA22-7448 Power3 SMP High Node Service Guide.
  • GC23-3897 IBM RISC System/6000 Scalable POWERparallel Systems Administration Guide.
  • GC23-3900 IBM RISC System/6000 Scalable POWERparallel Systems Command and Technical Reference.

 


4.0 Cautions and Important Notes
 

  • INSTALLATION OF THIS CODE WILL CAUSE AN UNCONDITIONAL REBOOT OF AIX AND IS NOT CONCURRENT WITH CUSTOMER OPERATIONS
  • Never power off the node during the update process.
  • *ATTENTION* IF SYSTEM FW IS AT NI01019 OR LATER, A SECOND REBOOT IS NOT REQUIRED. IF BELOW THIS LEVEL, PLEASE READ ON.

Full implementation of this level of Firmware requires that an additional step be performed which is typically not required when upgrading Firmware The installation process will automatically initiate a reboot. Once this reboot has completed, the node must be shutdown a second time but NOT be allowed to reboot automatically (use the shutdown or shutdown -F command).  Once the node powers off, the node can be powered back on using Perspectives or command line methods.

Failure to follow the above procedure may result in the occurrence of the following false error symptoms at each reboot:

  • Solid or blinking yellow environment light on the node supervisor card.
  • Error code 40110012 displayed in the LCD panel.
  • SRN A15-120 reported by diagnostics.

If any of the above symptoms appeared after the initial reboot but before the above procedure is executed, they can be ignored.   Do not request service on your machine unless these symptoms return AFTER the above procedure has been completed.

NOTE:  In order to prevent confusion, it is suggested that the AIX errorlog be cleared of any hardware errors.  This will prevent the automatic errorlog analysis feature from reposting old false failure messages.  This can be accomplished by running the following command: "errclear -dH 0"
 

How To Determine Release Date From The File Name

Date and level identifiers for the SvP FW uses the 8-digit Gregorian date code method in terms of year, month, and day (such as 20020426 for nh020426 level). The System FW uses the 5-digit Julian date code method in terms of day number in a year (such as 03125 for NI03125 level -- 125th day of 2003 or May 05, 2003).
The Combo FW uses the Gregorian date code.
 

The SvP and System Firmware need to be at companion levels

When an I/O planar or a Service Processor (SvP) card is replaced, the System firmware on the new I/O planar and the SvP firmware on the new SvP card must be checked to ensure that they are at the latest level, or that they are at companion levels. If they are not, it is recommended that both the System Firmware and SvP Firmware be "re-flashed" to the latest level or at the very least to the correct companion level. Companion levels are those files that have the same release date as indicated in the table below.

Installation Time

The typical time to install this microcode on one node is 1.0 hr. This firmware can be installed on more than one node at a time but the total install time will vary depending on how many nodes/frames are involved with the upgrade. The microcode does not become active when installed and requires a system reboot to become active. This reboot  time will vary depending on the system and the amount  of  features installed.


5.0 Downloading and Unpacking the Firmware Update Package
Instructions for downloading and unpacking the firmware update package follow.

5.1 Internet Package
The firmware, in AIX and DOS formats, are located at the web site

          download.html

Follow the instructions on this web page. You must read and agree to the license agreement to obtain the password (case sensitive) for unpacking the firmware packages.

In the table for System Microcode, scroll down to the entry for 9076 POWER3 SMP High Node.

The download choices at that entry are:

  •  Description (Instructions document)
  •  AIX Format (For downloading to an AIX server or system)
  •  DOS Format (For downloading to a DOS or Windows workstation)

You will want a copy of the description (instructions document) and one of the format choices. You may transfer files to the target system in one of several ways.

  • By downloading files directly to the target system.
  • By downloading files to an intermediate AIX system and then using either ftp or diskettes for transferring to the target server.
  • By downloading files to an intermediate AIX system and then using either ftp or diskettes for transferring to the target server.

Detailed download/unpacking instructions follow for each of the downloading preferences.

  •  If using an AIX system for downloading, continue to paragraph5.1.1.
  •  If using a DOS or Windows workstation for downloading, skip to paragraph5.1.2.

5.1.1 Downloading the AIX Format File
Use this method to download to an AIX system.
 

Note: In the instructions that follow are specific AIX commands.
           AIX commands are CASE (lower and upper) SENSITIVE, and must be entered exactly as
           shown, including the filenames.
 
 
 

  a) Provide a directory on an AIX system to receive the AIX format file.

      Enter:
        mkdir /tmp/fwupdate

         Note: If the directory /tmp/fwupdate already exists,
           make sure it is empty before proceeding.

  b) Transfer the AIX format file to the /tmp/fwupdate directory (using "Save as ...").
       You'll see that the filename is 90768XF.BIN

  c) Unpack the file by executing the instructions below.
       You will be need the password from the license agreement.

Enter the commands:

        cd /tmp/fwupdate
        chmod +x 90768XF.BIN
        ./90768XF.BIN

     [Don't overlook the periods (.) in the above command.]

 These files will be added to  /tmp/fwupdate:

      sx060315.img
      ReadMe.TXT

If you used the above procedure to transfer the AIX format file directly to the target server,  proceed to Section 7.0, Validating the Checksum.

Otherwise, from the intermediate AIX system, choose one of the following methods for transferring files to  the target server.

To transfer files to the target server via the ftp method, continue to paragraph5.1.1.1.
To transfer files to the target server via the diskettes method, skip to paragraph5.1.1.2.

5.1.1.1 The FTP Transfer Method
This method presumes you have ftp access to the target server.

On the intermediate AIX system,

   Enter the commands:
      ftp {name of target server}
     {Login with a valid userid and password}

      bin
     lcd /tmp/fwupdate
     mkdir /tmp/fwupdate
     cd /tmp/fwupdate
     put sx060315.img
     quit

Proceed to Section 7.0, Validating the Checksum.

5.1.1.2 The Diskette Transfer Method

This method can be used for cases in which electronic connections between the intermediate AIX system and the target server are inconvenient.

 Two 2MB (HD) new or freshly formatted diskettes are required.

 With a diskette loaded in the drive,
  Enter the commands (this process will request additional diskettes as each is filled):

       cd /tmp/fwupdate
       ls *.img | backup -i -v -f/dev/rfd0

This will produce AIX backup diskettes.  Label these diskettes, respectively,

     "Volume 1:  FW (sxd1060315) for 9076-N8X"
     "Volume 2:  FW (sxd2060315) for 9076-N8X"

Proceed to Section 6.0,Updating the Firmware.

5.1.2 Downloading the DOS Format File
Use this file to download to a DOS or Windows workstation.

  a) Prepare a directory for receiving the DOS format file.
      This directory can be in any partition with 12MB available space.
      Executing in such a partition, called [path] in these instructions
      (ex. c:\download),

      Enter:
       md [path]\fwupdate

     Note:  If the directory [path]\fwupdate already exists,
                 make sure it is empty before proceeding.

     b) Transfer the DOS format file to the [path]\fwupdate directory (using "Save as ...").
         You'll see the filename is 90768XF.EXE

     c) Unpack the file by executing the instructions below.
         You will need the password from the license agreement.

        Enter the commands:
        cd [path]\fwupdate
        90768XF

These files will be added to the fwupdate subdirectory:

       NKD120~1.EXE
       NKD220~1.EXE

       readme.txt

5.1.2.1 Diskettes for Firmware Updates

Two 2MB (HD) new or freshly formatted DOS diskettes are required.

  Note: The diskettes produced below will be in a format that can be used
             directly with a computer running AIX as it's operating system. These
             diskettes, once made on a PC platform, cannot be read using normal PC
             platform tools or command line operations.

  a) With a diskette loaded in the drive,

         Enter the commands:

            cd [path]\fwupdate
       NKD120~1

      Label this diskette,
           "Volume 1: FW (sxd1060315) for 9076-N8X"

  b) With a second diskette loaded in the drive,

         Enter:
           NKD220~1

     Label this diskette,
           "Volume 2: FW (sxd2060315) for 9076-N8X"
 
 



 

6.0 Putting the Firmware on the System

On the 9076 CWS that you are going to use this firmware, insert the first diskette in the floppy diskette reader (rfd0) on the CWS.

Login as root on the Control Workstation (CWS).

Login root

Enter the commands:

mkdir /tmp/fwupdate
cd /tmp/fwupdate
restore

When prompted, insert the second diskette into the diskette reader.

This will put the firmware file into the /tmp/fwupdate directory.
 



7.0 Validating the Checksum

Checksums can be used to verify the FW files you received have not been corrupted or altered during transmission. To calculate the checksum, enter:

sum [FW filename]

The checksum is the first number of the output.  Compare the number you get with the table.  If the numbers match then you can be sure the firmware files are not corrupted.
 
 

 

COMBINATION FIRMWARE

RELEASE DATE

Filename

Chksum

03/15/06

Nk060315.img

15829

09/12/05

nk050912.img

37888

09/09/03

nk030505.img

34850

07/17/02

nk020426.img

00151

03/18/02

nk020110.img

03203

11/05/01

nk010813.img

32847

08/08/01

nk010605.img

59849

03/27/01

nk010206.img

21907

11/06/00

nk001020.img

31542

08/30/00

nk000725.img

07688

07/19/00

nk000622.img

44892


8.0 Setting up the Distributed Shell

On an SP, you need to perform on each node what would normally be done on a standalone RS/6000 server.  To install the firmware you must distribute the code to all of the applicable nodes, then install it there.  You must be logged on as root to properly do the install.

All the following commands shown will be run from a window on the CWS. It should not be necessary to telnet or rlogin to the individual SP nodes. You should be familiar with the "dsh" and "pcp" commands. If not, refer to the (GC23-3897) IBM Parallel System Support Program for AIX: Administration Guide or (GC23-3900) IBM Parallel System Support Program for AIX: Command and Technical Reference.

  1. Verify all the target nodes are powered ON and quiesced (i.e.. No user applications or activity should be allowed).
  1. You will need to create a list of what nodes to install the code on.  To do this type the following command:

  splstdata -G -n

You should see something similar to the following:

                 List Node Configuration Information

node#   frame#   slot#   slots    initial_hostname    reliable_hostname    dcehostname
      default_route     processor_type   processors_installed   description
------------------------------------------------------------------------------
    1        1       1       2    vion01.pok.ibm     vion01.pok.ibm    ""
        9.114.201.190               MP                      1   Power3_SMP_High
    3        1       3       2    vion03.pok.ibm     vion03.pok.ibm    ""
        9.114.201.190               MP                      2   Power3_SMP_High

The output will include every node in your system.  For each node that has Power3_SMP in its description, record the first part of the initial_hostname field.  In this case since the initial_hostname is vion01.pok.ibm, you would record vion01.  This will become your list of nodes to be installed with the code.

  1. Next, prepare a Working Collective file to permit you to use the "dsh" and "pcp" commands to install all of the applicable nodes in parallel by entering the following:

            cat > group1
          [nodename1]
            [nodename2]
            [nodename3]
 
 

Next you will need to put the list of nodenames you just created into a file and export it.  To do this enter the following commands:                       .

            .
     press Ctrl D
            export WCOLL=/tmp/fwupdate/group1

  1. Test this file to ensure it is working the way you expect by typing:

  dsh date

You should get back a list that looks something like:

            nodename1: Wed Apr 10 10:37:46 EDT 1996
            nodename2: Wed Apr 10 10:37:46 EDT 1996
            nodename3: Wed Apr 10 10:37:47 EDT 1996
            nodename4: Wed Apr 10 10:37:48 EDT 1996

If this output does not contain all the nodes that you expected, examine your node list file /tmp/fwupdate/group1 and also ensure that the Kerberos ticket is current to permit "dsh" to be performed. You may need to refresh the Kerberos. Refer to the SP Administration Guide (GC23-3897) for further help. The System Administrator should also be able to help you with Kerberos initialization. Otherwise, consult your support center.
 
 


9.0 Determining Your Current Firmware Levels

To check the System Firmware level enter:

          dsh "lscfg -vp | grep -p 'System Firmware'"

This command will produce a system configuration report containing sections similar to the following for each node:

System Firmware:
ROM Level (alterable)…......NIxxxxx<==System FW level
Version…..........................RS6K
System Info Specific.(YL)…U2.1-P2

The ROM Level (alterable) lines list the level numbers of your currently installed System Firmware.  If the five right most characters of the System Firmware indicate that you are not at the latest level, you MUST install the update.
 

To check the Service Processor Firmware level enter:

          dsh "lscfg -vp | grep -p 'Service Processor'"

This command will produce a system configuration report containing sections similar to the following for each node:

            Service Processor:
            ROM Level.(non-alterable)...nhxxxxxx
            ROM Level.(alterable).......nh020426<==Service Processor FW level
            System Info Specific.(YL)...U2.1-P2
The ROM Level (alterable) lines list the level numbers of your currently installed System Firmware.  If the five right most characters of the System Firmware indicate that you are not at the latest level, you MUST install the update.

 



10.0 Updating the Firmware on the Target Nodes

This section describes the method for transferring the new firmware into the target nodes. Each flash update should complete within one minute.

WARNING:  DO NOT POWER OFF THE TARGET SERVER AT ANY TIME BEFORE THE FLASH PROCESS COMPLETES.

WARNING:  BE SURE THE SYSTEM IS QUIESCED AND NOT RUNNING ANY USER APPLICATIONS.
 

  1. Use "pcp" (parallel copy) to copy the firmware file to each of the target nodes (defined by the Working Collective from above).

            hostlist | pcp -w - /tmp/fwupdate/[firmware filename]  /var

         NOTE:  You must copy the firmware image to the /var directory. Any other directory besides /var will not work.

  1. You may wish to verify the firmware copied over correctly. Perform the following command and verify the firmware files are found on each target node.

            dsh "ls /var/*.img"

  1. You must have root authority on the CWS to update its nodes. This method uses the AIX "shutdown" command, and it assumes the firmware update images are located in the /var directory according to the instructions in this document.  To help ensure file system integrity enter:

            dsh "sync;sync;sync;sync"

  1. Update the firmware:


 


 
 

dsh "/usr/sbin/shutdown -Fu /var/[firmware filename]"

The node will power down and reboot to the AIX login prompt.

If one of the following errors occurs, go to section 4.0 Cautions and Important Notes.

    • Solid or blinking yellow environment light on the node supervisor card.
    • Error code 40110012 displayed in the LCD panel.
    • SRN A15-120 reported by diagnostics.
  1. Remove all old .img file from the /var directory.

            dsh "rm /var/*.img"

  1. Remove all temp files.

            rm -r /tmp/fwupdate

Verify that everything was installed properly.  Use section 8.0 Determining Your Current Firmware Levels to do this.