pSeries 655 Firmware Update

Applies to:  pSeries 655 Model 651 (7039-651)

This document describes the installation of Licensed Machine Code, which is sometimes referred to generically as microcode or firmware.


Contents


 1.0  Systems Affected

This update provides firmware (FW) for pSeries 655 Model 651 (7039-651) Servers only.  Do not use on any other systems.

The firmware level contained in this update are:

Before installing this level of firmware, see Section 3.0 Cautions and Important Notes.

ATTENTION:  Due to significant fixes in the last few firmware releases to
                             improve system availability, firmware level 3J040326 or higher
                             is being made mandatory for all 7039 systems.


2.0  Firmware Description and Revision History

Table 2.1 lists the levels and descriptions for System firmware.
 
Table 2.1:  System Firmware Update Descriptions and History
3J050715 IMPACT:   Serviceability     SEVERITY:   Special Attention
  • Fixes a problem that causes the system to crash when AIX tries a reset on a slot which is already in the reset state.
  • Added support for address request protocol (ARP).
  • A problem was fixed that was causing the firmware to report an error on an I/O slot when no I/O adapter card was present.
  • A problem was fixed that was causing the PCI bus to report an error when a PCI adapter with a bridge on it was hot-plugged.
  • An invalid error code callout for clock errors was fixed.
  • Various fixes and updates to the high performance switch interface code.
3J050405 IMPACT:   Serviceability     SEVERITY:   Special Attention
  • Improved isolation for Remote I/O (RIO) loop failures.
  • Corrects systemTime of Day gains observed after reboot.
3J050215 IMPACT:   Function      SEVERITY:   Special Attention
  • Change made to collect hypervisor dump as part of snap command.
  • Change made to mask interrupts for a specific slot if a problem is encountered with and I/O adapter.
  • Resolves an MCM logic condition such that an unwarranted checkstop is prevented.
  • Resolves a problem that prevented the hard reset of a partition from working every time.
  • Corrects a problem that caused incorrect tracking of CUoD activation time.
  • Corrects a problem that resulted in ON/OFF CUoD processors suddenly being deactivated.
  • Resolves the problem that caused informational firmware events (B1008FF0) to be posted after an ON/OFF event.
  • Resolves an issue with the firmware reset handler in order to prevent an LPAR hang condition.
  • Includes additional High Performance Switch enhancements.
3J041029 IMPACT:   Function      SEVERITY:   Special Attention
  • Added support for network boot when target server provides preferred gateway information in bootp reply.
  • Added support for PCI adapters that require additional memory I/O space (Future OEM vendor requirement).
  • Corrects network boot problems with IBM Gigabit Ethernet adapters (FC 2969).
  • Resolves issue where On/Off CUoD reclaims more processors than it should if system is allowed to go out of compliance.
  • Resolves issue where On/Off CUoD posts change event messages when ON/OFF resources are no longer in use.
3J041021 IMPACT:  Function       SEVERITY:  Special Attention
  • Resolves issue of excessive length of time required to collect AIX dump for partitions with Service Authority assigned.
  • Corrects potential cause for system hang with error 40670EA1 on systems with Linux Operating System.
  • Corrects loss of one hour of resource time when activating On/Off CUoD resources.
  • Includes additional High Performance Switch enhancements.
3J040901 IMPACT:  Performance     SEVERITY:  Attention
  • Corrects a condition where after replacing and I/O planar (PCI-X) the attached node could hang at an E556 progression code. 
  • Includes additional High Performance Switch enhancements.
3J040602 IMPACT:  Performance     SEVERITY:  Attention
  • Performance enhancements for High Performance Switch.
  • Prevents informational checkpoints E900 and E901 from being displayed on the operator panel during runtime for extended periods.
  • Increased HMC surveillance timeout value to prevent false 'loss of communication' errors.
  • Eliminates false mp_fatal errors from being reported. 
  • Corrects a condition where a system will pause at B00E during an IPL for an extended period on systems with greater than 32GB of memory.
  • Corrects the VPD FRU number formatting reported to AIX for L3 memory modules.
  • Corrects a condition where a bit may be erroneously set causing a system crash due to a GX Data Timeout, while being reported to diags as SRN 651-900 and posting a Firmware error code of 40670EA4 (Bus test detected fault in service processor code).
3J040528 IMPACT: NA   SEVERITY:  NA
  • Added support for AIX 5.3.
  • Changes made for manufacturing enhancements.
3J040503  IMPACT:  Function   SEVERITY: Attention
  • Adds support for using previously released memory books with the higher speed MCMs.
  • Corrects a condition where the failure of a component in an I/O (7040-61D) drawer DCA-BC posts a critical overcurrent fault causing  the I/O drawer to power off and the associated partitions to crash with an SRN of A03-041 and a firmware error code of 20D10011.
  • Corrects a condition that caused a High Performance Switch (7045-SW4) adapter to be uninitialized in some cases.  This  lead to erroneous mp_fatal errors and could crash of the system.
3J040326  IMPACT:  Function   SEVERITY: Attention
  • Corrects a problem where 20D009xx errors were not being reported to the operating system during a reboot.
  • Corrects an initialization setting in the service processor firmware that caused an invalid FAST(43) bit errors from being reported.
  • Corrects false B1314633 (with a word 13 value of A716xxxx),  "Power Subsystem code update has been interrupted" errors from being reported.
  • Corrects a condition where a soft reset of a logical partition (LPAR) could cause all logical partitions to crash.
  • Added a FRU callout for SRN A03-150,  "I/O Expansion Bus Connection Failure".
  • Corrects a condition that results in the AIX location code being incorrect after resources have been re-assigned during a dynamic LPAR operation.
  • Added support for system boot from tape devices containing boot images larger than 12MB.  Must be used in conjunction with AIX 5.1 APAR IY57522 or AIX 5.2 APAR IY56839.  To create a boot image larger than 12MB  on tape media, you must first apply the appropriate APAR.  The firmware must be at 3J040326 (or later) before the image can be restored.
  • Adjustments made to PLL bit for 1.5 GHz, 1.7 GHz and 1.9 GHz MCMs to improve reliability.
3J040214
  • Provides RAS enhancements.
3J040130
  • Added support for 1.7GHz  (8-way) MCMs.
  • Corrects location code designation for the Service Processor battery FRU.
3J031118
  • Corrects power error after a PLD event indicated by 10118731, 10118711 and potential B00E hangs.
  • Corrects a condition where a soft reset of a LPAR could cause a system crash.
  • Addresses miscellaneous High Performance Switch tuning issues.
3J031024
  • Added support for pSeries High Performance Switch (7045-SW4) including the applicable adapters FC 6432 and FC 6434.
  • Corrects AIX location code mismatch when PCI-X I/O backplane (FC 6571) is being installed.
3J031007
  • Manufacturing only.
3J030725
  • Added support for 1.5 GHz (8-way) and 1.7 GHz  (4-way)  MCMs.
  • Updated the Service Processor Main Menu to include the "Start SPCN Flash Update " function.
  • Corrects a threshold setting for L3 uncorrectable errors so as to ensure the correct FRU is called out.
3J030718
  • Corrects an erroneous error code (25B00004) from being reported for a repeat gard error.
3J030521 NOTE:  This release contains a significant number of improvements and changes which are too numerous to document individually.  Only those items of common interest or high impact are documented.  Customers are strongly encouraged to install this release.
  • Reliability enhancements for adapter recovery during an EEH event.
  • Corrects configuration problems with IBM 3581 Ultrium Tape Autoloader. 
  • Resolved incorrect PCI adapter AIX location codes when adapters are added or reassigned via a DLPAR operation. 
  • Added details to error description for exceeding plug count errors.
  • Corrects graphical representation of MCM/L3 interposer plug count menu. 
  • Corrects problem in NVRAM which reported a 20EE000B when rebooting after upgrading firmware.
  • Resolved incorrect identify and power LED behavior during PCI adapter hot plug operations.
  • Corrects a microcode download failure on Fibre Channel Adapter F/C 6228.
  • Corrects 4-way affinity partitions (ALPAR) failure to boot.
RJ030206
  • Corrects system hang during boot with bad date/time stamp in AIX banner and 'default catch' message displayed on the console.
  • Provides potential performance enhancement for SP Switch2 PCI Attachment Adapters, Feature Code 8397.
  • Changed initialization routines for high performance PCI adapters to allow best use of available bandwidth.
  • Corrects false L3 Cache error and deconfiguration during system initialization.
  • Corrects failure to recognize PCI adapters equipped with PCI-X to PCI-X bridge chips during system boot.
  • Corrects potential cause for system hang with error B1114699 during firmware flash update.
  • Corrects problem with tape devices not appearing in the SMS 'Select Boot Options' menu.
  • Corrects behavior where the system will automatically boot from tape device when SMS 'List All Devices' option is chosen.
  • Contains potential minor performance enhancement for PCI adapters internal to the system unit.
  • Prevents logging of false error condition B1xx8FF0 in Service Processor and AIX error logs during firmware flash update.
  • Corrects false error condition reported by RIO hub card (error LED).
  • Resolves incorrect drawer location codes posted in Service Processor and AIX error logs.
RJ021209
  • Original (GA) level.


3.0  Cautions and Important Notes

The System, Service Processor (SvP) and System Power Control Network (SPCN) firmware are combined into a single file. This allows all the firmware to be updated together and assures they are compatible.

Before Beginning the Update

Go to mcodetools.html to determine if the tools can assist you with this update.

Before installing this level of firmware, ensure the Hardware Management Console (HMC) code on all HMC systems is Release 3, Version 1.2 or later.  If a High Performance Switch (7045-SW4) is installed, the HMC code on all HMC systems must be Release 3, Version 2.6 or later.  If you will be updating hardware, you will need to refer to the specific MES document to identify the required level of HMC code.

Updating the firmware may result in the HMC going into 'Recovery' state.  Before updating the firmware, make sure the backup of Profile Data is complete (if running LPAR).

Linux

To update firmware on a Linux system, you must first download and install the following service tools on your server: Platform Enablement Library, Service Aids, and Hardware Inventory. To obtain these service tools, go to https://techsupport.services.ibm.com/server/lopdiags and follow the instructions on this web site for downloading and installing the service tools.To update firmware from a Linux partition, you must download the Linux commands found on the web site:
 

Firmware Update Installation Is Not Concurrent

Installation of the firmware will cause an unconditional reboot of the system.  Therefore, all user operations should be gracefully terminated before firmware updates are to be applied.

Never Power Off the System during the Firmware Update Process

The update will fail, and the process must be repeated.

AIX and Linux Instructions are CASE SENSITIVE

In the instructions that follow are specific AIX, Linux and DOS commands.  AIX and Linux commands are CASE (lower and upper) SENSITIVE, and must be entered exactly as shown, including the filenames.  DOS commands are not case sensitive, and may be entered without regard to the cases shown.

Replacement Parts May Require Updating

When the service processor card is replaced, the firmware must be checked to ensure it is at the latest level. Table 3.1 lists the released levels of firmware.
 
Table 3.1: Firmware Levels, File Sizes and Checksums
   Firmware
Distribution Date Filename Size Checksum
August 2005 3J050715.img 6478269 52967
April 2005 3J050405.img 6478365 48131
February 2005 3J050215.img 6405130 30482
November 2004 3J041029.img 6473901 23732
October 2004 3J041021.img 6462633 02642
September 2004  3J040901.img 6463733 07427
June 2004 3J040602.img 6446313 49881
June 2004 3J040528.img 6434389 47659
May 2004  3J040503.img 6434397 20758
April 2004 3J040326.img  6432065 10213
March 2004 3J040214.img 6429537 37431
February 2004 3J040130.img 6427913  28251
January  2004 3J031118.img 6382477 59197
October 2003  3J031024.img 6377945 62068
October 2003 (manufacturing only) 3J031007.img n/a n/a
August 2003 3J030725.img 6290133 04872
July 2003 (manufacturing only) 3J030718.img n/a n/a
May 2003 3J030521.img 6241777 29473
February 2003 RJ030206.img 5876645 62948
December 2002 RJ021209.img 5847125 03614


4.0  How to Determine Currently Installed Firmware Levels

The firmware level can be checked in AIX, Linux or in the Service Processor Main Menu.

4.1  Using AIX to Read Currently Installed Firmware Levels

Use the following AIX command for checking the firmware level.

    Enter:
        lscfg  -vp |  grep -p Platform

  This command will produce a system configuration report similar to the following.

     Platform Firmware:
          ROM Level.(alterable).......3J040326
          Version.....................RS6K
          System Info Specific.(YL)...U1.18-P1-H2/Y2
        Physical Location: U1.18-P1-H2/Y23
The ROM Level line list the level of the currently installed firmware.  In the above example, the current  firmware level is 3J040326.

If the right-most six characters (date) of the current firmware level are earlier than 050715, you should consider installing the update.

If you find the firmware must be updated, proceed to Section 5.0.   If the firmware level is correct and no update is needed, installation is complete.

4.2  Using the Service Processor Main Menu

The second line of the title, Version: 3J040326 shows the currently installed firmware level.

If the right-most six characters (date) of firmware level are earlier than 050715, you should consider installing the update.

If you find the firmware must be updated, proceed to Section 5.0.  If the firmware level is correct and no update is needed, installation is complete.
 

4.3 Using Linux to Read Currently Installed Firmware Levels

Use the following Linux command for checking the firmware level.

         Enter:
            /sbin/lscfg -vp | grep -A 1  Platform

     This command will produce a system configuration report similar to the following.

     Platform Firmware:
          ROM Level.(alterable).......3J040326
The ROM Level line lists the level of the currently installed firmware. In the above example, the current firmware level is 3J040326.

If the right-most six characters (date) of the current firmware level are earlier than 050715, you should consider installing the update.

If you find the firmware must be updated, proceed to Section 5.0.  If the firmware level is correct and no update is needed, installation is complete.


5.0  Downloading and Unpacking the Firmware Update Package

 Instructions for downloading and unpacking the firmware update package follow.

5.1  Downloading from the Microcode Update Files and Discovery Tool CD

Follow the instructions that come with the Microcode Update Files and Discovery Tool CD.  Information about severity and impact is also available. The file, 3J050715.rpm,  you download from the CD is in the  /tmp/microcode/RPM directory.   You will need to move the 3J050715.rpm file to the /tmp/fwupdate directory.

      Enter:
        mkdir /tmp/fwupdate

        Note:  If the directory /tmp/fwupdate already exists,
                    make sure it is empty before proceeding.

      cd /tmp/microcode/RPM
      mv 3J050715.rpm /tmp/fwupdate
      cd /tmp/fwupdate
      rpm -ihv --ignoreos 3J050715.rpm

The 3J050715.img file will be added to  /tmp/fwupdate:

The file size and checksum will be verified.

If you are installing manually:

      Enter:
        mkdir /tmp/fwupdate

        Note:  If the directory /tmp/fwupdate already exists,
                    make sure it is empty before proceeding.

Copy the 3J050715.img file to the /tmp/fwupate directory.

 Proceed to Section 6.0 Updating the Firmware.

5.2 Remote Installation of Firmware

 To install firmware on a remote system, login to the remote system as root. Copy (in binary format) the file, 3J050715.img  to the /tmp/fwupdate directory on the remote system.   Proceed to Section 6.0 Updating the Firmware.

6.0  Updating the Firmware

The System, Service Processor (SvP) and System Power Control Network (SPCN) firmware are combined into a single file. This allows all the firmware to be updated together and assures they are compatible.

Once the System and Service Processor firmware had been updated, the server will reboot. The  System Power Control Network (SPCN) update will continue to run in the background.

***
WARNING:

Do not power off the target server at any time before the update process completes.  Be sure the system is NOT running any user applications when you begin the update process.
***

Note:  Checksums can be used to verify files have not been corrupted or altered during transmission.

           At the AIX command line, enter:
           sum  3J050715.img

           The output will look like this ------->  52967  6327    3J050715.img
            The checksum is ---------------> 52967

6.1 Full System Partition

Updating firmware must be initiated either from the AIX command line, or from the Update Flash Diagnostic Service Aids.

6.1.1  Using the AIX Command-Line Method

Before installing this level of firmware, see Section 3.0 Cautions and Important Notes.

You must have root authority on the target server to update its firmware.  Because the update process will cause an automatic reboot, be sure the system is not running any user applications.

With the files located in the /tmp/fwupdate subdirectory,

  Enter the commands:

        cd  /usr/lpp/diagnostics/bin
        ./update_flash  -f  /tmp/fwupdate/3J050715.img

      [Don't overlook the periods (.) in the above command.]

You will be asked for confirmation to proceed with the firmware update and the required reboot.  If you confirm, the server automatically performs the update and reboots.  The checkpoints 99FF and 99FD alternately appear while the update is in progress. This may take up to thirty minutes, depending on the configuration of the target server.  Since the update occurs during this shutdown/reboot sequence, it is important to protect the server from interruptions.

NOTE:  On the HMC terminal you will need to close the existing terminal window and open a new terminal window when the State shows INITIALIZING.

NOTE:  You may see a frame icon 'Unknown*Frame' that was created during the update.  If so, click on 'Unknown*Frame', then chose Refresh.  This will delete the icon.

If you are using an HMC, once you have powered on the system:

Don't forget to retrieve and file any firmware update diskette that may still be in the intermediate system's diskette drive.  A good time to do this is after the reboot has completed.

6.1.2 Using the Service Aids Method

Before installing this level of firmware, see Section 3.0 Cautions and Important Notes.

You must have root authority on the target server to update its firmware.

Note: Review "Update System or Service Processor Flash" in the Service Aids
section of the eserver pSeries 655 User's Guide for more information about using this utility.

  a) Invoke the Service Aids from on-line diagnostics.

  b) Choose Update System or Service Processor Flash.

  c) Select 'File System' as the source of the flash update image file.

      The fully qualified path name of the update file is /tmp/fwupdate/3J050715.img

  d) If using an HMC terminal  press 'ESC' + '7' keys to 'Commit' the update.

   e) Choose Yes to continue.

If you confirm, the server automatically performs the update and reboots. The checkpoints 99FF and 99FD alternately appear while the update is in progress. This may take up to thirty minutes, depending on the configuration of the server. Since the update occurs during this shutdown/reboot sequence, it is important to protect the server from interruptions.

NOTE:  On the HMC terminal you will need to close the existing terminal window and open a new terminal window when the State shows INITIALIZING.

NOTE:  You may see a frame icon 'Unknown*Frame' that was created during the update.  If so,  click on 'Unknown*Frame', then chose Refresh.  This will delete the icon.

If you are using an HMC,  once you have powered on the system:

6.2  Partitioned System

 Updating firmware must be initiated either from the AIX command line, from the Update Flash Diagnostic Service Aids or from the Linux command line.

6.2.1 Using the AIX Command-Line Method

Before installing this level of firmware, see Section 3.0 Cautions and Important Notes.

You must have root authority on the target server to update its firmware.

 ATTENTION:  This method requires the device resources to be allocated properly.  This requires:

                              - One partition running AIX must have service authority.
                              - All other partitions except the one with service authority must be shut down.
                              - The partition with service authority must own the device from which the firmware
                                   update image will be read.
                              - It is also recommended that the partition with service authority have a hard disk.

If the required devices are not in the partition with service authority, the customer or system administrator must reassign the appropriate resources to it. This requires rebooting the partition with service authority.

With the file located in the /tmp/fwupdate subdirectory,

       Enter the commands:

           cd /usr/lpp/diagnostics/bin
          ./update_flash -f /tmp/fwupdate/3J050715.img

      [Don't overlook the periods (.) in the above command.]

You will be asked for confirmation to proceed with the firmware update and the required reboot.  If you confirm, the server automatically performs the update and reboots.   The checkpoints 99FF and 99FD alternately appear while the update is in progress. This may take up to thirty minutes, depending on the configuration of the server.  Since the update occurs during this shutdown/reboot sequence, it is important to protect the server from interruptions.

NOTE:  On the HMC terminal you will need to close the existing terminal window and open a new terminal window when the Operator Panel Value shows LPAR.

NOTE:  You may see a frame icon 'Unknown*Frame' that was created during the update.  If so, click on 'Unknown*Frame', then chose Refresh.  This will delete the icon.

If the Managed System state on the HMC is either RECOVERY or INCOMPLETE, skip to paragraph 6.3 HMC Restore Functions to complete the procedure.

If the Managed System state on the HMC is READY, the update of the firmware is complete. You will want to verify this update as shown in paragraph 6.4.

6.2.2 Updating with the Diagnostics Service Aids

Before installing this level of firmware, see Section 3.0 Cautions and Important Notes.

  a) Invoke the Service Aids from on-line diagnostics.

  b) Choose Update System or Service Processor Flash.

  c) Select 'File System' as the source of the flash update image file.

      The fully qualified path name of the update file is /tmp/fwupdate/3J050715.img

  d) f using an HMC terminal  press 'ESC' + '7' keys to 'Commit' the update.

  e) Choose Yes to continue.

If you confirm, the server automatically performs the update and reboots. The checkpoints 99FF and 99FD alternately appear while the update is in progress. This may take up to thirty minutes, depending on the configuration of the server.  Since the update occurs during this shutdown/reboot sequence, it is important to protect the server from interruptions.

NOTE:  On the HMC terminal you will need to close the existing terminal window and open a new terminal window when the Operator Panel Value shows LPAR.

NOTE:  You may see a frame icon 'Unknown*Frame' that was created during the update.  If so, click on 'Unknown*Frame', then chose Refresh.  This will delete the icon.

If the Managed System state on the HMC is either RECOVERY or INCOMPLETE, skip to paragraph 6.3 HMC Restore Functions to complete the procedure.

If the Managed System state on the HMC is READY, the update of the firmware is complete. You will want to verify this update as shown in paragraph 6.4.

6.2.3 Using the Linux Command-Line Method

You must have root authority on the target server to update its firmware.

 ATTENTION:  This method requires the device resources to be allocated properly.  This requires:

                              - One partition running Linux must have service authority.
                              - All other partitions except the one with service authority must be shut down.
                              - The partition with service authority must own the device from which the firmware
                                   update image will be read.
                              - It is also recommended that the partition with service authority have a hard disk.

If the required devices are not in the partition with service authority, the customer or system administrator must reassign the appropriate resources to it. This requires rebooting the partition with service authority.

This method allows updating from files already loaded onto the target server.

With the files located in the /tmp/fwupdate subdirectory,

     Enter the commands:

         cd /tmp/fwupdate
        /usr/sbin/update_flash -f  3J050715.img

The server automatically performs the update and reboots.  The checkpoints 99FF and 99FD alternately appear while the update is in progress. This may take up to thirty minutes, depending on the configuration of the server.  Since the update occurs during this shutdown/reboot sequence, it is important to protect the server from interruptions.

NOTE:  If using an HMC terminal you will need to close the existing terminal window and open a new terminal window when the Operator Panel Value shows LPAR.

If the Managed System state on the HMC  is either RECOVERY or INCOMPLETE,  skip to paragraph 6.3 HMC Restore Functions to complete the procedure.

If the Managed System state on the HMC is  READY, the update of the firmware is complete. You will want to verify this update as shown in paragraph 6.4.

6.3 HMC Restore Functions

The OP Panel on the server displays OK and the Managed System state on the HMC will go to the RECOVERY state.

    a. Click on the managed system name.

    b. Select the "Recover Partition Data" task.

    c. Select "Restore profile data from HMC backup" option.

 This can take up to 10 minutes.

When the restore is finished, the state changes to READY and the system Status and operator panel will say LPAR.

You will want to verify this update as shown in paragraph 6.4.

6.4  Verifying the Update

To verify the update was successful, the firmware level can be check in AIX, Linux or in the Service Processor Main Menu.

6.4.1 Using AIX

Use the following AIX command for checking the firmware level.

         Enter:
            lscfg -vp | grep -p  Platform

This command will produce a system configuration report containing sections similar to the following.

   Platform Firmware:
        ROM Level.(alterable).......3J050715
        Version.....................RS6K
        System Info Specific.(YL)...U1.18-P1-H2/Y2
      Physical Location: U1.18-P1-H2/Y2
The ROM Level line should match the level you just installed, namely, 3J050715.

6.4.2 Using the Service Processor Main Menu

The second line of the title, Version: 3J050715, shows the currently installed firmware level.
 

6.4.3 Using Linux

Use the following Linux command for checking the firmware level.

         Enter:
            /sbin/lscfg -vp | grep -A 1  Platform

     This command will produce a system configuration report similar to the following.

     Platform Firmware:
          ROM Level.(alterable).......3J050715
The ROM Level line should match the level of the you just installed, namely, 3J050715.

6.5  Archiving the Update Files

In the event it becomes necessary to restore the server to a certain firmware level, it is suggested you identify and archive the materials for each update you install.

If the download process produced diskettes, label and store them in a safe place.

If the download process produced files, archive and identify the files for convenient retrieval.


End of Installation Instructions