Power6 Mid-Range Firmware

Applies to:  9117-MMA and 9406-MMA

This document provides information about the installation of Licensed Machine or Licensed Internal Code, which is sometimes referred to generically as microcode or firmware.


Contents


1.0 Systems Affected

This package provides firmware for System p 570 (9117-MMA)  and System i570 (9406-MMA) servers only.  Do not use on any other systems.
 

The firmware level in this package is:


2.0 Important Information

 Do not attempt to backlevel firmware from the EM320_031 level to the EM310 release level.  This will corrupt the service processor(s) code and will require the service processor(s) to be replaced.

Updating firmware from EM320_031 to EM320_040

Prior to updating server firmware from EM320_031 to EM320_040, ensure a backup of partition profile(s) is current and HEA settings are collected (when applicable).

The following steps may be required if the HMC shows Recovery state after updating server firmware.

Restore partition data from the backup
Check/Reset any of the following settings which may have been lost during the server firmware update:
    Promiscuous partition flag
    HEA
    Boot device
    Other

Firmware update or upgrade fails with SRC E302F842

This problem will occur when the following conditions apply:
HMC is at V7.3.2 with fix MH01081 installed and the managed system being updated or upgraded is at firmware level EM310_048.

To determine if MH01081 is installed:
Enter the following command on an HMC command line:
           lshmc -V

This command will produce a report similar to the following:
  MH01081: Pegasus security fix, code update fix, and new DST updates (01-09-2008)

To prevent this failure from occurring, install fix MH01084.

If you have experienced this problem, install fix MH01084, and then reinstall the system firmware.  For information about the recovery procedure  call you next level of support.

Signal Cable in an InfiniBand loop, and InfiniBand I/0 drawer power on/off

Recently, internal IBM testing has discovered a reliability issue when concurrently plugging signal cables in an InfiniBand loop, or when an InfiniBand I/O drawer in the loop is powered off, during system operation.  Until a new level of system firmware is released that fixes this issue, IBM recommends that you do not follow any procedures, either from service publications, or from the HMC, that indicate this can be done concurrently.  To avoid potential problems, ensure that all partitions that utilize the resources in the affected InfiniBand loop are shut down prior to taking any actions with the cabling, or powering down the InfiniBand I/O drawer.  In the rare event that an InfiniBand cable is accidentally unplugged, do not attempt to replug the cable without quiescing all partitions that are utilizing resources in any of the InfiniBand I/O drawers in the loop.

The exposure exists on a 9117-MMA system if you have external InfiniBand I/O drawers connected to a Feature Code/CCIN 1802, 12X Channel Interface card assembly.
        In an AIX partition on a 9117-MMA system, check for feature code/CCIN 1802 by running both of the following  AIX commands:
           lscfg -vp | grep -p P1-C8
           lscfg -vp | grep -p P1-C9

If you have difficulty identifying the presence of these adapters in any system, contact support for assistance.

If an InfiniBand/GX+ adapter is found in any of the locations above, you should not concurrently plug InfiniBand signal cables or power off an I/O drawer on that InfiniBand loop.

When system firmware is released that includes a fix for this problem, it will be clearly documented in the firmware description section of the description document.

ECA702 Released for 9117-MMA Systems

ECA702 was released on 12/07/2007 to update 9117-MMA systems to firmware level  EM310_063_048.  In addition to system firmware, the ECA also provides corresponding HMC updates. Product Engineering strongly recommends the installation of the ECA.  Customers wishing to have IBM service perform the installation of this firmware, free of charge, should call 1-800-IBM-SERV or their country's service organization to request mandatory ECA702.

Memory Considerations for Firmware Upgrades

The increase in memory used by the firmware is due to the additional functionality in later firmware releases.

HMC-Managed Systems

NOTE:  You must upgrade your HMC code to Version 7, Release 3.2.0  before attempting to load this system firmware on your server.

For information concerning HMC releases and to access the HMC code packages, go to the following URL:  http://www14.software.ibm.com/webapp/set2/sas/f/hmc/home.html

NOTE:   You must be logged in as hscroot in order for the firmware installation to complete correctly.


3.0 Firmware Information and Description

Use the following example as a reference to determine whether your installation will be concurrent or disruptive.

Note:  The file names and service pack levels used in the examples are for clarification only, and are not
             necessarily levels that have been, or will be released.

An installation is disruptive if:

              Example:  Currently installed release is EM310, new release is EM320                Example:  EM310_120_120 is disruptive, no matter what level of EM310 is currently
                                   installed on the system                  Example:  Currently installed service pack is EM310_120_120 and
                                     new service pack is EM310_152_130

An installation is concurrent if:

              Example: Currently installed service pack  is EM310_126_120,
                                 new service pack is EM310_143_120.
 

System firmware file naming convention:

     01EMXXX_YYY_ZZZ

NOTE:  Values of service pack and last disruptive service pack  level (YYY and ZZZ) are only unique within a release level (XXX).  For example, 01EM310_067_045 and 01EM320_067_053 are different service packs.
 

Firmware Information and Update Description

 
Filename Size Checksum
01EM320_040_031.rpm  21462115 62551

 
EM320 
EM320_040_031

03/03/08

 

Impact:  Serviceability                 Severity: Special Attention

Fixes that affect all model MMA systems:

  • DEFERRED:  A problem was fixed that caused a system crash (with SRC B131E504) by changing the initialization settings of the I/O control hardware.
  • A problem was fixed that could cause the hypervisor to hang after a reset/reload of the service processor.
  • A problem was fixed that, under certain circumstances, caused the InfiniBand adapter to stop responding to InfiniBand requests.
  • A problem was fixed that caused SRC B1813014 to be logged after a successful system firmware installation.  This SRC will be logged when this level of firmware is installed and will generate a call home; it should be ignored.  It will not be logged during subsequent installations.
  • The FRU list was changed so that clock card failures in a multi-drawer system will be easier to debug and require fewer parts to fix.
  • A problem was fixed that caused the service processor to get stuck in a reset/reload loop, which prevented the system from booting to standby.


System firmware changes that affect certain model MMA systems:

  • On systems with redundant service processors enabled, a problem was fixed that could cause a significant increase in system boot time.
  • On systems with two service processors installed and with redundancy disabled, a problem was fixed that caused the secondary service processor to go into the dump state, and remain in the dump state, after a platform dump.
  • On systems with redundant service processors, SRCs B1813833 and B1813834, which were being logged intermittently after a side-switch IPL, were changed to informational.
  • On systems with a 1519-100 tower attached, a problem was fixed that caused the location code of a connector on the integrated virtual IOP to be displayed as Un-SE1-SE1-T1 instead of Un-SE1-T1.
  • On systems with 7134-G30 I/O drawers attached in certain cabling configurations, a problem was fixed that prevented the I/O port labels from being displayed for the port location codes on the hardware topology screens.
EM320_031_031

12/03/07

Impact:  Function                       Severity: Attention

New Features and Functions:

  • Support for redundant service processors with failover on model MMA systems.
  • Support for the concurrent addition of a RIO/HSL adapter on model MMA systems.
  • Support for the concurrent replacement of a RIO/HSL adapter on model MMA systems.
  • Support for the "hyperboot" boot speed option in the power on/off menu on the Advanced System Management interface (ASMI).
  • Support for the creation of multiple virtual shared processor pools (VSPPs) within the one physical pool. (In order for AIX performance tools to report the correct information on systems configured with multiple shared processor pools, a minimum of AIX 5.3 TL07 or AIX 6.1 must be running.)
  • Support for the capability to move a running AIX or Linux partition from one system to another compatible system with a minimum of disruption. 
  • Support for the collection of extended I/O device information (independent of the presence of an operating system) when a system is first connected to an HMC and is still in the manufacturing default state. 
  • Improved VPD collection time on model MMA systems.
  • Support for the migration of DDR2 memory DIMMs during the MES upgrade from a 9117-570 server to a 9117-MMA server when processor card F/C 5621 is ordered when the initial system upgrade MES order is placed.

  • Support for EnergyScaletm and Active Energy Managertm.  For more information on the energy management features now available, please see the EnergyScaletm white paper
EM310 
EM310_069_048

02/11/2008

Impact: Availability                 Severity:  HIPER

Fixes that affect all model MMA systems:

  • HIPER:  A problem was fixed that caused some functions that perform hardware operations during runtime to generate temporary extended error handling (EEH) errors.
  • DEFERRED:  A problem was fixed that caused a system crash (with SRC B131E504) by changing the initialization settings of the I/O control hardware.  Note: This fix is not in the EM320_031_031 level listed above;  it is included in the EM320_040_031 level.
  • A problem was fixed that prevented a system from recovering after SRC B1xxB9xx was logged.
  • A problem was fixed that caused a firmware installation to fail with SRC B1813028.
  • A problem was fixed that caused SRC B1818A10 to be erroneously logged during a disruptive firmware installation.
  • A problem was fixed that, under certain circumstances, caused the buttons on the control (operator) panel to be inoperative.
  • A problem was fixed that prevented the system planning tool from deploying a sysplan with certain HEA MCS values.
  • A problem was fixed that caused SRC B1813108 to be erroneously logged during system boot. 
  • A problem was fixed that, under certain circumstances, caused the InfiniBand adapter to stop responding to InfiniBand requests.
  • A problem was fixed that caused the error "MSGVIOSE0300E002-0154 There is insufficient memory available for firmware" to be logged on the HMC.
System firmware changes that affect certain model MMA systems
  •  On model MMA systems with multiple drawers, a problem was fixed that prevented the pin-hole reset switch on the control (operator) panel from resetting the system.
  • On model MMA system with an uninterruptible power supply (UPS) attached, a problem was fixed the prevented the UPS from notifying the operating system that a utility failure or low battery condition had occurred. 
  • On systems with at least 3 or more licensed processors and 2 or more unlicensed processors, a problem was fixed that caused the system boot to be slower than normal, or to hang with SRC C700406E.
  • On model MMA system with 7314-G30 I/O expansion drawers attached, problems were fixed that caused the wrong FRUs to be called out with SRC B70069ED, and caused the hypervisor to loop if certain invalid cabling configurations are encountered.
  • On model MMA systems with a large number of I/O towers attached, a problem was fixed that caused the HMC to go to the incomplete state when an additional tower was added to a loop.
EM310_063_048

11/19/07

Impact:  Availability                 Severity:  HIPER
  • HIPER:  A problem was fixed that caused a time-out in a hardware device driver.  This time-out must include both SRCs B181B920 and B181D147.  Other SRCs may be present including, but not limited to, B1xxB9xx, B1xxE504, and B150D141.  Occasionally the system crashes.  If B181B920 and B181D147 SRCs are logged, check for any resources that were deconfigured at the time of these errors and reconfigure them using the ASMI menus.  No hardware should be replaced.  To recover from this error condition, the service processor must be reset by removing,  then reapplying, the managed system's power.
  • DEFERRED:  On multi-drawer model MMA systems, a problem found in testing was fixed which when the L3 cache was disabled,  under very unique (and rare) circumstances may result in data being overwritten in the cache and the system to crash.  Although the exposure to this issue is very low, and there have been no reported problems from the field, the system impact if this occurred would be high.  Product Engineering recommends that you schedule time to install this deferred fix at you earliest convenience.
EM310_057_048

9/14/07

Impact:  Availability                 Severity:  HIPER 

Additional features and functions:

  • Added support for 9406-MMA.
System firmware changes that affect all 9117-MMA systems:
  • HIPER:  A problem was fixed that caused the system to crash with SRC B170E450.
  • HIPER:  A problem was fixed that, in rare circumstances, could cause the system to hang due to the improper handling of certain exceptions.
  • HIPER:  A problem was fixed that prevented the operating system from being notified of certain EPOW conditions that could lead to the system or partition being shut down, with the possible loss of data.  These EPOW conditions included the ambient temperature being too high, the loss of utility power (with or without UPS backup), and a user-initiated power off using the white power button or the HMC.
  • A problem was fixed that could cause a firmware installation from the HMC to fail with SRC E302F85C on the HMC, and SRC B1813088, B1818A0F, or B1813011 logged in the service processor error log.
  • A change was made so that if a failure occurs during a memory-preserving reboot, the system continues to reboot rather than remaining in the termination (powered off) state.
  • A problem was fixed that caused EEH (enhanced error handling) errors to be erroneously logged against certain I/O adapters.
  • A problem was fixed that prevented "linked" resources that had been guarded out from being reconfigured during the next reboot after a service action on one of the guarded parts.
  • A problem was fixed that, after the backplane was replaced in a 7314-G30 I/O drawer, prevented the partition that owned the drawer from seeing those resources. 
  • A problem was fixed that caused the serial connection to a partition to be lost.  When this occurred, SRCs B181D307, B200E0AA, and/or B200813A were generated by the service processor and the hypervisor.
  • A problem was fixed in partition firmware that, in some circumstances, prevented a CD-ROM or tape device from being in the default service mode boot list, even if one was present in the system.
  • A problem was fixed that caused the HMC to go to the incomplete state, and SRC B182953C to be logged in the service processor error log every five minutes or so, when the managed system was booted.
  • A problem was fixed that caused the system to intermittently fail to configure devices attached to the integrated USB port when booting.
  • A problem was fixed that might have caused erroneous callouts if a problem was found with certain levels of memory controller chips.
  • A problem was fixed that caused the system to call home and reboot instead of allowing the failing part (a memory controller or DIMM) to be deconfigured by PRD (processor runtime diagnostics). 
Additional information concerning this service pack:

In addition to the fixes described above, this service pack also contains a fix for a low probability problem and content intended for newly-manufactured systems, or enhancements to system internal interfaces, which is not required for systems already in production use.   This content will not be activated on systems that install this service pack concurrently.  Even though this content is not required for systems which are already installed and in use, a disruptive installation of this service pack or a re-IPL after installing it will cause this content to become active.  It is not necessary to plan a window for re-IPL the system the activate this content.

EM310_048_048

6/22/07

Impact:  New        Severity:  New
  • Original (GA)  level.


4.0 How to Determine Currently Installed Firmware Levels

You can view the server's current firmware level on the Advanced System Management Interface (ASMI) Welcome pane.  It appears in the top right corner.   Example:  EM320_031.


5.0 Downloading the Firmware Package

The firmware is located at the web site:

       http://www14.software.ibm.com/webapp/set2/firmware/gjsn

Follow the instructions on this web page. You must read and agree to the license agreement to obtain the firmware packages.

In the drop-down box, choose the entry for your specific machine type and model.

You may download the rpm file for system firmware from this location to your server, an ftp server, or a CD-ROM. If your system is HMC-managed, you will also need to download the xml file located on the final download page.  Make sure the file names have the format 01EM3xx_yyy_zzz, with an extension of .rpm and .xml, before copying them to your server, an ftp server or CD-ROM.  If using a CD-ROM, copy the .rpm and .xml files to the CD-ROM using a local CD-ROM burner utility.

Another method is to download the ISO image and create a CD-ROM to use with your HMC.

Note: If your HMC is not internet-connected you will need to download the new firmware level to a CD-ROM or ftp server.


6.0 Installing the Firmware

The method used to install new firmware will depend on the release level of firmware which is currently installed on your server. The release level can be determined by the prefix of the new firmware's filename.

Example: EMXXX_YYY_ZZZ

Where XXX =  release level


Instructions for installing firmware updates and upgrades can be found in the Operations Guide for the Hardware Management Console and Managed Systems, see Chapter 9. Updates, section Managed Systems Updates.


7.0 Change History

DATE Description
Jul 16, 2008 Added information about updating from EM320_031 to EM320_040 in Section 2.0.
May 06, 2008 Changed information in  Section 2.0  in "blue" to refer to EM320_031 firmware level specifically, instead of EM320 in general.
Apr 18, 2008 Changed link to the Operations Guide for the Hardware Management Console and Managed Systems.
Mar 21, 2008 Added 9406-MMA to the list of supported systems.