[ Bottom of Page | Previous Page | Next Page | Contents | Index | Library Home | Legal | Search ]

Understanding the Diagnostic Subsystem for AIX

Missing Options Resolution

This section describes the Missing Options Resolution Procedure performed by Diagnostics when a change in the system configuration has been detected. This procedure can be run to clean up the system configuration database, or to determine why previously detected resources are no longer found by the operating system.

Each time the system boots from an installed hardfile, the device configuration database (CuDv) that is stored on the hardfile from the previous IPL is compared against the resources detected on the current IPL. Detectable resources that were found on the previous IPL but not the current IPL are marked as MISSING. Devices that were found on the current IPL, but not present in the previous IPL are marked as NEW.

The customized device entry CuDv chgstatus field is set to the changed status for each resource. These changed status values can be found in /usr/include/sys/cfgdb.h file.

When booting a system in normal mode, a message is written to the console if any devices have been detected as MISSING. This message states:

        A device that was previously detected could not be found.

        Run diag -a to update the system configuration.

The diag -a command can then be run to process the missing options resolution procedure.

When booting a system in online service mode, the missing options resolution procedure is run automatically if any missing devices were detected.

The following sections describe how the Diagnostic Controller presents information to the Diagnostic Applications that get invoked during Missing Options.

Online Concurrent Diagnostics

Missing Options Resolution procedure can be run in online concurrent mode by using the following command:

% diag -a // Runs in Customer Mode
OR
% diag -a -A // Runs in Advanced Mode

The first screen seen by the user is the MISSING RESOURCE Menu, 801020.

The following TMInput is an example of the input given to the Diagnostic Application when running the diag -a command.

TMInput:
        exenv = 4               // Concurrent Environment
        advanced = 0            // Customer Mode
        system = 0              // Option Checkout
        dmode = 4               // System Verification
        date = "-s START -e NOW"// START = NOW - 24 hours.
        loopmode = 1            // Not in Loop Mode
        lcount = 0
        lerrors = 0
        console = 1             // Console Available
        parent = "parent0"      // Parent of resource to test
        parentloc = "AB-CD"     // Parent's Location Code
        dname = "resource0"     // Name of resource to test
        dnameloc = "AB-CD"      // Resource's Location Code
        child1 = "child0"       // Missing Child of Resource
        state1 = 3              // State of Child is MISSING
        childloc1 = "AB-CD"     // Child's Location Code
        child2 = ""
        state2 = 0
        childloc2 = ""

The following TMInput is an example of the input given to the Diagnostic Application when running the diag -a -A command.

TMInput:
        exenv = 4               // Concurrent Environment
        advanced = 1            // Advanced Mode
        system = 0              // Option Checkout
        dmode = 4               // System Verification
        date = "-s START -e NOW"// START = NOW - 24 hours.
        loopmode = 1            // Not in Loop Mode
        lcount = 0
        lerrors = 0
        console = 1             // Console Available
        parent = "parent0"      // Parent of resource to test
        parentloc = "AB-CD"     // Parent's Location Code
        dname = "resource0"     // Name of resource to test
        dnameloc = "AB-CD"      // Resource's Location Code
        child1 = "child0"       // Missing Child of Resource
        state1 = 3              // State of Child is MISSING
        childloc1 = "AB-CD"     // Child's Location Code
        child2 = ""
        state2 = 0
        childloc2 = ""

Online Service Diagnostics

Missing Options Resolution procedure is run automatically in online service mode when Diagnostics or Advanced Diagnostics selection is made from the FUNCTION SELECTION Menu.

When booting a system in online service mode, the OPERATING INSTRUCTIONS Menu and the FUNCTION SELECTION Menu are displayed in phase 1 by the service mode boot script. Once a selection is made, the selection is stored in /etc/lpp/diagnostics/data/fastdiag file, and phase 2 of the boot process commences.

The Diagnostic Application that gets called due to a missing child resource, after selecting Diagnostic Routines from the FUNCTION SELECTION menu, gets a TMInput shown below:

TMInput:
        exenv = 2               // Standalone Environment
        advanced = 0            // Customer Mode
        system = 0              // Option Checkout
        dmode = 4               // System Verification
        date = "-s START -e NOW"// START = NOW - 24 hours.
        loopmode = 1            // Not in Loop Mode
        lcount = 0
        lerrors = 0
        console = 1             // Console Available
        parent = "parent0"      // Parent of resource to test
        parentloc = "AB-CD"     // Parent's Location Code
        dname = "resource0"     // Name of resource to test
        dnameloc = "AB-CD"      // Resource's Location Code
        child1 = "child0"       // Missing Child of Resource
        state1 = 3              // State of Child is MISSING
        childloc1 = "AB-CD"     // Child's Location Code
        child2 = ""
        state2 = 0
        childloc2 = ""

The Diagnostic Application that gets called due to a missing child resource, after selecting Advanced Diagnostic Routines from the FUNCTION SELECTION menu, gets a TMInput shown below:

TMInput:
        exenv = 2               // Standalone Environment
        advanced = 1            // Advanced Mode
        system = 0              // Option Checkout
        dmode = 4               // System Verification
        date = "-s START -e NOW"// START = NOW - 24 hours.
        loopmode = 1            // Not in Loop Mode
        lcount = 0
        lerrors = 0
        console = 1             // Console Available
        parent = "parent0"      // Parent of resource to test
        parentloc = "AB-CD"     // Parent's Location Code
        dname = "resource0"     // Name of resource to test
        dnameloc = "AB-CD"      // Resource's Location Code
        child1 = "child0"       // Missing Child of Resource
        state1 = 3              // State of Child is MISSING
        childloc1 = "AB-CD"     // Child's Location Code
        child2 = ""
        state2 = 0
        childloc2 = ""

Standalone Diagnostics (POWER-based only)

Missing Options Resolution procedure is not run during Standalone Diagnostics. The reason for this is that there is no previous configuration database for the Diagnostic Controller to compare against with the new devices detected at boot time.

Therefore, only the NEW RESOURCES menu is seen during Standalone Diagnostics. This menu presents a list of all the resources found in the system at the time the Standalone Diagnostics were booted.

The user is given a list of choices to make during this time. If the system contains ISA adapters, then these adapters will not appear in the list. ISA adapters are not detectable, therefore an option is presented to the user to help in the configuration of these adapters.

Missing Options Procedure Steps

The following describes the steps performed by the Diagnostic Controller when running the Missing Options Procedure.

  1. The Diagnostic Controller keeps a sorted list of all resources found in the system as represented by the Customized Device object class. This list is walked finding all resources that are tagged as MISSING.
  2. Present the Missing Device menu for all MISSING devices. This menu lists each missing device with any children devices indented a few spaces. Missing Options Resolution Procedure can only be performed on the missing devices that do not have a parent also missing. See MISSING RESOURCE Menu for an example of this menu.
  3. After selection of a device, present the Missing Device Resolution menu. The menu asks the user if the device was moved, removed, or turned off. The following selections may be chosen:
    1. The resource has NOT been removed from the system, moved to another location or address, or turned off.
      This selection will determine why the resource was not detected.
      1. Test the path to the missing device.
      2. If a device in the path is defective, then skip to the next "missing" device in the list that is not dependent on the one just named. Note that the defective device in the path has been added to the FRU Bucket object class by the Diagnostic Application (DA).
      3. Return to the step where the missing device menu was presented.
      4. If an EnclDAName DA is named, call it.
      5. If a problem was detected, skip to the next missing device in the list that has a different parent, and return to the step where the Missing Device menu was presented.
      6. If a missing device procedure was specified (suptests & SUPTESTS_MS1), then call it. Note that the DA should conclude that there is a problem.
      7. Skip to the next missing device in the list that is not dependent on the current missing device.
      8. Return to the step where the Missing Device menu was presented.
      9. If a missing device procedure was not specified, then add the device to the FRU Bucket object class by the addfrub subroutine. The default information is obtained from the Predefined Device object class.
    2. The resource has been removed from the system and should be removed from the system configuration.
      1. If the DA for the missing device supports the Missing Device Procedure 2 (suptests==SUPTESTS_MS2), then call the DA. The Diagnostic Controller does not automatically delete the device from the system configuration.
      2. Otherwise, flag the device to be deleted.
    3. The resource has been moved to another location and should be removed from the system configuration.
      1. Display a list of the new devices that are of the same type so that the user can identify where the missing device was moved. This list should contain a default selection for "Not Listed" in the event that the device was not detected in its new location, in which case a default service request number (SRN) should be generated.
      2. Assuming the user identified a new location:
        1. If the missing device has children which are non-detectable:
          • Present a menu to the user asking if the children should be reconfigured to the new device. The menu should contain a single selection for all of the devices and additional selections for the individual devices.
          • When a device is chosen, the parent field needs to be changed and the device configured. The mkdev command is used to configure the device.
        2. Delete the missing device and any children that have not been reconfigured.
    4. The resource has been turned off and should be removed from the system configuration.
      1. Flag the device to be removed from the configuration database.
    5. The resource has been turned off but should remain in the system configuration.
      1. Do nothing.
  4. Once all the missing devices have been processed through one of the selections above, then perform the following:
    1. Report any problems found.
    2. Delete the devices that were previously flagged to be deleted.
    3. If a new resource has been added, then display a list of the new devices. Ask the user if the list is correct.
      1. If Yes, then exit.
      2. If No, display predefined SRN indicating some new devices were not detected. Exit.

[ Top of Page | Previous Page | Next Page | Contents | Index | Library Home | Legal | Search ]