IBM Books

Diagnosis Guide


Error symptoms, responses, and recoveries

Use the following table to diagnose problems with the SDR component of PSSP. Locate the symptom and perform the action described in the following table.

Table 15. System Data Repository (SDR) symptoms

Symptom Recovery
Nonzero return code See Action 1 - Get the return code.
Cannot connect to server See Action 2 - Analyze system or network changes.
SDR class corrupted or missing See Action 3 - Analyze class situation.
Cannot write to the SDR See Action 4 - Check authorization.
Error code 005 - Write authority required. See Action 4 - Check authorization.
Error code 006 - Administrator authority required. See Action 4 - Check authorization.

Actions

Action 1 - Get the return code

If you cannot run SDR commands, or a program that uses the SDR is failing when running SDR commands, get the return code or the message number from the failing SDR routine. The return codes from SDR routines are imbedded in the message numbers. The first four numbers in the SDR cataloged message are always 0025, followed by a hyphen and a three-digit number. The three digit number is the return code. For example, the following SDR message is issued with a return code of 080 from any SDR routine that cannot connect to the SDR server:

0025-080 The SDR routine could not connect to server.

Some programs report the return code from an SDR routine, but not the message. Use 0025 and the return code to find the appropriate message in PSSP: Messages Reference. Follow the action for the particular error message to correct the error.

Once you have correct the problem, rerun the command that produced the error to verify that it is corrected.

Action 2 - Analyze system or network changes

System or network changes could affect the SDR. If an SDR command fails to connect to the server, do the following:

  1. Issue the spget_syspar command on the node where SDR commands are failing.
  2. If the spget_syspar command fails, check the /etc/SDR_dest_info file on the same node. It should have four records in it. These records are the primary and the default hostname and IP addresses. They should be similar to:
     default:default_syspar_ip_address 
     primary:syspar_ip_address 
     nameofdefault:default_hostname
     nameofprimary:syspar_hostnameprimary
    

    where

    Note:
    The default system partition may be the same as the primary system partition.

    If this file is missing or does not have these four records, the node may not be properly installed, or the file may have been altered or corrupted. You can edit this file to correct it, or copy the file from a working node in the same system partition.

    The spget_syspar command may also fail if:

  3. If the spget_syspar command is successful, check to make sure that the address is also the address of a valid system partition. If it is, try to ping that address. Issue this command:
    ping -c 1 IP_address
    

    If the ping is successful, the output is similar to:

                PING 9.114.61.129: (9.114.61.129): 56 data bytes
                64 bytes from 9.114.61.129: icmp_seq=0 ttl=255 time=0 ms
                
                ----9.114.61.129 PING Statistics----
                1 packets transmitted, 1 packets received, 0% packet loss
                round-trip min/avg/max = 0/0/0 ms
    

    If the ping fails, output is similar to:

                PING 9.114.61.129: (9.114.61.129): 56 data bytes
     
                ----9.114.61.129 PING Statistics----
                1 packets transmitted, 0 packets received, 100% packet loss
    

    In this case, contact your system administrator to investigate a network problem.

  4. If the value returned by the spget_syspar command is not the same as the address in the primary record of the /etc/SDR_dest information file, the SP_NAME environment variable is directing SDR requests to a different address. Make sure that this address (the value of the SP_NAME environment variable) is a valid system partition.
  5. If the value of the SP_NAME environment variable is a hostname, try setting it to the equivalent dotted decimal IP address. If SDR commands now work, the system nameserver is not functioning.
  6. If the address returned by spget_syspar is a valid system partition address and pings to that address are successful, check for the existence of the SDR server process (sdrd) on the control workstation with:
    ps -ae | grep sdrd
    

    If the process (sdrd) is not running, do the following:

    1. Check the /var/adm/SPlogs/sdr directory for a core dump. If one exists, see Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
    2. Check the SDR server logs in /var/adm/SPlogs/sdr/sdrdlog.ipaddr.pid, where ipaddr is the IP address of the system partition and pid is a process ID.
    3. Issue the command:
      /usr/bin/startsrc -g sdr
      

      to start the SDR daemon. Start checks again at Step 5. If the SDR daemon is now running and continues to run, check the sdrd entry in the file /etc/inittab on the control workstation. It should read:

      sdrd:2:once:/usr/bin/startsrc -g sdr
      

Issue an SDR command again to see if it now connects to the server.

Action 3 - Analyze class situation

If an SDR command ends with RC=102 (internal data format inconsistency) or 026 (class does not exist), first make sure that the class name is spelled correctly and that the case is correct. See the table of classes and attributes in "The System Data Repository" appendix in PSSP: Administration Guide. Then, follow the steps in "SDR Shadow Files" in the System Data Repository appendix in the PSSP: Administration Guide.

This condition could be caused by the /var file system filling up. If this is the case, either define more space for /var or remove unnecessary files.

If the problem persists, contact the IBM Support Center.

Once you have corrected the problem, rerun the command that produced the error to verify that it is corrected.

Action 4 - Check authorization

The trusted services authentication methods for a system partition determine the rules used by that system partition's sdrd to permit write and administrator access to the SDR. SDR administrator access is required for commands that change class definitions or create and delete files from the SDR. Write access is required for commands that add objects, change attributes and replace files in the SDR.

If the trusted services authentication methods are set to DCE only, appropriate credentials are needed to be able to issue write or administrator commands to the SDR. If the trusted services authentication methods are set to dce:compat, compat, or anything else, only the root user on the control workstation or an SP node in the sdrd's system partition can issue write or administrator commands to that system partition's SDR. For more information on authentication, see "The System Data Repository" appendix in PSSP: Administration Guide.

If an SDR command fails to write to the SDR, perform these steps:

  1. Find out what trusted services authentication methods are in your system partition by issuing the command:
    lsauthpts
    
  2. Make sure that you are in the system partition you expect, by issuing the command:
    spget_syspar -n
    

    If not, check the SP_NAME environment variable to see if it is set to connect to an unexpected system partition. If SP_NAME is not set, check the /etc/SDR_dest_info file for correctness. To correct the SDR_dest_info file, see Action 2 - Analyze system or network changes, Step 2.

  3. If you are connecting to the expected system partition's sdrd, perform the following actions based on the value of the trusted services authentication methods for the system partition:

    1. If the trusted services authentication methods are set to DCE only:

      Issue the klist command to see if you have DCE credentials. If so, you can see which SDR groups you belong to.

      If you do not belong to a group with sdr and write in the name, you cannot write to the SDR. If you do not belong to a group with sdr and admin in the name, you cannot issue SDR administrator access commands.

      If this is the case, ask your security administrator to add you to the appropriate sdr groups. As an alternative, dce_login to a principal that is in the appropriate sdr groups. If you have no credentials, dce_login to a principal in the appropriate sdr groups.

      Note:
      The group names may be overridden, however, in the /spdata/sys1/spsec/spsec_overrides file.

      Note that there are separate groups for access to system classes which are global to all system partitions, and for access to partition-sensitive classes. The objects of partition-sensitive classes may be written only from that system partition's sdrd. Groups for system classes have system-class in their name, unless the name was overridden in the /spdata/sys1/spsec/spsec_overrides file. Groups for partition-sensitive classes do not have system-class in their name.

      Partition-sensitive groups can also be partitioned. If the group has a :p appended to it in the spdata/sys1/spsec/spsec_overrides file, there will be a separate group for each partition, with its own access list.

      Note:
      The SDR administrator access authority includes write authority.

      To determine if there is a problem with DCE, see Diagnosing SP Security Services problems.

    2. If the trusted services authentication methods are set to anything other than DCE only (The lsauthpts command returns anything other than DCE):

      The root user on the control workstation or root on a node in the SDR's system partition is allowed to perform write or administrator commands to the SDR.

      Issue the whoami command to make sure that you are running as root. If you are not root, and DCE is an option for your system partition, follow the actions in Step 3a.

      If you are root, perform these steps:

      1. To find out the hostname of the node you are running on, issue the command:
        hostname
        
      2. To find out the IP address for your hostname, issue the command:
        host hostname
        

        where hostname is the hostname found in the previous step.

      3. Issue the command:
        SDRGetObjects Adapter netaddr==ipaddress 
        

        where ipaddress is the address found in the previous step.

      If there is no object for the IP address you entered, the SDR will not recognize your node as being in its system partition. Also, the SDR will not allow root on the node to perform write or administrator SDR commands. Possible causes are: the adapters were not set up correctly during installation, and the network is not set up correctly on the node.

      To see how routing is set up on a node, see Diagnosing IP routing problems. For information on how to add adapters, see PSSP: Installation and Migration Guide.

      Also, if there is an adapter on the node that cannot be defined in the Adapter class of the SDR, but commands from the node are routed across that adapter, the sdrd will not recognize the command as coming from one of its nodes.

      In this case, a static route may be added from the node to the control workstation by using the smit fastpath mkroute. The destination address should be the IP address that represents the system partition of the node. The gateway IP address should be for an adapter that is defined in the SDR. Another possible workaround for this situation is to define the IP address of the unsupported adapter as a supported adapter type.

Once you have corrected the problem, check that the SDR can be written to by running the SDR_test command.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]