IBM Books

Diagnosis Guide


SNMP configuration diagnosis

The following section will aid you in diagnosing communication problems which may occur between the SNMP Agent administering the dependent node (residing on the router node) and the SP Manager residing on the control workstation. You should run with tracing enabled for the SPMGR subsystem during a dependent node configuration.

When you configure a dependent node within an extension node class, you create attribute values in the SDR DependentNode class which are used by the SP SNMP Manager to communicate with the SNMP Agent on the router node. These attributes are:

node_number
The node number for the dependent node

extension_node_identifier
The identifier assigned to the dependent node (this is the two-digit slot number of the dependent node adapter on the router node)

management_agent_hostname
The fully qualified hostname of the node on which the SNMP Agent administering the dependent node resides. This is used to communicate with the router node. It must resolve to an IP address.

snmp_community_name
The SNMP community name placed within SNMP messages passed between the SNMP Agent and the SNMP SP Manager for authentication. This value must match the community name value configured on the SNMP Agent host for communicating with the SP Manager on the control workstation.

If the node_number is specified in error, the configuration data may be sent to the SNMP Agent administering the dependent node successfully. However. problems will occur when attaching the switch adapter to the switch network.

When you have completed the definition of the dependent node on the control workstation, and have installed the SP Switch Router Adapter on the router node, check to see if the SDR adapter_config_status attribute value for the dependent node in the switch_responds class remains configured.. If so, then trap messages from the router node are not being processed successfully by the SNMP Manager on the control workstation. This can be caused by one of several problems:

  1. If the spmgr subsystem trace file in the directory /var/adm/SPlogs/spmgr contains an entry indicating init_io failed: udp port in use, then the UDP port specified for service name spmgrd-trap in the /etc/services file on the control workstation is already in use. This error will also appear in an AIX error log entry written by the spmgrd daemon.

    Solution: Change the UDP port number for the spmgrd-trap service to an unused port number. The router node snmpd daemon configuration file, /etc/snmpd.conf on the router node, must also be updated to specify this same port number when sending trap messages to the control workstation. Both the snmpd daemon on the router node and the spmgr subsystem on the control workstation must be restarted after this change is made.

  2. If the lssrc -ls spmgr command response contains zeros for both the number of switchInfoNeeded traps processed successfully, and the number processed unsuccessfully, then trap messages sent by the SNMP Agent on the router node are not being received by the SNMP Manager on the control workstation.

    Either the control workstation IP address or the UDP port number may have been specified in error in the /etc/snmpd.conf file on the router node. The UDP port number associated with the control workstation in file /etc/snmpd.conf on the router node must match the UDP port number specified for the spmgrd-trap service in the /etc/services file on the control workstation.

    Solution: Correct the erroneous value and restart the spmgrd daemon on the router node and the spmgr subsystem on the control workstation.

  3. If the lssrc -ls spmgr command response contains zeros for the number of switchInfoNeeded traps processed successfully, and the number processed unsuccessfully is greater than zero, then trap messages sent by the SNMP Agent on the router node are being received by the SNMP Manger on the control workstation. However, they are not being successfully processed. This may be the result of one of the following errors:

    1. If the spmgr subsystem trace file in directory /var/adm/SPlogs/spmgr contains an entry indicating: 'Dependent node <ext_id> managed by the SNMP Agent on <router_node_hostname> is not configured in the SDR - switchInfoNeeded trap ignored', then either the extension_node_identifier or the management_agent_hostname attribute value for the corresponding extension node in the SDR DependentNode class is incorrect.

      Solution: Correct the attribute value.

    2. If the spmgr subsystem trace file in directory /var/adm/SPlogs/spmgr contains an entry indicating: 'SDR attribute <attrname> for dependent node <ext_id> in class <classname> has a null value for SNMP Agent on host <router_node_hostname>', or an entry indicating: 'SDRGetAllObjects() DependentAdapter failed with return code 4', then required configuration values are missing from the indicated SDR class.

      Solution: Supply the missing attribute values.

    3. If the spmgr subsystem trace file in directory /var/adm/SPlogs/spmgr contains an entry indicating: 'Dependent node <ext_id> managed by the SNMP Agent on host <router_node_hostname> is configured with a bad community name - switchInfoNeeded trap ignored', then the snmp_community_name attribute value specified for the corresponding node in the SDR DependentNode class does not match the community name specified for the control workstation in the /etc/snmpd.conf file on the router node.

      Note that if the snmp_community_name attribute value is null, the community name to be specified in the router node is documented in the Ascend documentation.

      Solution: Correct the community names in the /etc/snmpd.conf file on the router node and the snmp_community_name attribute for the corresponding SDR DependentNode class so that they match.

Some SNMP-related configuration problems occur when data is changed in the SDR after an initial configuration. Most of these problems are detected by the configuration-related commands, and messages are issued to the operator.

If you attempt to reconfigure a dependent node after doing one of the following:

These problems could occur:

  1. A time_out occurs on the enadmin command (invoked internally from the SMIT panels, endefnode, and endefadpter commands). This could be caused by one of the following configuration problems:
    1. If the spmgr subsystem trace file in directory /var/adm/SPlogs/spmgr or the AIX error log contains an entry indicating '2536-007 An authentication failure notification was received from an SNMP Agent running on host <router_node_hostname> which supports Dependent Nodes', then the SDR snmp_community_name attribute value in the DependentNode class for the extension node contains a name that does not match the community name specified for the control workstation in the /etc/snmpd.conf file on the router node.

      Solution: Correct the community names in the /etc/snmpd.conf file on the router node and the snmp_community_name attribute for the corresponding SDR DependentNode class so that they match.

    2. If no authentication error exists in either the trace file or the AIX error log, then the value specified for the SDR management_agent_hostname attribute in the DependentNode class for the extension node must not be the correct fully-qualified name for the router node.

      Solution: Correct the management_agent_hostname attribute value in the DependentNode class for the extension node.

Note: if the extension_node_identifier attribute value for an extension node is erroneously set to the ID of another existing extension node on the router node managed by another SP system, the results are unpredictable since two SNMP managers are trying to configure the same SP Switch Router Adapter.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]