ITEM: AS7712L
Understanding Netview alarms from HACMP...
Question:
model:r24
HACMP alarms
Response:
ABSTRACT: HACMP failure doesn't create all snmp mib entries
ENVIRONMENT: AIX 3.2.5 HACMP 3.1
Machine model: R24
Desc:
Q1: According to /usr/include/cluster/clsnmp.h, are the traps for
2.0 release intended to be concatenated to the 1.0 release or
are the traps for 2.0 release intended to replace the 1.0
release.
A1: The traps for 2.0 release are intended to replace the traps
for 1.0 release.
Q2: Where are the integer values of the traps in Netview for the
each of the events defined?
For example, here are two lines from a /usr/OV/log/trapd.log netview
log file for the "SubState" event id with integer values of 16 and 32:
816906333 3 Mon Nov 20 16:25:33 1995 tamp_ndi u
Trap: generic 6 specific 11 args (1): [1]
risc6000clsmuxpd.cluster.clusterSubState.0 (Integer): 16
816906396 3 Mon Nov 20 16:26:36 1995 tamp_ndi u
Trap: generic 6 specific 11 args (1): [1]
risc6000clsmuxpd.cluster.clusterSubState.0 (Integer): 32
I believe that the the "SubState" event ids are defined in the
/usr/include/cluster/clsnmpd.h file as follows:
/*
* SMUX Cluster (Sub)states
*/
\#define SMUX_UNSTABLE 0x10
\#define SMUX_STABLE 0x20
\#define SMUX_ERROR 0x40
Where the decimal equivalent of the hex values given above are as
follows:
Hex Dec
0x10 16
0x20 32
0x40 64
Here is another example for the "State" event ids:
815771741 3 Tue Nov 07 13:15:41 1995 ATL_NDS u
Trap: generic 6 specific 10 args (1): [1]
risc6000clsmuxpd.cluster.clusterState.0 (Integer): 2
815773319 3 Tue Nov 07 13:41:59 1995 ATL_NDS_2 u
Trap: generic 6 specific 10 args (1): [1]
risc6000clsmuxpd.cluster.clusterState.0 (Integer): 4
815773319 3 Tue Nov 07 13:41:59 1995 ATL_NDS u
Trap: generic 6 specific 10 args (1): [1]
risc6000clsmuxpd.cluster.clusterState.0 (Integer): 8
I believe that the the "State" event ids are defined in the
/usr/include/cluster/clsnmpd.h file as follows:
/*
* SMUX Common States
*/
\#define SMUX_INVALID 0x00
\#define SMUX_VALID 0x01
\#define SMUX_UP 0x02
\#define SMUX_DOWN 0x04
\#define SMUX_UNKNOWN 0x08
Where the decimal equivalent of the hex values given above are as
follows:
Hex Dec
0x01 1
0x02 2
0x04 4
0x08 8
One event id that I cannot find a definition for is the "Primary"
event id. Here are three lines from a /usr/OV/log/trapd.log netview
log file for a "Primary" event id with integer values of -1, 1, and 2:
815788132 3 Tue Nov 07 17:48:52 ATL_NDS_2 ?
[1] risc6000clsmuxpd.cluster.clusterPrimary.0 (Integer): 1
815788132 3 Tue Nov 07 17:48:52 ATL_NDS_2 ?
[1] risc6000clsmuxpd.cluster.clusterPrimary.0 (Integer): -1
815788134 3 Tue Nov 07 17:48:54 ATL_NDS_2 ?
[1] risc6000clsmuxpd.cluster.clusterPrimary.0 (Integer): 2
I was unable to determine where the integer values of the "Primary"
event id is defined.
Hello,
I am trying to confirm this with development, but I believe the
reason that there are no \#defines in clsnmp.h for the clusterPrimary
event is that they reflect the node id, with -1 indicating a
transitional state. The trapd.log file entries for
risc6000clsmuxpd.cluster.clusterPrimary.0 (Integer): 1
would thus indicate that node 1 is becoming the primary node.
The subsequent:
risc6000clsmuxpd.cluster.clusterPrimary.0 (Integer): -1
followed by:
risc6000clsmuxpd.cluster.clusterPrimary.0 (Integer): 2
would indicate that the primary ins changing from node 1 to node 2.
Does that make sense, given the sequence of events that was happening
on your cluster at the time these log enrtries were generated?
If I find out anything different or any additional details from
development, I will make another append to this pmr. In the meantime,
I thought this might be useful info for you.
Hello again,
I heard back from development and what I said in my previous
append is accurate - The integer is the id of the cluster node
that is the "primary" node, and it is equal to -1 when the cluster
is not yet up/stable. I should point out, however, that starting with
HACMP 3.1, which has a distributed lock manager, the notion of
a "primary" cluster manager is no longer relevant. There are still
"remannts: of this notion in hacmp, however, as you have found.
I am curious as to whether or not you are using this information
or just trying to understand the information being generated by
clsmuxpd that Netview is capturing.
Support Line: Understanding Netview alarms from HACMP... ITEM: AS7712L
Dated: March 1996 Category: N/A
This HTML file was generated 99/06/24~13:30:25
Comments or suggestions?
Contact us