QUESTIONS ON CLSTRMGR OF HACMP V1.2

ITEM: RTA000068481



My customer tested the behavior of HACMP (V1.2) with Third-party                
takeover configuration (2 active nodes and 1 standby node) and                  
ran into the following problems.                                                
                                                                                
1. Several times, he could not stop gracefully the clstrmgr on                  
   one node out of 3 nodes. He then tried to stop the clstrmgr                  
   with 'forced' option, and succeeded. Right after that, he                    
   tried to start the clstrmgr on the same node again, but                      
   the "topchng.rc" was never called. (clstrmgr was active.)                    
                                                                                
   After testing several cases, he reached the following result.                
   When he stopped the clstrmgr at first on the primary active                  
   node or on secondary active node, then stopped the clstrmgr                  
   on the standby node, he could successfully stopped clstrmgr                  
   with graceful option on each node.  On the contrary, when he                
   tried to stop clstrmgr at first on the standby node, then                    
   clstrmgr on the primary or secondary active node, he always                  
   ran into the above problem.                                                  
                                                                                
   Is this order of stopping clstrmgr officially supported?                     
   Or, is there any other possible cause of the above problem?                  
                                                                                
2. In some cases (3 cases out of 80, in his test scenario), SMIT                
   on AIX-window abended with core dump while starting clstr-                   
   mgr.  He got into the clstrmgr start menu by entering "smitty                
   clstart" on one AIX-window and hit enter with all default option             
   (Start:now, BROADCAST message:no, Startup Cluster Lock Services              
   :no, Startup Cluster SMUX Peer Daemon:no), then, after several               
   seconds, the smit abnormally ended with core dump created.                   
   The clstrmgr, however, was started normally.                                
   Do you have any suggestion on the cause of this problem?                     
   If you need the core dump, please let me know.  I could send                 
   it to you.                                                                   
                                                                                
The version of AIX and AIX window is 3.2.4 and 1.2.5 each.                      
                                                                                
                                                                                
ANSWER                                                                          
I'll address your questions individually:                                       
                                                                                
1) In a third-party takeover configuration, you MUST stop clstrmgr on           
   one of the servers FIRST. Then you can stop clstrmgr on the other            
   nodes. You cannot stop clstrmgr on the stndby node first.                    
                                                                                
2) I don't know the cause of this problem, although I have seen the            
   same behavior on our systems running HACMP Version 1.2 (and even             
   HACMP Version 2.1). Because of this, you have two choices:                   
                                                                                
   a) Ignore the core dump since clstrmgr starts correctly. (It's just          
      SMIT that core dumps, not HACMP; I found this out by looking at           
      the core dump that we got).                                               
                                                                                
   b) Open a PMR with your defect support center. My understanding is           
      that you have access to RETAIN, which is the tool you use to do           
      this.                                                                     
                                                                                
S e a r c h - k e y w o r d s:                                                  
HACMP CORE DUMP THIRD-PARTY TAKEOVER                                            
                                                                                
                                                                               


WWQA: ITEM: RTA000068481 ITEM: RTA000068481
Dated: 05/1995 Category: ITSAIHA6000
This HTML file was generated 99/06/24~12:43:25
Comments or suggestions? Contact us