ITEM: BY3135L
Troubleshooting Service Director Communications
ABSTRACT: Troubleshooting Service Director Communications
.SYMPTOM: Service Director is not calling out on problems reported by a
client, or a "Test Communications Path" returns "unknown
status code 0" error.
.STEP 1: Verify the Communications Path is Working from the Forwarding
Server.
. When a problem is detected on the forwarding server (the one with
the modem), the servdir.analyze program handles modem
communications. By testing the communications path from the
server, we can determine if the failure to call out is a problem
with the modem.
. 1) On the server, type in the command:
. \# servdir.analyze ctest:
2) If you receive a return code of zero, the problem is with the
modem configuration. Verify your modem configuration:
. For the modem (consult your modem's user's guide):
Echo should be turned off;
Responses should be turned off;
Error control should be turned on;
RTS/CTS (hardware handshaking) should be turned on.
. For the TTY definition in AIX:
* Enable LOGIN disable
BAUD rate 9600
PARITY none
BITS per character 8
Number of STOP BITS 1
. * AIX 3.2.5 setting will be called "Enable program?"
. 3) If you receive a return code of 14, then the modem communi-
cations are working correctly, and the problem most likely
lies with the client/server communications.
.STEP 2: Verify Network Communications Between the Client and Server.
. When Service Director initiates the callhome process, it runs on a
port assigned to the IP address of the PRIMARY hostname (the name
you get when you run the 'hostname' command). However, the IP
address that the clients "call home" to is determined by the
hostname the server is registered with. Clients will not be able
to connect with the callhome process on the server if they attempt
to use any other hostname/IP address for the server.
. 1) Verify that the server was registered using its primary
hostname. If it was not, re-register the server with the
correct hostname, rebuild reporting topology files, and
re-distribute the keys (steps 1-4 of the Service Director
registration screen).
. 2) From the client, "ping" the server using the hostname specified
for the server when you registered your configuration.
. 3) If the ping fails, verify that the server's hostname and IP
address are correct.
.STEP 3: Test Communications Between the Client(s) and Server.
. If you have multiple clients, ideally you should perform this test
for each client.
. 1) On the client, type in:
\# servdir.analyze ctest:
. 2) If you receive a return code of 14, and you believe that this
client is not calling out problems when it should be, please
review the "Errors Reported by Service Director" and "Service
Director Limitations Based on Type of Error" sections in
Chapter 1 of the User's Guide. A copy of the User's Guide can
be found in /usr/lpp/servdir as UG.ASCII.
. 3) If you receive a return code of 0 from the communications
test, the problem lies either with the callhome program
running on the server or the RPC port configuration.
.STEP 4: Verify the 'callhome' Program Is Running on the Server.
. The callhome program runs on the server, "listening" for calls
from designated client machines. When a call is received from
an authorized client machine, the callhome program uses the
modem on the server to place a problem call to IBM.
. 1) On the server, run the command:
\# ps -ef | grep callhome
. 2) If no callhome process is found, you can restart it in one of
two ways:
. a) \# smit servdir_reg
select option 3, 'Build Reporting Topology', and answer yes
to use the forwarding daemon. This will rebuild the
reporting topology using the hostnames you used when your
registered your configuration and will restart the callhome
process.
b) \# vi /etc/inittab
add the following line to the end of the inittab script:
callhome:2:respawn:/usr/lib/ras/callhome -p /usr/lib/ras
run the following command to restart callhome from inittab:
\# telinit q
. NOTE: These steps will restart the default callhome process.
There are additional options available; please refer to
section 5.5 of the User's Guide for more information.
. 3) After restarting the callhome process, test the communications
path from the client machine again. If you're still getting a
return code of 0, go on to the next step.
. 4) If the callhome process is already running on the server, there
may be a problem with the RPC configuration.
.STEP 5. Verify the Contents of the Reporting Topology Files.
. The reporting topology of Service Director is based on the
contents of two ASCII files: the /usr/lib/ras/callhomehost file
on the clients, and the /usr/lib/ras/probrephosts file on the
server.
. The probrephosts file on the server contains the hostnames of the
client machines it will accept problem reports from.
. The callhomehost file on the client machine contains the IP
addresses of hosts running the 'callhome' program; that is, the
server (or forwarder as it is referred to in the User's Guide).
If the primary server's callhome process becomes unavailable, the
client will automatically try the second IP address in the
callhomehost file. Please note that if a client can establish a
link with a callhome process, it will use that process even if it
times out, and will not fall back to the secondary IP address.
You should only use a secondary IP address in the callhomehost
files if you have a second server acting as a backup to the
primary server.
. 1) Verify that the callhomehost file on each client contains the
primary IP address of your server, and that the probrephosts
file on the server contains the hostnames of all client
machines. If you're missing an IP address or hostname, edit
the appropriate file to include the missing information and
perform the following steps on the server:
\# smit servdir_reg
select option 3, "Build Reporting Topology" and answer
'yes' to use the forwarding daemon.
. 2) If you have more than one network connection between the
client(s) and the server and want to enable Service Director
to use the secondary connection(s) you can add the secondary
hostnames of the clients to the probrephosts file and rebuild
the reporting topology as described above.
. 3) After rebuilding the reporting topology, test the
communications path from the client again. If you're still
receiving the return code 0, go on to the next step.
.STEP 6. Checking the RPC Port Mapping.
. If you are still receiving a return code of 0 from testing
the communications path from the client at this point, it is
likely that there is a problem with the RPC port mapping.
. 1) To check the status of the callhome process, run the following
command on the server:
\# rpcinfo -t \.\ *.* LISTEN
. If you do not get any output, or if you don't see the
server's IP address in front of the port number, you will
need to perform problem determination on the network.
. 4) If all of the RPC info looks OK, but you're still receiving a
return code of 0 from testing the communications path from the
client, you can get additional information from the callhome
program by running it in the foreground. To run the callhome
program in the foreground, perform the following steps:
. a) \# rmitab "callhome"
b) \# ps -ef | grep callhome
If any callhome processes are still running, kill them
c) \# /usr/lib/ras/callhome -p /usr/lib/ras
Support Line: Troubleshooting Service Director Communications ITEM: BY3135L
Dated: February 1998 Category: N/A
This HTML file was generated 99/06/24~13:30:19
Comments or suggestions?
Contact us