The approach to troubleshooting a Network Information Service (NIS) problem depends on whether the problem is at the NIS client (see Identifying NIS Client Problems) or the NIS server (see Identifying NIS Server Problems).
NIS client problems most commonly occur at the following times:
Note: When attempting to solve one map problem, keep in mind that the same problem may be affecting other maps as well. See Files where NIS Appends Map Information for a more detailed explanation.
When a machine has two interfaces and they both are given the same name, gethostbyname lookups for rsh command will fail if NIS is being used because NIS does not return both addresses, but only the first one found. This is an implementation limitation imposed by the New Database Manager (NDBM) and performance considerations. The error message is:
0826-825: there is a host address that does not match
The most common problem occurring at an NIS client node is for a command to hang. A command can appear to hang, even though the system seems to be operating correctly. In such a case, a message similar to the following can be generated at the console:
NIS: server not responding for domain domainname. Still trying
This error message indicates that the ypbind daemon on the local machine is unable to communicate with the ypserv daemon in the given domain because systems that run the ypserv daemon have failed. It may also occur if the network or the NIS server machine is overloaded to the extant that the ypserv daemon cannot return a response to your ypbind daemon within the time-out period.
Under these circumstances, all the other NIS clients on your network show the same or similar problems. The condition is usually temporary. The messages are cleared when the NIS server machine reboots and the ypserv daemon restarts, or else when the load on the NIS server and the network decreases.
If the ypbind daemon is communicating with the ypserv daemon and the NIS server is not overloaded, one of the following problems may exist:
On a client system that is working normally, run the ypwhich command. If the ypwhich command never returns an answer, stop the command. Then type the following at the NIS server machine:
ps -ef | grep yp
Look for the ypserv and ypbind processes. If the server's ypbind daemon is not running, start the daemon, using the instructions in Starting and Stopping NIS Daemons.
If a ypserv process is running, run the ypwhich command on the NIS server machine. If this command returns no answer, stop and restart the ypserv daemon by following the instructions in Starting and Stopping NIS Daemons.
When other machines on the network appear to have no problems, but NIS service becomes unavailable on your system, a variety of symptoms can occur:
For example, messages such as the following might be generated:
ypcat myfile ypcat: can't bind to NIS server for domain <wigwam> Reason: can't communicate with ypbind.
/usr/etc/yp/yppoll myfile RPC: timed out
When symptoms like these occur, do the following:
If the ypbind daemon repeatedly crashes immediately after it is started, look for a problem in some other part of the system.
ps -ef | grep portmap
If the daemon is not running, reboot the system.
Try to communicate with the portmap daemon on your machine from a different machine that is operating normally. From such a machine, type:
rpcinfo -p client
where client is the host name of the machine.
program vers proto port 100007 2 tcp 1024 ypbind 100007 2 udp 1028 ypbind 100007 1 tcp 1024 ypbind 100007 1 udp 1028 ypbind 100021 1 tcp 1026 nlockmgr 100024 1 udp 1052 status 100020 1 udp 1058 llockmgr 100020 1 tcp 1028 llockmgr 100021 2 tcp 1029 nlockmgr 100012 1 udp 1083 sprayd 100011 1 udp 1085 rquotad 100005 1 udp 1087 mountd 100008 1 udp 1089 walld 100002 1 udp 1091 rusersd 100002 2 udp 1091 rusersd 100001 1 udp 1094 rstatd 100001 2 udp 1094 rstatd 100001 3 udp 1094 rstatd
When you use the ypwhich command several times at the same client node, the response varies because the status of the NIS server changes. The status changes are normal.
The binding of NIS client to NIS server changes over time on a busy network, when the NIS servers are busy. Whenever possible, the system stabilizes so that all clients get acceptable response time from the NIS servers. The source of an NIS service is not important, because an NIS server machine often gets its own NIS services from another NIS server on the network.
NIS server problems can most commonly occur at the following times:
Because NIS works by propagating maps among servers, you can sometimes find different versions of a map at the network servers. This is normal only as a temporary situation. Normal update is prevented when an NIS server or a router between NIS servers is down during a map transfer attempt. When all the NIS servers and all the routers between them are up and running, the ypxfr command should run successfully. If a particular slave server has problems updating a map, use the following procedure to detect and solve the problem:
cd /var/yp touch ypxfr.log
This saves all output from the ypxfr command to the ypxfr.log file. The output looks much like what the ypxfr command creates when it is run interactively, but each line in the log file is time stamped. The time stamp tells when the ypxfr command began its work. It is normal to see unusual orderings in the time stamps. If copies of the ypxfr command ran simultaneously but their work took differing amounts of time, the summary status line may be written to the log files in an order that differs from the order in which they were invoked.
When the ypserv process repeatedly crashes immediately after it is started, the debugging process is similar to that described for ypbind crashes. First, check for the portmap daemon:
ps -ef | grep portmap
If you do not find the portmap daemon, reboot the server. If there is a portmap daemon, type:
rpcinfo -p hostname
where hostname is the host name of the NIS server.
On your particular machine, the port numbers will be different. The four entries that represent the ypserv daemon are:
100004 2 udp 1027 ypserv 100004 2 tcp 1024 ypserv 100004 1 udp 1027 ypserv 100004 1 tcp 1024 ypserv
If these entries do not exist, the ypserv daemon is unable to register its services. Reboot the machine. If the ypserv entries exist, but they change each time you try to restart the ypserv daemon, reboot the machine again.