The approach to troubleshooting a Network Information Service (NIS) problem depends on whether the problem is at the NIS client or the NIS server.
The most common NIS client problems occur at the following times:
When an AIX machine has two interfaces and they both are given the same name, gethostbyname lookups for rsh command will fail if NIS is being used. This is because AIX NIS will not return both addresses, but only the first one found. This is an implementation limitation imposed by the New Database Manager (NDBM) and performance considerations. The error message will be:
0826-825: there is a host address that does not match
The most common problem occurring at an NIS client node is for a command to hang. Sometimes a command appears to hang, even though the system seems fine and other commands run. In such a case, a message like the following can be generated at the console:
NIS: server not responding for domain <wigwam>. Still trying
This error message indicates that the ypbind daemon on the local machine is unable to communicate with the ypserv daemon in the wigwam domain. This results when systems that run the ypserv daemon have failed. It may also occur if the network or the NIS server machine is so overloaded that the ypserv daemon cannot get a response back to your ypbind daemon within the time-out period.
Under these circumstances, all the other NIS clients on your network show the same or similar problems. The condition is usually temporary. The messages go away when the NIS server machine reboots and the ypserv daemon restarts, or else when the load on the NIS server and the network decreases.
If the ypbind daemon is communicating with the ypserv daemon and the NIS server is not overloaded, one of the following problems may exist:
Find a client system that is working normally and try the ypwhich command. If the ypwhich command never returns an answer, stop it. Then type the following at the NIS server machine:
Look for the ypserv and ypbind processes. If the server's ypbind daemon is not running, start it using the instructions in "Starting and Stopping the NIS Daemons".
If a ypserv process is running, issue the ypwhich command on the NIS server machine. If this command returns no answer, the ypserv daemon is probably hung and should be restarted. Stop and restart the ypserv daemon by following the instructions in "Starting and Stopping the NIS Daemons".
When other machines on the network appear to have no problems, but NIS service becomes unavailable on your system, a variety of symptoms can show up:
ypcat myfile ypcat: can't bind to NIS server for domain <wigwam> Reason: can't communicate with ypbind.OR
When symptoms like these occur, issue the ls -l command on a directory containing files owned by many users, including users not in the local machine's /etc/passwd file. Use the following format:
If the ls -l command reports file owners that are not in the local machine's /etc/passwd file as numbers, rather than names, it means that NIS service is not working.
These symptoms usually indicate that your ypbind daemon is not running. You can use the ps -ef command to check for one. If you do not find a ypbind daemon, start it by following the instructions in "Starting and Stopping NIS Daemons".
If the ypbind daemon repeatedly crashes immediately after it is started, you should look for a problem in some other part of the system.
program vers proto port 100007 2 tcp 1024 ypbind 100007 2 udp 1028 ypbind 100007 1 tcp 1024 ypbind 100007 1 udp 1028 ypbind 100021 1 tcp 1026 nlockmgr 100024 1 udp 1052 status 100020 1 udp 1058 llockmgr 100020 1 tcp 1028 llockmgr 100021 2 tcp 1029 nlockmgr 100012 1 udp 1083 sprayd 100011 1 udp 1085 rquotad 100005 1 udp 1087 mountd 100008 1 udp 1089 walld 100002 1 udp 1091 rusersd 100002 2 udp 1091 rusersd 100001 1 udp 1094 rstatd 100001 2 udp 1094 rstatd 100001 3 udp 1094 rstatdIf the daemons are not listed, the ypbind daemon is unable to register its services. Reboot the machine.
When you use the ypwhich command several times at the same client node, the response varies because the status of the NIS server changes. The status changes are normal.
The binding of NIS client to NIS server changes over time on a busy network, when the NIS servers are busy. Whenever possible, the system stabilizes so that all clients get acceptable response time from the NIS servers. The source of an NIS service is not important, because an NIS server machine often gets its own NIS services from another NIS server on the network.
The most common NIS server problems occur at the following times:
Because NIS works by propagating maps among servers, you can sometimes find different versions of a map at the network servers. This is normal if temporary and abnormal otherwise.
Normal update is prevented when an NIS server or a router between NIS servers is down during a map transfer attempt. When all the NIS servers and all the routers between them are up and running, the ypxfr command should succeed.
If a particular slave server has problems updating a map, you can log in to that server and run the ypxfr command interactively. If this command fails, an error message returns to tell you why, so that you can fix the problem. If the command succeeds, but you want to check it regardless, create a log file to enable logging of messages by typing the following:
cd /var/yp touch ypxfr.log
This saves all output from the ypxfr command. The output looks much like what the ypxfr command creates when it is run interactively, but each line in the log file is time stamped. The time stamp tells when the ypxfr command began its work. It is normal to see unusual orderings in the time stamps. If copies of the ypxfr command ran simultaneously but their work took differing amounts of time, the summary status line may be written to the log files in an order that differs from the order in which they were invoked.
Any pattern of intermittent failure shows up in the log. After you fix the problem, turn off logging by removing the log file. If you forget to remove the log file, it grows without limit.
While you are logged in to the NIS slave server, inspect the system crontab entries, and the ypxfr shell scripts it invokes.
Make sure that the NIS slave server is in the ypservers map. If not, the yppush command will not notify the slave server when a new copy of a map exists.
When the ypserv process repeatedly crashes immediately after starting and does not stay up with repeated activations, the debugging process is similar to that described for ypbind crashes. First, you should check for the portmap daemon:
ps -ef | grep portmap
If you do not find the portmap daemon, reboot the server. If there is a portmap daemon, type:
rpcinfo -p speed
where speed is the hostname of the NIS server.
On your particular machine, the port numbers will be different. The four entries that represent the ypserv daemon are:
100004 2 udp 1027 ypserv 100004 2 tcp 1024 ypserv 100004 1 udp 1027 ypserv 100004 1 tcp 1024 ypserv
If these entries do not exist, the ypserv daemon is unable to register its services. Reboot the machine. If the ypserv entries are present, but they change each time you try to restart the /usr/lib/netsvc/yp/ypserv daemon, reboot the machine again.