Most of the error symptoms have obvious solutions such as "add the user to the database", or "have the user issue the correct login command". These types of errors will not have responses and recovery actions listed here.
There are some errors that are not normally seen, but can be generated and are seen through the remote commands. These types of errors are included here for reference.
Use this table to locate the problem symptom and corresponding recover
action.
Table 51. Remote command (rsh/rcp) symptoms and recovery actions
Symptom | Recovery action |
---|---|
SIGINT error from rsh. | See SIGINT error from rsh command. |
Protocol failure errors - Kerberos V4. | See Protocol failure errors - Kerberos V4. |
Exec Format Error. | See Protocol failure errors - Kerberos V4. |
SP services fail due to Kerberos V5 remote command authentication problems. | See SP Services fail when using Kerberos V5. |
Non-SP system applications fail when using rsh or rcp. | See Non-SP System applications fail when using rsh or rcp commands. |
kshell or tcp: Unknown service. | See Unknown service: kshell or TCP. |
Unable to rsh, telnet, rlogin.
ping shows host is up. | See Unable to rsh, rcp, telnet, rlogin, but ping shows host is up. |
Error messages pointing to or from authentication subsystem. | See |
Error messages indicating SP Security configuration problems. | See |
Kerberos rsh message 0041-010 Cannot import nflag or options variables. | See Kerberos rsh message 0041-010. |
Error messages pointing to connection problems. | See Remote command connection problems. |
Error messages pointing to server configuration errors (Kerberos V5). | See Remote command server configuration errors. |
Remote to remote rcp errors. | See Remote to remote rcp errors. |
Cannot contact KDC for realm (Kerberos V5). | See Cannot contact KDC (Kerberos V5). |
Decrypt integrity check failed (Kerberos V5). | See Decrypt integrity check failed (Kerberos V5). |
Unexpected remote command authorization failures. | See Remote command authorization failures. |
This error is received if rsh is issued at the command line and placed in the background without using the -n flag. The recovery action is to use the -n flag when issuing rsh in the background.
This error message is usually received when the rsh client is passing a message through from either the krshd server, or one of the lower level Kerberos V4 libraries, where an error occurred. This message normally has the "passed through" message appended to it or on the next line.
If the message does not concern items like network errors or connection errors, the first recovery action is to verify that krshd is not having problems, by setting up syslog to capture any daemon error messages. See Error information. Remember to set syslog up on the target machine - not the machine where you issued the rsh command.
Any error messages logged from krshd should help pinpoint the problem and possible solution. The type of messages range from "cannot locate servers", or "cannot reach server", to errors obtained due to incorrect principal format for authentication or authorization.
While a complete list of "passed through" error messages cannot be specified, some general types of errors and recovery actions are helpful:
Run the proper SP configuration scripts to update the database and authorization files.
Run the proper SP configuration scripts to update the database and authorization files.
For more information, see Diagnosing SP Security Services problems.
This is returned for incorrect handling of the return code for the Kerberos V4 or compatibility library call: krb_recvauth. This routine returns either an error number or message number. If an error number is returned and is treated like an error message number, this error will be returned. See Information to collect before contacting the IBM Support Center and contact the IBM Support Center.
SP services can fail using the remote commands with the Kerberos V5 (through DCE) authentication method, due to expired credentials and the inability for services to obtain new credentials if the key used to login has expired.
An SP Service must login to DCE in order to obtain credentials for use with the remote commands. If the principal's key ("password") has expired, the service cannot log in to obtain credentials. Remote command failures may be the first indication of this situation, with other seemingly unrelated failures occurring. This type of error may affect services on only one node if that node has been down during the time an automatic refresh of keys had taken place.
For information on how to handle this situation, see Diagnosing SP Security Services problems.
The AIX /usr/bin/rsh and /usr/bin/rcp commands now support Kerberos V5 (through DCE), Kerberos V4 (through an SP-supplied library) and Standard AIX. These commands expect users and applications to have the proper tickets or credentials for the authentication mechanism installed, and the method enabled on the SP system.
Non-SP system applications, such as database or workflow applications, may need to be updated to obtain tickets or credentials for the authentication method enabled. These applications may also need to be updated to place the proper principals, and create the proper accounts in the authentication database or DCE registry for that application's use.
The only bypass for applications that do not support the authentication methods installed, configured, and enabled on the SP system is to disable all authentication methods except for AIX Standard and to put in place the proper authorization files (.rhosts).
This message usually means that the proper krshd configuration has not been done. The primary cause is either a missing or commented out kshell line in the /etc/services file on the source host, target host or both.
Verify that the following line is valid in the /etc/services file on both the source and target host:
kshell 544/tcp krcmd
This type of error can indicate one of two problems.
Verify that the following line is valid in the /etc/inetd.conf file on the target host for rsh and rcp problems
kshell stream tcp nowait root /usr/sbin/krshd krshd
You can get a variety of error messages, or remote commands may hang under these circumstances. You can also validate an inetd problem by setting up syslog to capture log errors from these daemons to see if the daemons start.
It may help to stop and restart the inetd daemon with the AIX stopsrc and startsrc commands. Otherwise, consult the AIX manuals in debugging an inetd problem.
When you receive messages indicating a problem with the underlying authentication mechanism, it is best to consult the proper manual for diagnosis and recovery information. Although the remote commands would not be the only function to see errors from the authentication mechanism, the remote commands may be the first indication of these errors.
Some authentication mechanism problems that can directly affect the remote commands are:
As indicated in the dependency section, the remote commands rely on the authentication method being installed and configured on the SP system, configured for SP system use and enabled. Depending on the authentication mechanism, the SP install and configure scripts perform a number of functions for the proper setup of the system.
Configuration and enablement of an authentication method depends on the choices you have made on the SP Security SMIT panels, and the resulting installation and configuration of the nodes after those choices have been made.
Some of the errors that directly affect the remote commands include:
For more information, consult Diagnosing node installation problems in this book, PSSP: Installation and Migration Guide, and RS/6000 SP: Planning Volume 2.
This message is received if you have not installed the AIX APAR IX85420 but have updated your SP system with ssp.clients fixes. Install the AIX APAR to obtain the companion fix to allow the Kerberos rsh routine to obtain the variables it needs from the AIX rsh client. The minimum AIX file set required is bos.net.tcp.client.4.3.2.4. If you have any questions, contact the IBM Support Center.
The target host may be in the process of rebooting, or the inetd system may be down. Check whether the krshd daemon is listed in the file /etc/inetd.conf on the target system. If you had to add the krshd daemon, be sure to stop and restart the inetd subsystem.
For a "Kerberos V5 Connection abort" message, it means that DCE was deinstalled, but a DCE unconfig_admin was never done for the target.
For a "Kerberos V5 Connection ended by software" message, DCE configuration is incomplete. Client services are only partially available.
The authentication services (servers and clients) may be inactive even though the services are still enabled. The authentication services may have been unconfigured. Use the lsauthent and lsauthpar commands to verify which authentication services are enabled, then check that they are running.
In order for remote to remote rcp to work, credentials must be present on the intermediate source host. Kerberos V5 (through DCE) supports forwarding of credentials. Kerberos V4 does not support forwarding of credentials. With a choice of authentication now available, this command may or may not work, depending on the authentication methods enabled on the systems involved.
For example, if you have three hosts named: HostA, HostB, and HostC and you issue the following command from HostA:
rcp HostB:/file HostC:/file
The following table shows what can happen and some of the necessary
requirements.
Table 52. Remote to remote rcp
The /etc/krb5.conf file contains an error in the realms stanza. The kdc= entry does not contain the correct address for the security server's location. Usually, the problem is that the kdc entry address is on a different subnet than the current host.
This usually indicates that a node's DCE services are out of synch with the rest of the DCE cell. Perform these steps:
If you experience unexpected failure when issuing remote commands, check the following:
Table 53. secure remote command symptoms and recovery actions
Symptom | Recovery action |
---|---|
Parallel command (dsh, pcp) hangs with secure shell enabled | See Action 1. |
Parallel command (dsh, pcp) is using the wrong remote command method (rsh versus secure shell). | See Action 2. |
Secure connection to hostname refused. | See Action 3. |
After node install with secure shell enabled, failures in remote copy of files form the control workstation to the nodes from firstboot.cust. | See Action 4. |
Perform these steps:
Check your setting of the RCMD_PGM, DSH_REMOTE_CMD and REMOTE_COPY_CMD environment variables. These variables determine which remote command method is used by parallel commands.
Check to see that the sshd daemon is running on the host that is listed in the error message received when the connection failed.
Check to see that the secure remote command program was installed, and that the daemon started by script.cust on the node. Ensure that the sshd daemon was put in file /etc/inittab right after the rctcpip command.