IBM Books

Installation and Migration Guide


Adding nodes

When you add a node, SP-attached server, or clustered enterprise server to your SP system, you should plan how it fits into your configuration. Consider the number of interfaces it will contain when planning your network configuration. Record this information in your SP configuration worksheets located in the RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment. You also must consider which nodes provide their boot/install service. You can configure a new boot/install server for these nodes or add them to an existing boot/install server as clients.

For more information on adding an extension node, refer to Chapter 10, Installing extension nodes.

Notes:

  1. When you add a node, ensure that the switch adapters are compatible with the switch adapters in the rest of your system and ensure that the nodes you are adding are supported by the switch in the system. For example, the SP Switch-8 only handles eight nodes.

  2. You cannot add a node in a location already defined for an SP Switch Router Adapter node or an SP-attached server. View the System Partition Map to determine which slots are valid for adding nodes.

  3. Do these steps in the system partition to which you add the nodes (SP Switch only).

  4. To SP Switch-8 Users:

    Since the switch node numbers for these switches are computed sequentially, it is possible for new nodes to be incorrectly numbered when a group of nodes is added. You can avoid this problem by having your IBM Customer Engineer (CE) connect the nodes one at a time in an ascending fashion. Between each node, you should verify that a node object has been created with the proper switch node number.

  5. |If you are adding a new LPAR to an existing pSeries 690 server |attached to your SP or clustered enterprise server system, perform the steps |in Adding a new LPAR then continue with the steps in this section.

Step 1: Archive the SDR

Note:
Perform this step only if you did not back up the SDR in Step 1: Archive the SDR.

Before reconfiguring your system, you should back up the SDR by issuing:

SDRArchive

Note the location and the name of the file created after you issue this command.

Step 2: Connect new nodes to the frame

Do this step if you are adding nodes to an existing frame. Your IBM Customer Engineer (CE) performs this step. See RS/6000 SP: Installation and Relocation for instructions.

When adding nodes with connected SP expansion I/O units, the I/O units must be powered on before the nodes are powered on in order for the connections to be properly recognized.

|To add a new logical partition to an existing pSeries 690 attached |server or clustered enterprise server, you will need to use the HMC interface |to create the logical partition and assign resources to that partition. |Once the partition has been created, hardmon will automatically find |the new LPAR and a new node will be added to the SDR for that pSeries 690 |frame. To add a new LPAR, perform the following steps:

  1. |Start the HMC interface. This can be done in one of the following |ways: |
  2. |Select the pSeries 690 system object and follow HMC procedures for |creating a new logical partition. Refer to the Hardware Management |Console for pSeries Operations Guide for details on performing these |operations. If this logical partition requires resources that are |currently owned by other partitions in the server, follow the steps in Deleting Resources from an LPAR before proceeding with this step.
  3. |View the properties for the LPAR and note the partition ID assigned by the |HMC.

|hardmon will automatically find the new LPAR and a new node |object will be created in the SDR on the control workstation. The frame |slot number of the LPAR is equivalent to the partition ID assigned to that |LPAR by the HMC. This slot number in turn is used to derive the node |number following standard PSSP frame-slot number to node number conversion |algorithms.

Step 3: Update the state of the supervisor microcode

To ensure that you have the latest level of microcode required by the hardware on your SP system, issue the spsvrmgr command. For example, to get the status in report form of all of your frames, nodes, and switches, enter:

spsvrmgr -G -r status all

To update the microcode of the frame supervisor of frame 3, enter:

spsvrmgr -G -u 3:0
Note:
When using the spled application, a node in the process of having the microcode on its supervisor card updated will not be displayed in the window.

Refer to the PSSP: Command and Technical Reference for more information on using the spsvrmgr command.

Step 4: Enter the required node information

This step adds IP address-related information to the node objects in the SDR. It also creates adapter objects in the SDR for the |SP Ethernet administrative LAN adapters on your nodes. This information is used during node customization and configuration.

Note:
|The default route that you enter in this step is not the same as the |default route on the node. The route that you enter here goes in the |SDR Node Class. It is the route over which the node communicates with |its boot/install server (for example, install, customize, and so on). |The default route must be a valid path from the SP Ethernet administrative LAN |adapter to the node's boot/install server and the control |workstation.

|The default route on the node is the route it will use for its network |communications if there is no specific route to the destination. During |the boot process, this is set to the default route in the SDR. It can |be changed later on in the boot process or after the node is running, but |should not be changed permanently in the SDR. For FDDI, token ring, or |other Ethernet adapters, create the route in |firstboot.cust. The following example defines a route |for an Ethernet adapter. This example also saves the route into the |node's ODM.

|old_route_info=$($lsattr -E -l inet0 | $grep 'route *net,.*,
|0, 0-9 . *' | $awk ' print $2; ' | $tail -n 1) #-
|if -n "$old_route_info" ; then #-
|$chdev -l inet0 -a delroute="$old_route_info" > /dev/null
|2>&1 #-
|fi #-
|$chdev -l inet0 -a route="0,<route>"

|In order for the route to remain set after customization, also set the |route up in /etc/inittab after the line that runs |rc.sp. For the switch, set the route up in |/etc/inittab after the line that runs |rc.switch. |

Enter information about your nodes attached to each Ethernet adapter using Perspectives, SMIT, or the |spadaptrs command.

|The following example configures an SP Ethernet administrative LAN |adapter network of 16 nodes with IP addresses ranging from |129.33.32.1 to 129.33.32.16, a |netmask of 255.255.255.192, and a default route of |129.33.32.200 for a twisted-pair Ethernet using |auto-negotiate for the communication transfer and rate.

|spadaptrs -e 129.33.32.200 -t tp -d auto -f auto 1 1 16 en0
|          129.33.32.1 255.255.255.192

|The following example configures the adapter on the SP Ethernet |administrative LAN adapter for the first logical partition of a pSeries 690 |server. The adapter is a twisted pair Ethernet adapter with |communication transfer and rate set to auto-negotiate. The IP address |is 129.33.32.65 with a netmask of |255.255.255.192. The pSeries 690 server is |represented as frame 5, the node is assigned slot 1, and the adapter is |located at the physical location U1.9-P2-I2/E1.

|spadaptrs -P U1.9-P2-I2/E1 -t tp -d auto -f auto 5 1 1 en
|          129.33.32.65 255.255.255.192

If you are adding an extension node to your system, you may want to enter required node information now. For more information, refer to Chapter 10, Installing extension nodes.

Step 5: Acquire the hardware Ethernet addresses

This step gets hardware Ethernet addresses for the |SP Ethernet administrative LAN adapters for your nodes, either from a file or from the nodes themselves, and puts them into the Node Objects in the SDR. That information is used to set up the /etc/bootptab files for your boot/install servers. |This step will also ping the default route set for this |node.

If you know the hardware Ethernet addresses, you can speed this process by putting the addresses in the /etc/bootptab.info file. |If you are performing this step for a pSeries 690 server, you may |already have the hardware Ethernet addresses available to you from Step 4: Enter the required node information. Create the /etc/bootptab.info file |as follows:

The following example gets all hardware Ethernet addresses for an RS/6000 SP system:

sphrdwrad 1 1 rest

This example gets all hardware Ethernet addresses for the nodes specified in the node list (the -l flag):

sphrdwrad -l 10,12,17

If this step fails, look for the node conditioning instructions in the PSSP: Diagnosis Guide.

Step 6: Verify that the Ethernet addresses were acquired

This step verifies that Ethernet addresses were placed in the SDR node object.

Attention: If your system is large, splstdata returns great quantities of data. You may want to pipe the command output through a filter to organize the amount of data you see.

To display SDR boot/install data, enter:

splstdata -b

Step 7: Configure additional adapters for nodes

Perform this step if you have a switch or if you require any additional adapters.

Be sure to have your switch configuration worksheet on hand with all the switch information completed before attempting to perform this step. RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment explains how to fill out your worksheet.

This step creates adapter objects in the SDR for each node. The data in the adapter objects is used during the customization or installation steps to configure the adapters on the nodes. You can configure the following adapters with this procedure:

To configure adapters such as ESCON and PCA, you must configure the adapter manually on each node using dsh, or modify the firstboot.cust file.

Configuring the switch adapters

To configure your switch adapters for use with the RS/6000 SP system, use SMIT or issue the spadaptrs command. RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment contains additional information on IP addressing for the switch.

The following example adds SDR information for a |css (SP Switch or SP Switch2) network of 30 nodes (frame 1 slot 1 to frame 2 slot 16, with a wide node as the first node in each frame and the rest thin nodes, and a switch on each frame) with IP addresses from 129.33.34.1 to 129.33.34.30, and a netmask of 255.255.255.0. The IP addressing corresponds to the slots in the frame, with each wide node incrementing by 2 and each thin node incrementing by 1, and each high node by 4.

If you specify the -s flag to skip IP addresses when you are setting the |css switch addresses, you must also specify -n no to not use switch numbers for IP address assignment, and -a yes to use ARP.

spadaptrs -s yes -n no -a yes 1 1 30 css0 129.33.34.1 255.255.255.0

Configuring other additional adapters

To configure other additional adapters, for example Ethernet (en), token ring (tr), or FDDI (fi), you must select the Additional Adapter Database Information. For these adapters, you can select either the Start Frame, Start Slot, and Node Count fields, or the Node List field.

Notes:

  1. When using the token ring (tr) adapter, you must select the token ring rate (4 MB or 16 MB).

  2. For best results, exit and get back into this panel for each different type of adapter. This clears any extraneous values left behind in the panel.

  3. Enter a correct value for Ethernet speed (10, 100, 1000, or auto), Duplex (full, half, or auto), and Type (bnc, dix, tp, fiber, or NA) for every Ethernet adapter on each node.

  4. |For pSeries 690 servers, specifying the adapter by its physical |location code and adapter type is suggested, especially if there is more than |one adapter of that type present in the node. The physical location |code for an adapter can be determined in one of the following ways: |

The distribution of your IP addresses determines how many times you perform this step. You may have to do it more than once if:

|The following example adds SDR information for an fi0 (FDDI adapter) |network of 30 nodes (frame 1 slot 1 to frame 2 slot 16, with a wide node as |the first node in each frame and the rest thin nodes) with IP addresses from |129.33.34.1 to 129.33.34.30, and a |netmask of 255.255.255.0. The IP addressing |corresponds to the slots in the frame, with each wide node incrementing by 2 |and each thin node incrementing by 1.

|spadaptrs -s yes 1 1 30 fi0 129.33.34.1 255.255.255.0

|This example adds SDR information for a tr0 (token ring adapter) for |node 1 with IP address 129.33.35.1 and a netmask of |255.255.255.0, and references the node list |field.

|spadaptrs -l 1 -r 16 tr0 129.33.35.1 255.255.255.0

|This example adds SDR information for an additional Ethernet adapter |for the second logical partition in a pSeries 690 server. The adapter |is a twisted pair Ethernet adapter with duplex, the speed set to |auto-negotiate, and is not the SP Ethernet adapter for the node. The IP |address is 129.33.35.66 with a netmask of |255.255.255.0. The pSeries 690 server is |represented as frame 5, the node is assigned slot 2, and the adapter is |located at the physical location U1.9-P2-I2/E4.

|spadaptrs -P U1.9-P2-I2/E4 -t tp -d auto -f auto 5 2 1 en \
|          129.33.35.66 255.255.255.0

For nodes running DCE

You need to perform the following steps only if you have a new adapter on a node that is running DCE.

Note:
All adapters must have an ftp and a host account defined in the DCE database.
  1. |Login to the DCE cell as a cell administrator.
  2. |Run kerberos.dce -type admin -ip_name |hostname_of_adapter
    |Note:
    You must either reboot or customize the node in order for the adapter to be |configured. |
  3. |After the node has completed reboot or customization, login as root |to the node and issue kerberos.dce -type local

Step 8: Configure initial host names for nodes

Do this step if:

This step changes the default host name information in the SDR Node Objects used during customization to set up the host name on each node, and allows you to indicate how you want to name your RS/6000 SP nodes. The default is the long form of the |SP Ethernet administrative LAN adapter host name, which is how the |spadaptrs command processes defaulted host names.

You can indicate an adapter name other than |SP Ethernet administrative LAN adapter for the node host names to be used, as well as whether the long or short form should be used. |When determining whether you want the nodes' host name to be |either in long or short form, be consistent with the host name resolution on |the control workstation. If the host command returns the short |form of a host name, you should choose the short form for the node's |initial host name.

Hostnames containing multibyte character data are not supported on the SP.

The following example indicates that the host name of each node is the long (fully qualified) form of the host name of the css0 adapter for a system with two frames and 32 nodes:

sphostnam -a css0 1 1 32

Step 9: Set up nodes to be installed

This step does the following:

The default installation assumes one of the following:

The default installation assumes your nodes have not been preinstalled. If you want to have them installed with your own install image, you must specify the following:

If you want different nodes to be installed by a different boot/install server, you must specify the target nodes and which node will serve as the boot/install server.

Selecting an installation disk

There are |five ways you can specify the disk or disks to use for installation.

  1. The hardware location format

    IBM strongly suggests that you use this format |for SCSI devices. It ensures that you install on the intended disk by targeting a specific disk at a specific location. The relative location of hdisks can change depending on the hardware installed or possible hardware failures. You should always use this format when there are external disk drives present, because the manner in which the device names are defined, may not be obvious. Installation on external disk drives is not supported. For example, to specify a single SCSI drive, enter:

    00-00-00-0,0
    

    or enter multiple hardware locations separated by colons:

    00-00-00-0,0:00-00-00-1,0
    
  2. The device names format

    For example, to specify a single device name, enter:

    hdisk0
    

    or enter multiple device names separated by commas:

    hdisk0,hdisk1
    
  3. |A combination of the parent and connwhere attributes for SSA devices

    |To specify the parent-connwhere attribute:

    |ssar//0123456789ABCDE

    |or to specify multiple disks, separate using colons as follows:

    |ssar//0123456789ABCDE:ssar//0123456789ABCDE

    |The parent-connwhere format should only be used for SSA drives.

    |For more information on acquiring ssar numbers, see AIX Kernel and |Subsystems Technical Reference, Volume 2.

  4. |The PVID format

    |If a disk was previously configured as a physical volume in order for it to |be assigned to a volume group, a physical volume identifier (PVID) was |assigned to that disk by AIX. You can specify a disk by its PVID value |as a string of 16 hexadecimal characters. For example:

    |00d4c45202be737f

    |To specify multiple disks by their PVID values, separate the specifications |using colons:

    |00d4c45202be737f:00d4c452eb639a2c

    |Use the AIX lspv command to list the PVID values for the disks on |your system. For more information on making an available disk a |physical volume and setting its PVID, see AIX System Management |Guide: Operating System and Devices.

  5. |The SAN target and logical unit identifier format

    |Fibre channel attached disks are identified by a worldwide port name and a |logical unit identifier (LUN ID). To specify the SAN_DISKID, combine |the two values into a single string separated by "//". For example, if |the SAN target worldwide port name for a fibre channel attached disk is |0x50060482bfd12c9c and the LUN ID is 0x8000000000000000, the SAN_DISKID |specification would be:

    |0x50060482bfd12c9c//0x8000000000000000

    |To specify multiple fibre channel disks, separate the specifications using |colons:

    |0x50060482bfd12c9c//0x8000000000000000:0x50060482bbffd7cb//0x0

    |Use the AIX lsattr -EH -l hdisk command to determine the |worldwide port name and LUN ID for a disk.

|The hardware location, SSA parent-connwhere, PVID, and SAN_DISKID |formats can be used together. Specify multiple mixed format disk values |using colons to separate the specifications as follows:

00-00-09-0,1:ssar//0123456789ABCDE:00d4c45202be737f

|The device names format cannot be combined with any of the other |format types.

You can use the spchvgobj command using the hardware location format for disk locations 00-07-00-0,0 and 00-07-00-1,0 for node 9. For example:

|spchvgobj -r selected_vg -h 00-07-00-0,0:00-07-00-1,0 -1 9

If you need to change lppsource_name from default to a new lppsource_name such as aix433 for nodes 1 through 16, issue:

|spchvgobj -r selected_vg -v aix433 1 1 16

If you need to change the install_image_name from default to install_image_name such as bos.obj.ssp.433 for nodes 17, 18, 21, 22, issue:

|spchvgobj -r selected_vg -i bos.obj.ssp.433 -v aix433 -l 17,18,21,22

For more information on alternate root volume groups, see the "Managing root volume groups" appendix in PSSP: Administration Guide.

Note:
|The AIX alt_disk_install function is not related to the SP |alternate root volume group support and is not supported with PSSP |installation. |

Mirroring the root volume group

One way to significantly increase the availability of the SP system is to set up redundant copies of the operating system on different physical disks using the AIX disk mirroring feature. Mirroring the root volume group means that there will be multiple copies of the operating system image available to a workstation or node. Mirrored system images are distributed so that a node can remain in operation even after one of the mirrored units fail.

When installing a node, you have a choice of how many copies of the root volume group you would like. AIX allows one (the original), two (the original plus one), or three (the original plus two) copies of a volume group. IBM strongly suggests that the root volume group be mirrored for a total of at least two copies. PSSP provides commands to facilitate root volume group mirroring.

You can specify how many copies and which disks to use with the spchvgobj command. Care should be taken when specifying disks so that no other single point of failure is introduced. For example, the specified disks should not be attached to the same adapter.

The default setting for the number of copies is based on the node type. The default is one copy for all nodes except the POWER3 Symmetric Multiprocessor (SMP) High Node, which has a default of two copies. These nodes are assumed to contain dual internal disk drives as a standard configuration. The disks will automatically be used for mirroring. If these nodes were not configured with the dual internal disks or you do not want mirroring, use the spchvgobj command to change the settings before installing the node.

You can use the spchvgobj command using the hardware location format for disk locations 00-07-00-0,0 and 00-07-00-1,0 for node 9 and set the number of copies to two. For example:

spchvgobj -r selected_vg -h 00-07-00-0,0:00-07-00-1,0 -1 9 -c 2

For a complete description of how mirroring is handled by PSSP, see the "Managing root volume groups" appendix in PSSP: Administration Guide.

Step 10: Update the security information for the new nodes

The new nodes will need to be configured to the DCE database. Perform the following steps to add the new nodes. Step 1, Step 2, and Step 3 are required for DCE. Step 4 is required for both DCE and Kerberos V4.

  1. You must set a DCE host name for each node in the SDR. This step uses the nodes' reliable host name as the DCE host name if a DCE host name does not already exist. Run create_dcehostname to update the SDR Node class attribute dcehostname for the new nodes.
  2. Run setupdce so the new nodes principals can be added to the DCE registry.

    Notes:

    1. You must know the cell administrator password to perform this step.

    2. To run this command off of the SP, you must set the SP_NAME environment variable on the remote workstation to point to the SDR of the SP system being configured. The value must be a resolvable address. For example:
      export SP_NAME=spcws.abc.com
      
  3. Run config_spsec so the new nodes service principals can be created.

    Notes:

    1. You must have cell administrator authority to perform this step.

    2. To run this command off of the SP, you must set the SP_NAME environment variable on the remote workstation to point to the SDR of the SP system being configured. Refer to the config_spsec command in PSSP: Command and Technical Reference for a description of the -r (remote) flag.
  4. All nodes in the system partition need to be updated and, therefore, the control workstation's authorization files need to be updated as well. To create the authorization files issue updauthfiles.

|Step 11: Refresh RSCT subsystems

The SDR has now been updated to reflect the new nodes that will run |PSSP 3.4. You now need to refresh the system |RSCT subsystems on the control workstation and all nodes to pick up these changes. Run syspar_ctrl on the control workstation to refresh the subsystems on both the control workstation and on the nodes.

syspar_ctrl -r -G

Step 12: Verify all node information

This step verifies that all the node information has been correctly entered into the SDR.
If using: Do this:
splstdata

To display SDR:
Enter:

Site environment data
splstdata -e

Frame data
splstdata -f

Node data
splstdata -n

Adapter data
splstdata -a

Boot/install data
splstdata -b

SP expansion I/O data
splstdata -x

SP security settings
splstdata -p

Switch data
splstdata -s

If your system is large, splstdata returns great quantities of data. You may want to pipe the command output through a filter to reduce the amount of data you see.

|Step 13: Add the new node to the root authorization files

| | | |

|The new nodes must be added to the root authorization files on |previously-installed nodes. Issue:

|dsh -avG /usr/lpp/ssp/bin/updauthfiles

Step 14: Configure the boot/install server

Do this step if you have not run setup_server already using the SMIT Boot/Install Server Information window or the spbootins command.

This step uses the information entered in the previous steps to set up the control workstation and optional boot/install servers on nodes. It configures the control workstation as a boot/install server and configures the following options (when selected in your site environment):

You can perform this step more than once. If you encounter any errors, see the PSSP: Diagnosis Guide for further explanation. After you correct your errors, you can start the task again.

In previous releases of PSSP, most of the installation function which configured boot/install servers and clients was performed in the single program called setup_server which you could run by issuing the setup_server command. This is still the suggested way for configuring the control workstation. For more experienced system administrators, IBM has provided a set of Perl scripts you can issue to also configure the control workstation that enable you to diagnose how the setup_server program is progressing.

If you have a node defined as a boot/install server, you must also run setup_server out on that server node. Enter the setup_server command on the control workstation with no parameters. For example:

setup_server

The first time setup_server runs, depending upon your configuration, it can take a significant amount of time to configure the control workstation as a NIM master.

Step 15: Change the default network tunable values

When a node is installed, migrated, or customized (set to customize and rebooted), and that node's boot/install server does not have a /tftpboot/tuning.cust file, a default file of system performance tuning variable settings in /usr/lpp/ssp/install/config/tuning.default is copied to /tftpboot/tuning.cust on that node. You can override these values by following one of the methods described in the following list:

  1. Select an IBM-Supplied Alternate Tuning File

    IBM supplies three alternate tuning files which contain initial performance tuning parameters for three different SP environments:

    1. /usr/lpp/ssp/install/config/tuning.commercial contains initial performance tuning parameters for a typical commercial environment.
    2. /usr/lpp/ssp/install/config/tuning.development contains initial performance tuning parameters for a typical interactive/development environment.
    3. /usr/lpp/ssp/install/config/tuning.scientific contains initial performance tuning parameters for a typical engineering/scientific environment.
      Note:
      The SP-attached servers and clustered enterprise servers should not use the tuning.scientific file because of the large number of processors and the amount of traffic that they can generate.

    To select one of these files for use throughout the nodes in your system, use SMIT or issue the cptuning command. When you select one of these files, it is copied to /tftpboot/tuning.cust on the control workstation and is propagated from there to each node in the system when it is installed, migrated, or customized. Each node inherits its tuning file from its boot/install server. Nodes which have as their boot/install server another node (other than the control workstation) obtain their tuning.cust file from that server node so it is necessary to propagate the file to the server node before attempting to propagate it to the client node. The settings in the /tftpboot/tuning.cust file are maintained across a boot of the node.

  2. Create and Select Your Own Alternate Tuning File

    The following steps enable you to create your own customized set of network tunable values and have them propagated throughout the nodes in your system. These values are propagated to each node's /tftpboot/tuning.cust file from the node's boot/install server when the node is installed, migrated, or customized and are maintained across the boot of the node.

    1. On the control workstation, create the file /tftpboot/tuning.cust. You can choose to begin with a copy of the file located in /usr/lpp/ssp/samples/tuning.cust which contains a template of performance tuning settings which have been commented out. Or you may prefer to begin with a copy of one of the IBM-supplied alternate tuning files.
    2. Select the tunable values that are best for your system.
    3. Edit the /tftpboot/tuning.cust file by ensuring the appropriate lines are uncommented and that the tunable values have been properly set.

Using SMIT:

|TYPE
|smit select_tuning |

SELECT
The desired tuning file

Once you have updated tuning.cust, continue installing the nodes. After the nodes are installed and customized, on all subsequent boots, the tunable values in tuning.cust will be automatically set on the nodes.

Note that each of the supplied network tuning parameter files, including the default tuning parameter file, contains the line /usr/sbin/no -o ipforwarding=1. IBM suggests that on non-gateway nodes, you change this line to read /usr/sbin/no -o ipforwarding=0. After a non-gateway node has been installed, migrated, or customized, you can make this change in the /tftpboot/tuning.cust file on that node.

If you are configuring more than eight of one particular adapter type, you must change the ifsize parameter in the tuning.cust file.

For the latest performance and tuning information, refer to the RS/6000 Web site at:

http://www.rs6000.ibm.com/support/sp/perf

Step 16: Perform additional node customization

Do this step to perform additional customization such as:

IBM provides the opportunity to run customer-supplied scripts during node installation:

script.cust
This script is run from the PSSP NIM customization script (pssp_script) after the node's AIX and PSSP software have been installed, but before the node has been rebooted. This script is run in a limited environment where not all services are fully configured. Because of this limited environment, you should restrict your use of script.cust to function that must be performed prior to the post-installation reboot of the node.

firstboot.cust
This script is run during the first boot of the node immediately after it has been installed. This script runs in a more "normal" environment where most all services have been fully configured. This script is a preferred location for node customization functions that do not require a reboot of the node to become fully enabled. |

|firstboot.cmds
|When in restricted root remote access mode and secure remote command mode, |this sysctl script is run on the control workstation |during node installation to copy critical files from the control workstation to the nodes. It is enabled in |the firstboot.cust script. See the |firstboot.cmds and firstboot.cust files for |information on how to set up and enable this script for sysctl.
Note:
Your security environment is not set up during script.cust processing. If you are using AIX remote commands or SP Trusted Services, perform your customization during firstboot.cust processing. See Appendix E, User-supplied node customization scripts for additional information.

See Appendix E, User-supplied node customization scripts for more detailed information on:

Appendix E, User-supplied node customization scripts also discusses migration and coexistence issues and techniques to use the same set of customization scripts across different releases and versions of AIX and PSSP.

Note:
When PSSP installs a node, it uses the AIX sysdumpdev -e command to estimate the size of the dump for the node. PSSP creates a dump logical volume that is approximately 10 percent larger than the estimated dump size, and makes that logical volume the primary dump device. However, you may find that this dump device is not large enough to contain an entire dump due to large processes or applications running on your node.

Once your node is up and running, use:

sysdumpdev -e
To get the estimated size of the node's dump

sysdumpdev -l
To find the name of the primary dump device

lslv
To list the amount of space available in the primary dump device

extendlv
To expand the size of the dump logical volume if the estimated dump space is greater than the dump space available

Step 17: Additional switch configuration

Frames with Switches

If you have added a frame with a switch, you will need to perform Step 17.1: Select a topology file through Step 17.4: Storing the switch topology file in the SDR. If you have an SP Switch, you will also need to perform Step 17.5: Set the switch clock source for all switches (SP Switch only).

Nodes or SP-attached servers (SP switch only)

If you have only added nodes or an SP-attached server, you will need to perform only Step 17.3: Annotating a switch topology file and Step 17.4: Storing the switch topology file in the SDR.

Step 17.1: Select a topology file

Select the correct switch topology file by counting the number of node switch boards (NSBs) and intermediate switch boards (ISBs) in your system, then apply these numbers to the naming convention. The switch topology files are in the /etc/SP directory on the control workstation.

|NSBs are switches mounted in slot 17 of frames containing nodes or |SP Switch2 switches mounted in slots 2 through 16 of frames designed as |multiple NSB frames. Multiple NSBs are used in systems that require a |large number of switch connections for SP-attached servers or clustered |enterprise server configurations. ISBs are switches mounted in the |switch frame. ISBs are used in large systems, where more than four |switch boards exist, to connect many processor frames together. |SP-attached servers never contain a node switch board, therefore, never |include non-SP frames when determining your topology files.

The topology file naming convention is as follows:

expected.top.NSBnumnsb.ISBnumisb.type

where:

For example, expected.top.2nsb.0isb is a file for a two frame and two switch system with no ISB switches.

The exception to this naming convention is the topology file for the SP Switch-8 configuration, which is expected.top.1nsb_8.0isb.1.

See the Etopology command in PSSP: Command and Technical Reference for additional information on topology file names.

Step 17.2: Managing the switch topology files

The switch topology file must be stored in the SDR. The switch initialization code uses the topology file stored in the SDR when starting the switch (Estart). When the switch topology file is selected for your system's switch configuration, it must be annotated with Eannotator, then stored in the SDR with Etopology. The switch topology file stored in the SDR can be overridden by having an expected.top file in /etc/SP on the primary node. Estart always checks for an expected.top file in /etc/SP before using the one stored in the SDR. The expected.top file is used when debugging or servicing the switch.

Notes:

  1. Be aware that Estart distributes the topology file to all the nodes in the system partition on the switch. In the case of expected.top, this is significant because if the topology file is left on a node and the primary is changed to that node, the topology file will be used. If you have an expected.top file in /etc/SP on any of the nodes, make sure that you remove it when it is no longer needed.

  2. Depending upon your configuration, the first Estart of the switch may take longer than subsequent Estarts.

  3. |In a two plane SP Switch2 system, the function of |/etc/SP/expected.top is taken over by |/etc/SP/expected.top.p0 for plane 0 and by |/etc/SP/expected.top.p1 for plane 1.

Step 17.3: Annotating a switch topology file

Use the Eannotator command to update the switch topology file's connection labels with their correct physical locations. Use the -O yes flag to store the switch topology file in the SDR. Using Eannotator makes the switch hardware easier to debug because the switch diagnostics information is based on physical locations.

For example, to annotate a two-switch or maximum 32-node system, enter:

Eannotator -F /etc/SP/expected.top.2nsb.0isb.0 \
           -f /etc/SP/expected.top.annotated -O yes

Step 17.4: Storing the switch topology file in the SDR

If you entered Eannotator -O yes or yes on the Topology File Annotator menu in Step 17.3: Annotating a switch topology file, skip this step.

Use the Etopology command to store the switch topology file in the SDR and make sure that it has been annotated. For example, to store a two-switch or maximum 32-node configuration, enter:

Etopology /etc/SP/expected.top.2nsb.0isb.0.annotated

Step 17.5: Set the switch clock source for all switches (SP Switch only)

Use SMIT or the Eclock command to initialize the switch's clock source. The SMIT and Eclock interfaces require that you know the number of Node Switch Boards (NSBs) and Intermediate Switch Boards (ISBs) in your RS/6000 SP system.

Select the Eclock topology file from the control workstation's /etc/SP subdirectory, based on these numbers. For example, if your RS/6000 SP system has six node switch boards and four intermediate switch boards, you would select /etc/SP/Eclock.top.6nsb.4isb.0 as an Eclock topology file.

See PSSP: Command and Technical Reference for the Eclock topology file names.

Use the Eclock command to set the switch's clock source for all switches.

For example, if your RS/6000 SP system has six node switch boards and four intermediate switch boards, select /etc/SP/Eclock.top.6nsb.4isb.0 as an Eclock topology file. Enter:

Eclock -f /etc/SP/Eclock.top.6nsb.4isb.0

This command sets the proper clock source settings on all switches within a 96-way (6 NSB, 4 ISB) RS/6000 SP system.

To verify the switch configuration information, enter:

splstdata -s

Step 18: Redefine system partitions (SP Switch or switchless systems only)

If you want to partition your system, you can select an alternate configuration from a predefined set of system partitions to implement before booting the nodes or you can use the System Partitioning Aid to generate and save a new layout. Follow the procedure described in the "Managing system partitions" chapter in the PSSP: Administration Guide and refer to information in "The System Partitioning Aid" section of the "Planning SP system partitions" chapter in the RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment. You do not have to partition your system now as part of this installation. You can partition it later.

Note:
|System partitioning is not supported on clustered enterprise servers |or on systems with an SP Switch2 switch. |

If you have a frame with a switch or a frame with a switchless system, you will need to redefine your system partition configuration to match the hardware. At this point, you should not move any existing nodes to different system partitions. If you want to reconfigure your system partitions after completing this task, see PSSP: Administration Guide for more information.

Step 19: Network boot optional boot/install servers

If you are adding a node or nodes that will function as a boot/install server, you will need to perform this step.

SP Switch note: If you have set up system partitions, do this step in each partition.

SP-attached server and clustered enterprise server note: If you do not want to reinstall your existing SP-attached server or clustered enterprise server, but want to preserve its environment, perform steps Step 19.1: Upgrade AIX through Step 19.10: Reboot.

To monitor installation progress by opening the node's read-only console, issue:

s1term frame_id slot_id

If you have eight or more boot/install servers on a single Ethernet segment, you should network boot those nodes in groups of eight or less. See the "IP performance tuning" section in RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment for more information.

To network boot your nodes, issue:

nodecond frame_id slot_id &
Note:
For MCA nodes, the nodecond command remotely processes information from the initial AIX firmware menus. You should not change the language option on these menus. The language must be set to English in order for the nodecond command to run properly.

To check the LCD and LED display for each node, enter:

spmon -Led nodenode_number

or

spled &

Network installation progress

When a network installation is in progress, the LED for the nodes involved show various values. These values indicate the installation stage. Since the node installation process can be long, it is hard to determine where you are in that process. Refer to PSSP: Diagnosis Guide for a complete list of PSSP-specific LED values.

Note:
Perform Step 19.1: Upgrade AIX through Step 19.10: Reboot only if you want to preserve the environment of your existing SP-attached server or clustered enterprise server.

SP-attached server and clustered enterprise server installation

Perform the following steps to add an SP-attached server or clustered enterprise server and preserve your existing software environment.

Step 19.1: Upgrade AIX

If your SP-attached server or clustered enterprise server is not at AIX 4.3.3, you must first upgrade to that level of AIX before proceeding.

Step 19.2: Set up name resolution of the SP-attached server or clustered enterprise server

In order to do PSSP customization, the following must be resolvable on the SP-attached server or clustered enterprise server:

Step 19.3: Set up routing to the control workstation host name

If you have a default route set up on the SP-attached server or clustered enterprise server, you will have to delete it. If you do not remove the route, customization will fail when it tries to set up the default route defined in the SDR. In order for customization to occur, you must define a static route to the control workstation's host name. For example, the control workstation's host name is its token ring address, such as 9.114.73.76 and your gateway is 9.114.73.256:

route add -host 9.114.73.76 9.114.73.256

Step 19.4: FTP the SDR_dest_info file

During customization, certain information will be read from the SDR. In order to get to the SDR, you must FTP the /etc/SDR_dest_info file from the control workstation to the /etc/SDR_dest_info file on the SP-attached server or clustered enterprise server and check the mode and ownership of the file.

Step 19.5: Verify perfagent

Ensure that perfagent.tools 2.2.32.x is installed on your SP-attached server or clustered enterprise server.

Step 19.6: Mount the pssplpp directory

Mount the /spdata/sys1/install/pssplpp directory on the boot/install server from the SP-attached server or clustered enterprise server. For example, issue:

mount k3n01:/spdata/sys1/install/pssplpp /mnt

Step 19.7: Install ssp.basic

Install ssp.basic and its prerequisites onto the SP-attached server or clustered enterprise server. For example, issue:

|installp -aXgd/mnt/PSSP-3.4 ssp.basic 2>&1 | tee /tmp/install.log

Step 19.8: Unmount the pssplpp directory

Unmount the /spdata/sys1/install/pssplpp directory on the boot/install server from the SP-attached server or clustered enterprise server. For example, issue:

umount /mnt

Step 19.9: Run pssp_script

Run the pssp_script by issuing:

/usr/lpp/ssp/install/bin/pssp_script

Step 19.10: Reboot

Perform a reboot. For example:

shutdown -Fr

Step 20: Verify that System Management tools were correctly installed on the boot/install servers

Now that the boot/install servers are powered up, run the verification test from the control workstation to check for correct installation of the System Management tools on these nodes.

To do this, enter:

SYSMAN_test

After the tests are run, the system creates a log in /var/adm/SPlogs called SYSMAN_test.log.

See the section on "Verifying System Management installation" in the PSSP: Diagnosis Guide for information on what this test does and what to do if the verification test fails.

Step 21: Network boot the remaining RS/6000 SP nodes

SP Switch note: If you have set up system partitions, do this step in each partition.

Repeat the procedure used in Step 19: Network boot optional boot/install servers to network boot and install, or customize the remaining nodes. You may need to ensure that all setup_server processes have completed on the boot/install nodes prior to issuing a network boot on the remaining nodes. Refer to the /var/adm/SPlogs/sysman/node.console.log file on the boot/install node to see if setup_server has completed.

If any of your boot/install servers have more than eight clients on a single Ethernet segment, you should Network Boot those nodes in groups of eight or less. See the "IP performance tuning" section in RS/6000 SP: Planning, Volume 2, Control Workstation and Software Environment for more information.

Using a token ring-bridge gateway

If you are using a token ring through a bridge as your default gateway to your nodes and the token ring bridge is not on the same segment as your LAN, you must change the value of the broadcast field in the ODM for each node. The default value is set to No (confine broadcast to local token-ring) each time you install or customize a node. However, when you boot the nodes with this bridge setup, the network is unusable.

To do this, enter:

chdev -P -l tr0 -a allcast=off

Step 22: Verify node installation

To check the hostResponds and powerLED indicators for each node, enter:

spmon -d -G

Step 23: Verify the SP expansion I/O unit configuration

To verify that the SP expansion I/O unit is properly configured in the SDR, issue:

splstdata -x

|Step 24: Enable s1_tty on the SP-attached server or clustered enterprise server (SAMI hardware protocol only)

If you just installed an SP-attached server or clustered enterprise server, you must ensure that the s1_tty is enabled on the server. Until the login is enabled on the tty, the s1term command from the control workstation to the SP-attached server or clustered enterprise server will not work.

On the SP-attached server or clustered enterprise server, determine which tty is mapped to 01-S1-00-00. For example, issue the following:

lsdev -C -c tty0

In response, the system displays something similar to:

tty0 Available 01-S1-00-00 Asynchronous Terminal
tty1 Available 01-S2-00-00 Asynchronous Terminal

In the previous example, tty0 is mapped to 01-S1-00-00.

Set the login to enable. For example, issue the following:

chdev -l tty0 -a login=enable

Step 25: Start the switch (optional)

Do this step if you have a switch installed in your system. If you have set up system partitions (SP Switch only), do this step in each partition.

Estart
Note:
|If you are using the Switch Admin daemon for node recovery, start it |by issuing startsrc -s swtadmd on SP Switch systems or startsrc |-s swtadmd2 on SP Switch2 systems before issuing the Estart |command. |

Step 26: Verify that the switch was installed correctly

Run a verification test to ensure that the switch is installed completely. To do this, enter:

CSS_test

After the tests are run, the system creates a log in /var/adm/SPlogs called CSS_test.log.

If the verification test fails, see the section on "Diagnosing switch problems" in the PSSP: Diagnosis Guide.

To check the switchResponds and powerLED indicators for each node, enter:

spmon -d -G

Step 27: Tune the network adapters for added nodes

Various models of network adapters can have different values for transmit and receive queue sizes. The queue setting for Micro Channel adapters is 512. For PCI adapters, the queue setting is 256 or greater.

Note:
For AIX 4.2.1, the receive queue size is not tunable.

You can set these values using SMIT or the chdev command. If the adapter you are changing is also the adapter for the network you are logged in through, you will have to make the changes to the database only. Then reboot the nodes for the changes to become effective. To do this, enter:

chdev -P -l ent0 -a xmt_que_size=256

You must reboot the nodes in order for the changes to take effect.

Step 28: Reconfigure LoadLeveler to add the new node to the LoadLeveler cluster

If you are using LoadLeveler as your workload management system, add the new node to the LoadLeveler configuration. For more information on this step, see the "Administration tasks for parallel jobs" chapter in IBM LoadLeveler for AIX 5L: Using and Administering.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]