IBM Netfinity Availability Extensions for Microsoft Cluster Service README File for ServicePack1d This README file contains the latest hints and tips to enhance the reliability and operation of your Netfinity Cluster. Refer to the "IBM Netfinity Availability Extensions for Microsoft Cluster Service Software Installation and User's Guide" for complete installation and configuration instructions. Major Changes From Previous Version. ____________________________________ a. After installing the base product, there is a separate procedure for installing the ServicePack. All nodes must be on the same version of the product. The details of this procedure are described below. b. The ServicePack contains the CHCS Hotfix that was required for the base product. For new installations, the separate step of installing the CHCS Hotfix is not required. c. Starting with ServicePack1, there is a recommended but not required change to the configuration files. The service pack contains a program that will update the configuration files. This is described below in the section Installing the Service Pack. d. There is now a separate Japanese version of the product. That version contains all of the fixes in the service pack. For Japanese systems only, the service pack does not need to be installed. e. Important Note: For Netfinity Availability Extensions for MSCS to support Lotus Domino 5, Domino transaction logging must be turned on. To enable Domino Transaction Logging follow the directions in the Transaction Logging chapter of the Domino Release 5, Administering the Domino System Manual. CONTENTS ________ *1.0 Fibre Channel Controller changes (required). *2.0 Fibre Channel Adapter changes (required). *3.0 Operating Procedures (required). 4.0 Installation Tips and Troubleshooting Hints. 5.0 Uninstall Tips and Troubleshooting Hints. 6.0 IBM System Manager Tips and Troubleshooting Hints. 7.0 Adding Shared storage to an existing Cluster. 8.0 Installation and Uninstall of the ServicePack. 9.0 Netfinity SP Switch Service Installation 10.0 Adding a node to an existing Cluster 11.0 DB2 Installation instructions. 12.0 DB2 Tips and Troubleshooting Hints. 13.0 Trademarks and Notices * THESE CHANGES ARE MANDATORY FOR THE CLUSTER TO FUNCTION PROPERLY. 1.0 Fibre Channel Controller Changes (required). ________________________________________________ To ensure proper operation of the Fibre Channel Controller in the Netfinity Availability Extensions environment, the following updates must be performed. These instructions assume that the user is familiar with the SYMplicty Storage Manager (SYMSM) and the Fibre Channel Controllers. These Fibre Channel Controller updates should be done prior to configuration of LUNs and installation of MSCS. If you are using SYMSM6.22 follow the directions in section 2.1. If you are using SYMSM7 follow the steps in section 2.2. NOTE: You must use SYMSM7 with the EXP200. 1.1 Setup using SYMSM6.22 Firmware file '93010222.apd'and NVRAM files "softreseton.def & nofua.def" are located on the IBM Netfinity Availability Extension CD. Prerequisite: 1. Both controllers should be online and set to active-active (dual active) mode. 2. Server node being used to load new controller firmware must see both controllers as "optimal". To view the status of the controllers, Use SYMplicity Storage Manager (SYMSM) Recovery utility, select "options | manual recover | controller pair ...". 3. No I/O should be issued to Fibre Channel Controllers during upgrade of firmware. 4. These instructions assume some familiarity with SYMplicity Storage Manager (SYMSM) and Fibre Channel Controllers. a. Fibre Channel Controller Firmware update. Perform the following steps to update the Fibre Channel Controller Firmware: 1. Run the SYMSM "Maintenance & Tuning" utility. 2. Select "Options | Firmware Upgrade ..." 3. Click 'OK' button in the next window. 4. Click 'Online' button for upgrade type. 5. Enter the full path and file name of the firmware file '93010222.apd' in the path field (ex. D:\FibreChannel\93010222.apd.) and click 'OK' button to start firmware updating. Firmware will be downloaded to controller B followed by controller A. b. Fibre Channel Controller NVRAM Update. Perform the following steps to update the "Soft Reset Enable" Setting: 1. From a Windows NT command prompt change directory to the SYMplicity Storage manager program directory. 2. Run "nvutil -v -f -n softreseton.def" to set the "softreset" enable bit. 3. Verify option set by using "nvutil -o 33". All controllers should show offset 0x33 with a value of 0x01. Perform the following steps to update the "Force Unit Access" Setting: 1. From a Windows NT command prompt change directory to the SYMplicity Storage manager program directory. 2. run "nvutil -v -f -n nofua.def" to set the "Force Unit Access" bit to off. 3. Verify option set by using "nvutil -o 28". All controllers should show offset 0x28 with a value of 0x4_. The first byte of the return value from offset 28 should be 4. c. Turn Fibre Channel Controllers "OFF" then back "ON". 1.2 Setup using SYMSM7 To correct timing anomalies between switch and controller in a Netfinity Availability Extensions for MSCS configuration, perform the following steps. File "fccfg_fe.dlp" is located in the "Fibre Channel Controller" directory of this install. 1. At the Netfinity Storage Manager Enterprise Management screen, double click the specific storage subsystem unit to be updated. 2. Once you enter the Netfinity Storage Manager Subsystem Management screen, right click on the storage subsystem unit in the left pane and select "Download | NVSRAM". 3. Browse to the directory containing the "fccfg_fe.dlp" file and select the file. 4. Initiate download. 2.0 Fibre Channel Adapter changes (required). _____________________________________________ a. Fibre Channel Controller Firmware update. To ensure proper operation of the Fibre Channel adapters in the Netfinity Availability Extensions environment, the following update must be performed. These instructions assume that the user is familiar with updating NVRAM on the Fibre Channel Adapters. This Fibre Channel Adapter update should be done prior to configuration of LUNs and installation of MSCS. The self extracting diskette file (2100bios.exe) can be found on the following web site: http://www.pc.ibm.com/support Perform the following steps to update the Fibre Channel Adapter NVRAM. 1. Create a Fibre Channel Adapter bios diskette. 2. Insert diskette into server, boot server. 3. At the command prompt run "ql2xutil /L /U" 4. Remove diskette and restart server. 5. When Qlogic adapters are loading at boot up select alt-Q. 6. Select adapter. 7. Select "Advanced Configuration Settings". 8. Verify that menu option "Enable target reset" is set to "yes". 9. Repeat steps 6-8 for the other controller. 10. Exit "Alt-Q" setup and reboot. 3.0 Operating Procedures (required). _____________________________________________ a. Nodes in the cluster should be powered up one at a time and allowed to boot completely to the NT logon screen before powering up successive nodes. b. Netfinity Availability Extensions must be started on each node completely before starting the service on successive nodes. In addition, all of the node's resources must be brought online completely on the node before starting the IBMCS service on the other nodes. c. Ensure that both fibre channel controllers are brought online and set to dual-active mode before powering up and restarting NT on any node in the cluster. 4.0 Installation Tips and Troubleshooting Hints. __________________________________________________ a. Symptom: Failure retrieving IP Address network Failure retrieving cluster name Failure searching for Cs_topology in Cscluster.cfg for Action: These messages indicate that a node in the cluster is unavailable. All nodes must be running to perform configuration. During the installation process all nodes in the cluster must be restarted and running before starting the configuration process from the last node. If you have already installed Netfinity Availability Extensions on all the nodes and encounter one of these errors, you must reinstall Netfinity Availability Extensions. b. Symptom: Cluster component cannot be installed after Manager component is installed. Action: To install the Cluster component on a system where the Manager component is installed, first uninstall the Manager component, then install the Cluster component and then the Manager component. c. Symptom: Quorum drive letter cannot be changed. Action: Once the Quorum drive letter is defined during the installation it cannot be redefined. You must reinstall Netfinity Availability Extensions to define a different drive letter for the Quorum drive. d. Symptom: One or more nodes fail to boot past the NT kernel screen. Action: Shutdown each node and power up one at a time (wait for the node to completely boot to the NT logon screen before starting the next node). e. Symptom: Installation or startup of the Japanese version of the product fails. Action: The installation directory path, cluster name, and node names may contain only ASCII (English) characters. Ensure that there are no Japanese characters in the installation path, the cluster name, or the node names. f. Symptom: Unable to communicate with other nodes in cluster. Unable to ping by name other nodes in the cluster. Action: A DNS entry for each node in the cluster must be defined on a DNS server in the same network as the cluster or in a "hosts" file on each node in the cluster. 5.0 Uninstall Tips and Troubleshooting Hints. ________________________________________________ a. Symptom: Drives are missing in Disk Administrator after uninstalling Netfinity Availability Extensions. Action: If Drives used as Disk Resources are missing from the Disk Administrator after uninstalling Netfinity Availability Extensions, then uninstall MSCS on each node in the cluster then reinstall MSCS and Netfinity Availability Extensions to recover missing Drives. b. Symptom: Uninstall will not delete log files. Action: The Uninstall program will not delete log files created after the install process. The log files must be deleted manually. The following log files will not be deleted: cs*.log, ibmcs.log,and *.trc. 6.0 IBM System Manager Tips and Troubleshooting Hints. ________________________________________________________ a. Symptom: Unable to connect multiple sessions of IBM System Manager GUI to the same cluster. Action: More than one connection by the IBM System Manager GUI to the same cluster is not supported. This includes remote connections and server connections. b. Symptom: Unable to "Double-Click" resources in Dependencies screen. Action: "Double-Click" selections on the Dependencies screen are not supported. You must select the resource then select the arrow buttons to move the Resource in a dependent state. c. Symptom: No errors displayed when scheduling "Group Online" to a group already Online. Action: Using the Scheduler function of IBM System Manager to Schedule a Group to be brought Online that is already Online will not produce an error message. d. Symptom: Unable to create IIS Virtual Root resource. Action: The creation of an IIS Virtual Root resource is not supported. e. Symptom: "A cluster node is not available" when deleting a Physical Disk Resource. Action: All nodes in the cluster should be online when deleting a Disk Resource. f. Symptom: Cannot delete online Resource or online group. Action: To delete resource or group you must take the resource or group offline, then delete the resource or group. g. Symptom: Dependent resources are not brought online when a group is moved to another node. Action: Manually bring offline resources online, then manually refresh the IBM System Manager. h. Symptom: Unable to discover clusters that are on a different network. Action: Clusters in different networks must be defined on a DNS server in the same network or in a hosts file on the workstation where you are running IBM System Manager. i. Symptom: Pop-up message "Access denied" when accessing the IBM System Manager GUI. Action: Close IBM System Manager GUI, then restart IBM System Manager GUI. j. Symptom: ChClusterGroup takes 6 minutes to move from the current node to another node. Action: When moving ChClusterGroup or scheduling the move of CHClusterGroup it may take as long as 6 minutes for the group to move (2 intervals in Scheduler). k. Symptom: Extra drive letters show up in Disk Administrator after starting servers. Action: This occurs sometimes when nodes are started up at the same time. To correct this problem; start each node one at a time (waiting for the node to completely start to the logon screen, before starting the next one). l. Symptom: IBM Systems Manager does not refresh automatically. Action: Auto refresh of IBM Systems Manager is not supported. Press "F5" button to update IBM Systems Manager. If IBM Systems Manager does not refresh after pressing "F5" button, then restart the IBM System Manager. m. Symptom: FAT formatted shared storage used as a Disk Resource is not supported. Action: Use NTFS format on share storage drives for Disk Resources. n. Symptom: Domino service will not restart after failback. Action: Reboot node before failing back Domino service. 7.0 Adding Shared storage to an existing Cluster ________________________________________________ This procedure is written to add additional LUNs to an existing cluster. IBM Netfinity Availability Extensions for Microsoft Cluster Services Clusters may have up to seven LUNs. One can not use this procedure to increase DASD capacity in an already-defined LUN in an existing cluster. 1. Add the necessary hardware for the new drives. (For details on how to add the devices, refer to the documentation accompanying the hardware.) 2. Make backup images of all nodes in the cluster and the existing shared storage drive data. 3. Stop IBM Netfinity Availability Extensions for Microsoft Cluster Services on all nodes in the cluster. 4. Shutdown and power off all cluster nodes. Power off the Fibre Channel Controller and then power off all of the external SCSI drive enclosures. 5. Add the new device or devices (refer to the documentation accompanying the hardware). 6. Power on all of the external SCSI drive enclosures. 7. Power on the Fibre Channel Controller. 8. Turn on one cluster node. Using the IBM SYMplicity Storage Manager "Configuration" utility, create LUNs with the newly added SCSI drives. Wait until the 'configuration' utility finishes formatting the LUNs . The LUNs should show up as 'optimal' in the 'configuration' utility window and the 'access' LEDs on the SCSI drives should stop blinking. 9. Exit the 'configuration' utility and reboot the node. 10. Once the node is rebooted, open the WinNT 'Disk Administrator' program to verify that the new LUNs show up as available disks. If not, reboot the node one more time. 11. Use the WinNT 'Disk Administrator' program to define, assign drive letters and format NT disk partition in the new LUNs. 12. Reboot the node. 13. Once the node is rebooted, start IBM Netfinity Availability Extensions for Microsoft Cluster Services and open the IBM Netfinity Availability Extensions for Microsoft Cluster Services Manager to define the Physical Disk(PD) resources. Do not try to bring the PD resources online yet. Keep creating PD resources for all of the newly defined disks. Once all of the PD resources are created, reboot the node. 14. Once the node is rebooted, start IBM Netfinity Availability Extensions for Microsoft Cluster Services and open the IBM Cluster Manager and bring the PD resources online. 15. Start the other nodes in the cluster and verify that the PD resources can be moved to and brought online on those nodes. 8.0 Installation and Uninstall of the ServicePack. __________________________________________________ a. To install ServicePack1d, perform the following steps on each node. Update the executable and DLL files. 1. Stop all cluster services on all nodes. This includes CHSchedulerService, CHAlertService, Cluster Service, IBMCS, and SCHEDSRV. 2. Copy the ServicePack1d self-extracting file (CHR1SP1d.exe) to any directory on the node. 3. Enter "CHR1SP1d" This will cause the current files to be backed up and the new files to be installed. Update the configuration files. 4. Change to the %CS_INSTALL_DIR%\bin directory. 5. Enter "ApplyService -updateconfig" b. To uninstall ServicePack1d, perform the following steps on each node. 1. Stop all cluster services on all nodes. This includes CHSchedulerService, CHAlertService, Cluster Service, IBMCS, and SCHEDSRV. 2. Change to the %CS_INSTALL_DIR%\bin directory. 3. Enter "ApplyService -uninstall" This will remove the service pack files and restore the previous version. Uninstall will not remove the file "ApplyService.exe". This file must be deleted manually. The configuration files are not updated when the service pack is uninstalled. The change to the configuration files that was made when the service pack was installed is recommended even if the service pack is uninstalled. If the original version of the product is going to be uninstalled, the service pack should be uninstalled first. 9.0 Netfinity SP Switch Service Installation _____________________________________________ a. Configuration of Netfinity SP switch service with MSCS installation. NOTE: IBM Netfinity SP switch should be installed and configured with "Fabric" started before proceeding to step 1. 1. Install MSCS as usual. However, before clicking 'OK' at the end of the MSCS installation, open the WINNT "services" window and change the start option in the "cluster server" service to "manual" from "automatic". Click "OK" button on the MSCS end of installation dialog to reboot the server. 2. Once the system is rebooted, open the IBM Netfinity SP switch manager to make sure that the switch and "Fabric" have started before bringing the MSCS online. If the MSCS service has started before the switch "fabric" has started, the MSCS will have incorrect "private network name" which will prevent IBM Netfinity Availability Extensions for Microsoft Cluster Service from being installed successfully. b. Configuration of Netfinity SP switch service with IBMCS service. 1. At the end of the Netfinity Availability Extensions installation on each node, select option to 'No, I want restart my computer later" and click OK. 2. Open the WINNT "services" window and change the startup option for the IBMCS service to "manual" from "automatic". 3. Reboot the node. 4. Make sure that the IBM Netfinity SP switch and "Fabric" have started on each node before continuing with the Netfinity Availability Extensions installation in the last node. c. Configuration of Netfinity SP switch 'depend' utility. NOTE: The IBM Netfinity SP switch 'depend' utility is used to set 'IBMCS' service to automatically start after IBM Netfinity SP switch and "fabric" is up. To run this utility, go to the Netfinity SP Switch directory of this install package and enter "install". 1. Open the WINNT service window and change the 'IBMCS' service startup option to 'automatic'. 2. Run the executable, 'depend.exe' to invoke the utility. A window will appeared. First, use pulldown list to select the service that other services will be dependent upon. The service that should select is ibmtb3_IP_x where x is the adapter instance (for example, ibmtb3_IP_0). Then in the first pane, select the services that will depend on the ibmtb3_IP_x service. Select both the IBMCS and Cluster Server services. Click 'OK' to save the choices. From now on, every time a node is rebooted, IBMCS and 'Cluster Server' services will not start until the ibmtb3_IP_x service is started. Repeat this for each instance of the communications adapter in the system. 10.0 Adding a node to an existing Cluster _________________________________________ You must install the IBM Netfinity Availability Extensions for MSCS software on each node you want to add to the cluster. With IBM Netfinity Availability Extensions for MSCS, you can create a cluster of up to eight nodes. 1. Click Control Panel ­ Services ­ IBM Cluster Systems Management ­ Stop to stop IBM Netfinity Availability Extensions for MSCS. (This must be done from each existing node in the Cluster.) 2. Install IBM Netfinity Availability Extensions for MSCS on the node you want to add to the cluster. For detailed installation instructions, see "Installing IBM Netfinity Availability Extensions for MSCS" on page 11. Reminder Make certain you use the same directory locations and addresses for the new node that is used by the existing nodes in the cluster. Also, do not reconfigure the cluster when prompted during installation. Click No in the Configure Cluster message window. 3. Do the following: a. Log on to the primary node (the node used for the original cluster configuration). b. Using the DOS command line or Windows NT Explorer, copy the following files located in the Configuration directory to the Config directory of the node you just added: CSCLUSTER.CFG CSCOMPUTER.CFG CHCONFIG.TXT 4. Open the CSCOMPTER.CFG file in a text editor (for example, Notepad or DOS Editor), and do the following: a. Scroll the page to locate the following line: CS_COMPUTER{ ID=n where n is the numeric value representing the total number of nodes participating in the cluster. b. Edit the ID total to reflect the node you just added. c. Save this information. 5. Open the CSCLUSTER.CFG file in a text editor, and do the following: a. Scroll the page to locate the following line: CS_TOPOLOGY{ COMPUTER ID=n where n is the numeric value representing the total number of nodes participating in the cluster. b. Type the IP address for the current server in the appropriate line. c. Save this information. 6. Open the CHCONFIG.TXT file in a text editor, and do the following: a. Type the new node name, adding it at the bottom of the file (make sure you enter an additional carriage return after entering new node name). b. Save this information. 7. From the DOS command line or Windows NT Explorer, copy the edited CHCONFIG.TXT and CSCLUSTER.CFG files to all nodes participating in the cluster. 8. From the new node, do the following: a. Rename the CLUSAPI.DLL to MSCLUSAPI.DLL in the \\WINNT\SYSTEM32\ directory. b. Copy the \\CLUSTER\BIN\IBMCLUSAPI.DLL to \\WINNT\SYSTEM32\CLUSAPI.DLL. 9. From the DOS command line of the primary node (Node 1), type CHCONFIG-INSTALL, and then press Enter.. 10. From the DOS command line or Windows NT Explorer, do the following: a. Start IBM Netfinity Availability Extensions for MSCS service on the primary node (Node 1). b. After Node 1 has started, start IBM Netfinity Availability Extensions for MSCS on the remaining nodes in the cluster. 11.0 DB2 Installation instructions (DB2 6.1). _____________________________________________ 1. Install Universal Database Extended Enterprise Edition onto every computer. a. Make sure to reserve enough ports for all db2 nodes to fail over to a single computer. 2. Install intermediate fixpack 2+ (or above) for db2. This file is named NAESP1d-DB2.zip and is located on the same Web page as this readme file and the NAESP1d service pack. a. This is fixpack 2 with additional fixes for IBM Netfinity Availability Extensions for MSCS. 3. Copy db2cDB2MPP & db2iDB2MPP entries from the services file on the instance owning computer to the services files on every other computer. a. This enables clients to connect to every computer 4. Modify DB2 environment variables from the db2 command prompt a. Run db2_all "db2set DB2_FALLBACK=ON" b. Run db2_all "db2set DB2_NUM_FAILOVER_NODES=*" 1) * Set this one to the total number of db2 nodes that were installed. This is the maximum amount of db2 nodes that one computer can run. 5. Check to make sure that db2 is working properly a. Run 'db2start' from the db2 command prompt 1) Make sure that all of the db2 nodes start b. Run 'db2stop' to stop the db2 nodes 6. If 8 nodes are being used, the last db2 node needs to be removed from the instance because of an 8 LUN limitation with fiber raid. a. Use the command 'db2ndrop /n:7' ONLY if no databases exist b. If a database already exists, use the 'db2stop drop nodenum 7' command 7. Install IBM Netfinity Availability Extensions for MSCS a. When adding computers to IBM Netfinity Availability Extensions for MSCS, the computers must be added in the same order as they were added to the db2 instance. This is done during the configuration step of IBM Netfinity Availability Extensions for MSCS install. If the order of the db2 nodes is unclear, check the 'db2nodes.cfg' file in the instance directory. 8. Create Disk group for db2 migration a. Create a single group which contains all of the disks resources that will be used for db2 nodes. 1) After creating the disk resources make sure to restart all of the machines before bringing them online. b. Bring the disks resources online. 9. Modify db2mscs.eee a. This file is located in c:\sqllib\cfg\ (by default) b. Db2_instance should be 'db2mpp' (by default) c. Db2_logon_username should be the login id used during install of db2 d. Db2_logon_password should be the password used during install of db2 e. Cluster_name should be the name of the IBM Netfinity Availability Extensions for MSCS cluster, NOT one of the individual MSCS clusters names. f. There are only 2 Node Groups by default (0 & 1). You will need to create another one for every db2 node in the cluster. If the eighth db2 node was dropped (see above) then only 7 groups need to be created. g. Make sure that the Group_Name is different for every group h. Make sure that the DB2_Node values go from 0 to the total number of db2 nodes, minus one. i. Modify all of the 'IP_' fields j. Modify all of the 'NetName_' fields k. Disk_Name should be set to the actual name of a disk resource, as seen in cavalier 1) This should be different for each group. 2) On the first db2 node group (0) InstProf_Disk should be the same as Disk_Name a) No other groups use the InstProf_Disk variable 10. Migrate the instance a. Change to the directory that has the modified db2mscs.eee b. Run db2mscs.exe -f:db2mscs.eee 1) This will take a long time. You will notice activity in cavalier. After all of the groups and resources are created, it will seem like nothing is happening for a long time. This is normal. 2) If a problem occurs during the migration, db2mscs will back out all of the changes and then terminate with an error. 3) The switch '-d:trace.fle' can be added to the command to produce a trace file. 11. Add drive mapping 12. Update the registry so that more than 25 services can be run simultaneously on a single computer. a. Http://support.microsoft.com/support/kb/articles/Q142/6/76.asp b. This is needed for running all 7 db2 nodes on one computer c. Reboot computers after modifying registry 12.0 DB2 Tips and Troubleshooting Hints. ________________________________________ a. Symptom: Indoubt transactions may exist after a db2 node fails over to a new computer. Action: Refer to the Db2 manuals for information on resolving indoubt transactions. The problem can be reduced in severity by including the command 'db2_all "||db2 restart db (dbname)"' into the 'db2cpost.bat' file in the instance directory. (dbname) refers to the name of the database affected. If multiple databases are affected then include additional 'restart' commands for each database. b. Symptom: A db2 node fails to start on a computer with 3 other db2 nodes already active on that computer. Action: Set the db2 variable "db2_num_failover_nodes" to the maximum number of db2 nodes that may fail over to or be moved to a single computer. 13.0 Trademarks and Notices ____________________________ The following terms are trademarks of the IBM Corporation in the United States or other countries or both: IBM Netfinity Microsoft, Windows, and Windows NT are trademarks of Microsoft Corporation. Any other company, product, and service names may be trademarks or service marks of others. THIS DOCUMENT IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. IBM DISCLAIMS ALL WARRANTIES, WHETHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF FITNESS FOR PARTICULAR PURPOSE AND MERCHANTABILITY WITH RESPECT TO THE INFORMATION IN THIS DOCUMENT. BY FURNISHING THIS DOCUMENT, IBM GRANTS NO LICENSES TO ANY PATENTS OR COPYRIGHTS. Copyright (C) 1999 IBM Corporation. All rights reserved. Note to U.S. Government Users -- Documentation related to restricted rights -- Use, duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp.