IBM Cluster Systems Management for Linux(R)
Set-Up HOWTO
Version 1 Release 1
Document Number SA22-7853-00
5799-GNJ
Note! |
---|
Before using this information and the product it supports, read the information in Appendix B, Notices. |
First Edition (June 2001)
This edition of the IBM Cluster Systems Management for Linux Set-Up HOWTO applies to IBM Cluster Systems Management for Linux Version 1 Release 1, program number 5799-GNJ, and to all subsequent releases of this product until otherwise indicated in new editions.
IBM welcomes your comments. A form for readers' comments may be provided at the back of this publication, or you may address your comments to the following address:
If you would like a reply, be sure to include your name, address, telephone number, or FAX number.
Make sure to include the following in your comment or note:
When you send information to IBM, you grant IBM a nonexclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you.
© Copyright International Business Machines Corporation 2001. All rights reserved.
U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Setting Up IBM Cluster Systems Management for Linux
Appendix A. Node Attributes Template
This HOWTO provides the specified operating environment for the IBM Cluster Systems Management for Linux (CSM) set of tools and describes how to install and set up a CSM cluster on an existing group of nodes.
This HOWTO is intended for system administrators who want to use IBM Cluster Systems Management for Linux. It describes tools that are provided to make the installation of Cluster Systems Management easier. The system administrator should have experience in UNIX(R) administration and networked systems.
This book uses the following typographic conventions:
Typographic | Usage |
---|---|
Bold |
|
Italic |
|
Constant width | Examples and information that the system displays appear in constant width typeface. |
[ ] | Brackets enclose optional items in format and syntax descriptions. |
{ } | Braces enclose a list from which you must choose an item in format and syntax descriptions. |
| | A vertical bar separates items in a list of choices. (In other words, it means "or.") |
< > | Angle brackets (less-than and greater-than) enclose the name of a key on the keyboard. For example, <Enter> refers to the key on your terminal or workstation that is labeled with the word Enter. |
... | An ellipsis indicates that you can repeat the preceding item one or more times. |
<Ctrl-x> | The notation <Ctrl-x> indicates a control character sequence. For example, <Ctrl-c> means that you hold down the control key while pressing <c>. |
\ | The continuation character is used in coding examples in this book for formatting purposes. |
IBM Cluster Systems Management for Linux Monitoring HOWTO, SA22-7852-00
IBM Cluster Systems Management for Linux Overview HOWTO, SA22-7857-00
IBM Cluster Systems Management for Linux Remote Control HOWTO, SA22-7856-00
IBM Cluster Systems Management for Linux Technical Reference, SA22-7851-00
The IBM Cluster Systems Management for Linux publications are available as HTML and PDF files on the CD-ROM in the /doc directory or on the installed system in the /opt/csm/doc directory.
README information is available on the CD-ROM in the root directory (/).
The file names are as follows:
Publications for IBM Cluster Systems Management for Linux were available also at the time of this release at the following URL:
http://www.ibm.com/eserver/clusters/linux
The set-up process helps a system administrator get IBM Cluster Systems Management for Linux (CSM), hereafter known as Cluster Systems Management, up and running easily by setting up a management server and managed nodes on existing Linux systems. The application is available as an installation image in a directory or on the CSM CD-ROM. This document describes the minimum hardware and software requirements needed to use this product. See the Specified Operating Environment.
Information is also provided about planning and pre-installation tasks that you need to perform so that the installation will go smoothly and easily. Next, there is a step-by-step procedure for installing and setting up the cluster on an existing group of nodes. Finally, a troubleshooting section is provided in the form of frequently asked questions. You should read this document carefully and be familiar with it throughout before beginning the installation and set-up tasks.
This section describes the hardware and software that are required for IBM Cluster Systems Management for Linux. For more detailed information, see the announcement.
This product is supported on the IBM xSeries 330 and 340.
To support remote control, the following hardware is required:
IBM Cluster Systems Management for Linux has requirements for non-IBM software as well as IBM-developed software. As a convenience, the required software that is not part of the Red Hat distribution is included on the CSM CD-ROM. Unless otherwise specified, the software is required on the management server and on the managed nodes.The non-IBM software required is as follows:
The following IBM software packages are required:
TCP/IP and a network adapter
In configuring a Cluster Systems Management cluster, give particular attention to the following:
For the management server, a minimum of 128MB of memory and 120MB of disk space is required.
For the managed node, a minimum of 128MB of memory and 20MB of disk space is required.
A cluster of up to 32 nodes is supported.
See IBM Cluster Systems Management for Linux Remote Control HOWTO for information on hardware configuration considerations and set-up instructions, including a filled out node-attribute table that you can use for guidance in filling out your own blank template. A blank template is provided in Appendix A, Node Attributes Template.
There are several tasks that the administrator must do to prepare for installation of Cluster Systems Management:
Before installing CSM, create a partition called /tftpboot that consists of 100MB of space. This will hold the required RPM and tarball packages for installation.
To create /tftpboot by using cfdisk, do the following:
Make a note of the device name and number of the new partition because you will need it for the next steps. Examples of the new partition name might be similar to /dev/hda7 or /dev/sda8.
mkfs /dev/device
(where device is the name of the new partition; for example, hda7 or sda8.)
mkdir -p /tftpboot
mount /dev/device /tftpboot
/dev/device /tftpboot ext2 defaults 1 2
A distributed shell program (dsh) is used to run commands on the nodes. It is contained in the csm.dsh RPM and installed by the installms command. The dsh program uses a remote shell of your choice to issue remote commands to the managed nodes from the management server. The following preparation to enable the remote shell is required on each node before dsh is installed:
The DSH_REMOTE_CMD environment variable is used to specify a remote shell other than the default. This environment variable should always be set when CSM commands are issued because some CSM commands use dsh internally and will use rsh as the default if DSH_REMOTE_CMD is not set.
For more information on dsh, see the man page or the IBM Cluster Systems Management for Linux Technical Reference.
To install Cluster Systems Management, you can take one of two approaches. For a simple installation without interim verification, you can run the following commands in sequence:
The addnode command runs definenode and then installnode automatically.
If you need more control and would like the ability to doublecheck and make interim changes during installation, then run definenode and then installnode as follows:
All of these commands are run on the management server. Details on these commands can be found in their man pages or in IBM Cluster Systems Management for Linux Technical Reference.
Attention: |
---|
After each node is installed by running installnode, you need to reboot the node to enable remote console support. The node does not have to be rebooted if remote console support is not being used. |
The identd service is started for you on the management server and the managed nodes by the installation process. See the Security section of the IBM Cluster Systems Management for Linux Overview HOWTO for more information about identd.
The installms command performs the tasks that are necessary to make this system a management server. It installs the appropriate software listed in Specified Operating Environment on the management server automatically if it is not already installed or if it is installed at a previous level.
IBM suggests that you set up the /tftpboot partition before you run installms. You also need to provide and mount the CSM distribution CD-ROM. The default mount point is /mnt/cdrom.
The program first copies installation packages from a download directory or from the CD-ROM that contains the CSM application to /tftpboot/rpm and /tftpboot/tarball.
After installms has been run successfully, run definenode or addnode.
For more information on installms, see the man page or the IBM Cluster Systems Management for Linux Technical Reference.
After the management server is installed by running installms, run definenode to define all of the nodes in the cluster or addnode to define and install the nodes in the cluster. These commands have certain prerequisites, which you need to be aware of.
Before you run definenode, you must prepare certain information and do some manual set up.
Attention: |
---|
A node that has already been defined cannot be redefined with the definenode command. Including such a node in the command-line input causes the command to fail without defining any nodes. Including such a node in the node definition (nodedef ) file, causes the definition of that node to fail, but the other nodes specified in the file are defined successfully. An error message is issued for the undefined node or nodes. |
Before you run definenode or addnode, information needs to be gathered and recorded on a template similar to the example in Appendix A, Node Attributes Template. This information can be entered into a node definition (nodedef file), or it can be entered at the command line.
If you intend to use a nodedef file, start with the sample file in /opt/csm/install/nodedef.sample and fill in the information from the node-attribute planning template that you completed earlier.
See the man page or IBM Cluster Systems Management for Linux Technical Reference for more details about the node definition file.
The definenode command defines all the nodes in the cluster. It does not actually install the nodes. Node installation is done by installnode. If you run addnode, you do not need to run installnode because addnode runs installnode for you.
If some arguments are not provided, the command prompts you for each piece of information that it needs. If you should inadvertently miss a required option, the command prompts you for the missing information.
You can either use a node-definition file to define the nodes, console servers, and service processors to the cluster, or you can enter the information from the command line. To use a node definition file in order to define the nodes, console server information and service processors, type:
definenode -f nodedef
To see the arguments that you need to enter from the command line, type:
definenode -h
All of the arguments are required when the command is run. The command prompts for missing information when some or all of the arguments are not provided. To use this method of input, type the command without any arguments:
definenode
See the man page or IBM Cluster Systems Management for Linux Technical Reference for details on definenode or addnode command-line syntax and more examples of the usage of the command. See Example of definenode command run interactively for an example that demonstrates the interactive approach.
After definenode has been run successfully, verify the node definitions, and then run installnode. See Verifying node definitions for details.
Some error messages may be returned if definenode is not completely successful. See FAQs, hints, tips, and troubleshooting for troubleshooting information.
If you run the definenode command without any options, the program prompts you for the required information. Also, if you miss a piece of required information, the program prompts you for that information.
The following example shows sample input for nodes, console servers, and service processors with an interactive program. The example uses the definenode command, but the addnode command can be used instead with the same usage and arguments. Instead of requiring you to enter everything at once on the command line, the interactive program allows you to enter a little bit at a time. User input is shown in bold type.
Enter starting node name (hostname or IP address): clsn01 Enter number of nodes to define (default = 1): 12 Enter list of Hardware Control Points (press ENTER for none): Format: hwctrlpt[:method:spname][,...] mspname = Hardware Control Point hostname or IP address. method = Power method (default=netfinity) spname = Starting service processor name or 'hostname' (default=node01) Example: hwctrlpt1::node06,hwctrlpt2,hwctrlpt3 Example: hwctrlpt1::hostname,hwctrlpt2::hostname mgtn03,mgtn04::hostname Enter list of Console Servers (press ENTER for none): Format: csname[:method:csnum:port][, ...] csname = Console server name (hostname or IP address) method = Console method (default=esp) csnum = Console server number (default=1) port = Starting console port number (default=1) Example: cs1:::4,cs2:conserver,cs3 mgtn02 Enter Hardware Type (default = netfinity): netfinity definenode: Adding CSM Nodes: definenode: Adding Node clsn01.ppd.pok.ibm.com(9.114.133.193) definenode: Adding Node clsn02.ppd.pok.ibm.com(9.114.133.194) definenode: Adding Node clsn03.ppd.pok.ibm.com(9.114.133.195) definenode: Adding Node clsn04.ppd.pok.ibm.com(9.114.133.196) definenode: Adding Node clsn05.ppd.pok.ibm.com(9.114.133.197) definenode: Adding Node clsn06.ppd.pok.ibm.com(9.114.133.198) definenode: Adding Node clsn07.ppd.pok.ibm.com(9.114.133.199) definenode: Adding Node clsn08.ppd.pok.ibm.com(9.114.133.200) definenode: Adding Node clsn09.ppd.pok.ibm.com(9.114.133.201) definenode: Adding Node clsn10.ppd.pok.ibm.com(9.114.133.202) definenode: Adding Node clsn11.ppd.pok.ibm.com(9.114.133.203) definenode: Adding Node clsn12.ppd.pok.ibm.com(9.114.133.204)
After definenode has run, the management server has been set up with all the node information for CSM. The cluster nodes are now ready to be installed. This set up, however, may not completely suit your needs. This section describes how to verify and customize the cluster node definitions before the actual installing of the nodes takes place. Since the actual node installation has not happened yet, you can make changes to any node definitions here.
Verify the csm node information as follows:
If something needs to be corrected, either you can use rmnode -P to remove the node that was not successfully defined and then rerun definenode with the correct arguments, or you can use chnode -P to make changes to any attributes of a node. Note that all of the attributes for a node might not be filled in at this point. See the chnode, definenode, lsnode, and rmnode man pages or the IBM Cluster Systems Management for Linux Technical Reference for more information.
The service processor password file is created from /etc/opt/csm/netfinity_power.config.templ when definenode is run. Afterwards you can modify the netfinity_power.config file to specify individual passwords and user IDs for each node, if needed.
This command is used to install all the nodes in the cluster by running makenode. The appropriate software listed in Specified Operating Environment is installed automatically by the installnode command on the nodes if it is not already installed or if it is installed at a previous level.
The installnode command also displays the installation status for each node. In addition a log file is maintained in /var/log/csm/installnode.log that contains information on what happened during installation on each node.
For more information on installnode and makenode, see the man pages or the IBM Cluster Systems Management for Linux Technical Reference.
Before the installnode command can be run, the following must be done on the management server:
Attention: |
---|
After each node is installed by running installnode, you need to reboot the node to enable remote console support. The node does not have to be rebooted if remote console support is not being used. |
You can add another node to the cluster either by running definenode and then installnode again or by running addnode again. To add a new node to the cluster, do the following:
lsnode -Al
And, type the following to see the attributes of PreManaged nodes:
lsnode -AlP
definenode
You will be prompted for the required information.
lsnode -Al
Removing a node from a cluster does not uninstall CSM and its prerequisites from the node. Rather, it disassociates the node from its management server. It removes the node from the database of the management server, and it informs the node that it is no longer attached to the management server. To remove a node from the cluster, type:
rmnode hostname
A removed node can be added back into the cluster by running definenode or addnode again.
This section tells you how you can determine whether the installation was successful. It also gives you some suggestions on how to get started using Cluster Systems Management. After installation is successfully completed, remote RMC and CSM commands are enabled. To verify that the installation was successful, enter the following commands. If everything is as it should be, you should see the following results:
dsh -a date
A list of nodes with the date on each node is returned.
rpower -a query
A list of nodes with their associated power state is returned.
lsnode -p
The ping status of the nodes is returned.
To try out monitoring, use the following example.
startcondresp NodeReachability BroadcastEventsAnyTime
lscondition
lscondresp
This section has frequently asked questions that can help to troubleshoot problems or give hints and tips on how to do something more easily or efficiently. The first group of questions are general troubleshooting questions. They are followed by a special group of questions on how to handle the RMC ACL file. In addition, for problems with the monitoring function, see the Diagnostics chapter in IBM Cluster Systems Management for Linux Monitoring HOWTO.
lssrc -s ctrmc
If the output shows that ctrmc is inoperative, run the following:
startsrc -s ctrmc
Then verify that ctrmc is now active by running the following again:
lssrc -s ctrmc
After ctrmc is active, you can run installnode again on the management server. This moves the PreManaged Nodes completely to ManagedNodes and finishes the necessary processing.
dsh -a mgmtsvr
The RMC ACL file is located in /var/ct/cfg/ctrmc.acls The management server uses the RMC ACL file as its authorization mechanism. You may want to update the RMC ACL file to manually fix problems during installation. The following questions and answers can guide you through the process of updating the ACL file to provide access to the resource classes on the management server machine to managed nodes.
refresh -s ctrmc
refresh -s ctrmc
IBM.PreManagedNode root@hostname1 * rw hostname1 * r IBM.ManagedNode root@hostname1 * rw hostname1 * r IBM.NodeGroup root@hostname1 * rw hostname1 * r
Save and close the file. Issue the refresh command:
refresh -s ctrmc
The output when the ctrmc.acls file is listed should show the following:
IBM.PreManagedNode root@hostname1 * rw hostname1 * r IBM.ManagedNode root@hostname1 * rw hostname1 * r IBM.NodeGroup root@hostname1 * rw hostname1 * r OTHER root@LOCAHOST * rw LOCALHOST * r
IBM.PreManagedNode root@hostname2 * rw hostname2 * r IBM.ManagedNode root@hostname2 * rw hostname2 * r IBM.NodeGroup root@hostname2 * rw hostname2 * r
Save and close the file. Save and close the file. Issue the refresh command:
refresh -s ctrmc
The output when the ctrmc.acls file is listed should show the following:
IBM.PreManagedNode root@hostname1 * rw hostname1 * r root@hostname2 * rw hostname2 * r IBM.ManagedNode root@hostname1 * rw hostname1 * r root@hostname2 * rw hostname2 * r IBM.NodeGroup root@hostname1 * rw hostname1 * r root@hostname2 * rw hostname2 * r OTHER root@LOCAHOST * rw LOCALHOST * r
See IBM Cluster Systems Management for Linux Remote Control HOWTO for an
example of a filled-out node-attributes template. Note that the console
port number is the physical port to which the serial port of the node is
connected in the console server hardware. Use the short host name (for
example, clsn01) instead of the fully qualified host name (for
example, clsn01.pok.ibm.com) in the following
template. Duplicate the template and fill it out before you install
CSM.
Table 1. Node Attributes Template
Hostname | HW Control Point | Power Method | Svc Proc Name | Console Server Name | Console Server Number | Console Method | Console PortNum | HWType |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Table 2. Node Attributes Template
Hostname | HW Control Point | Power Method | Svc Proc Name | Console Server Name | Console Server Number | Console Method | Console PortNum | HWType |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to:
IBM Director of LicensingFor license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to:
IBM World Trade Asia CorporationThe following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.
Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact:
IBM CorporationSuch information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee.
The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us.
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces.
The following trademarks apply to this book:
IBM and AIX are registered trademarks of International Business Machines Corporation.
Linux is a registered trademark of Linus Torvalds.
Red Hat and RPM are trademarks of Red Hat, Inc.
UNIX is a registered trademark in the United States and other countries licensed exclusively through The Open Group.
Other company, product, and service names may be the trademarks or service marks of others.
IBM Cluster Systems Management for Linux includes software that is publicly available:
This book discusses the use of these products only as they apply specifically to the IBM Cluster Systems Management for Linux product.