extreme CSM
tools
Author: Bruce Potter, Document Version 1.5, 1/2/07
Introduction
The xCSM package is a set of additional tools for CSM to give
you even more capabilities to manage your cluster with.
There is not official support for these tools, but you can
certainly post your problems, questions, and suggestions to the CSM Forum and we will do our best to address them.
What's New
(Note: the links above only work as links if you are browsing
this file from the /opt/xcsm directory.)
Commands - /opt/xcsm/bin
Using the -h option, on at least some of the commands, will give you extended
usage information.
- csm2hosts - write the hostname and IP info that is in the CSM DB to a file in the
/etc/hosts format. This includes network info in the UserComment
attribute. See also addhosts in the Utilities section.
- csmmpcli - run some hardware control operations
against the hardware control point using the MPCLI
- genhosts - create an /etc/hosts file for the
cluster
- helparp - increase the ARP table cache size for large clusters running Linux
- hosts2csm - put the hostname and IP info from the
/etc/hosts file into the CSM DB. This script
assumes the convention of
putting the adapter name after the node name for all the secondary adapter hostnames
(for example "n001-eth1.site.com" would be
the hostname for the eth1 adapter of node 1.) The
secondary adapter info is stored in the UserComment
attribute for now.
- lsboot - list the installation attributes of the
specified nodes
- lscons - list the console attributes of the
specified nodes
- lsip - list the IP address of the specified nodes
- lsisvr - list the install server for the specified
node
- lsmac - list the MAC address of the specified
nodes
- lsmode - list the CSM node mode of the specified
nodes
- lsos - list the installation attributes of the
specified nodes
- lspos - calculate the position of nodes in racks
based on IP address
- lspower - list the hardware control
attributes of the specified nodes
- mkpxeimage - create an initrd boot image
- rbeacon - light up the front panel of the
specified nodes
- reventlog-ipmi - manage BMC logs
- rinv-ipmi - return BMC inventory info
- rpower.dummy - rpower method that does nothing, to disable CSM from doing any power
operations. Link this to /opt/csm/bin/rpower when desired.
- rpower-ipmi - power operations via the BMC
- rvitals-ipmi - return BMC environmental info
- sethwctrlnodeidattr - set the HWControlNodeId
attribute of the specified nodes to be the short version
of the node's hostname
These are commands and other scripts that are used by the xcsm.web browser
interface and also useful in xcsm from the command line. All the files
in this directory are replicated in both the xcsm and xcsm.web rpms so
that neither rpm needs to prereq the other. To get additional info for
any of these, run the command with the -h flag, or look at the comments
in the file.
- addhosts - Adds IP address/hostname pairs to the /etc/hosts file, if they are not
in there yet. This can efficiently add a whole batch of entries in one
shot.
- addnodes - Adds a whole set of nodes to the CSM cluster according to a given naming
and positioning pattern. For each node, it creates name resolution for
it, defines it, and sets the physical location.
- burnmem - A sample script that chews up CPU and memory to help trigger the performance
related RMC Conditions.
- emailevent - This script, to be used from an ERRM event response, mails event information
generated by ERRM to a specified email address. It also has a flag for
sending brief email to a mobile device (e.g. text messaging a cell phone).
- install-cfm-file - Sample .post script for CFM to install an rpm after it is replicated
by CFM. Still under construction.
- Sample RMC monitoring resources. These are created automatically using
the CSM command mkresources.
- resources/IBM.Condition/SensorRMDownOnMS.pm - Triggers if the SensorRM is not running on the CSM mgmt svr. You can
stop the SensorRM using "stopsrc -s IBM.SensorRM" and start it
again using "startsrc -s IBM.SensorRM".
- resources/IBM.Condition/SensorRMDownOnNode.pm - Triggers if the SensorRM is not running on any of the nodes. This is
a sample of monitoring something across the whole cluster.
- resources/IBM.Condition/TooManyLoggedOnMS.pm - Triggers if more than 1 person is logged into the CSM mgmt svr. This
is a sample of writing a condition for a sensor (WhoSensor).
- resources/IBM.Sensor/WhoSensor.pm - Periodically queries how many people are logged into the machine.
- resources/IBM.EventResponse/EmailCell.pm - Takes the event info passed to it and text messages it to a cell phone.
- startmon - Starts default monitoring by starting a set of conditions that will
typically be wanted by most administrators.
- test-response-script - Helps test event response scripts by setting the environment variables
that an Event Response will set before calling the script.
Blade Utilities - /opt/xcsm/blade
- CSMBLADE.README - documentation on the scripts in this directory.
- mpname - set the textid of the blade. In CSM that is not really necessary any
more because CSM can use the slot # instead of the textid as the HWControlNodeId.
- pfwchk - check the firmware level on running nodes
- rbootseq - change the boot order of blades
- rvid - view a video (graphic as opposed to ascii) console of a blade
Nagios - /opt/xcsm/nagios
- doc/CSM2Nagios.pdf
- Paper describing how to connect CSM to the open source monitoring tool Nagios.
- bin/csm2nagios - Script for connecting CSM to Nagios.
These are commands that can be used to create and manage
database tables. The tables are stored in the RSCT
registry, using the RSCT DBI driver that is supplied with
CSM/RSCT. Note that these commands are just for convenience
for interactive use. To manipulate the tables
programmatically, use the DBI interface directly, since that is
more standard and will be faster. You can use these
commands below as examples for writing your own DBI scripts.
The DBI interface can also be used to access the RMC
classes in a standard way. See lscsmtab as an example.
Run any of these commands with the -h option to get
usage help. Before using these commands, or your own DBI
script, make sure the perl-DBI RPM from the distro CDs is
installed.
- mkcsmtab - create a new table.
- mkcsmrow - add a row to an existing table
- lscsmtab - list available tables, or list rows
from an existing table. Note: You must apply the RSCT.pm.patch
below for the -l option to work.
- chcsmrow - change rows in a table.
- rmcsmrow - delete rows in a table.
- rmcsmtab - delete a table.
- generic_power - an example user-defined power
method that uses WOL to wake up machines. To use
set node attributes: PowerMethod=generic
HWControlPoint=<mgmtsvr>
HWControlNodeId=<nodeMAC> . This power method
also does some other rpower and lshwstat actions as
examples.
Console Methods - /opt/xcsm/consolemethods
- e325sol_console - CSM console method for
serial-over-lan on e325
MAC Methods - /opt/xcsm/macmethods
- mmsnmp_mac - MAC collection method for the
BladeCenter management module that uses SNMP to get MACs
for adapters on the mother board and on the 1st daughter
card. To use it, do the following:
- Ensure you have the prereq perl-Net-SNMP RPM
installed
- Make sure the HW ctrl node attributes are set
correctly
- Configure the Bladecenter management module in
the Control -> Network Protocols section as
follows: make sure SNMP v1 agent is enabled and
SNMP v3 agent and SNMP Traps are disabled. Set
the Community Name to "public", set the
Access Type to "set", and set the
hostname/IP address to the value for the network
adapter on the CSM management server that is
connected the management module subnet.
- Then run: getadapters -w -n
<nodelist> -m mmsnmp
HPC Set Up Utilities - /opt/xcsm/hpc
- Scripts to help build mpich, pbs, and torque -
these haven't been updated in a while
Node Installation Customization Scripts - /opt/xcsm/install
In the bullets below "###" represents a 3 digit
number that is put at the beginning of the script to control what
order they are run in.
- Pre-reboot scripts:
- ###CSM_syslog - configure syslog on the
nodes to forward messages to the management
server
- ###CSM_setupnis - configure NIS on the
node
- CSM00_xcatpost - run the xCAT post install scripts during the CSM post install time
- Post-reboot scripts:
- ###CSM_IBMCompiliers - install the IBM
compilers on the node
CSM can work with SystemImager to capture golden images and clone them
to CSM nodes. See /opt/xcsm/clone/README for the details.
- MultipleCondition-Response - combine multiple
conditions together to form a complex condition
(including conditions that have to occur for a certain
length of time). See the comments in the beginning
of this script for directions on how to use it.
- ODCsamples - examples of conditions and responses
from IBM's On Demand Center.
- HPSsamples - more examples of conditions and
responses done for a pSeries customer.
Samples - /opt/xcsm/samples
You can use the CSM TEC adapter to forward cluster events to
the Tivoli Enterprise Console. See the README in the tivoli
sub-directory for instructions on how to set it up.
CSM Patches - /opt/xcsm/csm-patches
Most of the code in xCSM is structured so that it can be run
on top of the standard CSM release, without changing any CSM
files. But in some limited cases, this is not possible.
The files in this sub-directory are fixes or enhancements
to CSM files. Using this directory as the root, the files are
given the full path name of the corresponding CSM file. In
some cases the file is a full replacement, in other cases it is a
patch file. In both cases, proceed with caution:
always keep a backup of the file being replaced/patched,
and restore the original files before upgrading CSM.
- /opt/csm/install/pkgdefs/Linux-SLES9-ppc64.pm -
defines all the SLES 9 CDs so copycds will copy all RPMs.
This is necessary to do a full installation of the
node.
- /usr/sbin/rsct/pm/DBD/RSCT.pm.patch - fixes a bug
with lscsmtab. This patch applies to the 1.3.3 version of
CSM/RSCT and above. Apply this patch using: cd /usr/sbin/rsct/pm/DBD;
patch -b </opt/xcsm/csm-patches/usr/sbin/rsct/pm/DBD/RSCT.pm.patch
Switch Utilities - /opt/xcsm/dev/switch
A utility to get the MAC addresses of nodes from a Cisco switch.
HPC Utilities - /opt/xcsm/hpc
Utilities to help install and set up open source HPC applications in the
cluster. See the READMEs in the subdirectories for instructions. These
utilities have not been updated in a while.
xCSM now provides two xCAT commands designed to work in a CSM
cluster: addclusteruser and pushuser. These commands give you the
ability to quickly add users throughout the cluster and to setup
OpenSSH keys to allow users unprompted ssh access from any node
in the cluster to any other node in the cluster. To use
addclusteruser and pushuser do the following:
- Install xCAT on your
management server (the requirement for xCAT will be
removed in a future release).
- Run the xCAT provided script: /opt/xcat/csm/sbin/csm2xcat
to setup the xCAT tab files using your CSM data as a
starting point.
- Set nisdomain to NA in /opt/xcat/etc/site.tab
- Export your path to include /opt/xcsm/xcat/sbin
- Run addclusteruser. For example to add a user called
clusterguest run: addclusteruser -n clusterguest
- Then either use CFM to push out /etc/passwd and
/etc/group to the nodes, or run pushuser [noderange]
clusterguest
- Finally, run cfmupdatenode to push out the user's home
directory (which includes the needed OpenSSH keys): cfmupdatenode
-n [noderange]. Please note you need CSM version
1.4.0.13 or later for this CFM functionality. If you are
using an earlier version of CSM, push the user's home
directory to the nodes via NFS or dcp.
References