IBM Books

Hardware Planning and Control Guide


Remote power software and configuration

Once CSM cluster hardware and networking is configured as required, CSM software must be installed and configured to enable remote power functions. The CSM installms command installs the required csm.server package, including the /opt/csm/bin/rpower remote power command, on the management server. For detailed CSM installation instructions, see the IBM CSM for Linux: Software Planning and Installation Guide. For detailed command usage information, see the installms, definenode, and rpower man pages or the IBM CSM for Linux: Administration Guide.


Remote power software

The CSM management server communicates with hardware control points through the IBM Distributed Management Server (DMS) resource manager (IBM.DMSRM). The DMS resource manager supports IBM hardware control libraries and customized programs or scripts. IBM hardware control libraries manage the IBM RSA hardware control point. Customized programs or scripts are required to manage other hardware control points. Each library (or script) is responsible for communicating to the hardware control point and making the request for the hardware action. The library or script returns the action request results to the DMS resource manager, which returns the information to the user.


Remote power configuration

To configure remote power, the default hardware control point user IDs and passwords must be changed using the utility disks and documentation provided with the hardware. For MPs and MPAs the default user ID shipped with the system is "USERID" and the default password is "PASSW0RD" (P-A-S-S-W-zero-R-D). When a user runs the rpower command, the user ID and password information is automatically retrieved and decrypted. The rpower command is run from the management server only to restrict remote power to users with root access.

The MP and MPA user IDs and passwords stored for nodes on the management server must match the nodes' physical user IDs and passwords in the hardware. The systemid command must be run once for each MP and MPA to encrypt password information on the management server. Password files generated by the systemid command have the following properties:

Directory location: /etc/opt/csm/system_config 
 
File permissions (owner/group and permissions):  root/root read-only. 
For example, -r------- 1  root root 20 May 3 12:31 9.111.111.11  
 
Naming convention: IP Address of host (if resolvable). For example, 9.111.111.11; 
otherwise, the node ID specified (for example, node01).

The following examples show how to create system IDs on the management server (all examples will prompt for a valid password):

  1. To create a system ID for HWControlPoint clsn05.pok.ibm.com, enter:
    systemid clsn05.pok.ibm.com USERID 
     
    
  2. To create a system ID for a node with node ID clsn07, enter:
    systemid clsn07 USERID 
     
    
  3. To verify that the system IDs have been created, enter:
    systemid
     
    

    Output should be similar to:

    9.001.001.01        USERID
    clsn06.pok.ibm.com  USERID
    

The remote power PowerMethod library type provided with CSM for Linux is netfinity. The HWControlNodeID attribute must contain the value shown in Table 2 based on the PowerMethod library type used:

Table 2. Hardware control attribute values

HWControlPoint attribute HWControlNodeID attribute PowerMethod library
host name of the IBM xSeries RSA PCI MPA text ID associated with MP or MPA netfinity

The following example shows how to change a management processor (MP) hardware text ID, user ID, and password using a Web browser session. For detailed information on the Cluster 1300 see the IBM Linux clusters hardware Web site listed in Related information. To change the text ID, user ID, and password of an MP in a cluster node:

  1. Use a Web browser to open a session to the RSA controller's IP address. You will be prompted for a user ID and password. Use the default user name and password to log in:
    User Name: USERID
    Password: PASSW0RD (P-A-S-S-W-zero-R-D)
    
  2. Click the Continue button.
  3. Click the Access Remote ASM button. You should see the nodes controlled by this MPA.
  4. Choose a node from the ASM Name list and click login to establish a session on the target node. Use the default user name and password to log in:
    User Name: USERID 	
    Password: PASSW0RD (P-A-S-S-W-zero-R-D)
    
  5. You should see a status window for the node. Click ASM Control => System Settings. Change the text ID for the node in the Name window and click Save.
  6. Click ASM Control => Login Profiles. Click on a profile with a USERID login ID, change the login ID, and reset the password. Click Save. Repeat this step to change additional profiles.
  7. Click Log Off Remote ASM to return to the Remote ASM Access window.

To change log in information for an IBM RSA management processor adapter (MPA), perform the above steps but skip steps 3 and 4.


Remote power attributes

CSM remote power functions require attributes to be defined in the CSM database for the specific hardware control points used in the cluster. When new nodes are defined in a cluster using the definenode command, the required PowerMethod, HWControlPoint, and HWControlNodeID attributes are created in the CSM database. See the IBM CSM for Linux: Software Planning and Installation Guide or the man page for definenode command examples.

PowerMethod
The PowerMethod attribute specifies the type of hardware control points used in the cluster. The PowerMethod attribute value specifies which power method library the rpower command will use to control power functions on that node.

HWControlPoint
The HWControlPoint attribute specifies the short host name, long host name, or IP address of the Ethernet connection for the MPA. For IBM xSeries systems the MPA or hardware control point is the Ethernet connection to the IBM RSA PCI. The attribute value is passed from the rpower command to the corresponding PowerMethod library to contact the target node.

HWControlNodeID
The HWControlNodeID attribute specifies the IBM xSeries MP or MPA hardware text ID of the target node for the rpower hardware control action. The node's HWControlNodeID attribute value is passed through the netfinity library.

Setting a node's HWControlNodeID attribute to the short host name of the node can simplify the node definition process. The following definenode command example defines the node short host names as the HWControlNodeID attribute values. If the HWControlNodeID attribute values were not set to the node short host names, then the nodedef file would be used to specify each attribute value. If a node's HWControlNodeID is the short host name, then the following command can be run once to define all nodes attached to the hardware control point:

definenode -n clsn01.pok.ibm.com -c 20 -H mgtn03.pok.ibm.com:10,- mgtn04.pok.ibm.com:10 \
-C mgtn02.pok.ibm.com:1:0:16 mgtn05.pok.ibm.com:2:0:16 PowerMethod=netfinity ConsoleMethod=esp.

All HWControlNodeID node attribute values attached to a hardware control point must be unique. For xSeries 330 and 342 nodes, the HWControlNodeID value must match the text ID set in the hardware. If HWControlNodeID values are changed to the short host names of the nodes, then the systemid command must be subsequently run to correctly set the new user ID and password information in the CSM database.


Using remote power with other hardware

If a hardware control point other than those listed in Table 2 is used, then additional software and configuration is required to enable remote power functions. Each hardware control point must have an associated Perl or shell command for communicating with the management server. The PowerMethod attribute for a node must be set accordingly by the definenode command when the node is defined. A corresponding PowerMethod_power command must also be provided. For example, if the PowerMethod attribute on a node is set to vendor1, then rpower will attempt to access a vendor1_power command to carry out remote power requests. Each node's HWControlPoint attribute must be set to the value expected by the PowerMethod_power command using the definenode command. For command usage examples see the definenode man page and the IBM CSM for Linux: Software Planning and Installation Guide.

To use remote power with other hardware control points, the following steps are required. In this example, vendor1 is used as the PowerMethod node attribute value:

  1. Connect the hardware control points to the management server and nodes on the management VLAN.
  2. Specify a value for the PowerMethod node attribute: for example, vendor1.
  3. Use the definenode or chnode command to set the PowerMethod value for each node to vendor1.
  4. Write a Perl or shell command named vendor1_power, and save the command in the /opt/csm/bin directory on the management server. The vendor1_power command will be run by the rpower command for nodes with PowerMethod attribute values set to vendor1. The vendor1_power command will carry out the requested power operation on one node, and must support some or all of the actions that rpower supports: on, off, reboot, query, resetsp_host, and resetsp_hcp. Actions not supported by vendor1_power should result in an error message written to standard error and a negative value return code. A successful call to vendor1_power should result in a return code of 0. The vendor1_power Perl or shell command should have root:system ownership and permissions of 5-0-0.
    Note:
    The Perl or shell command provides the same function that a power method library does for IBM hardware. A power method library is not required for non-IBM hardware.
  5. The vendor1_power command will be passed the following parameters in the order shown:
    1. option_string (-v only)
    2. Hostname value
    3. HWControlPoint value
    4. HWControlNodeID value
    5. action (on, off, reboot, query, resetsp_host, or resetsp_hcp).

Testing remote power hardware control

To ensure the cluster is configured correctly, all CSM remote power hardware control functions should be tested before using them in a production environment. The rpower command should be run with the query, power on, power off, reboot, resetsp_host, and resetsp_hcp options to verify that all nodes are configured correctly and are responding accordingly. See the rpower man page or the IBM CSM for Linux: Administration Guide for examples.

Node power status is determined by polling. The PowerStatus attribute value is the status returned from polling MPs and MPAs: on, off, or unknown. The polling interval is set to 300 seconds by default, but can be changed if required using the chrsrc command.

The following examples provide some methods for testing remote power configuration:

  1. To view current attribute values for nodes in a cluster, enter the following command on the management server:
    lsnode -l
    

    Output for each node in the cluster is similar to:

    ManagementServer = csmlinux.pok.ibm.com
    Mode = PreManaged
    Name = clsn02.pok.ibm.com
    NodeNameList = {csmlinux.pok.ibm.com}
    PowerMethod = esp
    PowerStatus = on
    Status = on
    UniversalId = 0
    UpdatenodeFailed = 0
    ------------------- 
    Hostname = clsn03.pok.ibm.com
    AllowManageRequest = 0 (no)
    CSMVersion =
    ConfigChanged = 0 (no)
    ConsoleMethod = esp
    ConsolePortNum =
    ConsoleServerName = clsn03.ppd.pok.ibm.com
    ConsoleServerNumber =
    HWControlNodeId = clsn03_NODEID
    HWControlPoint = clsn03.ppd.pok.ibm.com
    HWModel =
    HWSerialNum =
    HWType =
    InstallAdapterDuplex =
    InstallAdapterMacaddr = 00:00:00:00:00:00
    InstallAdapterSpeed =
    InstallAdapterType =
    InstallCSMVersion = 1.2.0
    InstallDisk =
    InstallDiskType =
    InstallDistributionName = RedHat
    InstallDistributionVersion = 7.2
    InstallKernelVersion =
    InstallMethod = kickstart
    InstallOSName = Linux
    InstallPkgArchitecture = i386
    LParID =
    LastCFMUpdateTime =
    ManagementServer = csmlinux.pok.ibm.com
    Mode = PreManaged
    Name = clsn03.pok.ibm.com
    NodeNameList = {csmlinux.pok.ibm.com}
    PowerMethod = esp
    PowerStatus = on
    Status = on
    UniversalId = 0
    UpdatenodeFailed = 0                             
    
  2. To power on multiple cluster nodes simultaneously, enter:
    rpower -n clsn01,clsn07,clsn13,clsn20 on
    
  3. To change the power status polling interval to the minimum allowed value of 30 seconds, enter:
    chrsrc -s 'Name=="clsn07.pok.ibm.com"' IBM.HwCtrlPoint PollingInterval=30
    

Remote power diagnostics

The following examples show problems and solutions for specific instances of using remote power:

  1. Problem:
    Connectivity to the target hardware control point cannot be established.

    Description:
    Attempts to use trace routes or ping to the target hardware control point are unsuccessful. Connectivity from the management server to the target hardware control point cannot be established.

    Action:
    Contact your network administrator and check the hardware connectivity documentation to diagnose and solve the network connectivity problem.
  2. Problem:
    Cannot log in to the target hardware control point.

    Description:
    Connection to the targeted hardware control point cannot be established. Attempts to log in to the targeted hardware control point were unsuccessful or returned an error.

    Action:
    Confirm that the filename user ID and password file for hardware control points exists in the pathdirname directory. Confirm that the user ID and password for the user are entered and encrypted correctly.
  3. Problem:
    Java interface error for method <action>: communication session is not valid.

    Description:
    The hardware control library successfully logged in to the service processor specified by the node HWControlPoint attribute, but could not log in to the node's service processor.

    Action:
    Confirm the user ID and password for the hardware control node ID for the node and rerun the systemid command with the correct user ID and password.
  4. Problem:
    Java interface error for method <action>: node not found.

    Description:
    The hardware control point specified by the node HWControlPoint attribute is not configured to control this node.

    Action:
    Check the CSM configuration to ensure the specified node HWControlPoint and HWControlNodeID attributes are correct for the node. If correct, ensure that the hardware control point service processor is configured correctly to control the node.
  5. Problem:
    Java interface error for method <connect>: service processor host name is not valid.

    Description:
    The service processor specified by the node's HWControlPoint attribute is not valid for the node's PowerMethod attribute specified.

    Action:
    Verify that the node HWControlPoint and PowerMethod attributes are valid for the node using the lsnode -F <Hostname> command. If they are not correct, change them using the chnode <Hostname> HWControlPoint=<HWControlPoint> PowerMethod=<PowerMethod> command.
  6. Problem:
    Java interface error for method <connect>: SPException.

    Description:
    The hardware control library was unable to log in to the service processor specified by the node HWControlPoint attribute.

    Action:
    Confirm the user ID and password for the hardware control point and run the systemid command again with the correct user ID and password.
  7. Problem:
    Could not perform action because one or more HWControlPoint, HWControlNodeId, and PowerMethod attributes are not set.

    Description:
    For the command to work properly the HWControlPoint, HWControlNodeId and PowerMethod attributes must be defined.

    Action:
    Verify that the HWControlPoint, HWControlNodeId, and PowerMethod attributes are defined using the /opt/csm/bin/lsnode -F <hostname> command. If these attributes are not set, then set them to their correct values using the /opt/csm/bin/chnode command and rerun the rpower command.
  8. Problem:
    Could not load hardware control library.

    Description:
    CSM hardware control requires the PowerMethod attribute be set "netfinity" for an IBM xSeries controlled node. The hardware control library corresponding to this attribute must also reside in /opt/csm/lib. For example, if the power method is netfinity then the hardware control library must be /opt/csm/lib/libnetfinity_power.so.

    Action:
    Verify that the PowerMethod attribute is set correctly using the chnode -F <Hostname> command. For an IBM xSeries node, the power method value must be "netfinity". If it is not correct then set the value using the chnode <Hostname>PowerMethod=netfinity command. If the power method is set correctly, then verify that the /opt/csm/lib/libnetfinity_power.so library exists. If it does not, then reinstall the csm.server package.
  9. Problem:
    The hardware control point address specified is not valid.

    Description:
    The hardware control library could not resolve the host name or IP address of the service processor specified by the HWControlPoint attribute.

    Action:
    Verify that the HWControlPoint attribute is valid for the node and that the management server can reach the node.
  10. Problem:
    The Hostname attribute value specified for the node is not valid.

    Description:
    The specified host name is not a defined node resource.

    Action:
    Verify that the node specified by the Hostname attribute is a node resource using the lsnode <Hostname> command. If it is not valid then choose a valid node returned by the lsnode command or add the host name as a node resource.
  11. Problem:
    Cannot run the command because the IBM.DMSRM resource manager is not available.

    Description:
    The rpower command could not make a connection to the IBM.DMSRMd daemon. This daemon contains the IBM.NodeHWCtrl and IBM.HwCtrlPoint resource classes. The rpower command runs actions on these resource classes which in turn calls the appropriate hardware control library.

    Action:
    Ensure that the IBM.DMSRM daemon is running on the management server using the lssrc -s IBM.DMSRM command. If the output status field is "inoperative" then start the daemon using the startsrc -s IBM.DMSRM command.
  12. Problem:
    Could not find hardware control point for the node.

    Description:
    There is no corresponding hardware control point resource for the HWControlPoint attribute. This is an internal error that should not occur.

    Action:
    Stop the IBM.DMSRMd daemon using the stopsrc -s IBM.DMSRM command. Remove the hardware control point resources from the registry table using the /usr/sbin/rsct/bin/rmsrtbl/IBM/HwCtrlPoint/Resources command. Restart the IBM.DMSRMd daemon using the startsrc -s IBM.DMSRM command. Rerun the command. If this problem persists contact your IBM service representative.

Web and telnet access

Management processor adapters (MPAs) and management processors (MPs) can be controlled through a telnet session or Web browser. To access an MPA or MP using a telnet or Web connection, the http server must be installed and running, and the telnet session or Web browser must be targeted to the host name or IP address of the hardware control point. The physical login process for telnet and Web connections is the same as that shown in Figure 1. User IDs and passwords are required, and appropriate security measures should be implemented to restrict remote power control to users on the management VLAN, as described in CSM hardware and network requirements. Telnet and Web access provide alternative interfaces for tasks such as debugging.

Remote power logging

Log files are generated by specifying the -v flag on hardware control commands such as rpower. For example, the command:

rpower -v

could produce log files similar to the following examples:

An example trace file:

--------  Wed May 15 06:43:36 EDT 2002
06:43:36  
CLASSPATH=/opt/csm/codebase:/opt/csm/codebase/asmlibrary.jar:/opt/csm/codebase/sniacimom.jar:  \ 
      /opt/csm/codebase/xerces.jar
06:43:36  LIBPATH=/opt/csm/lib
06:43:36  hardware control point -> clsn01.pok.ibm.com (9.111.111.111)
06:43:36  using default userid/password
06:43:36  connect(9.111.111.111,USERID,********)
06:43:36  >>>>> connect	: 274 ms
06:43:36  connect() returns 0
06:43:36  invoke_method (query,clsn01.pok.ibm.com,clsn09)
06:43:36  using default userid/password
06:43:36  searching for clsn09
06:43:36  8 nodes found: 
06:43:36  	clsn01
06:43:36  	clsn02
06:43:36  	clsn03
06:43:36  	clsn04
06:43:36  	clsn05
06:43:36  	cl2n06-> found on 485 bus
06:43:37  >>>>> connect485	: 1030 ms
06:43:37  send_command (query,clsn01.pok.ibm.com,clsn09)
06:43:37  c5n64 state = 4
06:43:38  >>>>> query	: 474 ms
06:43:38  query() returns 0
 

An example error file:

[9.111.111.11]: com.ibm.sysmgt.lib.exception.DestinationInvalidException
	at com.ibm.sysmgt.lib.comm.IP.DCSocketBase.openDestinationHelper(DCSocketBase.java:178)
	at com.ibm.sysmgt.lib.comm.IP.DCSocketBase.logon(DCSocketBase.java:335)
	at com.ibm.csm.hcnetfinity.Netfinity.connect(Netfinity.java:1443)
 

Command usage with output to stdout:

rpower -v -n clsn05.pok.ibm.com query
clsn05.pok.ibm.com on
 
resource class: IBM.NodeHwCtrl
response for: query
        mc_errnum = 0x00000000
        mc_ffdc_id =
        mc_error_msg =
        SD value is :
                Element 0: type=8, value=clsn05.pok.ibm.com
                Element 1: type=3, value=0
                Element 2: type=8, value=on


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]