Last updated: July 06, 2012 - Raj Patel ( rajpat@us.ibm.com )
Recovering nodes When "cluster -status" shows nodes are "DOWN"
before using addnode or rmnode can be used sucessfully.
OUTPUT WHEN one of the NODE is "DOWN" - Example below in which vio2 is DOWN.
============================================================================
$ cluster -status -clustername [cluster-name]
Cluster Name State
ssp_cluster1 DEGRADED
Node Name MTM Partition Num State Pool State
vio1 8202-E4B02067FECP 1 OK OK
vio2 0 DOWN
Problem:
========
$ cluster -addnode -clustername [cluster-name] -hostname vio2.sspgroup.com
Node is already a cluster member
vio2.sspgroup.com
Command did not complete.
Solution:
=========
Please try below Steps:
a) On vio1: export VIO_API_DEBUG=7
b) On vio1: cluster -list
c) on vio1: Check if vio_daemon will list the DBN
( it should since VIO1 node is UP )
$ lssrc -ls vio_daemon
d) On vio1: cluster -status -clustername [cluster-name]
e) On vio1: cluster -sync -clustername [cluster-name]
f) On vio2: check caa daemons & ctrmc
$ lssrc -s ctrmc; lssrc -g caa
1) Pool daemons were not started/running
1a) On vio2: clstartstop to stop pool
$ clstartstop -stop -n [cluster-name] -m vio2
May report:
clmain.c cl_startstop 2955 Local node has not been STOPPED.
1b) On vio2: clstartstop to start pool
$ clstartstop -start -n [cluster-name] -m vio2
May report:
clmain.c cl_startstop 2955 Local node has not been STOPPED.
1c) On vio2: Wait up to 10 minutes for pool to be started.
1d) On vio1: cluster -sync -clustername [cluster-name] ( just to be sure )
2) Pool daemons were running, but pool didn't start
2a) On vio1: remove vio2 node from vio1
$ cluster -rmnode -clustername [cluster-name] -hostnane
vio2.sspgroup.com
May report:
Partition vio2 has been removed from the [cluster-name] cluster
2b) On vio1: check cluster removed vio2
$ cluster -status -clustername [cluster-name]
Cluster Name State
[cluster-name] OK
Node Name MTM Partition Num State Pool State
vio1 8202-E4B02067FECP 1 OK OK
2c) On vio1: add vio2 back to cluster
$ cluster -addnode -clustername [cluster-name] -hostname
vio2.sspgroup.com
Partition vio2 has been added to the [cluster-name] cluster.
REF:
http://pic.dhe.ibm.com/infocenter/powersys/v3r1m5/index.jsp?topic=/p7hcgl/clstartstop.htm
Side Notes:
Run "startsrc -g rsct" , "startsrc -g caa" if after node is rebooted and
necessary to re-start these services.