Tivoli Netcool/Impact Tivoli Netcool/Impact Clustering Nameserver clustering Nameserver clustering What is clustering? Why use clustering? Before you configure clustering be sure that you have correctly installed Impact, and that you have the NCHOME environment variable set correctly. Clustering is a feature that allows you to install multiple instances of the Netcool®/Impact nameserver and configure them to provide failover capability. Clustering prevents the nameserver from becoming a single point of failure in a Netcool/Impact installation. Nameserver clustering Nameserver clustering Install Impact on another machine (Secondary Cluster member) Create new server instance (use one of two methods) Can be done during installation Run $NCHOME/impact/install/nci_new_server (-console) Cluster Group name has be to identical to Primary Use a unique Instance name for easier server identification For example: NCI_B Editing the nameserver.props and web.xml files will require the server to be restarted for properties to take affect First stop both servers, if they are running. The secondary nameservers are responsible for providing failover functionality for the primary nameserver. They do not communicate with the Netcool applications in real time, except in the case that the primary nameserver fails and a secondary nameserver assumes the role of the primary nameserver. It is necessary to create a new server instance on the server where your secondary impact installation is installed. You can specify the instance name and cluster group name either during installation or after installation using the nci_new_server program, which can be found in the $NCHOME/impact/install directory. The “-console” option allows you to run this program in console mode. The Secondary Cluster Group name must be identical to the Primary Cluster Group name, but it is strongly recommended that you use a unique Instance name for easier server identification. An example of a cluster group name which would be shared by both primary and secondary cluster members is “NCICLUSTER”. An example of an instance name for the primary would be NCI_P, and for the secondary server it would be NCI_B. First, make sure you have stopped both Primary and Secondary Impact cluster servers before you make changes. Nameserver clustering Nameserver clustering Edit both server’s $NCHOME/impact/etc/nameserver.props: impact.nameserver.0.host=Primary’s hostname impact.nameserver.0.port=Primary’s HTTP port impact.nameserver.0.location=/nameserver/services impact.nameserver.1.host=Secondary’s hostname impact.nameserver.1.port=Secondary’s HTTP port impact.nameserver.1.location=/nameserver/services impact.nameserver.count=2 impact.nameserver.ssl_enabled=false impact.nameserver.netcall_timeout=5 impact.nameserver.userid=admin impact.nameserver.password=netcool Edit the properties files for the Primary and Secondary servers. The properties files are located in, $NCHOME/impact/etc/nameserver.props Make the modifications that are highlighted in blue. The files need to be identical. The text in blue represents what needs to be modified, or what needs to be added to this file. You will need to know the hostname and http port for your servers. Nameserver clustering Nameserver clustering Edit both server’s $NCHOME/guiserver/etc/nameserver.props: nameserver.0.host=Primary’s hostname nameserver.0.port =Primary’s HTTP port nameserver.0.location=/nameserver/services nameserver.1.host=Secondary’s hostname nameserver.1.port=Secondary’s HTTP port nameserver.1.location=/nameserver/services nameserver.count=2 nameserver.ssl_enabled=false nameserver.netcall_timeout=5 nameserver.userid=admin nameserver.password=netcool Edit this properties file the same way as described on the previous slide. Note that this file is associated with the guiserver. Namserver clustering Namserver clustering Edit both server’s $NCHOME/guiserver/install/stage/nameserver/WEB-INF/web.xml: REPLICANT.COUNT 2 REPLICANT.0.HOST Primary’s hostname REPLICANT.0.PORT Primary’s HTTP port REPLICANT.0.LOCATION /nameserver/services Edit the web.xml file that is in the directory structure listed here. The changes are highlighted in blue. Nameserver clustering Nameserver clustering Edit both server’s $NCHOME/guiserver/install/stage/nameserver/WEB-INF/web.xml (cont.): REPLICANT.1.HOST Secondary’s hostname REPLICANT.1.PORT Secondary’s HTTP port REPLICANT.1.LOCATION /nameserver/services SELFINDEX 0 ? Set this to “0” for Primary and “1” for Secondary Make the same changes to the web.xml file for the secondary server, as shown in the directory structure listed here. Be sure to set the SELFINDEX parameter correctly: 0 on the primary cluster’s web.xml file and 1 on the secondary cluster’s web.xml file. Impact clustering Impact clustering Redeploying the GUI server Start primary clustered server first Run $NCHOME/guiserver/install/ncgui_createear Check netcool.log file for proper server startup: 13:21:46,498 DEBUG [Debug] Event Processor got primary status update 1 PrimaryMode: true 13:21:46,498 INFO [ClusterMember] NCI_P is now acting as primary cluster member 13:21:46,549 INFO [ImpactServerGBean] Server startup in 27707 ms 13:21:46,549 INFO [ImpactServerGBean] Impact instance [NCI_P] started successfully Whenever any changes are made to the web.xml files you should redeploy the Primary GUI server. First make sure the GUI server is running. Then run the $NCHOME/guiserver/install/ncgui_createear program. Impact clustering Impact clustering Redeploy the Secondary GUI server Start secondary cluster member (NCI_B) Run $NCHOME/guiserver/install/ncgui_createear Check netcool.log file for proper startup secondary: 13:48:43,734 INFO [ClusterMember] Starting the Cluster member with name: NCI_B 13:48:43,893 INFO [UddiClusterBootStrapper] Established connection with Nameserver [rmi://9.52.130.213:45589/NCICLUSTER]; found primary cluster member ClusterMember_Stub[UnicastRef2 [liveRef: [endpoint:[9.52.130.213:45591,com.micromuse.common.rmi.LoggingRMIClientSocketFactory@1f](remote),objID:[-14a8e9d:11e3c8616b3:-8000, 2]]]] 13:48:43,894 INFO [ClusterMember] A valid primary is running with name: NCI_P 13:48:43,899 INFO [UddiClusterBootStrapper] Established connection with Nameserver [rmi://9.52.130.213:45589/NCICLUSTER]; found primary cluster member ClusterMember_Stub[UnicastRef2 [liveRef: [endpoint:[9.52.130.213:45591,com.micromuse.common.rmi.LoggingRMIClientSocketFactory@1f](remote),objID:[-14a8e9d:11e3c8616b3:-8000, 2]]]] 13:48:43,924 INFO [ClusterMember] NCI_B is now acting as Secondary cluster member. … 13:49:06,979 INFO [ClusterMember] ClusterMember started standby services in standby mode 13:49:07,007 INFO [ImpactServerGBean] Server startup in 24690 ms 13:49:07,007 INFO [ImpactServerGBean] Impact instance [NCI_B] started successfully Now redeploy the Secondary GUI server. Make sure guiserver is running. Then run the $NCHOME/guiserver/install/ncgui_createear program. After you run the createear program above, you should see similar messages in the netcool.log file that indicate a proper startup. Impact clustering Impact clustering If primary cluster member (NCI_P) goes down… Check the NCI_B netcool.log file to verify take over as primary server: 12:21:06,263 WARN [SoapServiceNameserverUtil] Unable to create a DynamicBinding for http://dingy:8080/nameserver/services DynamicBindingException- LABEL: NSSERVICES_COMM_ERROR DESC: DynamicBindingException: Communication problems with Nameserver Services: Connection refused … 12:21:06,284 INFO [Server] Registered Impact soap endpoint at port '8080' of host [9.52.130.197] with the Nameserver 12:21:06,284 DEBUG [Debug] Event Processor got primary status update 1 PrimaryMode: true 12:21:06,284 DEBUG [Debug] RoundRobinEventQueueManager got primary status update 1 PrimaryMode: true 12:21:06,284 INFO [ClusterMember] Assumed the role of primary If the secondary cluster member becomes the primary cluster member, and the former primary is restarted, it becomes the new secondary. You will see messages similar to the following messages in the netcool.log file of the secondary server, if and when the Primary server goes down. Note that the server that has been up the longest is always the Primary Cluster member, even if the original Primary Cluster member goes down and is restarted. The original primary will only become Primary again, if you stop the original Secondary Cluster member, which is now acting as the Primary. Trademarks