If the Workload Management component is not properly distributing the workload across servers in multi-node configuration, use these steps to isolate the problem.
There are some basic steps for troubleshooting the Workload Management component:
Eliminate environment or configuration issues
First, determine the health of the cluster. In other words, are the servers capable of serving the applications for which they have been enabled? To do this, you must identify the cluster that is exhibiting the problem.
If you are experiencing workload management problems related to HTTP requests, such as HTTP requests not being served by all members of the cluster, be aware that the HTTP plug-in will balance the load across all servers that are defined in the PrimaryServers list if affinity has not been established. If you do not have a PrimaryServers list defined then the plug-in will load balance across all servers defined in the cluster if affinity has not been established. If affinity has been established, the plug-in should go directly to that server for all requests.
For workload management problems relating to enterprise bean requests, such as enterprise bean requests not getting served by all members of a cluster:
Note: The remainder of this article deals with enterprise bean workload balancing only. For more help on diagnosing problems in distributing Web (HTTP) requests, view the topics HTTP plug-in component troubleshooting tips and Web resource (JSP, servlet, html file, image, etc) will not display.
Browse log files for WLM errors and CORBA minor codes
If you still encounter problems with enterprise bean workload management, the next step is to check the activity log for entries that show:
To do this, open the service log (activity.log) on the affected servers, and look for the following entries:
Note: It is not unusual for a server to be marked unusable. The server may be tagged unusable for normal operational reasons, such as a ripple start being executed while there is still a load on the server from a client.
If any of these warning are encountered, follow the user response given in the log. If, after following the user response, the warnings persist, look at any other errors and warnings in the Log Analyzer on the affected servers to look for:
You may also see exceptions with "CORBA" as part of the exception name, since WLM uses CORBA (Common Object Request Broker Architecture) to communicate between processes. Look for a statement in the exception stack specifying a "minor code". These codes denote the specific reason a CORBA call or response could not complete. WLM minor codes fall in range of 0x4921040 - 0x492104F. For an explanation of minor codes related to WLM, see the Javadoc for the package and class com.ibm.websphere.wlm.WsCorbaMinorCodes.
Analyze PMI data
The purpose for analyzing the PMI data is to understand the workload arriving for each member of a cluster. The data for any one member of the cluster is only useful within the context of the data of all the members of the cluster.
Use the Tivoli Performance Viewer to verify that, based on the weights assigned to the cluster members (the steady-state weights), each server is getting the correct proportion of the requests.
To turn on PMI metrics using the Tivoli Performance Viewer:
WLM PMI metrics can be viewed on a server by server basis. In the Tivoli Performance Viewer select Node -> Server->WorkloadManagement->Server/Client. By default the data is shown in raw form in a table, collected every 10 seconds, as an aggregate number. You can also choose to see the data as a delta or rate, add or remove columns, clear the buffer, reset the metrics to zero, and change the collection rate and buffer size.
Resolve problem or contact IBM support
If the client logs indicate an error in WLM, collect the following information and contact IBM support.
If none of these steps solves the problem, check to see if the problem has been identified and documented using the links in Diagnosing and fixing problems: Resources for learning. If you do not see a problem that resembles yours, or if the information provided does not solve your problem, contact IBM support for further assistance.
For current information available from IBM Support on known problems and their resolution, see the IBM Support page.
IBM Support has documents that can save you time gathering information needed to resolve this problem. Before opening a PMR, see the IBM Support page.