WebSphere Extended Deployment, Version 6.0.x
             Operating Systems: AIX, HP-UX, Linux, Solaris, Windows, z/OS


Health management

With health management, the system can take a policy-driven approach to monitoring the application server environment and take action when certain criteria are discovered.

Health monitoring and management subsystem

The health management subsystem continuously monitors the state of servers and the work that is performed by the servers in your environment. The health management subsystem consists of two main elements: the health controller and health policies.

The health controller is the autonomic manager that controls the health monitoring and management subsystem, and acts on your health policies to ensure certain conditions exist. The health controller runs on one of the nodes in your environment. You can disable or enable health management using the health controller, while still having multiple health policies defined on the system. You can also apply limits to the frequency that the server restarts or prohibit restarts during certain periods.

Health policies define the health conditions that you want to monitor in your environment and the health actions to take if these conditions are not met.

The health management subsystem functions when WebSphere Extended Deployment is in automatic or supervised operating mode. When the reaction mode on the policy is set to automatic, the health management system takes action when a health policy violation is detected. In supervised mode, the health management system creates a runtime task that proposes one or more reactions. The system administrator can approve or deny the proposed actions.

Health conditions

Health conditions define the variables that you want to monitor in your environment. Several categories of health policy conditions exist. The following list defines the existing health conditions:
  • Excessive memory consumption, which can indicate a memory leak
  • Excessive response time, which can indicate that the server is in an endless loop
  • Excessive request timeout, which can indicate that the server is in an endless loop
  • The volume of work performed by a server
  • Storm drain detection, which relies on change point detection on given time series data
  • The age of the server

Health actions

Health actions define the process to use when a health condition is not met. Depending on the conditions that you define, the actions can vary.

The following list defines the health actions that can be run in your environment:

Health policy targets

Health policy targets can be a single server, each of the servers in a cluster or dynamic cluster, or each of the servers in a cell. You can define multiple health policies to monitor the same set of servers.

Default health policies [Version 6.0.1 and later]

[Version 6.0.1 and later] Default health policies are a set of predefined, supervised mode, cell-level policies that are installed with WebSphere Extended Deployment. You can modify the default policies for your environment, or delete the default health policies. Because the default health policies monitor each server in supervised mode, you can use these policies to prevent health problems. You can define policies with more detailed settings or automated mode operation for particular servers or collections of servers in addition to the default policies. Five default cell-wide health policies are created during installation. The following default health policies correspond to the appropriate policy type:
  • Default memory leak: Default standard detection level
  • Default excessive memory usage: Set to 95 percent of the JVM heap size for 15 minutes
  • Default excessive request timeout: Set for 5 percent of the requests timing out
  • Default excessive response time: Set to 120 seconds
  • Default storm drain: Default standard detection level

To view the recommendations that are made by the default health policies and to take actions on these recommendations, click System administration > Task management > Runtime tasks.




Related concepts
Health management and long-running work
Related tasks
Configuring health management
Creating health policies
Managing runtime tasks
Concept topic    

Terms of Use | Feedback

Last updated: Nov 30, 2007 3:58:31 PM EST
http://publib.boulder.ibm.com/infocenter/wxdinfo/v6r0/index.jsp?topic=/com.ibm.websphere.xd.doc/info/odoe_task/codhealth.html