Health management
With the health management feature in Liberty, you can take a policy-driven approach to monitoring the application server environment and take action when unhealthy criteria is discovered.
You can define the health policies, which include the health conditions to be monitored in your environment and the health actions to take if these conditions are met.
Health conditions
Health conditions define the variables that you want to monitor in your environment. The condition element defines what behavior can trigger this health policy. Only one condition element can be defined per health policy. You can choose from the following predefined health conditions:
- Excessive request timeout condition
- Specifies a percentage of HTTP requests that can time out. When the percentage of requests
exceeds the defined value, the health actions run. The timeout value depends on your environment
configuration.
<excessiveRequestTimeout timeoutPercentage="5"/>
Note: Dynamic Routing must be enabled to use this condition. - Excessive response time condition
- Tracks the average amount of time that requests take to complete. If the time exceeds the
defined response time threshold, the health actions
run.
<excessiveResponseTime responseTime="10s"/>
Note: Requests that exceed the timeout value that is configured for the excessive request timeout condition are not counted towards this health condition. For example, if the default timeout value is set to 60 seconds, then any request that exceeds 60 seconds will time out and is not included in the average response time calculation for this health condition. This restriction applies even if you do not have an excessive request timeout condition defined.Note: Dynamic Routing must be enabled in to use this condition. - Memory condition: excessive memory usage
- Tracks the memory usage for a member. When the memory usage exceeds a percentage of the heap
size for a specified time, health actions
run.
<excessiveMemoryUsage heapSizePercentage="85" timePeriod="5m"/>
- Memory condition: memory leak
- When a downward trend in free memory is detected, health actions
run.
<memoryLeak/>
Health actions
Health actions define the activities to perform when a health condition is not met. Action elements define what action will be taken in response to a detected condition. All actions share the element type of <action>. The action attribute determines which action is taken and multiple actions can be defined for each health policy. Actions are run in the order they are specified in the policy. The following table lists the health actions that are supported in Liberty server environments:
Health action | Liberty servers that run in the same collective controller |
---|---|
Restart server | Supported |
Take thread dumps | Supported |
Take Java™ virtual machine (JVM) heap dumps | Supported for servers that are running on the IBM® JRE or JDK |
Enter server into maintenance mode | Supported |
Exit server out of maintenance mode | Supported |
<action action="generateThreadDump"/>
<action action="generateHeapDump"/>
<action action="restartServer"/>
<action action="enterMaintenanceMode"/>
<action action="exitMaintenanceMode"/>
Health targets
- A host
<host hostName="someHost"/>
- Each of the servers in a
cluster
<cluster clusterName="someCluster"/>
- A
single-server
<server hostName="Host" wlpUsrDirectory="/opt/ibm/liberty/wlp" serverName="Server"/>
Each target type has a unique element that is used to define it within the healthPolicy element. More than one target can be specified per health policy.