PQ57190: ENHANCEMENT TO HARDWARE FAILOVER DETECTION | |||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||
APAR status Closed as program error. Error description The HTTP Transport is loaded under a parent HTTP Process which spawns multiple threads. Each HTTP Server thread will call the HTTP Transport to determine if a request is intended for Websphere, once that determination is made, the HTTP Transport will select an AppServer to service the request. If the AppServer selected is unavailable due to a hardware failure, the HTTP request is directed to another AppServer. The unresponsive AppServer is flagged as unavailable. After a Retry interval has expired, the HTTP Transport attempts to connect to the failed AppServer. . The detection of a failed AppServer by a single thread has minimal impact on performance, however when multiple threads go through the same process of discovering a failed AppServer Node, the performance impact can be significant. This APAR ensures that only 1 HTTP Transport thread attempts to re-connect to a previously failed AppServer, minimizing the perfomance impact of hardware failures.Local fix No workaround exists.Problem summary **************************************************************** * USERS AFFECTED: WebSphere Application Server version 4.0.0, * * 4.0.1, and 4.0.2 users who use the webserver * * plugins. * **************************************************************** * PROBLEM DESCRIPTION: All webserver threads get stuck trying * * to see if a backend app server has come * * back up. As a result no new work * * coming in can be handled. * **************************************************************** * RECOMMENDATION: * **************************************************************** The plugin did not attempt to rediscover a downed clone with just one of the webserver threads. If the clone could not be contacted to determine if the port was up or down all threads could end up stuck waiting for the connect to timeout.Problem conclusion The plugin now only uses one of the webserver threads to see if a downed clone has come back up. As a result, the other threads are free to handle incoming requests and only use the app servers that are known to be up.Temporary fix Comments
APAR is sysrouted FROM one or more of the following: APAR is sysrouted TO one or more of the following: Modules/Macros
|
Document Information |
Product categories: Software > Application Servers >
Distributed Application & Web Servers > WebSphere Application
Server > General
Operating system(s):
Software version: 400
Software edition:
Reference #: PQ57190
IBM Group: Software Group
Modified date: Feb 20, 2002
(C) Copyright IBM Corporation 2000, 2006. All Rights Reserved.