PQ67826: NETWORK IS UNREACHABLE BECAUSE THER IS NO SO_KEEP_ALIVE OPTION FOR THE ORB | |||||||||||||||||||||||||||||||||||||||
![]() |
|||||||||||||||||||||||||||||||||||||||
![]() APAR status Closed as program error. Error description Currently, ORB team has provided a test patch to customer to enable socket's so_keep_alive feature to detect the network failure problem to when user detaches a network cable. With or without socket's so_keep_alive enabled, ORB application is unable to receive any IOException back from socket object used in AIX JDK 1.3.1. (for WAS404)Local fix Code changeProblem summary **************************************************************** * USERS AFFECTED: All WebSphere Application Server users of * * WLM client for quicker failover. * **************************************************************** * PROBLEM DESCRIPTION: WebSphere client (WLMed or non-WLMed) * * is either slow or unable to detect a * * broken network when a peer connection * * is unreachable. * **************************************************************** * RECOMMENDATION: * **************************************************************** When using two AIX machines, one running the client, the other running the server, if the user unplugs the networking cable on the client box, the client application could hang for a long time. In this typical failure scenario, the user has a WLMed client and multiple backend EJB servers. When one of the EJB servers is unreachable, the WLMed client will be hung for a long time before switching to a working clone. To fix this problem, apply this efix to allow the socket object to have so_keepalive enabled. In addition to this fix, also set -tcp_keepidle to 'no' in the JVM and TCP/IP layer to allow the socket timeout to take effect. Also, change the TCP/IP timeout to a smaller value. This efix is applicable to all platforms although it is originally reported by AIX user.Problem conclusion When a remote process is unreachable, WAS might hang for a long period of time and not be able to switch to a new WLM server. This efix will add setKeepAlive method for sockets used for RMI-IIOP connection and allows the socket to throw an IO Exception back to the ORB component, and ORB, in turn, throws CORBA.Comm_Failure back to its caller (ie, WLM).Temporary fix Comments
APAR is sysrouted FROM one or more of the following: APAR is sysrouted TO one or more of the following: Modules/Macros
SRLS
|
Document Information |
Product categories: Software > Application Servers >
Distributed Application & Web Servers > WebSphere Application
Server > General
Operating system(s):
Software version: 400
Software edition:
Reference #: PQ67826
IBM Group: Software Group
Modified date: Nov 14, 2002
(C) Copyright IBM Corporation 2000, 2006. All Rights Reserved.