PQ98500: ORB CONNECTIONS WILL NOT BE CLOSED RESULTING IN MEMORY LEAK

 A fix is available

PQ98500; 4.0.6: Memory leak from Orb connections left open



APAR status
Closed as program error.

Error description
Orb connections will stopped being closed by the server and will
only be removed by IO exceptions on the client side.  This conn
ection leak may appear also to be a memory leak since each conne
ction will also use native resources.
Local fix Problem summary
****************************************************************
* USERS AFFECTED: All WebSphere Application Server users.      *
****************************************************************
* PROBLEM DESCRIPTION: Application Server stops closing TCPIP  *
*                      connections resulting in memory growth  *
*                      until an out of memory condition is     *
*                      reached.                                *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
Under certain conditions, the Application Server ORB will stop
closing inactive connections.  When in this state, connections
are only closed by an explicit close from the client side, or
an I/O exception from an application being stopped on the client
side without explicitly closing the connection.  This
condition can be observed by an inordinate number of open
connections on the server (with netstat for example).  If
left long enough, the JVM that Application Server is running
under will not be able to create threads for new connections
anymore because it has run out of memory.

Following is an example of what you may see in an ORB trace:

[1/24/05 17:59:02:420 EST] 558fb8dd  e
UOW=16089-558fb8dd-8538311:
win16 source=ORBRas class=com.ibm.rmi.corba.PluginRegistry
method=ensureDefaultPrereqsGroupTwo:495 user23 org=IBM
prod=JDK
component=ORB
          java.lang.OutOfMemoryError:
JVMCI015:OutOfMemoryError,
cannot create anymore threads due to memory or resource
constraints
parm1=java.lang.OutOfMemoryError: JVMCI015:OutOfMemoryError,
cannot
create anymore threads due to memory or resource constraints
 at java.lang.Thread.start(Native Method)
 at com.ibm.rmi.iiop.WorkerThread.<init>(ThreadPoolImpl.java
(Inlined Compiled Code))
 at com.ibm.rmi.iiop.WorkerPool.createWorkerThread
(ThreadPoolImpl.java(Compiled Code))
 at com.ibm.rmi.iiop.WorkerPool.createThread
(ThreadPoolImpl.java(Inlined Compiled Code))
<...>
Problem conclusion
What is happening is that a connection cache entry for a port
gets orphaned.  This means that it shares a real connection
with other entries in the cache for that host/port, but it has
lost its place in the alias table.  When the other connections
are removed, the orphan isn't in the alias table and it isn't
removed from the cache.  When the cleanUp code gets to this
orphan entry in the cache, since it shared the real connection
object with the other cache entries, it is now the lucky
holder of a connection that is closed.  When a cleanUp is
attempted on an entry with state=closed, an exception is
thrown, which is a problem for the base ORBs cleanUp method
(apar PQ99675)

The problem is that the routine that is doing the "put" to the
connection cache assumed that it would only be called for a
brand new connection.  It creates a new alias vector for this
connection and overlays whatever was there before.  The code
that adds an additional cache entry for that connection calls
the "put" routine, then adds itself to the alias vector, which
gives a wierd bookend effect [two entries for the same
hostname showing up in the cache both before and after its ip
address]. The code change will make the "put" to the connection
cache tolerant of being used to add a new connection or an
additional cache entry for an alias.  It will also fix the
double-ip and the bookend effect.

This fix will be available as iFix PQ98500_Fix.jar
Temporary fix Comments
APAR information
APAR number PQ98500
Reported component name WEBSPHERE AE AI
Reported component ID 5630A2200
Reported release 400
Status CLOSED PER
PE NoPE
HIPER NoHIPER
Submitted date 2004-12-15
Closed date 2005-01-26
Last modified date 2005-01-26

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Modules/Macros
ORB          

SRLS

Fix information

Applicable component levels
R400 PSY    UP


Document Information


Product categories: Software > Application Servers > Distributed Application & Web Servers > WebSphere Application Server > General
Operating system(s):
Software version: 400
Software edition:
Reference #: PQ98500
IBM Group: Software Group
Modified date: Jan 26, 2005