APAR status
Closed as program error.
Error description
A Websphere "hang" within the nodeagent occurs intermittently
when security is enabled. A deadlock has occurred within
security code. The customer must stop and restart the
nodeagent to recover. These exceptions are seen in the
systemOut.log for the nodeagent:
...
■1/6/04 14:42:00:603 EST 3bcd04 LdapRegistryI E SECJ0352E:
Could not get the users matching the pattern root because of
the following exception javax.naming.CommunicationException:
mmfd0009:389. Root exception is java.net.ConnectException:
Connection refused
(stack trace not shown to save space)
...
■1/6/04 14:42:00:611 EST 3bcd04 LdapRegistryI E SECJ0336E:
Authentication failed for user root because of the following
exception com.ibm.websphere.security.CustomRegistryException:
mmfd0009:389.
(stack trace not shown to save space)
...
Root exception is java.net.ConnectException: Connection refused
...
■1/6/04 14:42:00:616 EST 3bcd04 LTPAServerObj E SECJ0369E:
Authentication failed when using LTPA. The exception is
com.ibm.websphere.security.CustomRegistryException: mmfd0009:389
(stack trace not shown to save space)
...
Root exception is java.net.ConnectException: Connection refused
...
■1/6/04 14:42:00:636 EST 3bcd04 JaasLoginHelp E SECJ4001E:
Login failed for root/mmfd0009:389
com.ibm.websphere.security.auth.WSLoginFailedException:
mmfd0009:389
...
■1/6/04 14:42:00:643 EST 3bcd04 ContextManage E SECJ0270E:
Failed to get actual credentials. The exception is
com.ibm.websphere.security.auth.WSLoginFailedException:
mmfd0009:389
(stack trace not shown to save space)
...
Root exception is java.net.ConnectException: Connection refused
...
■1/6/04 14:42:00:656 EST 20cf3b LdapRegistryI E SECJ0336E:
Authentication failed for user root because of the following
exception com.ibm.websphere.security.CustomRegistryException:
mmfd0009:389
------------ threaddump analysis --------------
Analysis of the thread dump of the hang for the nodeagent shows
a
Java level deadlock:
"Alarm : 8":
waiting to lock monitor 0xbc388 (object 0xe8d70eb0, a [B),
which is locked by "Thread-9794"
"Thread-9794":
waiting to lock monitor 0xbc3c0 (object 0xe8d59c80, a
com.ibm.ws.security.auth.ContextManagerImpl),
which is locked by "Alarm : 8"
---
Java Stack for "Alarm : 8":
at com.ibm.ws.security.auth.WSCredentialImpl.getExpiration
(WSCredentialImpl.java:666)
at com.ibm.ws.security.auth.ContextManagerImpl.getServerSubject
(ContextManagerImpl.java:1018)
... (rest of stack not shown for brevity)
.
Java Stack for "Thread-9794":
==========
at com.ibm.ws.security.auth.ContextManagerImpl.
getServerCredential(ContextManagerImpl.java:1106)
- waiting to lock <e8d59c80>
at com.ibm.ws.security.auth.ContextManagerImpl)
at com.ibm.ISecurityLocalObjectBaseL13Impl.VaultImpl.
getServerCred(VaultImpl.java:2620)
at com.ibm.ISecurityLocalObjectBaseL13Impl.
CredentialsImpl.is_valid(CredentialsImpl.java:2023)
at com.ibm.ws.security.auth.WSCredentialImpl.isCurrent
(WSCredentialImpl.java:957)
at com.ibm.ws.security.auth.WSCredentialImpl._assert
(WSCredentialImpl.java:1164)
at com.ibm.ws.security.auth.WSCredentialImpl.
getCredentialToken(WSCredentialImpl.java:444)
- locked <e8d70eb0>
at com.ibm.ISecurityLocalObjectTokenBaseImpl.
WSSecurityContextLTPAImpl.initSecContext
(WSSecurityContextLTPAImpl.java:124)
... (rest of stack not shown for brevity)
Keywords: node agent hang wait dead lock
Local fix Problem summary
****************************************************************
* USERS AFFECTED: WebSphere Application Server users who have *
* enabled security. *
****************************************************************
* PROBLEM DESCRIPTION: The server may intermittantly hang. *
****************************************************************
* RECOMMENDATION: *
****************************************************************
The server may intermittantly hang. A thread dump shows that
the method getServercredential() and getExpiration() in the
class com.ibm.ws.security.auth.WSCredentialImpl
are waiting on the monitors held by each other and are
deadlocked.
Problem conclusion
Synchronization in getserverCredential() and getExpiration()
was not necessary and was removed.
Temporary fix
Provide test fix
Comments
APAR information |
APAR number |
PQ83910 |
Reported component name |
WAS BASE 5.0 |
Reported component ID |
5630A3600 |
Reported release |
00S |
Status |
CLOSED PER |
PE |
NoPE |
HIPER |
NoHIPER |
Special Attention |
NoSpecatt |
Submitted date |
2004-01-29 |
Closed date |
2004-03-16 |
Last modified date |
2004-03-16 |
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Modules/Macros
Publications Referenced
Applicable component levels |
R003 PSY |
UP |
R00A PSY |
UP |
R00H PSY |
UP |
R00I PSY |
UP |
R00P PSY |
UP |
R00S PSY |
UP |
R00W PSY |
UP |
R103 PSY |
UP |
R10A PSY |
UP |
R10H PSY |
UP |
R10I PSY |
UP |
R10P PSY |
UP |
R10S PSY |
UP |
R10W PSY |
UP |
|