Nullpointer Exception is written into trace.log but all processes complete successfully
 Technote (troubleshooting)
 
Problem(Abstract)
Nullpointer Exception is written into trace.log but all processes complete successfully:
java.lang.NullPointerException at com.ibm.bpe.framework.ProcessMDB.processEngineMessage(ProcessMDB.java:675)
 
Resolving the problem

Exception in trace.log
java.lang.NullPointerException at com.ibm.bpe.framework.ProcessMDB.processEngineMessage(ProcessMDB.java:675)
(...Stack trace continues)

Quiesce/Resume algorithm
Process Choreographer uses JMS messages to navigate through multi-transacted processes. If one of these messages cannot be processed, i.e. the transaction has to be rolled back.
The Process Choreographer implements a Quiesce/Resume algorithm that makes sure that messages are copied to a hold queue, if and only if they are poisoned.
In short, the algorithm works as follows:
  • Message is retried for 3 times.
  • If the message could not be processed for 3 times, it is removed from the BPEIntQueue and copied to the BPERetQueue. The latter queue works as a buffer. If the buffer contains more than a configurable number of messages, the system switches to a quiesce mode (Infrastructure seems to fail, because there where a number of messages that could not be processed in a row).
  • In quiesce mode and in the normal processing mode, the next message on the queue is processed. If that works, i.e. the transaction can be committed, all messages in the retention queue are copied back to the BPEIntQueue for a retry. (If the system was in quiesce mode, the successfull processing of the last message signals that the infrastructure is working again. The system switches back to processing mode).
  • Messages that have passed the retention queue for a configurable number of times are assumed to be poisoned and are copied into the BPEHoldQueue.

Reason for the Exception
If messages that are left in the retention queue because they could not be processed earlier, and any instance data of the process the message belongs to is deleted (should not happen !), we get the following scenario:
  • An interruptible process is started.
  • Messages are put into the BPEIntQueue.
  • If one of these messages is processed sucessfully, the message from the retention queue is copied into the BPEIntQueue and therefore processed in later.
  • The NullpointerException is caused by the fact that the workflow engine needs a context for the message. If this context does not exist for any reason, the message cannot be processed. Note that this is really an exception, because the situation that there is no context in the database, but messages left in the system, should not occur!

Note that the exception is not caused by the current testcase which will complete successfully! The message that causes the exception belongs to a process that is already deleted in the database.

Proposed solution
If there are only a few poisoned messages in the system, do nothing. The quiesce/resume algorithm will filter them out soon.
(That can be seen in the trace.log:
The retry counter is incremented
Retry Count was set to 1 on EngineMessage
Retry Count was set to 2 on EngineMessage.
Retry Count was set to 3 on EngineMessage.
An the message is copied to the hold queue.)

In test environments, it might be better to simply empty all queues, and restart the testcase that "caused" that exception. It will then work without any problems.

The queues can be emptied using the following commands:
dis qlocal('WQ_BPERetQueue') CURDEPTH
clear qlocal('WQ_BPERetQueue')
 
 
 


Document Information


Product categories: Software > Application Servers > Distributed Application & Web Servers > WebSphere Application Server > Enterprise Edition (EE)
Operating system(s): Windows
Software version: 5.0
Software edition:
Reference #: 1110819
IBM Group: Software Group
Modified date: Dec 3, 2004