A long-running process spans multiple transactions. If a transaction fails because of an infrastructure failure, Business Flow Manager provides a facility for automatically recovering from these failures.
In a long-running process, the Business Flow Manager sends itself request messages that trigger follow-on navigation. For each incoming request message, a new transaction is started and the request message is passed to the Business Flow Manager for processing. Each transaction consists of the following actions:
Cause | Response |
---|---|
Unavailable infrastructure | In normal processing mode, for a specified time, all messages are kept available until the infrastructure is operational again. This problem might be caused by a database failure, for example. |
Damaged message | After a specified number of retries, the message is put into the hold queue. From the hold queue, it can also be moved back to the input queue, to retry the transaction. |
If the infrastructure is unavailable, and the retention queue is full, message processing is switched from normal processing to quiesce mode. In quiesce mode, the message processing is slowed down until the infrastructure is available again. When the infrastructure becomes available, message processing switches back to normal mode.
During normal processing, a message is processed as follows:
In quiesce mode, processing a message is attempted periodically. Messages that fail to be processed are put back in the input queue, without incrementing either the delivery count or the retention queue traversal count. As soon as a message can be processed successfully, message processing is switched back to normal mode.
The retry limit defines the maximum number of times that a message can be transferred to the retention queue before it is put in the hold queue.
To be put in the retention queue, the processing of a message must fail three times.
For example, if the retry limit is 5, a message must go to the retention queue five times (it must fail for 3 * 5 = 15 times), before the last retry is started. If the last retry fails two more times, the message is put in the hold queue. This means that a message must fail (3 * RetryLimit) + 2 times before it is put in the hold queue.
In a performance-critical application running in a reliable infrastructure, the retry limit should be small: one or two, for example. If the retry limit is set to zero, a repeatedly failing message is retried three times and then it goes immediately into the hold queue.
This Business Flow Manager property is specified in the administrative console. Click Configuration tab, under Business Integration, click .
, or if Business Process Choreographer is configured on a cluster. On theThe retention queue message limit defines the maximum number of messages that can be in the retention queue. If the retention queue overflows, the system goes into quiesce mode. To make the system enter quiesce mode as soon as a message fails, set the value to zero. To make Business Flow Manager more tolerant of infrastructure failures, increase the value.
This property is specified in the administrative console. Click Configuration tab, under Business Integration, click .
, or if Business Process Choreographer is configured on a cluster. On theThe administrator can move the messages from the hold or retention queues back to the internal queue. This can be done using the administrative console, administrative scripts, or failed event manager.