WebSphere WebSphere Application Server Network Deployment, Version 6.0.x Operating Systems: AIX, HP-UX, Linux, Solaris, Windows

Service integration messaging - troubleshooting tips

This topic provides a set of specific tips to help you troubleshoot problems with service integration messaging.

Messaging engine cannot start up because of a known error in the Informix JDBC Driver 3.00JC1

When attempting to use the Informix JDBC driver 3.00JC1 to store data, the messaging engine cannot start up and the following error message might appear in the WebSphere Application Server SystemOut.log file:

00000022 SibMessage E [RetireBus:retire_web.000- RetireBus] CWSIS0002E: 
The messaging engine encountered an exception while starting. 
Exception: com.ibm.ws.sib.msgstore.PersistenceException: CWSIS1501E: 
The data source has produced an unexpected exception: java.sql.BatchUpdateException: Unique constraint 
(informix.u114_62) violated.
00000022 SibMessage E [RetireBus:retire_web.000- RetireBus] CWSID0035E: 
Messaging engine retire_web.000-RetireBus cannot be started;
detected error reported during com.ibm.ws.sib.msgstore.impl.MessageStoreImpl start() 
00000022 SibMessage E [RetireBus:retire_web.000- RetireBus] CWSID0027I: 
Messaging engine retire_web.000-RetireBus cannot be restarted because a serious error has been reported.T] 
00000022 SibMessage I [RetireBus:retire_web.000- RetireBus] CWSID0016I: 
Messaging engine retire_web.000-RetireBus is in state Stopped.  

There is a known defect (PTS 172471) in the Informix JDBC Driver 3.00JC1. To avoid this error, upgrade the Informix JDBC Driver to 3.00JC2.

Problem determination for a data store

You can perform a dump, in reduced form, of the data in the data store for a messaging engine. The output is intended for use by IBM Service personnel.
If there is a problem with the data in the data store, it can be hard to diagnose from the trace output. However, you can create a dump, in XML format, of the data in the data store. This makes diagnosis easier because it is a human readable representation that can be transformed to other formats as required. You cam create a data store dump by typing in the wsadmin tool:
$AdminControl invoke [$AdminControl queryNames type=SIBMessagingEngine,name=messagingenginename,*] 
 dump com.ibm.ws.sib.msgstore.*

The dump is created as an XML file in the $WAS_HOME/logs/server1 directory. The file is named according to the format: messaging_engine_nameUUIDtimestamp.xml

The format of the file is illustrated in the following example:
<MessageStore>
    <itemStreams>
        <ItemStreamLink id="0" state="Available">
            <class>com.ibm.ws.sib.msgstore.ItemStream</class>
            <priority>5</priority>
            <canExpireSilently></canExpireSilently>
            <storageStrategy>STORE_NEVER</storageStrategy>
            <expiryTime>0</expiryTime>
            <sequence>0</sequence>
            <tranID>null</tranID>
            <tickValue>0</tickValue>
            <items>
                <ItemLink id="2" state="Available" refCount="3" refCountDecreasing="false">
                    <class>com.ibm.ws.sib.msgstore.Item</class>
                    <priority>5</priority>
                    <canExpireSilently></canExpireSilently>
                    <storageStrategy>STORE_NEVER</storageStrategy>
                    <expiryTime>0</expiryTime>
                    <sequence>1</sequence>
                    <tranID>null</tranID>
                    <tickValue>0</tickValue>
                </ItemLink></items></ItemStreamLink></itemStreams></MessageStore>

Possible causes of the XAResourceNotAvailableException exception and how to take appropriate action

When the deleteNode command is used for a node that hosts messaging engines, those messaging engines are deleted. When new messaging engines are re-created following the addNode command, they have different identifiers and so during transaction recovery it is not possible to connect to the old messaging engines. A message identifying the XAResourceNotAvailableException exception is generated in the SystemOut.log file for each server that hosts a messaging engine.

To solve this problem, you must follow the procedure described in Resolving in-doubt transactions.

The XAResourceNotAvailableException exception can also be thrown when a server in a cluster bus member fails over. In this case, no operator intervention is required to recover and resolve transactions.

Problems when you re-create a service integration bus

If you delete a service integration bus, and later create a new bus with the same name, the messaging engine fails to start and messages like the following are generated in SystemOut.log:
[8/11/04 21:55:01:439 CDT] 0000000f SibMessage    I   
[LateBus:xyzsun15.server1-LateBus] isAlive: MessagingEngine suffered common mode error. Correct error (see 
logs) and restart server.
[8/11/04 21:55:01:468 CDT] 0000000f SibMessage    I   
[LateBus:xyzsun15.server1-LateBus] isAlive: MessagingEngine will be stopped because of common mode error. 
No failover will occur.
[8/11/04 21:55:01:493 CDT] 0000000f SibMessage    I   
[LateBus:xyzsun15.server1-LateBus] Messaging Engine 
xyzsun15.server1-LateBus not in state from which stop is valid: Starting
[8/11/04 21:55:01:513 CDT] 0000000f SibMessage    I   
[LateBus:xyzsun15.server1-LateBus] isAlive: MessagingEngine stopped because of common mode error. Correct 
error (see logs) and restart server.
[8/11/04 21:57:01:431 CDT] 0000000e SibMessage    I   
[LateBus:xyzsun15.server1-LateBus] isAlive: MessagingEngine suffered common mode error. Correct error (see 
logs) and restart server.

The messaging engine failed to start because the database directory for the messaging engine still exists after deletion of the bus and you must manually remove it. To delete the Cloudscape database for a non-existent messaging engine, you must delete the database directory that is located in profile_root/databases/com.ibm.ws.sib, where profile_root is the directory in which profile-specific information is stored.

You must stop WebSphere Application Server before you can delete the database files.

For other databases, you can either delete all of the rows from the data store tables or you can drop all of the tables. The names of the data store tables all begin with SIB, and are in the schema that you configured for the data store.

For more information , see Data store life cycle.

Problems when re-creating bus members

If you previously created a bus and added a bus member using the DEFAULT data source, when you attempt to recreate this bus member and you therefore delete the bus member and then try to add it again, the following exception appears:
ADMG0037E: A new instance of the DataSource object  cannot be created because the jndiName attribute 
 of an existing DataSource object has the same value as jdbc/com.ibm.ws.sib/<NODENAME>.<SERVERNAME>-
<BUSNAME>
Note: The bus member is recreated but the messaging engine fails to start.
To resolve the problem, delete the data source manually. To do this, use the administrative console to complete the following steps:
  1. Delete the bus member that is causing the exception.
  2. Select Resources > JDBC Providers.
  3. Change the scope to the scope of the data source that you want to delete. For example, on a single server install, you would choose Server, but this may vary with other topologies.
  4. In the list of data sources, click Cloudscape JDBC Provider. Note: Select the non XA option.
  5. On the Configuration tab, under Additional Properties, click Data sources.
  6. Select the check box next to the data source specified in the error message.
  7. Click Delete.
  8. Save your changes to the master configuration.
You should now be able to recreate the bus member without difficulty.

Problems communicating with foreign buses

To enable communication between buses, a foreign bus and a service bus integration link must be created. On the first bus, the name of the foreign bus must match the name of the second bus that becomes a foreign bus, and the name of the foreign bus for this second bus must match the name of the first bus. The service integration bus link must be have the same name on both buses.

You may see the following type of error if your configuration is not correct, for example because the service integration bus links do not match:

SibMessage    E   [TechBus:TechCluster.000-TechBus] CWSIT0057E: The inter-bus 
connection BookstoreBus failed in the remote messaging engine on host 
aixp401.rchland.ibm.com with reason: CWSIT0067E: Inter-bus connection BookstoreBus 
in bus BookstoreBus is not available.

Problems when attempting to communicate with a renamed foreign bus

The administrative console panel used for configuring the properties of a service integration bus link, also allows you to change the foreign bus name that the link is pointing to. However, you must not alter the name of the foreign bus once it has been configured. If it is, any messaging engines that already hold state information about the link will not be able to use the link unless the foreign bus name is reset to its previous value.

Possible causes of a JMSException with a wrapped SILimitExceeded exception

When the number of messages held by a destination reaches its limiting threshold, any attempt to send a message to that destination fails with a JMSException with a wrapped SILimitExceeded exception. The destination continues to fail with this exception until the number of messages held by the destination reduces below the limiting threshold.

To obtain an accurate count of the number of available messages, you can monitor the Available Message Count PMI statistic for queue and topicspace destinations. If the number of available messages increases, take action to balance the system. Consider stopping producers from sending new messages until the destination consumes the available messages.

Examine the following list for possible causes and solutions for this problem:

Corruption problems on system restarts

It is possible, although rare, for a messaging engine, destination or link to be corrupted after a restart of the system. If this corruption occurs you will see a message indicating the problem. If the problem lies with the messaging engine, the messaging engine will not start. If a destination or link is corrupted, the relevant messaging engine will start, but the destination or link will not be usable on that messaging engine.

If you do not know the cause of the problem, contact your IBM service representative to establish the cause before attempting to resolve the situation.

If you know the cause of the problem, for example, you are aware of an issue with your database, resolve it by completing the following steps:
  1. Ensure that the configuration files are synchronized across your system by clicking System administration > Nodes > Full resynchronize. This operation can take several minutes to run.
  2. If the problem still exists, perform one of the following tasks:
    • Delete the corrupted object and recreate it. Messages produced or received before the corruption occurred will be lost.
    • Restore your system from a backup, see Restoring a data store and recovering its messaging engine. Messages produced or received since the backup was taken will be lost.

Retrieving the status of messaging engines in the administrative console

To be able to retrieve the status of messaging engines, you must be logged into the administrative console with at least monitor authority. If you do not have this authority, the messaging engine status is displayed as "Unavailable", even if the messaging engine has started.

If you are not logged in with the authority needed to retrieve the status of messaging engines, an error message like the following is logged in the server's systemOut log file:
[4/20/05 10:49:57:083 CDT] 0000004b RoleBasedAuth A   SECJ0305I: The role-based  authorization check 
failed for admin-authz operation SIBMessagingEngine:stateExtended. The user UNAUTHENTICATED (unique ID: 
unauthenticated) was not granted any of the following required roles: administrator, operator, 
configurator, monitor.
Where the user ID actually shown in the message is the user ID that you used to log in to the administrative console.

Enabling an application to be started before a required messaging engine has started

If an application depends on a messaging engine being available, then the messaging engine must be started before the application can be run. If you want application server to start an application automatically, you should develop your applications to test that any required messaging engine has been started and, if needed, wait for the messaging engine. If this is technique used in a startup bean, then the startup bean method should perform the test and wait work in a separate thread (using the standard WorkManager methods), so that the application server startup is not delayed.

For an example of code to test and wait for a messaging engine, see Enabling an application to be started before a messaging engine.

Related concepts
Data store

Reference topic

Terms of Use | Feedback

Last updated: 15 Mar 2007
http://publib.boulder.ibm.com/infocenter/wasinfo/v6r0/index.jsp?topic=/com.ibm.websphere.pmc.nd.doc\ref\rjk_prob0.html

© Copyright IBM Corporation 2004, 2007. All Rights Reserved.
This information center is powered by Eclipse technology. (http://www.eclipse.org)