Name: PK21171 ============= Summary: Race condition in scheduler when cluster nodes startup concurrently Problem Description: The DataStoreService needs to recreate the scheduled tasks at startup to work around a design issue in WAS. This includes two or three calls to the scheduler EJBs: - query if the task exists - cancel the existing task (but copy it's settings first) - create a new task (either with default settings or with the old task's settings) Those EJB calls are all wrapped with their own transaction and take quite long (~10 secs), so when cluster nodes start up concurrently, the steps may overlap, leading to several tasks created, no tasks being created or simple NPE's and other exceptions in the code, preventing the node from starting up at all. Problem Solution: The cancel / create steps have been merged into one EJB call, so the EJB's transaction handling prevents race conditions. In addtion to that, the code was made more robust to avoid NPEs. Note: This APAR requires a specialized installation procedure to follow. Failing Module(s): Database Affected Users: All users Version Information: Portal Version(s): 5.1.0.4 Pre-Requisite(s): Co-Requisite(s): --- Platform Specific: This fix applies to all platforms. Installation: NOTE: YOU MUST FIRST DOWNLOAD THE UPDATE INSTALLER TOOL IN ORDER TO INSTALL A FIX. The Portal Update Installer can be downloaded from the following link: http://www.ibm.com/software/genservers/portal/support 1. Create temporary "fix" directory to store the jar file. 2. Copy jar file to this directory. 3. Shutdown WebSphere Portal. 4. Follow the fix installation instructions that are packaged with the Portal Update Installer on how to install the fix. Special Instructions start here In directory /config issue the command wpsconfig.[sh|bat] apply-pk21171 This task does a checkout of wps.ear, modifies the Scheduler.ejb and redeploys the wps.ear. In a clustered environment this task must be executed only on one node. After this task was run and all changes are syncronized to all nodes via ND the whole portal cluster has to be restarted in order for the changes to take effekt. Note: the APAR itself has to be installed on all cluster nodes as additional files are modified, only the wpsconfig task has to be run only once. Special Instructions end here 5. Restart WebSphere Portal. 6. The temporary directory may be removed. Un-Installation: NOTE: FIXES MUST BE REMOVED IN THE ORDER THEY WERE APPLIED. DO NOT REMOVE A FIX UNLESS ALL FIXES APPLIED AFTER IT HAVE FIRST BEEN REMOVED. YOU MAY REAPPLY ANY REMOVED FIX. 1. Shutdown WebSphere Portal. 2. Follow the instructions that are packaged with the Portal Update Installer on how to uninstall the fix. 3. Restart WebSphere Portal.