Fix (APAR): PQ75605 Status: Fix Release: 4.0.6,4.0.5,4.0.4 Operating System: All Supersedes Fixes: None CMVC Defect: PQ75605 Byte size of APAR: 1110256 Date: 08/19/03 Abstract: This APAR marks only remote nodes offline, contains more Description/symptom of problem: Indefinite wait for epoch synchronization causes hang. Directions to apply fix: 1) Create temporary "Fix" directory to store the jar file: UNIX: /tmp/WebSphere/Fix Windows: c:\temp\WebSphere\Fix 2) Copy jar file to the directory 3) Shutdown WebSphere 4) Run the jar file with the following command answering questions/prompts as they appear: java -jar 5) Restart WebSphere 6) The temp directory may be removed but the jar file should be saved. Do not remove any files created and stored in the /Fix/PQ68569 directories. These files are required if a Fix is to be removed. Directions to remove fix: NOTE: FIXES MUST BE REMOVED IN THE ORDER THEY WERE APPLIED. DO NOT REMOVE A FIX UNLESS ALL FIXES APPLIED AFTER IT HAVE FIRST BEEN REMOVED. YOU MAY RE-APPLY ANY REMOVED FIX. Example: If your system has Fix1, Fix2, and Fix3 applied in that order and Fix2 is to be removed, Fix3 must be removed first, Fix2 removed, and Fix3 re-applied. 1) Change directory to the Fix location (/Fix/PQ75605). 2) Shutdown WebSphere 3) Run the backup jar file with the following command: java -jar PQ75605_Fix_backup.jar 4) Restart WebSphere Directions to re-apply fix: Follow the instructions for applying a Fix. If the backup files still exist (from the previous Fix application), you will be prompted to overwrite. Answer "yes" at the overwrite prompts. Additional Information: ----------------------- 0. For WAS 4.0.5 and 4.0.6 the fix can be applied straight away. For WAS 4.0.4 the fix needs the SM cumulative fix WAS_SysMgmt_05-01-2003_4.0.5-4.0.4-4.0.3-4.0.2_AE_Solaris_cumulative APAR as a required prerequisite. 1. In the absence of information from a node, the node used to be marked offline. In this fix we have changed it to take only remote nodeds offline. 2. An indefinite wait for the active object to sync up the epoch values has been altered to a timed wait and request to user to retry the transaction. Extra notification mechanisms has also been added. 3. RemoteExceptions when an active object is forwarded are now tracked by having extra trace messages. 4. We have seen that this fix is most beneficial in conjunction with increasing the ping timeout and ping interval values for all application servers. There is no right value in that these are tuning parameters to be deteremined by gradual change, we recommend that the ping timeout failure messages be monitored in the tracefile and the ping values increased gradually. Another suggested guideline is that the minimum ping interval be more than the ApplicationServer's startup time and the ping timeout be double of that. 5. Extra trace points added to monitor admin task execution paths. 6. Recommended prerequisite is the application of SM cumulative fix May 5, 2003. This has been tested on this configuration.