PQ75807: MESSAGE BBON1017E SERVER INSTANCE X ON SERVER Y COULD NOT BE STOPPED.

 A fix may be available

Obtain the fix for this APAR



APAR status
Closed as program error.

Error description
During deployment systems management is producing the followng
message (note that this is from a script based commit but the
same result is obtained through the SM/EUI):
status 4
message.1 BBON3199E Method commitconversation failed.
message.2 BBON1141I The following activation steps already
succeeded:
message.3 BBON1142I Environment files for conversation CONV
are successfully written.
message.4 BBON1143I Conversation CONV is now the active
conversation.
message.5 BBON1144I Homes successfully queued for registration.
message.6 BBON1017E Server instance WASC1A02 on server WASC1A
could not be stopped.
count 0.
Our deployment process is therefore assuming that the deploy did
not work. The lightning strike is not applied to conversation
when looked at through the SM/EUI so we assume that the status
is not officially 'active' for the committed conversation. SM is
complaining that it cannot stop the server but in fact server
was actually stopped. In this case the customer has automation
Local fix
The BBON1017E message can be avoided if the automation tool
waits for 12 seconds between the time it detects a server has
been stopped and it attempts to restart the server. This is not
a complete fix, because even if automation waits 12 seconds,
there is no way for the automation tool to know if SM has
finished doing all the HFS file copies yet. There is a slight
chance that a server region could be started and some or all of
the updated jar are still the downlevel copies because the HFS
updates are not complete. Only Systems Management knows when the
HFS copies are complete.
Problem summary
****************************************************************
* USERS AFFECTED: All users of WebSphere Application Server    *
*                 version 4.0.1 for z/OS and OS/390            *
****************************************************************
* PROBLEM DESCRIPTION: Customer automation tools cannot tell   *
*                      if a down server is due to SM recycling *
*                      the server as part of conversation      *
*                      activation or it is due to an           *
*                      application or system error.            *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
During deployment, systems management (SM) is producing the
followng message (note that this is from a script based commit
but the same result is obtained through the SM/EUI):
status 4
message.1 BBON3199E Method commitconversation failed.
message.2 BBON1141I The following activation steps already
          succeeded:
message.3 BBON1142I Environment files for conversation CONV are
          successfully written.
message.4 BBON1143I Conversation CONV is now the active
          conversation.
message.5 BBON1144I Homes successfully queued for registration.
message.6 BBON1017E Server instance WASC1A02 on server WASC1A
          could not be stopped.
count 0.

Customer's deployment process is therefore assuming that the
deploy did not work. The lightning strike is not applied to
conversation when looked at through the SM/EUI so customer
assumes that the status is not officially 'active' for the
committed conversation. SM is complaining that it cannot stop
the server but in fact server was actually stopped. In this
case the customer has automation.
Local Fix: The BBON1017E message can be avoided if the
automation tool waits for 12 seconds between the time it
detects a server has been stopped and it attempts to restart
the server.  This is not a complete fix, because even if
automation waits 12 seconds, there is no way for the automation
tool to know if SM has finished doing all the HFS file copies
yet. There is a slight chance that a server region could be
started and some or all of the updated jar are still the
downlevel copies because the HFS updates are not complete. Only
Systems Management knows when the HFS copies are complete.
Problem conclusion
Four new automation messages and a new configurable delay were
added.  The messages are written to the operator console
(1) before SM intentionally stops a server for maintenance
(2) before SM restarts the server
(3) after the server has been successfully restarted, and
(4) in the event that the server fails to restart (error
    condition).

The configurable delay - specified using a new environment
variable - will take effect (a) immediately after SM issues
message (1) above, but before SM actually stops the server,
and (b) immediately after SM issues message (2) above, but
before SM actually restarts the server.  This delay is designed
to give customers' automation tools time to reset the desired
state of the server.  The new environment variable,
AUTOMATION_DELAY, will have meaning only for the SM server and
will default to 0, so the default behavior will be the same as
it always has been.

APAR PQ75807 requires a change to WebSphere V4.0.1 for z/OS and
OS/390 documentation:

NOTE: Periodically, we refresh the documentation on our Web
site, so these changes might have been made before you read this
text. To access the latest on-line documentation, go to the
product library page at URL:

www.ibm.com/software/webservers/appserv/zos_os390/library/

A change to V4.0.1 WebSphere for z/OS: Installation and
Customization, GA22-7834-07, and V4.0.1 WebSphere for z/OS:
Assembling J2EE Applications, SA22-7836-06, will be available in
the next refresh of the documentation.

The change is to the table that starts on page 325 of
Installation and Customization and page 303 of Assembling J2EE
Applications, which now reads:

|--------------------------------------------------------------|
| Env variable=<default>   | Dmn | SM  | Nam | IR  | App | zOS |
|--------------------------------------------------------------|
| ...                      |     |     |     |     |     |     |
|--------------------------------------------------------------|
| AUTOMATION_DELAY=0       |     | O   |     |     |     |     |
|--------------------------------------------------------------|
| ...                      |     |     |     |     |     |     |
|--------------------------------------------------------------|

and text on page 332, which now reads:

AUTOMATION_DELAY=n
Specifies a time, in seconds, that the System Management (SM)
server will delay after two scenarios:
  - After issuing message BBOU0827I and before stopping the
    application server
  - After issuing message BBOU0828I and before restarting the
    application server.
This delay is designed to allow automation tools time to adjust
to the imminent change in a server's state. The default is 0.
Note: AUTOMATION_DELAY is used by only the System Management
server. If specified on any other server, it is ignored.
Example:
AUTOMATION_DELAY=10

----------------------------------------------------------------

A change to V4.0.1 WebSphere for z/OS: Messages and Diagnosis
GA22-7837-07 will be available in the next refresh of the
documentation.

The change is to Chapter 13, page 273:

BBOU0827I SM STOPPING SERVER servername FOR MAINTENANCE.
Explanation: This message is written to the console before the
System Management (SM) process stops the specified server during
a server recycle operation. When a new conversation is
activated, SM recycles active application servers whose
configurations have changed in the new conversation.
User Response: Do not restart the specified server manually; SM
will automatically restart it after performing maintenance. If
you are monitoring application server activity with automation
tools, you can use this message to signal to the tools that the
specified server will stop intentionally.

BBOU0830I SM RESTARTING SERVER servername AFTER MAINTENANCE.
Explanation: This message is written to the console before the
System Management (SM) process restarts the specified server
during a server recycle operation. When a new conversation is
activated, SM recycles active application servers whose
configurations have changed in the new conversation.
User Response: If you are monitoring application server activity
with automation tools, you can use this message to signal to the
tools that the specified server will automatically restart.

BBOU0828I SM SUCCESSFULLY RESTARTED SERVER servername.
Explanation: This message is written to the console after the
System Management (SM) process successfully restarts the
specified server during a server recycle operation. When a new
conversation is activated, SM recycles active application
servers whose configurations have changed in the new
conversation.
User Response: If you are monitoring application server activity
with automation tools, you can use this message to signal to the
tools that the specified server was successfully restarted.

BBOU0829E SM FAILED TO RESTART SERVER servername.
Explanation: This message is written to the console when the
System Management (SM) process fails to restart the specified
server during a server recycle operation. When a new
conversation is activated, SM recycles active application
servers whose configurations have changed in the new
conversation. This message indicates an error occurred during
such a recycle operation.
User Response: Check the WebSphere error log for information
about errors preceeding this one which could have caused the
server restart failure. If you are monitoring application server
activity with automation tools, you can use this message to
signal to the tools that the specified server failed to restart.


and Appendix B. Automation-geared messages, page 459, where
"BBOU0827I," "BBOU0830I," "BBOU0828I," and "BBOU0829E" will be
added to the table.


APAR PQ75807 is associated with SERVICE LEVEL W401511 of
WebSphere Application Server version 4.0.1 for z/OS and OS/390.
Temporary fix Comments
APAR information
APAR number PQ75807
Reported component name WASKBASE
Reported component ID 5655A9801
Reported release 401
Status CLOSED PER
PE NoPE
HIPER NoHIPER
Submitted date 2003-06-30
Closed date 2003-08-08
Last modified date 2003-09-05

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Modules/Macros
BBOUBINF          

Fix information
Fixed component name WASKBASE
Fixed component ID 5655A9801

Applicable component levels
R401 PSY UQ79317    UP03/08/14 P F308

  Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.


Document Information


Product categories: Software > Application Servers > Distributed Application & Web Servers > WebSphere Application Server for z/OS
Operating system(s):
Software version: 401
Software edition:
Reference #: PQ75807
IBM Group: Software Group
Modified date: Sep 5, 2003