The documentation required to diagnose child process crashes includes
If core dumps are not being saved for the child process crashes, the first step is to perform any necessary operating system and web server configuration so that core dumps are saved. Core dump configuration information is described here.
When a core dump is available, the ServerDoc tool provided with ihsdiag automates much of the work of gathering and formatting the required documentation. The user runs ServerDoc and provides the IHS installation directory and the path to the core file, and ServerDoc creates a new directory to hold the required documentation, and stores information in that new directory.
Once the ServerDoc tool has completed, the user should copy any remaining log files and configuration files used by the web server and the plug-in into the new directory, and send in the directory to IBM support.
Note: If IBM HTTP Server has been upgraded to a newer maintenance level since the core dump was generated, the core dump needs to be reproduced with the new level of product code. Otherwise, the crash information will be incorrect since the core dump and the product won't match.
In addition to submitting the documentation described below, we also recommend enabling mod_whatkilledus and mod_backtrace so that key information about each subsequent crash is recorded in the web server error log. This provides additional insight into the crashes without requiring that the steps outlined in this document be followed for each and every crash.
These modules are not supported with very old maintenance levels of
IBM HTTP Server. Check the Supported server versions
section
in the documentation for each module to confirm that the module works
with your level of IBM HTTP Server.
These levels of the plug-in have an issue with random load balancing which can cause crashes with IBM HTTP Server. The crash can be in arbitrary code and may not consistently occur in the same place. To resolve this known problem, either change LoadBalance to a different value or apply the plug-in fix for APAR PK43752, which is targeted for 5.1.1.15, 6.0.2.21, and 6.1.0.9.
Make sure required Solaris AF_UNIX fixes have been applied, using one of the patches below or equivalent:
If crashes occur after apachectl restart
or
apachectl graceful
on AIX 5.2, check for the following
LoadModule directives in the configuration file (uncommented):
LoadModule dav_module modules/mod_dav.so ... LoadModule dav_fs_module modules/mod_dav_fs.so
A child process crash can occur after a web server restart on AIX 5.2 if these are enabled.
DAV
On
directive is active elsewhere in the configuration, update
the LDR_CNTRL
directive in the
IHSROOT/bin/envvars
file to include
@IGNOREUNLOAD
in the value, as in the following
example
LDR_CNTRL="MAXDATA=0x60000000@IGNOREUNLOAD"
Simply add @IGNOREUNLOAD
to the end of the current
value of MAXDATA
.
Stop IHS then start it again to activate the configuration change. It will not be activated across a restart.
# LoadModule dav_module modules/mod_dav.so ... # LoadModule dav_fs_module modules/mod_dav_fs.so
Stop IHS then start it again to activate the configuration change. A restart is not sufficient due to the nature of this problem.
AIX APAR IY78080 resolves the problem for AIX 5.3 This APAR fix is not available for AIX 5.2, so one of the configuration changes described above must be used.
The most common cause of a SIGBUS crash on these platforms is that a file is truncated while the web server is trying to send it to a client. Some file replacement methods cause the existing file to be truncated and then the new contents written, instead of writing the new contents to a temporary file and then renaming to the proper name.
If you have static files served from IHS which can be modified in place, try EnableMMap Off to see if the problem is resolved.
Note: On Solaris, many other types of crashes result in SIGBUS.
For U40xx or S0C4 abend in LE CELQLIB at httpd child process termination, check for applicability of LE APAR PK34252.
The PHP manual recommends against using PHP in a multithreaded web server; see "Why shouldn't I use Apache2 with a threaded MPM in a production environment?".
IHS 2.0.42 and higher is multithreaded on all platforms. (IHS 1.3 is multithreaded only on Windows or with certain third-party modules.)
Thread safety problems in PHP applications or third-party libraries referenced by PHP can cause crashes in a threaded web server. The recommended solution is to configure PHP as a FastCGI application and use mod_fastcgi to communicate with it.
A core dump and related information is critical for diagnosing the cause of child process crashes. Without the information, IBM support is limited to suggesting that the customer move to the current level of fixes. With the information, IBM support anticipates being able to make the following initial determination:
In cases where an IBM component crashed, the information often contains enough information to address the root cause of previously unknown problems. Even when the root cause cannot be determined from a particular core dump, the information is used to decide the next step.
In cases where a third-party component crashed, the vendor of that component will need to investigate further; IBM support is unable to diagnose problems in third-party components.
Please refer to these instructions for verifying that required support programs are installed.
Run the tool as root
to avoid any permissions problems
with reading the core file or other files, such as log files and
configuration files. (More information about the requirement to run
this tool as root
is available here.)
ServerDoc is passed three parameters for gathering crash documentation:
GatherCrashDoc
# java -jar ServerDoc.jar GatherCrashDoc /path/to/IHS /path/to/corefile
The tool creates a new directory which contains a timestamp in the name, and the crash documentation will be saved in that directory.
For this example, IHS is installed in /usr/HTTPServer
,
the core dump was written to /tmp/core
, and ihsdiag was
unpacked into /root/ihsdiag-1.1.0
# cd /tmp # java -jar /root/ihsdiag-1.1.0/ServerDoc.jar GatherCrashDoc \ /usr/HTTPServer /tmp/core Reports, log files, and configuration files have been saved to directory CrashDoc.200404121310 If you have additional log files or configuration files, copy them there before packing up the directory. Hint for packing up the directory: tar -cf CrashDoc.200404121310.tar CrashDoc.200404121310 gzip CrashDoc.200404121310.tar # ls -l CrashDoc.200404121310/ total 8136 -rw-r--r-- 1 root system 8779 Apr 12 13:10 access_log -rw-r--r-- 1 root system 7094 Apr 12 13:10 apachectl -rw-r--r-- 1 root system 3593703 Apr 12 13:10 core -rw-r--r-- 1 root system 478483 Apr 12 13:10 core_file_strings -rw-r--r-- 1 root system 14419 Apr 12 13:10 error_log -rw-r--r-- 1 root system 37141 Apr 12 13:10 httpd.conf -rw-r--r-- 1 root system 7500 Apr 12 13:10 log -rw-r--r-- 1 root system 173 Apr 12 13:10 report
The next step is to copy any other web server or plug-in configuration files and logs into the new CrashDoc directory. Here is a list of files to copy if they are being used:
The last step is to pack up and compress the documentation directory using zip, tar followed by gzip, or tar followed by compress. The easiest way is to cut and paste the messages displayed by ServerDoc previously which showed the commands to use. The suggested commands will vary by platform. On z/OS, for example, compress will be suggested instead of gzip.
# tar -cf CrashDoc.200404121310.tar CrashDoc.200404121310 # gzip CrashDoc.200404121310.tar
The resulting compressed file is the file to send to IBM support.
root
requirementWhen gathering information on web server crashes, the tool must
be able to read core files created for web server processes and web
server logs and configuration files. Often the web server logs and
configuration files are readable by normal user ids, but core files
are readable only by root
or by the web server user id
(e.g., nobody
or www
).
If the web server is started as root
, the permissions
on generated core files and log files and configuration files can be
changed to allow a non-root
user to run the crash
documentation tool.
If the web server is not started as root
, there are no
such concerns, and the crash documentation tool may be run by the user
id which starts the web server.
If the tool is run as non-root
and it is unable to
gather the required information, permissions on the core file or other
files can be changed and the tool may be run again. It may not be
possible to determine if this problem occurred until the documentation
has been analyzed by IBM HTTP Server support.