[Tue Mar 8 17:12:35 2005] [emerg] (9)Bad file number: fcntl: F_SETLKW:
Error getting accept lock, exiting! Perhaps you need to use the
LockFile directive to place your lock file on a local disk!
The only case where this error has been seen is on Solaris with IHS 1.3 and a third-party module. The "Bad file number" message indicates that the IHS file mutex descriptor has been closed inadvertently.
We recommend switching the IHS mutex mechanism to SysV semaphore to
avoid this problem. Add AcceptMutex sysvsem
to
httpd.conf. Also, there are likely changes to the Solaris kernel
semaphore tuning. Refer to these
notes.
[warn] (22)Invalid argument: setsockopt: (TCP_NODELAY)
[error] (22)Invalid argument: getsockname
[warn] (22)Invalid argument: apr_socket_opt_set: (TCP_NODELAY)
This message is logged when the client drops the connection right after IHS has started processing it but before IHS has started reading the request. It is to be expected that some subset of clients will drop the connection at unexpected times (this could be a real user at a web browsers pressing the stop button), and this is one of the types of messages that can be logged because of that.
Unfortunately, the error returned from the kernel (EINVAL) has a text description that makes the failure look like a programming error, but that is how these calls are rejected when the TCP connection has already been dropped by the client.
Some load balancers test availability of the server on a regular basis in such a way that IHS often encounters a failure on one of these calls. Recent levels of IHS 1.3 have lowered the severity of these messages and included the client IP address in the log message so that they can be more easily ignored by most folks, but that people that see them can tell whether or not it is the same client over and over.
(info) (9) bad file number: core_output_filter: writing
data to the network.
The significance with this message is the text "bad file number." If
it says something like "Broken pipe" or "Connection reset", it is a
different issue.
To correct some problems which lead to the "bad file number" message, the customer needs the PQ85834 e-fix for 2.0.42.2, or 2.0.47.
[info] [client 1.2.3.4] (32)Broken pipe: client stopped
connection before rflush completed
[info] (32)Broken pipe: core_output_filter: writing data to the network
This message is logged when the client drops the connection while IHS is writing the response. It is to be expected that some subset of clients will drop the connection while IHS is writing to it. The user could have pressed the stop button or closed the browser or there could have been a temporary network problem.
Another normal condition where this may occur is when a .pdf file is requested from a browser with the Adobe plug-in installed. The browser first requests the entire file, then when it sees the type of response the plug-in takes over, cancels the initial request (though IHS may not have written the entire response yet), then re-issues the request for the .pdf file in the form of a range request.
A rare condition where these messages may occur is if there is some application layer issue (e.g., browser incompatibility with application response) which leads to the browser abruptly dropping the connection to IHS when it receives certain response data. In order to diagnose the problem, network traces or mod_net_trace are normally needed so that IHS support can see what was sent to the browser. For SSL connections, mod_net_trace is preferred since it traces the data in unencrypted form.
[Tue Feb 15 01:02:07 2005] [debug] D:\Build\WWWIHS1328\apache\ibm\modules\afpa\mod_ibm_afpa.c(595): (2)No such file or directory: afpa_init: hafpa = 0, cfg->init_afpa = 1, cfg->bBaseServerConfig = 1, cfg->afpaEnabled = 0
This is a debug message and is not formatted for customer consumption. The key issue here is the value printed for "cfg->afpaEnabled." If the value is 0, then AFPA is disabled. Otherwise, AFPA is enabled.
/opt/IBMHttpServer/bin/httpd: relocation error:
/opt/IBMHttpServer/bin/httpd: symbol __pthread_atfork, version GLIBC_2.0 not defined in file libpthread.so.0 with link time reference
This is an issue seen with IHS 1.3.28.x on RedHat 3 Update 4/x86 and SLES 9/x86 caused by the vendor dropping some programming interfaces which were available in prior releases. IHS will continually try to fork new child processes but each fork will fail with these messages. As of 20050315 there is no permanent solution. A work-around is to avoid the use of LDAP in /etc/nsswitch.conf.
The issue is being tracked at RedHat with bug id 67074. The customer can check with RedHat on the status of that report.
[notice] caught SIGTERM, shutting down
This message is a normal shutdown message for IHS. If a user or automated task (e.g., cron job) does "apachectl stop" or sends SIGTERM to the IHS parent process, this message will be written.
Note: The SIGTERM "signal" shouldn't be confused with other signals that indicate problems, such as SIGSEGV, SIGBUS, SIGABRT, and SIGILL. SIGTERM is the normal mechanism to tell a program to terminate on Unix, so apachectl sends SIGTERM to the IHS parent process to tell it to go away (and to clean up other IHS processes).
[warn] child process 26354 still did not exit, sending a SIGTERM
and/or [error] child process 26354 still did not exit, sending a SIGKILL
A child process is not exiting as quickly as expected during shutdown. During shutdown the parent will tell the child processes to exit. If after a short amount of time a child has still not exited, the parent will send the SIGTERM signal to the child process again and write the message to the log. If after another interval the child process still does not exit, then the parent process forcefully brings down the child by sending the SIGKILL signal to the child process.
[Wed Sep 10 15:20:03 2003] [crit] the listener thread didn't exit
This is sometimes written when a child process exits. It means that one of the threads in a child process didn't exit in the cleanest possible manner, but it doesn't indicate any operational problem. This is a debug message in current levels of IHS.
[Tue Oct 07 15:37:14 2003] [warn] new file descriptor 1025
is too large; you probably need to rebuild Apache with a larger FD_SETSIZE
(currently 1024)
This is caused by a problem which is fixed with 2.0.42.2-PQ85834 and 2.0.47.1.
(24)Too many open files: file permissions deny server
access: /opt/IBMHttpServer/htdocs/en_US/index.html
The ulimits for number of file descriptors per process could be too low, or there could be a file descriptor leak. Stop IHS, increase the hard and soft limits for number of file descriptors, start IHS again, and see if the problem reappears.
[Thu Dec 4 11:49:44 2003] [error] [client 1.2.3.4] (11)Resource temporarily unavailable: couldn't spawn child process: /path/to/cgi
This message occurs when the operation system cannot spawn
(fork()
) another child process for the web server for a
CGI request. The request will fail with a 500 error. The specific cause is
usually one of the following:
More information may be available in the
operation system documentation for the fork()
subroutine.
AIX
The AIX documentation for fork()
is
http://publib16.boulder.ibm.com/doc_link/en_US/a_doc_lib/libs/basetrf1/fork.htm.
The applicable limit is maxuproc
. The following
commands show how to display the current value and set it to a new
value:
# lsattr -El sys0 -a maxuproc maxuproc 2048 Maximum number of PROCESSES allowed per user True # chdev -l sys0 -a maxuproc=4096 sys0 changed
[Sat Dec 13 11:52:48 2003] [warn] long lost child came home!
(pid 11380)
This can occur with piped loggers (e.g., rotatelogs) during a graceful restart. One of these messages may appear for every piped logger.
With levels of IHS prior to interim fix PK01070, it can also occur during steady state for httpd processes which begin exiting due to MaxSpareThreads or MaxRequestsPerChild processing, but take a long time to exit.
[Wed Feb 20 12:17:16 2002] [alert] (11)Resource temporarily unavailable: apr_thread_create: unable to create worker thread
This is a tuning issue. The operating system failed a request by the web server to create another thread. The operating system can be tuned to allow more threads, or the web server can be tuned to create fewer threads per process so that it doesn't encounter this issue.
Things to check:
It may be too high for current operating system limits on memory or processes or threads.
This controls how much of the address space can be used for heap storage, and heap storage is the main requirement on AIX for being able to create a new thread.
[Tue Jan 20 10:35:57 2004] [error] server reached MaxClients
setting, consider raising the MaxClients setting
All we know for sure when this message appears is that at one time there were not any workers available to process additional requests. I.e., all child processes/threads are already in use handling existing requests. The message is only written once during the life of the server, so the peak condition could have happened more than once.
This is an important message to watch out for when there is a symptom such as a hung server. In some types of hang conditions (e.g., all IHS child process/threads tied up waiting for app server to respond, and app server isn't responding because the servlet processing the request is hung), this "MaxClients" message will be written to the error log.
If there is no other symptom, the customer may wish to increase MaxClients to increase the peak capacity of IHS, but unless any users are having problems with long response times then perhaps nothing can be done. The customer can get a good idea of how many of the IHS child processes/threads are in use at a given time by activating mod_status and checking the server-status page.
SemCounter ERROR: Cannot remove semaphore: 655433
This does not come from IHS. It comes from a third-party module.
A customer seeing this message needs to contact the vendor of whichever module is issuing the message.[crit] (67)Address already in use: make_sock: could not bind
to address nnn.nnn.nnn.nnn port 80
This can occur at startup or restart if some other process is listening to the specified port and optional address.
Ouch! malloc failed in malloc_block()
This occurs if a storage allocation fails. The child process then exits.
The storage allocation can fail due to reaching an OS limit for memory, or because a plug-in module is asking for an inappropriate amount of storage.
In later IHS 1.3 (PQ87084 for 1.3.19.6 or 1.3.26.2, or IHS >= 1.3.28), the number of bytes for the failing allocation is traced to make it easier to decide the condition.
[error] [client 127.0.0.1] request failed: error reading the
headers
This message can be triggered by one of the following protocol errors encountered while reading a request from the client:
The exact cause is written in the error message sent to the client,
but it is not logged anywhere unless the "error-notes" note for the
request is logged via mod_log_config. (This would be rather
voluminous and is not recommended except for brief testing periods.)
The note can be logged by adding %{error-notes}n
to your
log format string.
Even without logging the error-notes note, you can consult the access log to see the URL of the request which failed and see what error message was returned to the browser. Look for requests which failed with 400 with the same timestamp as the messages in the error log.
[error] SSL0108E: Initialization error, GSK internal
error.
If using 2.0.42.x on Linux/PPC: This can be caused by GSkit getting installed without the OpenSSL functions. The root cause shows up in an strace of server startup like this:
open("/usr/local/ibm/gsk6//icc/icclib/libicclib.so", O_RDONLY) = -1 ENOENT (No such file or directory)
To resolve: In the directory where IHS was unpacked there should be files iccrte.tar and osslrte.tar, as well as gskit.sh and the GSkit RPM. Run this to re-install GSkit with the OpenSSL function:
# rpm --erase gsk6bas # ./gskit.sh 1
This bypasses a license dialog. If you install IHS normally via the GUI, you'll have a chance to see the license and accept it.
The message can appear on other platforms when there is a problem with the GSKit installation or there is a system library compatibility issue.
piped log program '(null)' failed unexpectedly
This message can appear with IHS 2 and above at shutdown or restart time when piped log programs such as rotatelogs are configured. It does not result in an operational problem.
[error] [client nnn.nnn.nnn.nnn] (13)Permission denied:
access to some url failed because search permissions are missing on a
component of the path
To find the exact permission problem, first determine which file
should be served for the specified url. As an example, if the url is
/
, this would usually result in serving file index.html
in the IHS DocumentRoot
directory. With the following
configuration in httpd.conf
DocumentRoot /opt/local/HTTPServer/htdocs/en_US ... User nobody Group nobodythe actual file to be served would be
/opt/local/HTTPServer/htdocs/en_US/index.html
, and the
user/group nobody
/nobody
must be able to
read and search each component of the path up through index.html. So
run these commands and verify from the permissions of each component
of the path that user/group nobody
can read and search
that component.
ls -ld / ls -ld /opt ls -ld /opt/local ls -ld /opt/local/HTTPServer ls -ld /opt/local/HTTPServer/htdocs ls -ld /opt/local/HTTPServer/htdocs/en_US ls -l /opt/local/HTTPServer/htdocs/en_US/index.html
If the directory where you choose to install IHS is not within a directory created by the system, you may need to add read and search permissions to that directory so that the web server user/group can access the files.
[crit] SSL0600S: Unable to connect to session ID cache
This level of IHS does not log the exact failure. It may be caused
by a temporary issue during extremely high loads, where the session ID
cache daemon is temporarily behind in servicing requests. A common
cause is a configuration error which prevents the Unix socket used by
sidd from being accessible by the web server user id. See the
description below under Permission denied: SSL0600S: Unable to
connect to session ID cache
for more information about that
problem.
[crit] (13)Permission denied: SSL0600S: Unable to connect to session
ID cache
This will result in degraded SSL performance. This problem is caused by the web server user id (e.g., "www" or "nobody") not having permission to read the Unix socket used by sidd, the session id cache daemon). Consider /opt/IBMIHS as the example IHS install directory, and assume that customer did not use the SSLCachePortFilename directive to specify the location of the sidd socket, and www is the web server user id (value of User directive).
When IHS starts up, sidd will create the file /opt/IBMIHS/logs/siddport. When a new client SSL connection is received, mod_ibm_ssl will be running as user "www" and will try to connect to the sidd socket. So user "www" must have read and execute permissions to these directories:
/opt /opt/IBMIHS /opt/IBMIHS/logsAnd user "www" must have read permission to this "file":
/opt/IBMIHS/logs/siddportNormally, when IHS is installed the directories will be world readable and executable. If the customer changes those permissions (on /opt, /opt/IBMIHS, or /opt/IBMIHS/logs) then permission errors will be received when new SSL connections are being established and mod_ibm_ssl tries to connect to the sidd socket. The SSLCachePortFilename directive can be used to place the sidd socket somewhere else.
Example:
SSLCachePortFilename /var/run/siddport
The actual file needs to be in a directory structure which, on your system, the web server user id can access.
[error] [client nnn.nnn.nnn.nnn] (13)Permission denied: unable to connect to cgi daemon after multiple tries: /path/to/cgi
This message can be received for CGI requests with IBM HTTP Server
2.0 or above on Unix and Linux systems. mod_cgid
implements the CGI feature in these levels of IHS. At IHS
initialization, mod_cgid creates a Unix socket in the filesystem which
is used when executing CGI scripts. The default name for this Unix
socket is logs/cgisock
within the ServerRoot
directory. (Example name: /usr/IBMIHS/logs/cgisock
) The name
of this Unix socket in the filesystem can be changed with the
ScriptSock
directive. Example:
ScriptSock /tmp/scriptsock
There are two requirements for the filesystem path of this Unix socket:
ServerRoot
directive, the ScriptSock
directive must be used in each configuration file to specify a
unique name. Otherwise, security exposures can occur if different web
servers have different credentials, since CGI requests may end up
being executed by the wrong IHS instance. Beginning with cumulative
e-fix PK01070 for IHS 2.0.x, mod_cgid will refuse to execute a CGI
request if this configuration error exists, since unexpected
privileges could be used otherwise.
User
and Group
directives).
Assume the following configuration:
User nobody Group nobody ScriptSock /usr/IBMIHS/logs/scriptsock
If the permission denied error is written to the error log when a CGI request is made, the expected cause is that there is a filesystem permissions issue which prevents the web server user id from accessing the Unix socket. Perform the following checks:
Here is an example where the logs directory has been made readable only by root. In this case, the permission denied error would occur when trying to process a CGI request. The fix would be to use the ScriptSock directive to specify a Unix socket file that the web server user id could access, or to make the logs directory readable by the web server user id.
[trawick@b80-2 /]$ ls -ld /usr drwxr-xr-x 63 bin bin 1536 Mar 7 16:15 /usr [trawick@b80-2 /]$ ls -ld /usr/IBMIHS drwxr-xr-x 22 root system 512 May 19 2004 /usr/IBMIHS [trawick@b80-2 /]$ ls -ld /usr/IBMIHS/logs drwx------ 2 root system 512 May 19 2004 /usr/IBMIHS/logs
If the problem persists, send in the following information to IBM support:
[Tue Jan 20 10:27:34 2004] [error] [client 1.2.3.4] (13)Permission denied: Could not open file /apps/opt/IBMHTTPD-SSO/logs/filter.52954
This appears to be a typical IHS error message at first glance, but it is issued by SiteMinder. A customer seeing this message needs to contact CA/Netegrity to determine how to resolve the problem.