Descriptions for common error log messages

  • [Tue Mar 8 17:12:35 2005] [emerg] (9)Bad file number: fcntl: F_SETLKW: Error getting accept lock, exiting! Perhaps you need to use the LockFile directive to place your lock file on a local disk!

    The only case where this error has been seen is on Solaris with IHS 1.3 and a third-party module. The "Bad file number" message indicates that the IHS file mutex descriptor has been closed inadvertently.

    We recommend switching the IHS mutex mechanism to SysV semaphore to avoid this problem. Add AcceptMutex sysvsem to httpd.conf. Also, there are likely changes to the Solaris kernel semaphore tuning. Refer to these notes.

  • [warn] (22)Invalid argument: setsockopt: (TCP_NODELAY)
    [error] (22)Invalid argument: getsockname
    [warn] (22)Invalid argument: apr_socket_opt_set: (TCP_NODELAY)

    This message is logged when the client drops the connection right after IHS has started processing it but before IHS has started reading the request. It is to be expected that some subset of clients will drop the connection at unexpected times (this could be a real user at a web browsers pressing the stop button), and this is one of the types of messages that can be logged because of that.

    Unfortunately, the error returned from the kernel (EINVAL) has a text description that makes the failure look like a programming error, but that is how these calls are rejected when the TCP connection has already been dropped by the client.

    Some load balancers test availability of the server on a regular basis in such a way that IHS often encounters a failure on one of these calls. Recent levels of IHS 1.3 have lowered the severity of these messages and included the client IP address in the log message so that they can be more easily ignored by most folks, but that people that see them can tell whether or not it is the same client over and over.

  • (info) (9) bad file number: core_output_filter: writing data to the network. The significance with this message is the text "bad file number." If it says something like "Broken pipe" or "Connection reset", it is a different issue.

    To correct some problems which lead to the "bad file number" message, the customer needs the PQ85834 e-fix for 2.0.42.2, or 2.0.47.

  • [info] [client 1.2.3.4] (32)Broken pipe: client stopped connection before rflush completed
    [info] (32)Broken pipe: core_output_filter: writing data to the network

    This message is logged when the client drops the connection while IHS is writing the response. It is to be expected that some subset of clients will drop the connection while IHS is writing to it. The user could have pressed the stop button or closed the browser or there could have been a temporary network problem.

    Another normal condition where this may occur is when a .pdf file is requested from a browser with the Adobe plug-in installed. The browser first requests the entire file, then when it sees the type of response the plug-in takes over, cancels the initial request (though IHS may not have written the entire response yet), then re-issues the request for the .pdf file in the form of a range request.

    A rare condition where these messages may occur is if there is some application layer issue (e.g., browser incompatibility with application response) which leads to the browser abruptly dropping the connection to IHS when it receives certain response data. In order to diagnose the problem, network traces or mod_net_trace are normally needed so that IHS support can see what was sent to the browser. For SSL connections, mod_net_trace is preferred since it traces the data in unencrypted form.

  • [Tue Feb 15 01:02:07 2005] [debug] D:\Build\WWWIHS1328\apache\ibm\modules\afpa\mod_ibm_afpa.c(595): (2)No such file or directory: afpa_init: hafpa = 0, cfg->init_afpa = 1, cfg->bBaseServerConfig = 1, cfg->afpaEnabled = 0

    This is a debug message and is not formatted for customer consumption. The key issue here is the value printed for "cfg->afpaEnabled." If the value is 0, then AFPA is disabled. Otherwise, AFPA is enabled.

  • /opt/IBMHttpServer/bin/httpd: relocation error:
    /opt/IBMHttpServer/bin/httpd: symbol __pthread_atfork, version GLIBC_2.0 not defined in file libpthread.so.0 with link time reference

    This is an issue seen with IHS 1.3.28.x on RedHat 3 Update 4/x86 and SLES 9/x86 caused by the vendor dropping some programming interfaces which were available in prior releases. IHS will continually try to fork new child processes but each fork will fail with these messages. As of 20050315 there is no permanent solution. A work-around is to avoid the use of LDAP in /etc/nsswitch.conf.

    The issue is being tracked at RedHat with bug id 67074. The customer can check with RedHat on the status of that report.

  • [notice] caught SIGTERM, shutting down

    This message is a normal shutdown message for IHS. If a user or automated task (e.g., cron job) does "apachectl stop" or sends SIGTERM to the IHS parent process, this message will be written.

    Note: The SIGTERM "signal" shouldn't be confused with other signals that indicate problems, such as SIGSEGV, SIGBUS, SIGABRT, and SIGILL. SIGTERM is the normal mechanism to tell a program to terminate on Unix, so apachectl sends SIGTERM to the IHS parent process to tell it to go away (and to clean up other IHS processes).

  • [warn] child process 26354 still did not exit, sending a SIGTERM and/or [error] child process 26354 still did not exit, sending a SIGKILL

    A child process is not exiting as quickly as expected during shutdown. During shutdown the parent will tell the child processes to exit. If after a short amount of time a child has still not exited, the parent will send the SIGTERM signal to the child process again and write the message to the log. If after another interval the child process still does not exit, then the parent process forcefully brings down the child by sending the SIGKILL signal to the child process.

  • [Wed Sep 10 15:20:03 2003] [crit] the listener thread didn't exit

    This is sometimes written when a child process exits. It means that one of the threads in a child process didn't exit in the cleanest possible manner, but it doesn't indicate any operational problem. This is a debug message in current levels of IHS.

  • [Tue Oct 07 15:37:14 2003] [warn] new file descriptor 1025 is too large; you probably need to rebuild Apache with a larger FD_SETSIZE (currently 1024)

    This is caused by a problem which is fixed with 2.0.42.2-PQ85834 and 2.0.47.1.

  • (24)Too many open files: file permissions deny server access: /opt/IBMHttpServer/htdocs/en_US/index.html

    The ulimits for number of file descriptors per process could be too low, or there could be a file descriptor leak. Stop IHS, increase the hard and soft limits for number of file descriptors, start IHS again, and see if the problem reappears.

  • [Thu Dec 4 11:49:44 2003] [error] [client 1.2.3.4] (11)Resource temporarily unavailable: couldn't spawn child process: /path/to/cgi

    This message occurs when the operation system cannot spawn (fork()) another child process for the web server for a CGI request. The request will fail with a 500 error. The specific cause is usually one of the following:

  • limits on processes for the user which started the web server may be exceeded
  • system-wide limits on processes may be exceeded
  • there may be a system resource shortage

    More information may be available in the operation system documentation for the fork() subroutine.

    AIX

    The AIX documentation for fork() is http://publib16.boulder.ibm.com/doc_link/en_US/a_doc_lib/libs/basetrf1/fork.htm. The applicable limit is maxuproc. The following commands show how to display the current value and set it to a new value:

    # lsattr -El sys0 -a maxuproc
    maxuproc 2048 Maximum number of PROCESSES allowed per user True
    # chdev -l sys0 -a maxuproc=4096
    sys0 changed
    
  • [Sat Dec 13 11:52:48 2003] [warn] long lost child came home! (pid 11380)

    This can occur with piped loggers (e.g., rotatelogs) during a graceful restart. One of these messages may appear for every piped logger.

    With levels of IHS prior to interim fix PK01070, it can also occur during steady state for httpd processes which begin exiting due to MaxSpareThreads or MaxRequestsPerChild processing, but take a long time to exit.

  • [Wed Feb 20 12:17:16 2002] [alert] (11)Resource temporarily unavailable: apr_thread_create: unable to create worker thread

    This is a tuning issue. The operating system failed a request by the web server to create another thread. The operating system can be tuned to allow more threads, or the web server can be tuned to create fewer threads per process so that it doesn't encounter this issue.

    Things to check:

    1. value of ThreadsPerChild in the worker MPM configuration in serverroot/conf/httpd.conf

      It may be too high for current operating system limits on memory or processes or threads.

    2. AIX: value of LDR_CNTRL variable in serverroot/bin/envvars file

      This controls how much of the address space can be used for heap storage, and heap storage is the main requirement on AIX for being able to create a new thread.

  • [Tue Jan 20 10:35:57 2004] [error] server reached MaxClients setting, consider raising the MaxClients setting

    All we know for sure when this message appears is that at one time there were not any workers available to process additional requests. I.e., all child processes/threads are already in use handling existing requests. The message is only written once during the life of the server, so the peak condition could have happened more than once.

    This is an important message to watch out for when there is a symptom such as a hung server. In some types of hang conditions (e.g., all IHS child process/threads tied up waiting for app server to respond, and app server isn't responding because the servlet processing the request is hung), this "MaxClients" message will be written to the error log.

    If there is no other symptom, the customer may wish to increase MaxClients to increase the peak capacity of IHS, but unless any users are having problems with long response times then perhaps nothing can be done. The customer can get a good idea of how many of the IHS child processes/threads are in use at a given time by activating mod_status and checking the server-status page.

  • SemCounter ERROR: Cannot remove semaphore: 655433

    This does not come from IHS. It comes from a third-party module.

    A customer seeing this message needs to contact the vendor of whichever module is issuing the message.

  • [crit] (67)Address already in use: make_sock: could not bind to address nnn.nnn.nnn.nnn port 80

    This can occur at startup or restart if some other process is listening to the specified port and optional address.

  • Ouch! malloc failed in malloc_block()

    This occurs if a storage allocation fails. The child process then exits.

    The storage allocation can fail due to reaching an OS limit for memory, or because a plug-in module is asking for an inappropriate amount of storage.

    In later IHS 1.3 (PQ87084 for 1.3.19.6 or 1.3.26.2, or IHS >= 1.3.28), the number of bytes for the failing allocation is traced to make it easier to decide the condition.

  • [error] [client 127.0.0.1] request failed: error reading the headers

    This message can be triggered by one of the following protocol errors encountered while reading a request from the client:

  • the number of request header fields exceeds the server's limit (400 will be logged in access log)
  • the length of a request header field exceeds the server's limit (400 will be logged in access log)
  • a request header field is missing a colon separator (400 will be logged in access log)

    The exact cause is written in the error message sent to the client, but it is not logged anywhere unless the "error-notes" note for the request is logged via mod_log_config. (This would be rather voluminous and is not recommended except for brief testing periods.) The note can be logged by adding %{error-notes}n to your log format string.

    Even without logging the error-notes note, you can consult the access log to see the URL of the request which failed and see what error message was returned to the browser. Look for requests which failed with 400 with the same timestamp as the messages in the error log.

  • [error] SSL0108E: Initialization error, GSK internal error.

    If using 2.0.42.x on Linux/PPC: This can be caused by GSkit getting installed without the OpenSSL functions. The root cause shows up in an strace of server startup like this:

    open("/usr/local/ibm/gsk6//icc/icclib/libicclib.so", O_RDONLY) = -1 ENOENT (No such file or directory)
    

    To resolve: In the directory where IHS was unpacked there should be files iccrte.tar and osslrte.tar, as well as gskit.sh and the GSkit RPM. Run this to re-install GSkit with the OpenSSL function:

    # rpm --erase gsk6bas
    # ./gskit.sh 1
    

    This bypasses a license dialog. If you install IHS normally via the GUI, you'll have a chance to see the license and accept it.

    The message can appear on other platforms when there is a problem with the GSKit installation or there is a system library compatibility issue.

  • piped log program '(null)' failed unexpectedly

    This message can appear with IHS 2 and above at shutdown or restart time when piped log programs such as rotatelogs are configured. It does not result in an operational problem.

  • [error] [client nnn.nnn.nnn.nnn] (13)Permission denied: access to some url failed because search permissions are missing on a component of the path

    To find the exact permission problem, first determine which file should be served for the specified url. As an example, if the url is /, this would usually result in serving file index.html in the IHS DocumentRoot directory. With the following configuration in httpd.conf

    DocumentRoot /opt/local/HTTPServer/htdocs/en_US
    ...
    User nobody
    Group nobody
    
    the actual file to be served would be /opt/local/HTTPServer/htdocs/en_US/index.html, and the user/group nobody/nobody must be able to read and search each component of the path up through index.html. So run these commands and verify from the permissions of each component of the path that user/group nobody can read and search that component.

    ls -ld /
    ls -ld /opt
    ls -ld /opt/local
    ls -ld /opt/local/HTTPServer
    ls -ld /opt/local/HTTPServer/htdocs
    ls -ld /opt/local/HTTPServer/htdocs/en_US
    ls -l /opt/local/HTTPServer/htdocs/en_US/index.html
    

    If the directory where you choose to install IHS is not within a directory created by the system, you may need to add read and search permissions to that directory so that the web server user/group can access the files.

  • [crit] SSL0600S: Unable to connect to session ID cache

    This level of IHS does not log the exact failure. It may be caused by a temporary issue during extremely high loads, where the session ID cache daemon is temporarily behind in servicing requests. A common cause is a configuration error which prevents the Unix socket used by sidd from being accessible by the web server user id. See the description below under Permission denied: SSL0600S: Unable to connect to session ID cache for more information about that problem.

  • [crit] (13)Permission denied: SSL0600S: Unable to connect to session ID cache

    This will result in degraded SSL performance. This problem is caused by the web server user id (e.g., "www" or "nobody") not having permission to read the Unix socket used by sidd, the session id cache daemon). Consider /opt/IBMIHS as the example IHS install directory, and assume that customer did not use the SSLCachePortFilename directive to specify the location of the sidd socket, and www is the web server user id (value of User directive).

    When IHS starts up, sidd will create the file /opt/IBMIHS/logs/siddport. When a new client SSL connection is received, mod_ibm_ssl will be running as user "www" and will try to connect to the sidd socket. So user "www" must have read and execute permissions to these directories:

            
    /opt                                                                    
    /opt/IBMIHS                                                             
    /opt/IBMIHS/logs                                                        
    
    And user "www" must have read permission to this "file":
    /opt/IBMIHS/logs/siddport
    
    Normally, when IHS is installed the directories will be world readable and executable. If the customer changes those permissions (on /opt, /opt/IBMIHS, or /opt/IBMIHS/logs) then permission errors will be received when new SSL connections are being established and mod_ibm_ssl tries to connect to the sidd socket. The SSLCachePortFilename directive can be used to place the sidd socket somewhere else.

    Example:

    SSLCachePortFilename /var/run/siddport                         
    

    The actual file needs to be in a directory structure which, on your system, the web server user id can access.

  • [error] [client nnn.nnn.nnn.nnn] (13)Permission denied: unable to connect to cgi daemon after multiple tries: /path/to/cgi

    This message can be received for CGI requests with IBM HTTP Server 2.0 or above on Unix and Linux systems. mod_cgid implements the CGI feature in these levels of IHS. At IHS initialization, mod_cgid creates a Unix socket in the filesystem which is used when executing CGI scripts. The default name for this Unix socket is logs/cgisock within the ServerRoot directory. (Example name: /usr/IBMIHS/logs/cgisock) The name of this Unix socket in the filesystem can be changed with the ScriptSock directive. Example:

    ScriptSock /tmp/scriptsock
    

    There are two requirements for the filesystem path of this Unix socket:

    1. If there are multiple instances of IHS running on this machine, the different instances must be configured to use different Unix sockets. When the different instances have the same value for the ServerRoot directive, the ScriptSock directive must be used in each configuration file to specify a unique name. Otherwise, security exposures can occur if different web servers have different credentials, since CGI requests may end up being executed by the wrong IHS instance. Beginning with cumulative e-fix PK01070 for IHS 2.0.x, mod_cgid will refuse to execute a CGI request if this configuration error exists, since unexpected privileges could be used otherwise.
    2. The entire path to the socket must be accessible by the web server user id (e.g., nobody, or whatever is specified by the User and Group directives).

      Assume the following configuration:

      User nobody
      Group nobody
      ScriptSock /usr/IBMIHS/logs/scriptsock
      

      If the permission denied error is written to the error log when a CGI request is made, the expected cause is that there is a filesystem permissions issue which prevents the web server user id from accessing the Unix socket. Perform the following checks:

      1. Run "ls -ld /usr" and verify that user/group nobody has permissions to read (r) and search (x) the directory.
      2. Run "ls -ld /usr/IBMIHS" and verify that user/group nobody has permissions to read (r) and search (x) the directory.
      3. Run "ls -ld /usr/IBMIHS/logs" and verify that user/group nobody has permissions to read (r) and search (x) the directory.
      4. Run "ls -l /usr/IBMIHS/logs/scriptsock" and verify that user/group nobody has permissions to read (r) and write (w) the Unix socket file.

      Here is an example where the logs directory has been made readable only by root. In this case, the permission denied error would occur when trying to process a CGI request. The fix would be to use the ScriptSock directive to specify a Unix socket file that the web server user id could access, or to make the logs directory readable by the web server user id.

      [trawick@b80-2 /]$ ls -ld /usr
      drwxr-xr-x  63 bin      bin            1536 Mar  7 16:15 /usr
      [trawick@b80-2 /]$ ls -ld /usr/IBMIHS
      drwxr-xr-x  22 root     system          512 May 19 2004  /usr/IBMIHS
      [trawick@b80-2 /]$ ls -ld /usr/IBMIHS/logs
      drwx------   2 root     system          512 May 19 2004  /usr/IBMIHS/logs
      

    If the problem persists, send in the following information to IBM support:

    1. IHS conf file
    2. IHS error log covering a time when CGI requests failed
    3. output of ls -ld on every directory component of the path to the Unix socket
    4. output of ls -l on the Unix socket itself
  • [Tue Jan 20 10:27:34 2004] [error] [client 1.2.3.4] (13)Permission denied: Could not open file /apps/opt/IBMHTTPD-SSO/logs/filter.52954

    This appears to be a typical IHS error message at first glance, but it is issued by SiteMinder. A customer seeing this message needs to contact CA/Netegrity to determine how to resolve the problem.