This document provides a check list of common causes contributing to truncated, missing, and corrupted Java thread dumps (javacore files), Java object dumps (heapdump files) files, and/or process core dumps (core files) when using IBM Java for AIX.

Note:
It will be expected that application developers, system administrators, and application vendors confirm that each of these scenarios have been reviewed and eliminated as possible causes prior to opening an IBM support call.
Scenarios
Details

Overview

The IBM Java for AIX includes feature-rich diagnostic capabilities such as the creation of Java thread dumps (javacore files), Java object dumps ( heap dump files), and/or process core dumps (system or core files).

These diagnostics files may be generated automatically or manually depending on the configuration and the situations that occur. This document identifies common scenarios leading to truncated, missing, and/or corrupted diagnostic files and tips for resolving the issue.

When contacting IBM Support, the support specialists will confirm that this checklist of items have been completed. Therefore, to expedite the resolution of an issue, please complete this checklist of scenarios and tips prior to contacting IBM Support.

This document will not provide information for using and/or analyzing the diagnostic files.

Checking Process Environment Variables

In this document, there will be tips that include checking if a process environment variable has been configured. Process environment variables are also known shell variables. This section provides examples of how to view the configured process environment variables for a running process.

For the examples in this section, replace references to:

- JAVA_PID with the process id of the active process
- JAVA_HOME is the parent directory of the Java installation being used for the application
- LARGE_TMP_DIR with an existing directory with adequate free space (e.g., 2GB - 6GB (or higher depending on your process size))

a. Use the ps command to list the process environment variables
(The output can be truncated for very long command lines, which is common for Java processes)

# ps ewwww JAVA_PID | sed -e "s| \([^=]*=\)|#\1|g" | tr '#' ' '

b. Generate an AIX process core file (e.g., gencore), then search (e.g., grep) for strings that match command lines.
(Only run the gencore command one time, then use the commands line tools such grep, egrep, strings to inspect the process core file)

{the following command is likely to also display unprintable characters}
# cd /LARGE_TMP_DIR
# gencore JAVA_PID core.dmp
# strings core.dmp | grep "^[A-Za-z0-9_][A-Za-z0-9_]*=" | sort -u


c. Generate an AIX process core file (e.g., gencore), then extract (e.g., jdmpview) the command line sections.
(Only run the gencore command one time, then run the jdmpview command to inspect the process core file)


# cd /LARGE_TMP_DIR
# gencore JAVA_PID core.dmp

{For Java 6.0}
# JAVA_HOME/jre/bin/jextract -J-Xms1024M -J-Xmx1024M core.dmp core.dmp.zip
# echo "info proc quit " |
JAVA_HOME/bin/jdmpview -zip core.dmp.zip

{For Java 7.0, 7.1, and 8.0}
# echo "info process quit " | JAVA_HOME/jre/bin/jdmpview -core core.dmp

d. Download and execute the free software tool called getevars (usually included in perfpmr)

{Download the version of getevars for your AIX level, then save as /tmp/getevars on your AIX System: see oslevel -s command to determine AIX level}
AIX 61: ftp://ftp.software.ibm.com/aix/tools/java/tools/getevars61
AIX 71: ftp://ftp.software.ibm.com/aix/tools/java/tools/getevars71
AIX 72: ftp://ftp.software.ibm.com/aix/tools/java/tools/getevars72

{Execute the following commands}
# cd /tmp
# chmod 755 getevars
# getevars
JAVA_PID

*Note:
Before using the gencore command, as the root user, from a command prompt, execute the following command to ensure a complete process core is generated. If you do not perform this step, you may not see validate results from the above commands.

# chdev -lsys0 -afullcore=true

Checking Java Command Line Parameters

In this document, there will be tips that includes checking if Java command line options has been configured. This section provides examples of how to view the configured Java command line options for the running process.

For the examples in this section, replace references to:

- JAVA_PID with the process id of the active process
- JAVA_HOME with the parent directory of the Java installation being used for the application
- LARGE_TMP_DIR with an existing directory with adequate free space (e.g., 2GB - 6GB (or higher depending on your process size))

a. Use the ps command to list the Java command line options
(The output can be truncated for very long command lines, which is common for Java processes)

# ps avwwwg JAVA_PID | tr ' ' ' ' | grep "^-[A-Za-z0-9]"

b. Generate an AIX process core file (e.g., gencore), then search (e.g., grep) for strings that match command lines.
(Only run the gencore command one time, then use the commands line tools such as grep, egrep, strings to inspect the process core file)

{the following commands is likely to also display unprintable characters}
# cd /LARGE_TMP_DIR
# gencore JAVA_PID core.dmp
# strings core.dmp | grep "^-[A-Za-z0-9]"


c. Generate an AIX process core file (e.g., gencore), then extract (e.g., jdmpview) the command line sections.
(Only run the gencore command one time, then run the jdmpview command to inspect the process core file)

# cd /LARGE_TMP_DIR
# gencore JAVA_PID core.dmp


{For Java 6.0}
# JAVA_HOME/jre/bin/jextract -J-Xms1024M -J-Xmx1024M core.dmp core.dmp.zip
# echo "info proc quit " |
JAVA_HOME/bin/jdmpview -zip core.dmp.zip

{For Java 7.0, 7.1, and 8.0}
# echo "info process quit " | JAVA_HOME/jre/bin/jdmpview -core core.dmp


*Note:
Before using the gencore command, as the root user, from a command prompt, execute the following command to ensure a complete process core is generated. If you not do perform this step, you may not see validate results from the above commands.

# chdev -lsys0 -afullcore=true

Configuring Java Command Line Options and Process Environment Variables

In this document, there will be references checking and configuring Java command line options and process environment variables. This document assumes that application developers and systems administrators have at least a basic working knowledge of AIX and/or UNIX environments and processes.

For most Java processes, the Java command line options and the process environment variables are configured in startup shell scripts or application configuration files. This requires more effort by the developers and administrators to identify, review, and modify the configuration. If the configuration settings can not be located, please work with the application vendor or development team that provided the application environment to identify the location of the settings.

As a courtesy to our customers, this technote is provided to assist developers and administrators with links to online documentation for popular application environments to assist with the identification and configuraiton of command line options and process environment variables.

Low Free File System Space

One of the common reasons for truncated or corrupted diagnostic files, is not enough free spae in the file system containing the diagnostic files.

This situation occurs when there is frequently generation of the diagnostic files (e.g., javacore, heapdump, and process core files) or the diagnostic files are simply left behind in the file system.

To check the file system free space, from a command prompt, execute the following command:

{replace JAVA_DIAGNOSTIC_FILE with the complete path and name of a javacore file or heapdump file}
{the location of the JAVA_DIAGNOSTIC_FILE may be noted in the application log file or process standard error output}

$ df -mP JAVA_DIAGNOSTIC_FILE

(e.g., df -mP /tmp/javacore.20160101.121212.1234567.001.txt)

If the value listed for the Available column (a.k.a., free space) is low (e.g., has less than 5 GB of free space), then consider these corrective actions:

a. Look for older diagnostic files, then remove them or move them to another file system

b. Increase the size of the file system using the chfs or smitty chfs commands

c. Configure the IBM Java process to generate the diagnostic files in an alternative file system that has adequate free space.

Disk Quotas

Many times, application developers and application administrators are not aware that the AIX systems administrator has enabled disk quotas (using the AIX disk quota feature or another vendor solution).

If a disk quota feature or application has been enabled, then consider these corrective actions:

a. Increase the disk quota

b. Configure the IBM Java process to generate the diagnostic files in an alternative file system that has no quota or a higher quota

System, Process, User, File, and Core Limits

One of the common reasons for truncated, missing, and corrupted diagnostic files is due to the configured user limits (a.k.a., ulimits).

The user limits can be configured using the /etc/security/limits file, the ulimit command, and the configuration of open source, custom, or other vendor authentication software (e.g., open source sshd daemon).

To validate that the truncated, misising, or corrupted diagnostic files are not contributed to incorrect user limits, consider these corrective actions:

a. Set the fsize to unlimited to remove to remove the file size restriction by executing the following command as the root user:

{replace USERID with the AIX user id that is used to run the Java application}

# chuser fsize=-1 USERID


b. Increase the size of nofiles to allow for more open files by executing the following command as the root user:

{replace USERID with the AIX user that is used to run the Java application}

{**Important, once the # of open files has been ruled out, configure the # of open files (nofiles) back to the original setting.}
{This will require a restart of the application(s)}

# chuser nofiles=30000 USERID


c. For process core dumps (e.g., core files), set the data and core limits to unlimited to remove any restrictions by executing the following commands as the root user:

{replace USERID with the AIX user that is used to run the Java application}

# chdev -lsys0 -afullcore=true
# chuser core=-1 data=-1
USERID


d. In the case of an open source, custom, or other vendor authentication software being used, refer to that application's documentation or contact that vendor for assistance.


** Important:
After executing the chuser command, for the settings to become active, it is required to:

1. Stop the process and its parent processes (e.g., node manager, node agent)
2. Re-login as the user id
3. Confirm the settings are enabled by executing the command ulimit -a as the user id
4. Restart all processes.

32-bit Process Limitation

This scenario only applies to 32-bit IBM Java for AIX implementations (i.e., this does not apply to 64-bit Java).

32-bit processes have a maximum addressable native memory space of 4 GB. Depending on the values used for the maximum Java heap size (-Xmx parameter) and the LDR_CNTRL=MAXDATA, the native memory for the process may be constrained or has reached the upper limit of the available native memory for the 32-bit Java process.

This situation can contribute to truncated, missing, or corrupted diagnostic files. There are two possible corrective actions:

a. Set the following process environment variable prior to starting the Java process:

export LDR_CNTRL=MAXDATA=0xB0000000@DSA

Click here for information on LDR_CNTRL and click here for information on how to configure process environment variables.

b. Determine if the application can be migrated to a 64-bit JVM by working with the application development team or application vendor.

Permissions or Extended Access Control Lists

Incorrect directory permissions can prevent the diagnostic files from being generated. The permission issues may be the result of:


- The directory or one of the parent directories having write access disabled for the user id running the process

- Extended access control lists (ACL) have been configured

- Incorrect permissions for the underlying file system mount point


Take these corrective actions to resolve permission issues. To identify the location of the diagnostic files, review the application logs and/or application standard error output for references to the filenames.

a. Review the permissions of the directory and/or parent directories using the ls -ld PATH command.

b. Work with the system adminstrator to understand if extended ACL has been enabled for the directory and/or parent directories.

c. Follow the instructions in this technote to identify if the underlying file system mount point has incorrect permissions

Security Software or Disk Software

When security and disk applications are applied to systems, they may install device driver level software that can impair or restrict the creation of diagnostic files.

Consider these corrective actions to ensure you have non standard software installed and take the corrective actions:

a. Contact the system adminsitrator to identify any non-standard security or disk software

b. Temporarily disable any security software

c. Configure the IBM Java process to generate the diagnostic files in an alternative file system that has no quota or a higher quota

Remote File System Constraints

In situations that a remote file server (e.g., NFS, Samba) is used for the destination of the diagnostic files, consider these corrective actions to ensure that the remote file server or its configuration is not contributing to the truncated, missing, or corrupted files:

a. Confirm that the local file system and remote file system are writeable by the user id running the Java process

b. Confirm the permissions of the local and remote directory and/or parent directories are writable by the user id

c. Confirm that the remote file system is mounted and operational

d. If using automount for the destination directory, use an alternative location or disable automount

d. Confirm there are no network issues between the local and remote systems (e.g., performance, dropped packets, etc)

e. Follow the instructions in this technote to identify if the underlying file system mount point has incorrect permissions

Too Frequent Generation

There are situations when diagnostic files are generated automatically. The frequency of the events triggering the diagnostic files may be too quick, leaving the Java process in a state that it is unable to service the requests and generate truncated or corrupted diagnostic files. The automated generation of these files can be the result of frequent occurrences of exceptions, out of memory, assertions, GPFs, and other processes (e.g., admin and monitor programs) polling the process (e.g., sending kill -3 to the Java process).

Additional, there may be situations that users are manually generating diagnostic files by executed the command kill -3 JAVA_PID. Executing this command too quickly can put the Java process in an unstable state that can contribute to truncated and/or corrupted diagnostic files.

Consider these corrective actions to limit the frequency of diagnostic files being generated:

a. Address the source of the exceptions or issues contributing to the frequent generation of diagnostic data

b. Use the Xdump IBM Java command line option with the range suboption. Use of this option will be specific to the situation. Visit this technote for configuration details for your environment. For example: -Xdump:java:events=systhrow,throw,range=1..2 (creates javacore files for the first two exceptions only)

c. When manually executing multiple kill -3 JAVA_PID commands, confirm that the generation of the previous diagnostic data has completed before attempting to execute the next kill -3 JAVA_PID command

Compaction on Generation

By default, when the Java diagnostic files are automatically generated, the JVM will perform a compaction of the Java object heap. Depending on the situation, the compaction may worsen the situation, thus impacting the ability to generate complete diagnostic files.

Consider disabling the default behavior of compaction of the Java object heap before the generation of the diagnostic files by using the following IBM Java -Xdump settings:

-Xdump:none
-Xdump:java+heap+snap:events=user,systhrow,throw,request=exclusive+prepwalk

-Xdump:system+java+snap:events=gpf,abort,traceassert,request=exclusive+prepwalk

Visit this technote for configuration details for your environment.

Generating Unnecessary Diagnostic Files

Its is recommended to generate only the diagnostic files required to troubleshoot the current issue. Attempting to generate all of the diagnostic files or unnecessary files could impact the ability to generate complete files.

The first step is to identify if any custom Xdump options have been enabled. Execute the ps avwwwg command and identify the complete command line used for the Java process, then identify any modified Xdump options that may enable multiple or all diagnostic files (e.g., -Xdump:java+heap+core).

The next step is to remove any unnecessary options. For example,

a. If troubleshooting thread stacks, use the options:

-Xdump:none
-Xdump:java:events=user,throw,systhrow


b. If troubleshooting Java object heap and Java object issues, use the options:

-Xdump:none
-Xdump:java+heap:events=user,throw,systhrow


c. If troubleshooting aborts or GPFs, use the options:

-Xdump:none
-Xdump:java+snap+system:events=gpf,abort



Visit this technote for configuration details for your environment.

Disabled Signal Handlers

The ability to generate the diagnostic files is provided by native code call signal handlers. The signal handlers are functions that are waiting for specific events to occur to take specified actions based on the signal , one of them being generating logs. When these signal handlers are disabled or overwritten (e.g., by other native code), the JVM is no longer capable of generating the diagnostic files though some of the logs are still generated by the OS as some of the signals cannot be disabled..

Review the configuration of the application to confirm if any of these situations apply:

a. Confirm if the Java command line option -Xrs (which will disable the Java signal handlers) is enabled

b. Confirm if the process environment variable IBM_NOSIGHANDLER (which will disable the Java signal handlers) is enabled

c. Some applications install Java Native code (a.k.a., JNI code) that can alter the signal handlers. You will need to check with your application vendor for more details.

** Important:

Caution must be taken before you simply remove these settings. Confirm with the application developer or application vendor that it is supported to remove these options, otherwise unpredictable results may occur.


Visit this technote for configuration details for your environment.

Disabled Diagnostic Files

IBM Java provides several options for enabling and disabling the creation of diagnostic files in order to support the many customer configurations. In some situations, one or more of these options may have been disabled for past situation and forgotten.

If diagnostic files are not being generated, review the configuration of the application to confirm if any of these situations apply:

a. Check if the process environment variable DISABLE_JAVADUMP (javacore files) has been enabled

b. Check if process environment variables such as IBM_HEAPDUMP and/or IBM_HEAP_DUMP (heapdump files) are set to FALSE or 0

c. Check if the Java command line option -Xdisablejavadump (javacore files) has been enabled

d. Check if Java command line options such as -Xdump:none (for all files), -Xdump:java:none (javacore files), -Xdump:heap:none (heapdump files), -Xdump:system:none (AIX process core files), -Xdump:snap:none (Java snap / trace files) have been enabled

e. Check if the Java command line option -Xdump:nofailover (prevents using alternative locations such as /tmp)


Visit this technote for configuration details for your environment.

Alternate Locations

IBM Java provides several options for changing the default location where diagnostic data is generated. For some configurations, the default location may have been changed due to past situations and forgotten. When these options are used, these situations may occur:

- The new location was forgotten and the diagnostic files can not be identified
- The new location was removed or the permissions were changed to prevent the files from being created
- If the JVM is unable to use the new location, it wiill attempt to use alternative directories

If the diagnostic files are not generated or not found, review the configuration of the application and confirm if any of these options are used:

a. Check if the Java command line options -Xdump:java:file= (javacore files), -Xdump:heap:file= (heapdump files), -Xdump:system:file= (AIX process core files), -Xdump:snap:files= (Java snap / trace files) have been enabled

b. Check if the Java command line option -Xdump:directory has been enabled

a. Check if the process environment variable IBM_JAVACOREDIR (javacore directory) has been enabled

b. Check if the process environemnt variable IBM_HEAPDUMPDIR (heapdump directory) has been enabled

c. Check if the process environment variable IBM_COREDIR (AIX process core and Java snap/trace directory) has been enabled

d. Check if the process environment variable TMPDIR has been enabled

e. Check if the Java command line option -Xdump:nofailover has been enabled


Visit this technote for configuration details for your environment.

Additional Techniques

If the location of the diagnostic files still can not be determined, try these additional methods.

For the examples in this section, replace references to:

- JAVA_PID with the process id of the active process

a. Use the procwdx command to identify the current working directory, then look in that directory

# procwdx JAVA_PID

In order to run the procwdx command, you must be root user, user with system authority, or logged in as the owner of the process.

b. In cases when AIX core [dump] files are automatically generated, check the AIX system error log:

# errpt -a

-> Look for entries that start with "LABEL: CORE_DUMP"
-> Look for "CORE_DUMP" entries where "PROGRAM NAME" shows java
-> The location of the diagnostic file (core) will be listed in the "CORE FILE NAME" section of the matching entries
-> Look the directory located for other diagnostic files


c. Use the truss command to identify the directory that the process is attempting to save the diagnostic files:

# cd /tmp
# truss -dealfo java-truss.out -p
JAVA_PID &
# TRUSS_PID="${!}"
# kill -3
JAVA_PID
# sleep 3
# kill -9 ${TRUSS_PID}
# egrep "/javacore|/heapdump|/Snap|/core" java-truss.out | egrep "statx|open"

-> Look for references to paths containing references to any of the diagnostic files.
-> If you do not see any reference, then it may be the generation of the diagnostic files have been disabled


In order to run the proxwdx command, you must be root user, user with system authority, or logged in as the owner of the process.

Known Issues

There may be known fixes for truncated and/or corrupted diagnostic files being generated.

Prior to contacting support, please visit the web page:

IBM Java for AIX Reference: Consolidated List of IBM Java Fixes

which contains links to web pages of known fixes for each version of IBM Java for AIX.
Select the link for the version of Java being used, then perform a browser text search (e.g., ctrl+f key strokes) looking for words such as:

- truncate
- corrupted
- javacore
- heapdump
- zero
- core
- Xdump


to identify any known issue and the Java release (update) that contains the fixes for truncated, missing, or corrupted diagnostic files. Then apply the more recent updates containing the fixes to the Java installation being used following the instructions on the download, installation, and upgrade web page.

Troubleshoot and Support

If, after all of the above items have been checked and corrected, the diagnostic files continue to be truncated, missing, or corrupted, please follow these steps for further assistance (note: all steps must be completed, otherwise there may be delays in the resolution of your situation).

For the examples in this section, replace references to:

- JAVA_PID with the process id of the active process
- JAVA_HOME with the parent directory of the Java installation being used for the application
- LARGE_TMP_DIR with an existing directory with adequate free space (e.g., 6GB - 10GB (or higher depending on your process size))

Step 1. Collect data

From a command prompt while logged in as the root user, execute the following commands:

# mkdir -p /LARGE_TMP_FS/data
# cd
/LARGE_TMP_FS/data
# truss -dealfo truss.out -p JAVA_PID &
# TRUSS_PID=${!}"
# sleep 2
# kill -3
JAVA_PID
# sleep 5
# kill -9 ${TRUSS_PID}
# trace -a -d -n -l -T268435184 -L536870368 -p -r PURR -o "${PWD}/trace.bin"
# trcon
# sleep 2
# kill -3
JAVA_PID
# sleep 5
# trcstop
# trcrpt -r -o trace.rpt trace.raw
# trcnm > trcnm.out
# gennames > gennames.out
# cp /etc/trcfmt trace.fmt
# LDR_CNTRL=MAXDATA=0xB0000000@DSA gensyms > gensyms.out
# trcrpt -C all -n gennames.out -o trace.txt -O exec=on,timestamp=1,cpuid=on,pid=on,svc=on,tid=on -t trace.fmt trace.raw
# gencore
JAVA_PID ../core.dmp
#
JAVA_HOME/jre/bin/jextract -J-Xms1024M -J-Xmx1024M ../core.dmp core.dmp.Z
# JAVA_HOME/jre/bin/java -version > java-version.out 2>&1

# ps avwwwwg > ps-avwg.out 2>&1
# ps ewwwww > ps-ew.out 2>&1


Step 2. Package the data

# cd /LARGE_TMP_FS/
# tar -cf - data | gzip -c > $(hostname)-data.tgz


Step 3. Open a new IBM service request (a.k.a., open a new PMR)

Step 4. Upload the data

Once the service request has been opened, upload the data using these instructions to the new support call. The IBM AIX Java team will provide further assistance to diagnose the cause of the truncated, missing, and/or corrupted diagnostic files.

Section 19

Section 20

Section 21

Document Type: Technical Document
Content Type: General
Hardware: all Power
Operating System: all AIX Versions
IBM Java: all Java Versions
Author(s): Roger Leuckie
Reviewer(s): NA
Click here to submit feedback for this document.