A. Java heap Out Of Memory
Check the SIGINFO line in the generated javacore file to confirm the java heap "Out Of Memory" error:
# grep -i siginfo START_PATH/javacore*
Output should be similar to:
ISIGINFO Dump Event "systhrow" (00040000) Detail
"java/lang/OutOfMemoryError" "Java heap space" received.
Refer to "IBM Java for AIX MustGather: Data collection procedure for Java heap issues" technote, for instructions to collect data for Java heap Out Of Memory issue:
http://www-01.ibm.com/support/docview.wss?uid=isg3T1022801
For any other SIGINFO output, e.g. like the ones below,which indicate Java native memory issue, continue with the rest of the instructions::
....... :"Failed to fork OS thread" received.
....... :"Unable to create native thread" received
B. Native memory leak
After the application starts and reaches a stable point, at the time of high native memory usage, check 20 iterations of "svmon" command output for a pattern of continous increase in the "inuse" column value of the native memory segments:
# svmon -P JAVA_PID -O \
segment=category,filtercat=exclusive,filtertype=working \
-i 300 20 | grep -v mmap
B. Confirm native memory leak:
To collect svmon command ouptut for Java process PID 1993326 every 10 minutes for 20 iterations, execute command:
# svmon -P 11993326 -O \
segment=category,filtercat=exclusive,filtertype=working \ -i 600 20| grep -v mmap
Output will be similar to the following. Check the "Inuse" column for a pattern of increasing memory usage:
Unit: page
-------------------------------------------------------------------------------
Pid Command Inuse Pin Pgsp Virtual
11993326 java 4332 3 0 4332
...............................................................................
EXCLUSIVE segments Inuse Pin Pgsp Virtual
4332 3 0 4332
Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual
9721d7 3 work working storage s 4068 0 0 4068
812261 f work working storage s 242 0 0 242
862246 2 work process private s 22 3 0 22
For a moderate rate of memory growth, specify an interval of 300 seconds, to collect svmon output every 5 minutes for 20 iterations.
Based on the rate of memory growth, increase or decrease the interval for "svmon" command execution.
Before proceeding with the setup and data collection instructions to collect complete diagnostic data for analysis, click Step 7. Upload Data for instructions to upload any existing latest generated logs and data collected.
To prepare for these data collection procedures, the process environment needs to be configured to save the additional debug information to a log file.
A. Set the user ulimits and the system attribute:
From a command prompt and while logged in as the root user, execute following commands to:
{ set file, data, core file ulimit sizes to unlimited }
# chuser fsize=-1 data=-1 core=-1 USERID
{ set the fullcore system attribute to true }
# chdev -l sys0 -a fullcore=true
B. Redirect or save standard error (stderr) messages to a file.
Commonly used application servers may already save standard out and standard error messages to a log file (e.g., SystemOut.log native_stdout.log, SystemErr.log, native_stderr.log) or to the application log file.
For custom applications, redirect the standard error messages by appending "2>&LOG_FILE" or to redirect both the stdout and stderr to a file append ">LOG_FILE 2>&1".
C. Perform the following actions in order for the changes to take effect:
- Stop the application
- Relogin as the "USERID" used in Step 1.A
- Confirm that full core is enabled and the new ulimits are in effect by executing the commands:
# ulimit -a
# lsattr -El sys0 | grep -i fullcore
- Do not restart the application until Step 3. Configure has been completed.
A. Enabling debug options will result in additional data being stored in memory buffers and written to application logs. The process file, data, core file sizes should be increased during the data collection to ensure the data is complete.
If there are multiple processes executed by multiple user ids experiencing the issue, then all preparation steps must be repeated for each id and process.
To confirm the process environment is configured correctly, login using the "USERID" specified in the steps, then run the command:
# ulimit -a
The values for "file", "data" and "coredump" should show:
file(blocks) unlimited
data(kbytes) unlimited
coredump(blocks) unlimited
Confirm "fullcore" system attribute is set to true:
# lsattr -El sys0 | grep -i fullcore
fullcore true Enable full CORE dump True
B. As an example, to save standard error message to the file /tmp/stderr.log, use a command line syntax similar to:
# java YOUR_APP 2>/tmp/stderr.log
to save both standard out and standard error messages to the file /tmp/out.log, specify:
# java YOUR_APP > /tmp/out.log 2>&1
To confirm the messages are being redirected to the log file, view the contents of the log file.
A. Set environment variables:
Note: The environment variables set on the AIX command line in the current USERID session are applicable to all the processes started by USERID . To limit to a specific process, set the environment variables in the startup/profile script for the specific Java application.
For the 32-bit java process, logged in as USERID, set the "LDR_CNTRL=MAXDATA" environment variable to increase the total native memory of the java process:
# export LDR_CNTRL=MAXDATA=0xB0000000@DSA
From the svmon command output in "Step 1.B" if the patttern of increasing native memory usage is confirmed, set environment variable:
# export MALLOCDEBUG=log:extended,stack_depth:12
B. Execute vmstat command, preferably before the application is started, and let it run continuously:
# vmstat -It 2 > vmstat.out &
C. Add options to the java command line:
Generate verbose GC logging:
-verbose:gc
By default, verbose GC logging is written to stderr.
If unable or unsure how to redirect standard error for the verbose GC data, specify log file name:
-Xverbosegclog:SPECIFIC_PATH/gc.log
Generate AIX core file, javacore and snap trace for Out Of Memory issues:
-Xdump:system+java+snap:events=systhrow,filter=java/lang/OutOfMemoryError,range=1..3
Enable manual generation of javacore, snap trace for native memory leak issues:
-Xdump:java+snap:events=user
To specify a directory to write the generated logs to instead of the default directory of START_PATH, specify:
-Xdump:directory=SPECIFIC_PATH
D. Restart the java application (e.g., node agent/manager) from the USERID new login session.
A. Execute the following commands to collect the required diagnostic data.
# mkdir -p /TMP_PATH/MM-DD/memory-issues/data
# cd /TMP_PATH/MM-DD/memory-issues/data
B. Collect data for native memory leak
1. If "Step 1.B" indicates native memory leak, collect svmon data after the java process starts, stabilizes and grows to few segments but before the native memory gets exhausted:
# date >> svmon.out
# svmon -G >> svmon.out
# echo BEGIN >> svmon.out
# svmon -PJAVA_PID -O \
segment=category,filtercat=exclusive,filtertype=working \
-i 20 | grep -v mmap >> svmon.out
# echo END >> svmon.out
# date >> svmon.out
# svmon -G >> svmon.out
2. Collect two AIX core files manually, preferably when the memory growth is seen in different segments, at an interval of 10 minutes between the command execution:
# gencore JAVA_PID core.001
Generate another core file after the memory growth continues in a new segment:
# gencore JAVA_PID core.002
Generate javacore and snap traces:
# kill -3 JAVA_PID
C. Collect libraries associated with core file:
# JAVA_PATH/jre/bin/jextract START_PATH/core.001
If any errors with the above command, execute command:
# snapcore -d START_PATH/core.001 START_PATH/java
For automatically generated core files:
# JAVA_PATH/jre/bin/jextract START_PATH/core*.dmp
or
# snapcore -d START_PATH/core.001 START_PATH/java
D. Collect output of AIX commands from the same AIX LPAR:
# errpt -a > errpt-a.out 2>&1
# oslevel -s > oslevel-s.out 2>&1
# prtconf > prtconf.out 2>&1
# lsps -a > lsps-a.out 2>&1
# lslpp -hac > lslpp-hac.out 2>&1
# instfix -i > instfix-i.out 2>&1
# emgr -lv3 > emgr-lv3.out 2>&1
# ps avwwwg > ps-avwwwg.out 2>&1
E. Copy the generated files to the /TMP_PATH/MM-DD/memory-issues/data directory created in "Step 4.A".
# cp /START_PATH/core* ./
# cp /START_PATH/javacore* ./
# cp /START_PATH/Snap*trc ./
and one of:
# cp /START_PATH/core*zip ./
# cp /START_PATH/snap*txt ./
Also copy standard error, standard output, SystemOut, SystemErr, gc.log, application logs and any other logs generated.
A. Examples of commands to be executed:
# mkdir -p /large_fs/01-31/memory_issues/data
# cd /large_fs/01-31/memory_issues/data
C.Determine the default directory "START_PATH" for the generated logs:
# ps -ef | grep -i java
rt 3211380 1 0 May 31 - 1109:46 java -Dsrse_property=/rt/pmr/test/classloader/JvmTest-dir/j_test/etc/aib.ini....
# kill -3 3211380
If not sure of the default directory "START_PATH", execute command:
# procwdx 3211380
3211380: /rt/pmr/test/
Check for the generated javacore in the the above directory:
# cd /rt/pmr/test
# ls -l *javacore*
-rw-r--r-- 1 rtstaff 234048 Aug 25 10:54 javacore.20150825.105402.3211380.0010.txt
E. Confirm all files and directories have been saved to the data directory:
# cp /var/myapp/*.log ./
# ls javacore.*.txt, core*, Snap*
# ls *.log *.out *.txt
** MANDATORY **
Prior to packaging and uploading, confirm that the following files have been saved in the "/TMP_PATH/MM-DD/memory_issues/data directory:
a. Javacore files
b. AIX core files
c. libraries associated with core file
d. Snap traces
e. verbose GC log
f. standard error, standard output, SystemOut, SystemErr, and application logs and any other logs generated.
g. svmon, vmstat and all oither AIX commands output
Packaging the files may simplify the upload of the diagnostic data collected. From the command line, and while logged in as the root user, execute the commands:
# cd /TMP_PATH/PMR/MM-DD/memory_issues
# tar -cf - data1 | gzip -c > PMR.MM-DD.tgz
A. Examples of commands to be executed:
# cd /large_fs/01-31/memory_issues
# tar -cf - data | gzip -c > 12345.678.000.01-31.tgz
Upload the packaged data or individual files to the IBM secured server using one of upload options provided in the "IBM Java for AIX MustGather: How to upload diagnostic data and testcases to IBM" web page:
http://www-01.ibm.com/support/docview.wss?uid=isg3T1022619
If this step is reached by clicking "Step 6. Upload Data" in Step 2, at the end of the completion of the existing latest logs/data upload, click Step 2. Prepare Environment to continue with the data collection instructions.