Troubleshooting the node

Actions to troubleshoot the Platform Analytics node

Change the default log level of your log files

Change the default log level of your log files if they do not cover enough detail, or cover too much, to suit your needs.

  1. If you are logged into a UNIX host, source the LSF environment.
    • For csh or tcsh: source LSF_TOP/conf/cshrc.lsf

    • For sh, ksh, or bash: . LSF_TOP/conf/profile.lsf

  2. If you are logged into a UNIX host, source the PERF environment.
    • For csh or tcsh: source $PERF_TOP/conf/cshrc.perf

    • For sh, ksh, or bash: . $PERF_TOP/conf/profile.perf

  3. Edit the log4j.properties file.

    This file is located in the PERF configuration directory:

    • UNIX: $PERF_CONFDIR

    • Windows: %PERF_CONFDIR%

  4. Navigate to the section representing the service you want to change, or to the default loader configuration if you want to change the log level of the data loaders, and look for the *.logger.* variable.

    For example, to change the log level of the loader controller log files, navigate to the following section, which is set to the default INFO level:

    # Loader controller ("plc") configuration
    log4j.logger.com.platform.perf.dataloader=INFO com.platform.perf.dataloader
  5. Change the *.logger.* variable to the new logging level.

    In decreasing level of detail, the valid values are ALL (for all messages), DEBUG, INFO, WARN, ERROR, FATAL, and OFF (for no messages). The services or data loaders only log messages of the same or lower level of detail as specified by the *.logger.* variable. Therefore, if you change the log level to ERROR, the service or data loaders will only log ERROR and FATAL messages.

    For example, to change the loader controller log files to the ERROR log level:

    # Loader controller ("plc") configuration
    log4j.logger.com.platform.perf.dataloader=ERROR com.platform.perf.dataloader
  6. Restart the service that you changed (or the loader controller if you changed the data loader log level).

Disable data collection for individual data loaders

To reduce unwanted data from being logged in the database, disable data collection for individual data loaders.

  1. If you are logged into a UNIX host, source the LSF environment.
    • For csh or tcsh: source LSF_TOP/conf/cshrc.lsf

    • For sh, ksh, or bash: . LSF_TOP/conf/profile.lsf

  2. If you are logged into a UNIX host, source the PERF environment.
    • For csh or tcsh: source $PERF_TOP/conf/cshrc.perf

    • For sh, ksh, or bash: . $PERF_TOP/conf/profile.perf

  3. Edit the plc configuration files for your data loaders.
    • For host-related data loaders, edit plc_ego.xml and plc_coreutil.xml.

    • For job-related data loaders (LSF data loaders), edit plc_lsf.xml and plc_bjobs-sp012.xml.

    • For advanced job-related data loaders (advanced LSF data loaders), edit plc_lsf_advanced_data.xml.

    • For license-related data loaders (FLEXnet data loaders), edit plc_license.xml.

    These files are located in the LSF environment directory:

    • UNIX: $LSF_ENVDIR

    • Windows: %LSF_ENVDIR%

  4. Navigate to the specific <DataLoader> tag with the Name attribute matching the data loader that you want to disable.
    For example:
    <DataLoader Name="hostgrouploader" ... Enable="true" .../>
  5. Edit the Enable attribute to "false".
    For example, to disable data collection for this plug-in:
    <DataLoader Name="hostgrouploader" ... Enable="false" ... />
  6. Restart the plc service.

Check the status of the loader controller

  1. If you are logged into a UNIX host, source the LSF environment.
    • For csh or tcsh: source LSF_TOP/conf/cshrc.lsf

    • For sh, ksh, or bash: . LSF_TOP/conf/profile.lsf

  2. If you are logged into a UNIX host, source the PERF environment.
    • For csh or tcsh: source $PERF_TOP/conf/cshrc.perf

    • For sh, ksh, or bash: . $PERF_TOP/conf/profile.perf

  3. Navigate to the PERF binary directory.
    • UNIX: cd $PERF_TOP/version_number/bin

    • Windows: cd %PERF_TOP%\version_number\bin

  4. View the status of the loader controller (plc) and other PERF services.

    perfadmin list

  5. Verify that there are no errors in the loader controller log file.

    The loader controller log file is located in the log directory:

    • UNIX: $PERF_LOGDIR

    • Windows: %PERF_LOGDIR%

Check the status of the data loaders

  1. If you are logged into a UNIX host, source the LSF environment.
    • For csh or tcsh: source LSF_TOP/conf/cshrc.lsf

    • For sh, ksh, or bash: . LSF_TOP/conf/profile.lsf

  2. If you are logged into a UNIX host, source the PERF environment.
    • For csh or tcsh: source $PERF_TOP/conf/cshrc.perf

    • For sh, ksh, or bash: . $PERF_TOP/conf/profile.perf

  3. Verify that there are no errors in the LSF data loader log files.

    The data loader log files (data_loader_name.log.host_name) are located in the dataloader subdirectory of the log directory:

    • UNIX: $PERF_LOGDIR/dataloader

    • Windows: %PERF_LOGDIR%\dataloader

Check the status of the Platform Analytics node database connection

  1. If you are logged into a UNIX host, source the LSF environment.
    • For csh or tcsh: source LSF_TOP/conf/cshrc.lsf

    • For sh, ksh, or bash: . LSF_TOP/conf/profile.lsf

  2. If you are logged into a UNIX host, source the PERF environment.
    • For csh or tcsh: source $PERF_TOP/conf/cshrc.perf

    • For sh, ksh, or bash: . $PERF_TOP/conf/profile.perf

  3. Navigate to the PERF binary directory.
    • UNIX: cd $PERF_TOP/version_number/bin

    • Windows: cd %PERF_TOP%\version_number\bin

  4. View the status of the node database connection.
    • UNIX: dbconfig.sh

    • Windows: dbconfig

Check core dump on the Platform Analytics node

Check and enable core dump on the following OS.

Core dump on Linux

  1. If you are logged into a UNIX host, source the LSF environment.
    • For csh or tcsh: source LSF_TOP/conf/cshrc.lsf

    • For sh or bash: . LSF_TOP/conf/profile.lsf

  2. If you are logged into a UNIX host, source the PERF environment.
    • For csh or tcsh: source $PERF_TOP/conf/cshrc.perf

    • For sh or bash: . $PERF_TOP/conf/profile.perf

  3. Check if core dump is enabled.
    • For csh or tcsh: ulimit -c unlimited

    • For sh or bash: ulimit -c

    If it displays 0, then it is disabled.

  4. Enable core dump.
    • For csh or tcsh: limit coredumpsize unlimited

    • For sh or bash: ulimit coredump

  5. Restart the loader controller and apply your changes.

    perfadmin stop all

    perfadmin start all

  6. Collect the stack trace from the node host.
    • Source the environment variables

    • Use gdb to load the core file.

      gdb ${JAVA_HOME}/bin/java core_file

      where core_file is the dump core file generated by the Analytics node

    • Print the stack trace: bt

  7. Collect the output from various installations to check if they are correct.

    For environment variables: env

    For csh or tcsh: limit

    For sh or bash: ulimit -a

    Verify rpm packages that you have installed: rpm -qa|grep glibc

Core dump on Solaris

  1. If you are logged into a UNIX host, source the LSF environment.
    • For csh or tcsh: source LSF_TOP/conf/cshrc.lsf

    • For sh or bash: . LSF_TOP/conf/profile.lsf

  2. If you are logged into a UNIX host, source the PERF environment.
    • For csh or tcsh: source $PERF_TOP/conf/cshrc.perf

    • For sh or bash: . $PERF_TOP/conf/profile.perf

  3. Check if core dump is enabled.
    • For csh or tcsh: ulimit -c unlimited

    • For sh or bash: ulimit -c

    If it displays 0, then it is disabled.

  4. Enable core dump.
    • For csh or tcsh: limit coredumpsize unlimited

    • For sh or bash: ulimit coredump

  5. Restart the loader controller and apply your changes.

    perfadmin stop all

    perfadmin start all

  6. Collect the stack trace from the node host.

    /usr/proc/bin/pstack core_file >pstack.out

    /usr/proc/bin/pmap core_file >pmap.out

    /usr/proc/bin/pldd core_file >pldd.out

    where core_file is the dump core file generated by the Analytics node

  7. It is recommended that you use dbx to collect stack trace.
    • Source the environment variables

    • Use dbx to load the core file.

      dbx ${JAVA_HOME}/bin/java core_file

    • Print the stack trace: where

  8. Collect the output from various installations to check if they are correct.

    For environment variables: env

    For csh or tcsh: limit

    For sh or bash: ulimit -a

    For patches currently installed: showrev -p

    For detailed information about the packages installed on a system: pkginfo -l

Core dump on AIX and HP-UX

  1. If you are logged into a UNIX host, source the LSF environment.
    • For csh or tcsh: source LSF_TOP/conf/cshrc.lsf

    • For sh or bash: . LSF_TOP/conf/profile.lsf

  2. If you are logged into a UNIX host, source the PERF environment.
    • For csh or tcsh: source $PERF_TOP/conf/cshrc.perf

    • For sh or bash: . $PERF_TOP/conf/profile.perf

  3. Check if core dump is enabled.
    • For csh or tcsh: ulimit -c unlimited

    • For sh or bash: ulimit -c

    If it displays 0, then it is disabled.

  4. Enable core dump.
    • For csh or tcsh: limit coredumpsize unlimited

    • For sh or bash: ulimit coredump

  5. Restart the loader controller and apply your changes.

    perfadmin stop all

    perfadmin start all

  6. It is recommended that you use dbx to collect stack trace.
    • Source the environment variables

    • Use dbx to load the core file.

      dbx ${JAVA_HOME}/bin/java core_file

      where core_file is the dump core file generated by the Analytics node

    • Print the stack trace: where

  7. Collect the output from various installations to check if they are correct.

    For environment variables: env

    For csh or tcsh: limit

    For sh or bash: ulimit -a

    For release number of the OS: uname -a

Debug LSF API

Enable debugging for the LSF API.

  1. Set the following environment variables for the current session.
    • For sh or bash:

      export LSF_DEBUG_CMD="LC_EXEC LC_COMM LC_TRACE"

      export LSF_CMD_LOG_MASK=LOG_DEBUG3

      export LSF_CMD_LOGDIR="log_path"

      export LSB_DEBUG_CMD="LC_EXEC LC_COMM LC_TRACE"

      export LSF_CMD_LOG_MASK=LOG_DEBUG3

      export LSF_CMD_LOGDIR="log_path"

      where log_path is the full path where debugging log files are generated.

    • For tsh and tcsh: Follow the same commands as sh or bash, but use setenv instead of export.

  2. Restart the loader controller in the same command line session where you set the environment variables.

    perfadmin stop all

    perfadmin start all

  3. When data loader start to collect data from LSF, the following log files are generated under the specified directory.
    • lscmd log host_name

    • bcmd log host_name

    Where host_name is the name of the Analytics node host.

Analytics node did not respond

If INFO level messages are not updated for more than one hour in the ANALYTICS_TOP/log/plc.log.host_name file, the Analytics node may not respond. Check for the following reasons to resolve this issue.

  1. Check if the specified maximum heap size is less than the minimum memory required for the data volume. Check for the following in the log file.

    Memory info before gc: memory in bytes

    Memory info after gc: memory in bytes

    If the specified heap size is less than the minimum memory requirement, then increase the heap size by changing the java settings in the ANALYTICS_TOP/conf/wsm/wsm_plc.conf file.

    For example: JAVA_OPTS=-Xms64m -Xmx2048m

    Note:

    For Windows 32bit, the maximum heap size that you can set is 1600M. For Linux / Unix 32bit, you can set it to 4096M. For 64bit machine, you can set it to any value.

  2. Check if there is enough disk space for the Analytics node host. If that is the problem, then contact your Administrator to resolve the disk space issue. You need to restart the loader controller once you increase the disk space.