Log files contain important run-time information about the general health of EGO daemons and EGO system events. Log files are an essential troubleshooting tool during production and testing.
The naming convention for most EGO log files is the name of the daemon plus the host name the daemon is running on.
The following table outlines the EGO daemons and their associated log file names. Most log files on Windows hosts have a .txt extension.
Daemon or component
|
Log file name
|
Application configuration wizard
|
In the Platform Management Console,
|
datasourcetools (Database Configuration Tool)*
|
datasourcetools.host_name.log
|
egoconsumerresloader (Consumer Resource Data Loader)*
|
egoconsumerresloader.host_name.log
|
egodynamicresloader (Dynamic Metric Data Loader)*
|
egodynamicresloader.host_name.log
|
egoeventsloader (EGO Events Data Loader)*
|
egoeventsloader.host_name.log
|
egosc (EGO Service Controller)
|
egoservice.audit.log, esc.log.host_name
|
egostatisticresloader (Static Attribute Data Loader)*
|
egostatisticresloader.host_name.log
|
fam (File Access Manager)
|
fam.host_name.log
|
lim (Load Information Manager)
|
lim.log.host_name
|
named (Service Director)
|
named.log
|
pem (Process Execution Manager)
|
pem.log.host_name
|
pim (Process Information Manager)
|
pim.log.host_name (Linux only)
|
plc (Loader Controller)*
|
plc.host_name.log
|
purger (Data Purger)*
|
purger.host_name.log
|
rfa (Remote File Access)
|
cli.host_name.log
|
rs (Repository Service)
|
rs.host_name.log, repositoryservice.audit.log
|
tomcat (WEBGUI)
|
catalina.out
|
vemkd (EGO Kernel Daemon)
|
ego.audit.log, vemkd.log.host_name
|
WSG (Web Service Gateway)
|
wsg.log
|
WSM (Platform Management Console/WEBGUI)
|
wsm.log.host_name
|
The majority of log entries are informational in nature. It is not uncommon to have a large (and growing) log file and still have a healthy cluster.
*Indicates daemons and log files associated with the reporting feature.
Log file locations
By default, most EGO log files are found in
EGO_TOP\kernel\log (Windows) or
EGO_TOP/kernel/log (Linux).
Component
|
Log file name
|
Windows
|
Linux
|
Application configuration wizard
|
|
In the Platoform management console:
|
EGO kernel daemon
|
ego.audit.log
vemkd.log.host_name
|
EGO_TOP\kernel\log
|
EGO_TOP/kernel/log
|
EGO service controller
|
|
EGO_TOP\eservice\esc\log
|
EGO_TOP/eservice/esc/log
|
File access manager
|
|
EGO_TOP\kernel\log
|
EGO_TOP/kernel/log
|
Load information manager
|
|
EGO_TOP\kernel\log
|
EGO_TOP/kernel/log
|
Process execution manager
|
|
EGO_TOP\kernel\log
|
EGO_TOP/kernel/log
|
Process information manager
|
|
N/A
|
EGO_TOP/kernel/log
|
Platform management console/WEBGUI
|
catalina.out
wsm.log.host_name
|
EGO_TOP\gui\logs
|
EGO_TOP/gui/logs
|
Remote file access: client side
|
|
Where rfa was run from
|
Where rfa was run from
|
Remote file access: server side
|
|
EGO_TOP\eservice\rs\log
|
EGO_TOP/eservice/rs/log
|
Reporting:
Database configuration tool
Dynamic metric data loader
EGO events data loader
Static attribute data loader
Loader controller
Data purger
|
datasourcetools.host_name.log
egodynamicresloader.host_name.log
egoeventsloader.host_name.log
egostatisticresloader.host_name.log
plc.host_name.log
purger.host_name.log
|
EGO_TOP\perf\logs
|
EGO_TOP/perf/logs
|
Repository service and file access manager
|
|
EGO_TOP\eservice\rs\log
|
EGO_TOP/eservice/rs/log
|
Service director
|
|
EGO_TOP\eservice\esd\conf\named\namedb
|
EGO_TOP/eservice/esd/conf/named/namedb
|
Web service gateway
|
|
EGO_TOP\eservice\wsg\log
|
EGO_TOP/eservice/wsg/log
|
Log files can also be accessed through the Platform Management Console (from System Logs > Standard Logs).
Log entry format
The standard format for log file entries is:
date time_zone log_level [process_id:thread_id] action:description/message
where the date is expressed in YYYY-MM-DD hh:mm:ss.sss.
For example, 2006-03-14 11:02:44.000 Eastern Standard Time ERROR [2488:1036] vemkdexit: vemkd is halting.
Log classes for vemkd and pem
Use the following parameters to specify the log class:
vemkd: EGO_DEBUG_VEMKD
For example, EGO_DEBUG_VEMKD=LC_AUTH.
pem: EGO_DEBUG_PEM
For example, EGO_DEBUG_PEM=LC_PEM
Every log entry belongs to a log class. You can use log class as a mechanism to filter log entries by area. Log classes in combination with log levels allow you to troubleshoot using log entries that only address, for example, configuration.
Log classes (as well as log levels) can be filtered at run time using egosh debug.
Valid logging classes are as follows:
Log class
|
Description
|
LC_TRACE
|
Logs significant program steps.
|
LC_COMM
|
Logs messages related to communications.
|
LC_AUTH
|
Logs messages related to users and authentication.
|
LC_MEM
|
Logs messages related to memory allocation.
|
LC_SYS
|
Logs messages related to system calls.
|
LC_PERF
|
Logs messages related to performance.
|
LC_RSRC
|
Logs messages related to resources, including host status changes.
|
LC_ALLOC
|
Logs messages related to the resource allocation engine.
|
LC_ACTIVITY
|
Logs messages related to activities.
|
LC_PEM
|
Logs messages related to the process execution manager (pem).
|
LC_EVENT
|
Logs messages related to the event notification service.
|
LC_QUERY
|
Logs messages related to client queries.
|
LC_RECOVER
|
Logs messages related to recovery and data persistence
|
LC_CONF
|
Logs messages related to configuration.
|
LC_CLIENT
|
Logs messages related to clients.
|
Log classes for lim
Use EGO_DEBUG_LIM to specify the log class. For example, EGO_DEBUG_LIM=LC_MEMORY.
Every log entry belongs to a log class. You can use log class as a mechanism to filter log entries by area. Log classes in combination with log levels allow you to troubleshoot using log entries that only address, for example, configuration.
Log classes (as well as log levels) can be filtered at run time using egosh debug.
Valid logging classes are as follows:
Log class
|
Description
|
LC_SCHED
|
Logs LSF scheduler (mbschd) messages.
|
LC_EXEC
|
Logs significant steps for job execution.
|
LC_TRACE
|
Logs significant program steps.
|
LC_COMM
|
Logs messages related to communications.
|
LC_XDR
|
Logs everything transferred by XDR
|
LC_CHKPNT
|
Logs checkpointing messages.
|
LC_LICENSE
|
Logs license management messages.
|
LC_FILE
|
Logs file transfer messages.
|
LC_AFS
|
Logs AFS messages.
|
LC_AUTH
|
Logs messages related to users and authentication.
|
LC_HANG
|
Marks where a program might hang.
|
LC_MULTI
|
Logs messages pertaining to MultiCluster.
|
LC_SIGNAL
|
Logs messages pertaining to signals.
|
LC_DCE
|
Logs messages pertaining to DCE support.
|
LC_PIM
|
Logs PIM messages.
|
LC_MEMORY
|
Logs memory limit messages.
|
LC_SYS
|
Logs system call messages.
|
LC_JLIMIT
|
Logs job slot limit messages.
|
LC_FAIR
|
Logs fairshare policy messages.
|
LC_PREEMPT
|
Logs preemption policy messages.
|
LC_PEND
|
Logs messages related to job pending reasons.
|
LC_EEVENTD
|
Logs eeventd messages.
|
LC_LOADINDX
|
Logs load index messages.
|
LC_RESOURCE
|
Logs information used by resource broker (resource gathering and reporting).
|
LC_JGRP
|
Logs job group messages.
|
LC_JARRAY
|
Logs job array messages.
|
LC_MPI
|
Logs MPI messages.
|
LC_ELIM
|
Logs ELIM messages.
|
LC_M_LOG
|
Logs multievent logging messages.
|
LC_PERFM
|
Logs performance messages.
|
LC_HPC
|
Logs information specific to HPC integration.
|
LC_LICSCHED
|
Logs LSF License Scheduler messages.
|
Log levels
Use EGO_LOG_MASK to specify the log level. For example, EGO_LOG_MASK=LOG_CRIT.
For most logs, there are nine log levels that allow administrators to control the level of event information that is logged. For logs associated with the reporting feature, there are seven log levels.
When you are troubleshooting, increase the log level to obtain as much detailed information as you can. When you are finished troubleshooting, decrease the log level to prevent the log files from becoming too large and to enhance daemon performance.
Valid logging levels are as follows (not including the reporting feature log levels):
Log level
|
Description
|
LOG_EMERG
|
Logs only those messages in which the system is unusable.
|
LOG_ALERT
|
Logs those messages for which action must be taken immediately.
|
LOG_CRIT
|
Logs those messages that are critical.
|
LOG_ERR
|
Logs those messages that indicate error conditions.
|
LOG_WARNING
|
Logs those messages that are warnings or more serious messages. This is the default level of debug information.
|
LOG_NOTICE
|
Logs those messages that indicate normal but significant conditions or warnings and more serious messages.
|
LOG_INFO
|
Logs all informational messages and more serious messages.
|
LOG_DEBUG
|
Logs all debug-level messages.
|
LOG_TRACE
|
Logs all available messages.
Note: LOG_TRACE is not supported by the LIM. If you set LOG_TRACE for the LIM, it is automatically changed to LOG_DEBUG.
|
Valid log levels for reporting feature are as follows:
Log level
|
Description
|
OFF
|
Logs no messages.
|
FATAL
|
Logs messages that were fatal to the reporting feature.
|
ERROR
|
Logs those messages that indicate error conditions.
|
WARN
|
Logs those messages that are warnings or more serious messages.
|
INFO
|
Logs all informational messages and more serious messages. (Default)
|
DEBUG
|
Logs all debug-level messages
|
ALL
|
Logs all messages.
|
Conf files where log level and class information are retrieved
The lim, pem, and vemkd daemons read
ego.conf to retrieve the following information (as corresponds to the particular daemon).
EGO_LOG_MASK: The log level used to determine the amount of detail logged.
EGO_DEBUG_LIM: The log class setting for lim.
EGO_DEBUG_PEM: The log class setting for pem.
EGO_DEBUG_VEMKD: The log class setting for vemkd.
The service director daemon (“named”) reads
named.conf to retrieve the following information:
logging severity: The configured severity log class controlling the level of event information that is logged (critical, error, warning, notice, info, debug, or dynamic). In the case of the log class set to debug, a log level is required to determine the amount of detail logged for debugging purposes. The higher the log level number, the more debug details messages are logged. Refer to third-party documentation for more information about BIND and logging.
The egosc daemon reads egosc_conf.xml.
The wsg daemon reads
wsg.conf to retrieve the following information:
WSG_DEBUG_DETAIL: The log level used to determine the amount of detail logged for debugging purposes. The configured severity log class controlling the level of event information that is logged (critical, error, warning, notice, info, debug, or dynamic). In the case of the log class set to debug, the logging is either on (1) or off (0).
WSG_LOGDIR: Where to write wsg.log files.
The wsm daemon reads
wsm.conf to retrieve the following information:
LOG_LEVEL: The configured log level controlling the level of event information that is logged (INFO, ERROR, WARNING, or DEBUG).
If a system is running well, typically set log level to info or even warning to minimize messages.
Note: The daemons associated with the reporting feature read various .xml files to retrieve information. For more information, see the Reports chapters.
Why do log files grow so quickly?
Every time an EGO system event occurs, a log file entry is added to a log file. Most entries are informational in nature, except when there is an error condition. If your log levels provide entries for all information (for example, if you have set them to LOG_DEBUG), the files grow quickly.
Suggested settings:
During regular EGO operation, set your log levels to LOG_WARNING. With this setting, critical errors are logged but informational entries are not, keeping the log file size to a minimum.
For troubleshooting purposes, set your log level to LOG_DEBUG. Because of the quantity of messages you receive when subscribed to this log level, change the level back to LOG_WARNING as soon as you are finished troubleshooting.
Note: If your log files are too long, you can always rename them for archive purposes. New, fresh log files are then created and log all new events.
How often should I maintain log files?
The growth rate of the log files is dependent on the log level and the complexity of your cluster. If you have a large cluster, daily log file maintenance may be required.
We recommend using a log file rotation utility to do unattended maintenance of your log files. Failure to do timely maintenance could result in a full file system, which hinders system performance and operation.