IBM FileNet P8, Version 5.2.1            

Monitoring system metrics to improve indexing performance

You can improve indexing performance by monitoring system metrics and adjusting configuration settings.

Use the Content Platform Engine content-based retrieval counters to view the overall content-based retrieval process. The counters provide real-time details for a particular index job or index area; average metrics for batch sizes; processing time; documents created, updated, and deleted; content-based retrieval search metrics; and more. For more information, see Counter interpretation.

You can also gather information about the IBM Content Search Services server (such as memory usage and queue size) during indexing. This information is written to a CSV file in the YourCSSfolder\log directory. Information is also collected by IBM® System Dashboard for Enterprise Content Management, if available. Most of the information that is stored in the monitor.csv file can also be viewed as meter counters in the dashboard, in addition to other diagnostic information.

The following IBM Content Search Services server information is provided in both the monitor.csv output file and IBM System Dashboard for Enterprise Content Management, unless otherwise indicated.

Table 1. Information provided in the monitor.csv output file and IBM System Dashboard for Enterprise Content Management
Name Description
Time The current time (in seconds). This information is not displayed in IBM System Dashboard for Enterprise Content Management.
Total number of processed documents The total number of documents that were processed by IBM Content Search Services for all full-text indexes since server startup. Processing includes adds, updates, and deletes. Documents are counted regardless of whether the processing was successful or not.
Total number of add requests that failed The total number of failed add requests that were processed by IBM Content Search Services for all full-text indexes since the server started. This information is not displayed in the IBM System Dashboard for Enterprise Content Management.
Total number of successful add requests The total number of successful add requests that were processed by IBM Content Search Services for all full-text indexes since the server started. This information is not displayed in the IBM System Dashboard for Enterprise Content Management.
Total size of successful add requests (KB) The total memory size (in KB) of successful add requests that were processed by IBM Content Search Services for all full-text indexes since the server started. This information is not displayed in the IBM System Dashboard for Enterprise Content Management.
Total number of delete requests that failed The total number of failed delete requests that were process by IBM Content Search Services for all full-text indexes since the server started. This information is not displayed in the IBM System Dashboard for Enterprise Content Management.
Total number of successful delete requests The total number of successful add requests that were processed by IBM Content Search Services for all full-text indexes since the server started. This information is not displayed in the IBM System Dashboard for Enterprise Content Management.
Total size of processed documents The total memory size (in KB) of documents that were processed by IBM Content Search Services for all collections since server startup.
Documents in input queue The number of documents in the IBM Content Search Services input queue
Input queue size (in bytes) The memory size (in bytes) of documents in the IBM Content Search Services input queue
Documents in the output queue The number of documents in the IBM Content Search Services output queue
Output queue size (in bytes) The memory size (in bytes) of documents in the IBM Content Search Services output queue
Documents waiting for preprocessing The number of documents in the initial stage of the IBM Content Search Services indexing pipeline that are waiting for preprocessing
Documents currently in preprocessing The number of documents in the IBM Content Search Services indexing pipeline, in the second stage of preprocessing (text extraction, tokenization, and language analysis)
Documents waiting for indexing The number of documents in the third stage of the IBM Content Search Services indexing pipeline that are waiting to be indexed
Documents currently being indexed The number of documents in the final stage of the IBM Content Search Services indexing pipeline
Number of concurrent queries The number of ongoing queries that are currently running in the system. This number includes all searches that have started but not yet completed at the time of measurement. This information is not displayed in the IBM System Dashboard for Enterprise Content Management.
Total number of queries The total number of search requests that were processed by IBM Content Search Services since the server started. This information is not displayed in the IBM System Dashboard for Enterprise Content Management.
Used heap memory (MB) The amount of heap memory that is used by the JVM before Java™ memory garbage collection. This information is not displayed in IBM System Dashboard for Enterprise Content Management.
Thread count Number of threads that are used by the IBM Content Search Services server. This information is not displayed in IBM System Dashboard for Enterprise Content Management.
System load Provides an indication of the average system load for the previous minute, as provided by the JVM. This information might not be available on all platforms. This information is not displayed in IBM System Dashboard for Enterprise Content Management.
Open file descriptors The number of open operating system file descriptors. This information is available only for AIX, Linux, and Solaris systems on which the lsof utility is installed.
Free physical memory Provides an indication of the free physical memory on the computer, as provided by the JVM. This information might not be available on all platforms. This information is not displayed in IBM System Dashboard for Enterprise Content Management.
Batches in progress
In the monitor.csv output file:
Provides information about the indexing batches that are currently being processed. For each batch, the following information is provided in brackets, for example, [1;A;L;1000;1000]:
  • Batch ID
  • Type of batch. Possible values:
    • A: Add or update
    • D: Delete
  • State of the batch. Possible values:
    • I: Initial. The server did not yet accept the entire batch from the client.
    • L: Last document was received. The server accepted the entire batch from the client.
  • Number of documents that were received so far for this batch.
  • Number of documents from this batch that are in the input queue.
In IBM System Dashboard for Enterprise Content Management:
Displays the number of indexing batches (tasks) that are currently being processed
Active merges The number of index segment merges that are currently taking place
Merge size (MB) The total size (in megabytes) of index segment merges that are currently taking place

You can troubleshoot IBM Content Search Services by monitoring the status of documents at each stage of the indexing pipeline.

Figure 1. Document indexing stages
Stages of document processing: 1. Input queue, 2. Preprocessing ,3. Output queue, 4. Index
Table 2. Stages in the document processing pipeline
Stage in document indexing pipeline Column in the monitor.csv file
1. Input queue
Contains documents that are waiting for preprocessing
Number of documents that are waiting for preprocessing
2. Document preprocessing
Text extraction, tokenization, language analysis
Number of documents that are currently being preprocessed
3. Output queue
Contains preprocessed documents that are waiting to be indexed
Number of documents that are waiting to be indexed
4. Index
Contains indexed documents
Number of documents that are currently being indexed

The monitor.csv file is rotated like a log file (monitor0.csv, monitor1.csv, and so on). By default, queue status information is printed to the file every 10 seconds. To change the frequency, use the configuration tool to set a new value for the monitorQueuesFrequency parameter. You can disable queue monitoring by specifying a value of zero for the monitorQueuesFrequency parameter.



Last updated: October 2015
etspf002.htm

© Copyright IBM Corporation 2015.