During indexing, the text extracted from a text document might be truncated because the number of characters extracted exceeds a configuration setting.
When indexing a text document, the following error appears in the IBM Content Search Services error log:
Error details: IQQI0005E The document with ID <id> cannot be indexed.
Causes of the problem: IQQP0012W The document <id> exceeds the limit of the size of the document in text format. The indexed document will be truncated.
The size of extracted characters from a document is larger than the value specified in the max.text.size parameter located in the parser_config.xml file for the document’s collection.
If the truncated text is not needed for searching, then you can safely ignore this message. However, if the text being truncated is important for searching, you can adjust the system configuration to handle a larger text size. Note, however, that increasing the maximum text size will increase the memory requirements of the server.
To modify the maximum text size:
<Parameter Name="max.text.size">60000000</Parameter>
Note that increasing this value to too large of a value can result in the server running out of memory. Also note that this parameter controls the limit for text documents, and documents larger than this size are truncated. For documents that are binary and xml files, the limits are controlled by the max.binary.text.size and max.xml.text.size settings, respectively. For these types of files, if the configured limit is exceeded then the document is not indexed.
<install-location>\IBM\ECMTextSearch\config\defaults\parser_config.xml
<install-location>\IBM\ECMTextSearch\config\config.xml
Restart the IBM Content Search Services server after you modify the value of the startupHeapSize parameter.
Test your changes in the user environment since the limits can vary based on the operating system and available memory. For example, on a 64-bit machine with a large amount of memory, the startupHeapSize parameter can be set to a larger value, allowing you to set the max.text.size parameter to a larger value, as well.
See Techdoc 7020031 for a related known limitation.