The Content Platform Engine
generates an error stating No token or line breaks when
a content element file incorrectly has a MIME type of text/plain.
Symptoms
An error similar to the following is returned in the
p8_server_error.log file:
While indexing a large text document (ID: <document_ID>=,
size: 7996160) in 1000000 character chunks, no token breaks such as
spaces or line breaks were detected between characters 3832272 and
4832271. Ensure that the document is valid. Continuing to index the
document, but each chunk might be handled as a single token.<document_ID>
is the Globally Unique Identifier (GUID) assigned to the document.
Causes
If the Content Platform Engine sends
content with a MIME type of text/plain to the IBM® Content Search Services server, and the file
is not a text file, the IBM Content Search Services server
will generate this error. The resulting index data is probably invalid.
Resolving the problem
Correct the MIME type by checking out the document and
checking it back in with the correct MIME type.