Documents that are checked into the Content Engine require a class. A document can be classified manually, by selecting the document's class, or it can be classified automatically when the document is checked in. The Content Engine provides an extensible framework that enables an incoming document of a specified MIME type to be automatically assigned to a target document class, and sets selected properties of that target class based on values that are found in the incoming document. A classification component, or classifier, does the work of assigning a document class. One such classifier that is packaged with the Content Engine is the XML classifier. For information, see Classification Flowchart and Understanding the XML Classifier.
You can also plug custom classifiers that are implemented with JavaScript or Java™ into the document classification framework. For information, see Understanding Automatic Document Classification.
To plug a custom classifier into the document classification framework, follow these tasks:
For a classifier that is implemented with Java, you can package the class in a JAR file, and check in your class or JAR file as a CodeModule object in a Content Engine object store. Alternatively, you can specify the classifier in the class path of the application server where the Content Engine is running. A document classifier runs asynchronously on the Content Engine.
For code examples on implementing a document classifier and on creating a DocumentClassificationAction object, see Working with Document Classification-related Objects. For more information, see Action Handlers.
You can automatically classify documents with a content type that matches the MIME type property of an existing DocumentClassificationAction object. To automatically classify a new document, you create a Document object and follow these tasks:
The Document object is checked into the object store with an initial class, and the object's ClassificationStatus property is set to CLASSIFICATION_PENDING. Document classification is an asynchronous action; therefore, the auto-classification request is queued, represented by a DocumentClassificationQueueItem object.
The Classification Manager is responsible for dequeuing a classification request and processing it. The Classification Manager obtains the MIME type from the target document, locates the DocumentClassificationAction object that is registered for that MIME type, and starts the classifier that is identified in the DocumentClassificationAction object. A classifier operates with the same access permissions of the user who initiates the document check-in.
When document classification is complete, the Document object's ClassificationStatus property is updated to indicate success or failure. If classification fails, the initial class that is assigned to the document remains. If classification succeeds, a new class is assigned to the document and a ClassifyCompleteEvent object is triggered.