IBM FileNet P8, Version 5.2.1            

Constructor configuration file: XML filter constructor reference

The constructors.xml file contains the XML filter constructors that associate your surplus XML elements with Content Platform Engine classes. Each IBM® Content Search Services server has an associated constructors.xml file in the server-home/config directory.

You can configure multiple XML filter constructors. Each XML filter constructor associates a set of Content Platform Engine classes with a set of surplus XML elements. For optimal indexing performance, associate a Content Platform Engine class only with the surplus XML elements that apply to the XML content of that class. For example, suppose that you have one XML filter constructor that associates five Content Platform Engine classes with five surplus XML elements. If each surplus XML element applies only to one of the Content Platform Engine classes, the filtering configuration is roughly 20% efficient: For each XML document, only one surplus element potentially applies but the XML filtering utility must check for the presence of all five elements. In this example, the optimal configuration would be to associate each Content Platform Engine class with one surplus XML in five separate XML filter constructors.

Add one <constructor> element to the constructors.xml file for each XML filter constructor that you want to configure. The following example shows the <constructor> element format:

	<?xml version="1.0" encoding="UTF-8"?> 
	<constructors>     
		<constructor>         
			<name>XMLFilter1</name>         
			<class>com.ibm.filenet.cse.cascade.xmlfilter.XMLFilterConstructor</class>         
			<batchSize>15</batchSize>         
			<customConfig name="FileNet">             
				<Type>XMLFilter</Type>             
				<SymbolicClassName>LaurelTree,OliveTree,OrangeTree</SymbolicClassName>             
				<ElementsToRemove>/document/item/rawitemdata,//font,/Configuration</ElementsToRemove>            
				<CleanupOldDirTime>1</CleanupOldDirTime>            
				<SleepIntervalForCleanup>2000</SleepIntervalForCleanup>        
			</customConfig>     
		</constructor> 
	</constructors>

To configure multiple XML filter constructors, add multiple <constructor> elements as shown in the following example:

	<?xml version="1.0" encoding="UTF-8"?> 
	<constructors>     
		<constructor> 
			...    
		</constructor>     
		<constructor>         
			...    
		</constructor>     
		<constructor>         
			...    
		</constructor> 
	</constructors> 

The following XML shows the <constructor> element format. The required items of information are shown italicized. The details of each item are described in the table that follows the XML.

	<?xml version="1.0" encoding="UTF-8"?>
	<constructors>    
		<constructor>        
			<name>filter-name</name>       
			<class>filter-class</class>        
			<batchSize>filter-batchsize</batchSize>        
			<customConfig name="FileNet">             
				<Type>config-type</Type>            
				<SymbolicClassName>config-classlist</SymbolicClassName>            
				<ElementsToRemove>config-elementlist</ElementsToRemove>            
				<CleanupOldDirTime>config-cleanupage</CleanupOldDirTime>            
				<SleepIntervalForCleanup>config-sleepinterval</SleepIntervalForCleanup>        
			</customConfig>     
		</constructor>
	</constructors> 
Item Description
filter-name The name that you assign for the XML filter. The name must be unique within constructors.xml.
filter-class The name of the filter class. This name must be com.ibm.filenet.cse.cascade.xmlfilter.XMLFilterConstructor.
filter-batchsize The maximum number of XML documents that can be sent to the XML filter in one request.
config-type The name of the configuration type. The name must be XMLFilter.
config-classlist The Content Platform Engine class names of the objects whose XML document content is filtered. Specify the names in a comma delimited list.

A class name is not specific to an object store. For example, suppose that you specify Laureltree as one of the class names. If the Laureltree class exists in two object stores, the XML content of the Laureltree objects in both object stores is filtered.

When an XML document is filtered, the document is examined for the presence of surplus XML elements. An XML document is always filtered in this sense. Any detected surplus XML elements are removed from the XML document before the remainder of the document is indexed. No XML elements are removed from an XML document, however, if no surplus XML elements exist within the document.

The encoding for an XML document is assumed to be UTF-8 unless otherwise specified in the document.

config-elementlist The surplus XML elements that are removed from the XML content. Specify the names in a comma delimited list. Specify each name in one of the following ways:
Specific node
Use single slashes to specify the full path for a surplus element. For example, /zoo/mammal/lion indicates that lion content elements are to be removed in the following cases: the parent element for lion is mammal, and the parent element for mammal is zoo, and zoo is the root element for the XML document.
Pathless node
Use double slashes to specify no path or a partial path. For example, //mammal/lion indicates lion content elements are to be removed when the parent element for lion is mammal.
config-cleanupage The minimum age in hours of the filtering work directories that are deleted during cleanup runs. For example, if the value of this item is 1, the XML filtering utility deletes any work directory that is one hour or older.
Work directories are created as part of the XML filtering process in the following location:
	JAVA_TEMP_LOCATION/IQQ_server_port_user_id/XMLFilteringUtil

The server_port is the port of the IBM Content Search Services index server. The user_id is the name of the operating system user account for the index server.

One work directory is created for each index batch. The name of a work directory is FilteredDir_ plus some random number.

config-sleepinterval The number of milliseconds between work directory cleanup runs. The XML filtering utility deletes work directories during the cleanup runs.


Last updated: March 2016
csscbr_xmlfilterformat_constructor.htm

© Copyright IBM Corporation 2016.