<?xml version="1.0" encoding="UTF-8"?> <filenetBridgeConfiguration version="1"> <server> <url>http://localhost:9080/wsi/FNCEWS40DIME</url> <username>user</username> <password>password</password> </server> <domain></domain> <objectStore>MyObjectStore</objectStore> <documentSelection> <folders> <folder>/docs/analyze</folder> </folders> </documentSelection> <contentMapping name="DocumentContent"> <textContentPattern encoding="MS932">text/plain</textContentPattern> <textContentPattern encoding="UTF-8">text/html</textContentPattern> <binaryContentPattern>^application/pdf$</binaryContentPattern> </contentMapping> <propertyMappings> <propertyMapping> <symbolicName>DocumentTitle</symbolicName> <mappingTarget> <title /> </mappingTarget> </propertyMapping> <propertyMapping> <symbolicName>DateCreated</symbolicName> <mappingTarget> <date /> </mappingTarget> </propertyMapping> <propertyMapping> <symbolicName>Comment</symbolicName> <mappingTarget> <text name="Comment" /> </mappingTarget> </propertyMapping> </propertyMappings> <outputATML> <basename>filenet_data</basename> <maxDocuments>2000</maxDocuments> </outputATML> <categoryRecord property="Category"> <serialOperation ignoreError="false" /> </categoryRecord> </filenetBridgeConfiguration>This configuration file specifies the following information:
takmi_filenet2atml config.xmlAfter the text analysis of the resulting ATML files, you can write back the category information from the generated MIML as:
takmi_miml2filenet config.xml filenet_data_XXXX.mimlOn Linux and AIX, add .sh to the command names.
Element | Description |
---|---|
url | URL of the FileNet P8 server web services interface. |
username | User name for the server. |
password | Password for the server. |
Element | Attribute | Description | Cardinality or type |
---|---|---|---|
folders | Contains one or more folder elements. | 0 — 1 | |
folder | Path of the target FileNet folder. | 0 — n | |
recursive | Whether or not to select documents recursively in subfolders. | Boolean, defaults to true | |
querySQL | Custom SQL query to select FileNet documents. It should retrieve Id column in the SELECT list to generate document IDs. Please see FileNet manuals for the detail. | 0 — 1 |
Element | Attribute | Description | Cardinality or type |
---|---|---|---|
contentMapping | Top-level element of the content mapping configuration. This element contains all the elements and attributes that are described in this table. | 0 — 1 | |
name | Name of the text that is created during content extraction. This name will be shown in the MINER application. | string | |
maxLength | Maximum length of the extracted text. The text cannot be longer than this value. | integer, defaults to 65535 | |
contentReplacement | Replaces specified characters in the retrieved contents by using Java™ regular expression syntax. | 0 — 1 | |
pattern | Regular expression pattern to be replaced. | string | |
replacement | Replacement characters. This attribute can contain references to capturing groups in the pattern attribute. | string | |
textContentPattern | Regular expression pattern of the MIME type of the files to be considered as text files. | 0 — n | |
encoding | The encoding of the text file. | string, defaults to "UTF-8" | |
binaryContentPattern | Regular expression pattern of the MIME type of the files to be considered as binary files. | 0 — n |
Element | Attribute | Description | Cardinality or type |
---|---|---|---|
propertyMappings | Top level element of the property mapping configuration. This element contains zero or more propertyMapping elements. | 0 — 1 | |
propertyMapping | Represents one property mapping. It specifies a FileNet document property as the mapping source and one or more mapping targets as child elements. The child elements are described in the following sections of this table. | 0 — n | |
symbolicName | FileNet property name as the source of the mapping. The normalized symbolic name is used rather than the display name. | 1 | |
mappingTarget | The target of the mapping. Because one mapping can have multiple targets, this element can have multiple elements below as its child elements. | 1 | |
standardFeature | Standard feature of IBM Content Analyzer as the mapping target. | 0 — 1 | |
category | Category path of the standard feature. | string | |
dynamicPath | Allows flexible mapping to the category path based on the property value. | 0 — n | |
value | FileNet property value to be matched. If a property value matches this configuration, a standard feature will be generated with the specified category path. This value will be the value of the generated standard feature. | string | |
category | Category path of the standard feature. | string | |
text | This element maps a FileNet property value to a text in ATML documents. The text is the subject of the text analysis unlike other mapping targets. Because an ATML document can have multiple texts, one or more FileNet properties can be mapped to ATML texts. | 0 — 1 | |
name | Name of the text shown in the MINER application. | string | |
date | Maps a FileNet property value to the special date property in ATML documents. The property type must be DateTime or String (IBM Content Analyzer string format of date). | 0 — 1 | |
title | Maps a FileNet property value to the title in ATML documents. | 0 — 1 |