IBM Content Analyzer Installation Guide
Edition Notice
This edition applies to version 8, release 4 of IBM® Content™ Analyzer and to all subsequent releases and modifications until otherwise indicated in new editions.

This document contains proprietary information of IBM. This proprietary information is provided in accordance with the license conditions and is protected by copyright. Information contained in this document provides no warranties whatsoever for any products. Also, no descriptions provided in this document should be interpreted as product warranties. Depending on the system environment, the yen symbol may be displayed as the backslash symbol, or the backslash symbol may be displayed as the yen symbol.

© Copyright International Business Machines Corporation 2007, 2008. All rights reserved.

US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

1 Overview
This document describes how to install IBM Content Analyzer. The following topics describe the product modules and system requirements.
1.1 System Configuration and Operating Environment
IBM Content Analyzer consists of the following modules.

Module Details of processing
Preparation This module handles batch processing such as language processing, information extraction, and index creation.
Text Miner This module is an enterprise application that you can use to perform various types of analysis.
Alerting System This module is an enterprise application that you can use to specify settings for alerts.
Dictionary Editor This module is an enterprise application that you can use to create and maintain dictionaries.
DOCAT This module, which is available only in Japanese, is an enterprise application that you can use to specify settings for category extraction.
Manual This module displays online documentation for IBM Content Analyzer.
Browser This module is connected to the IBM Content Analyzer enterprise applications and displays the mining results.



Although it is possible to operate everything on one computer, for optimum performance and load distribution, you should install the modules on separate computers, as follows:
  • Preparation
  • Enterprise applications (Text Miner, Alerting System, Dictionary Editor, DOCAT, and Manual)
  • Browser
If all modules are installed and run on one computer, the operating environment must conform to the following requirements:

Operating system Hardware Software
Microsoft® Windows® 2003 Intel® 32 bit or AMD 32 bit CPU
6 GB memory (minimum: 2 GB)
60 GB hard disk
Microsoft Windows Service Pack 2
IBM WebSphere® Application Server 6.1.0
IBM Java Runtime Environment , Java 2 Technology Edition, Version 5.0
AIX® 5.3 Power 64 bit CPU
6 GB memory (minimum: 2 GB)
60 GB hard disk
IBM WebSphere® Application Server 6.1.0
IBM Java Runtime Environment , Java 2 Technology Edition, Version 5.0
Red Hat Enterprise Linux 4 Japanese : 32bit x86 or 64bit x86 CPU
English : 32bit x86 or 64bit x86 CPU
6 GB memory (minimum: 2 GB)
60 GB hard disk
IBM WebSphere® Application Server 6.1.0
IBM Java Runtime Environment , Java 2 Technology Edition, Version 5.0
Note: Microsoft Internet Explorer 6 is required as a client.
1.2 Environment Variables
The environment variables used in IBM Content Analyzer are as follows.

Environment variable Meaning
TAKMI_HOME Specifies the directory in which the IBM Content Analyzer modules are located.
  • This variable is referred to when reading the configuration file during operations.
PATH Adds the TAKMI_HOME bin directory.
When analyzing Japanese, the TAKMI_HOME uima/components/jsa/lib directory must also be added.
  • This variable is referred to when calling native codes to be used in the language processing.
TAKMI_HOME (WebSphere Application Server system property) Specifies the same value as the system environment variable TAKMI_HOME.
  • This variable is referred to by the enterprise applications.
uima.home (WebSphere Application Server system property) Specifies the TAKMI_HOME uima directory.
  • This variable is referred to by the enterprise applications.
ws.ext.dirs (WebSphere Application Server system property) Specifies the TAKMI_HOME lib directory, uima/lib directory, uima/components/TAKMI_NLP/lib directory, uima/components/jsa/lib and uima/components/LW/lib directory.
  • Note that ws.ext.dirs can also be set by the WAS_EXT_DIRS setting in setupCmdLine.bat on WebSphere Application Server.
1.3 Browser Settings
Microsoft Internet Explorer 6 is required as a client.
Set the browser in accordance with the security policy of your environment.

Pop-up windows function should be enabled while using IBM Content Analyzer enterprise applications. Depending on the settings, pop-up windows can also be disabled by pop-up blocker; therefore, disable the pop-up block function to be able to use IBM Content Analyzer enterprise applications.
2 Installation Procedure
This topic describes how to install IBM Content Analyzer.
2.1 Installation
2.1.1 Using GUI
To launch the installation program using GUI, run the takmisetupwin32.exe command for Windows, the takmisetupaix.bin command for AIX and the takmisetuplinux.bin command for Linux.
The following example shows the install on Windows.
Command:
takmisetupwin32.exe
  • The Welcome screen is displayed.



    Click the Next button to proceed.

  • The program license agreement is displayed.



    Specify that you accept the license agreement, and click the Next button to proceed.

  • The screen for specifying the directory for installing IBM Content Analyzer is displayed.



    Specify the directory and click the Next button.

  • The screen for selecting an installation type is displayed.



    Select an installation type and click the Next button.

  • If you selected Customize as the installation type, the screen for selecting the features to be installed is displayed.



    Select the features to be installed, and click the Next button.

  • The screen to confirm the features to be installed is displayed.



    Click the Install button to start the installation.

  • The screen to show the results of installing the product is displayed.



    Click the Finish button to complete the installation.

2.1.2 Using console mode
To launch the installation program using console mode, run the takmisetupwin32.exe command for Windows, the takmisetupaix.bin command for AIX and the takmisetuplinux.bin command for Linux with the option "-console".
The following example shows the install on Windows.
Command:
takmisetupwin32.exe -console
  • The Welcome screen is displayed.



    Press Enter to proceed.

  • The program license agreement is displayed.



    Specify 1 to accept the license agreement, and press Enter to proceed.

  • The screen for specifying the directory for installing IBM Content Analyzer is displayed.



    Specify the directory and press Enter.Then press 1 to continue the install.
  • The screen for selecting an installation type is displayed.



    Select an installation type and press 0 to continue the install.

  • If you selected Customize as the installation type, the screen for selecting the features to be installed is displayed.



    Select the features to be installed, and press 0 to continue the install.

  • The screen to confirm the features to be installed is displayed.



    Press 1 to start the installation.

  • The screen to show the results of installing the product is displayed.



    Press 3 to complete the installation.

2.2 Updating the Configuration File
Update the following file in the directory where IBM Content Analyzer is installed:

   conf/global_config.xml

Specify the database entry within the database_entry tag as follows:

    <database_entry name="SAMPLE" path_type="absolute" path="C:/Program Files/IBM/takmi/databases/sample"/>

where

   name = name of database (this name is shown on the database selection screen and you can set it to any name).
   path_type = type of the path to the database directory (relative or absolute).
   path = the path to the database directory (use the slash character as a directory separator).

If a sample database was selected at the time of installation, the sample database entry is added.
2.3 Uninstallation
The uninstaller is created in the _uninst directory in the directory where IBM Content Analyzer is installed.
Because the IBM Content Analyzer library is loaded in the WebSphere Application Server extension class loader, be sure to stop the operation of WebSphere Application Server before you launch the uninstallation program when you wish to uninstall the "Program Files."

2.3.1 Using GUI
To launch the uninstallation program using GUI, run the uninstaller.exe command for Windows or the uninstaller.bin command for AIX and Linux.
The following example shows the install on Windows.
Command:
uninstaller.exe
  • The Welcome screen is displayed.



    Click the Next button to proceed.

  • The screen for selecting the features to be uninstalled is displayed.



    Select the features to be uninstalled, and click the Next button.

  • The screen to confirm the features to be uninstalled is displayed.



    Click the Uninstall button to start the uninstallation.

  • The screen to show the results of uninstalling the product is displayed.



    Click the Finish button to complete the uninstallation.

2.3.2 Using console mode
To launch the uninstallation program using console mode, run the uninstaller.exe command for Windows or the uninstaller.bin command for AIX and Linux with the option "-console".
The following example shows the install on Windows.
Command:
uninstaller.exe -console
  • The Welcome screen is displayed.



    Press 1 to proceed.

  • The screen for selecting the features to be uninstalled is displayed.



    Select the features to be uninstalled, and press 0 to continue the uninstall.

  • The screen to confirm the features to be uninstalled is displayed.



    Press 1 to start the uninstallation.

  • The screen to show the results of uninstalling the product is displayed.



    Press 3 to complete the uninstallation.
3 Making Settings for WebSphere Application Server
This topic describes how to specify settings for WebSphere Application Server.
3.1 Running the wsadmin Command
Before running the wsadmin command on AIX or Linux, confirm if the TAKMI_HOME uima/components/jsa/lib directory was set to the environment variable LIBPATH (on AIX) or LD_LIBRARY_PATH (on Linux). If not, logout then login again with command :
    su - [username]

AIX:
    echo $LIBPATH
    /opt/IBM/takmi/uima/components/jsa/lib

Linux:
    echo $LD_LIBRARY_PATH
    /opt/IBM/takmi/uima/components/jsa/lib

In the ear directory in the directory where IBM Content Analyzer is installed, run the command appropriate for your operating environment:

Windows:
    C:\Program Files\IBM\WebSphere\AppServer\bin\wsadmin.bat -f .\TAKMI.jacl

AIX or Linux:
    /usr/IBM/WebSphere/AppServer/bin/wsadmin.sh -f ./TAKMI.jacl

Because the TAKMI.jacl file makes settings for application server "server1", if you want to make settings for a different server, make the necessary corrections to the TAKMI.jacl file before executing the command.

This command automatically runs the processing described in 3.2 Making Settings for the Application Server and 3.3 Installation of the Enterprise Application.

If global security is enabled in WebSphere Application Server, you must specify -user and -password parameters when you run the wsadmin command.

See the WebSphere Application Server documentation for details on the wsadmin command.
3.2 Making Settings for the Application Server
Follow the procedure described below to make settings for the application server.
See 1.2 Environment Variables for the meanings of the environment variables used in IBM Content Analyzer.
  • Launch WebSphere Application Server and log in to the administrative console.
  • Select Server > Application Server > server1 > Java and Process Management > Process Definition > Java Virtual Machine > Custom Properties.
  • Click the New button, type TAKMI_HOME in "Name", type the IBM Content Analyzer installation directory in "Value" (the default directory in Windows is C:\Program Files\IBM\takmi, and the default directory in AIX and Linux is /opt/IBM/takmi), and then click OK.
  • Click the New button, type uima.home in "Name", type the UIMA installation directory in "Value" (the default directory in Windows is C:\Program Files\IBM\takmi\uima and the default directory in AIX and Linux is /opt/IBM/takmi/uima), and then click OK.
  • Click the New button, type ws.ext.dirs in "Name", type the lib directory, the uima/lib directory, the uima/components/TAKMI_NLP/lib directory, the uima/components/jsa/lib directory and the uima/components/LW/lib of TAKMI_HOME in "Value", and then click OK.

    The default in Windows is:
       C:\Program Files\IBM\takmi\lib;C:\Program Files\IBM\takmi\uima\lib;C:\Program Files\IBM\takmi\uima\components\TAKMI_NLP\lib;C:\Program Files\IBM\takmi\uima\components\jsa\lib;C:\Program Files\IBM\takmi\uima\components\LW\lib

    The default in AIX and Linux is:
       /opt/IBM/takmi/lib:/opt/IBM/takmi/uima/lib:/opt/IBM/takmi/uima/components/TAKMI_NLP/lib:/opt/IBM/takmi/uima/components/jsa/lib:/opt/IBM/takmi/uima/components/LW/lib

  • Click Save to save the changes in the master configuration.
3.3 Installation of Enterprise Applications
Install the .ear files in the IBM Content Analyzer installation directory to WebSphere Application Server.

    ear/TAKMI_MINER.ear
    ear/TAKMI_ALERT.ear
    ear/TAKMI_DIC.ear
    ear/TAKMI_DOCAT.ear (included only in the Japanese version)
    ear/TAKMI_MANUAL.ear


  • Launch WebSphere Application Server and log in to the administrative console.
  • Select Applications > Install New Application.
  • In the "Preparing for the application installation" screen, select the .ear file to be installed, and click the Next button.
  • Click the Next button in the "Preparing for the application installation" screen.
  • Click the Next button in the "Step 1: Select installation options" screen.
  • Click the Next button in the "Step 2: Map modules to servers" screen.
  • Click the Next button in the "Step  3:  Map virtual hosts for Web modules."
  • For files other than TAKMI_MANUAL.ear, map the security roles to users/groups in the "Step  4:  Map security roles to users/groups" screen, and click the Next button.
    See 3.4 Security Settings for the security settings of WebSphere Application Server.
  • In the "Step 5: Summary" screen (in the "Step 4: Summary" screen for TAKMI_MANUAL_EAR.ear), click the Finish button.
  • Ensure that the application is successfully installed, and click Save to save it in the master configuration.
See the WebSphere Application Server documentation for details on how to install enterprise applications.

After you install the enterprise applications, restart WebSphere Application Server.

The URL of each enterprise application is as follows.

Module URL
Text Miner http://hostname:port/TAKMI_MINER/
Alerting System http://hostname:port/TAKMI_ALERT/
Dictionary Editor http://hostname:port/TAKMI_DIC/
DOCAT Connect from Text Miner.
Manual http://hostname:port/TAKMI_MANUAL/

For example, the URL for accessing Text Miner launched on the local host through the default server1 of WebSphere Application Server is:

    http://localhost:9080/TAKMI_MINER/.

Check the host name and port number with the WebSphere Application Server administrator as they vary with the environment.
3.4 Security Settings
Make the following security settings to protect the installed enterprise applications through user authentication.
  • Launch WebSphere Application Server and log in to the administrative console.
  • Select Security > Global Security.
  • Configure a user registry to be specified in the "Active user directory" and set its properties.
  • Check Enable global security under General properties. Enforce Java 2 security is checked at the same time, but be sure to uncheck it.
  • Select a configured user registered type from the Active user registry option, and click OK.
  • Select Applications > Enterprise Applications.
  • Select an installed enterprise application.
  • Click Map security roles to users/groups under Additional Properties, map the security roles to Users/Groups, and click OK.
    The security role for each enterprise application is as follows:

    Enterprise application Security role
    TAKMI MINER_EAR takmi_miner
    TAKMI ALERT_EAR takmi_alerting_system
    TAKMI DIC_EAR takmi_dictionary_editor
    TAKMI DOCAT_EAR takmi_docat

  • Click Save to save changes in the master configuration.
  • Restart WebSphere Application Server.
See the WebSphere Application Server documentation for details on specifying security settings.
4 Operation Check
This topic describes how to check the operation of IBM Content Analyzer.
4.1 Checking the Operation of Enterprise Applications
Restart WebSphere Application Server and access each of the enterprise applications through the browser.
  • Text Miner
    Access http://hostname:port/TAKMI_MINER/ (example: http://localhost:9080/TAKMI_MINER/) and verify that the following screen is displayed:

    Screen shot of TEXT_MINER

    For information on how to use Text Miner, click Manual at the top right of the Text Miner screen and refer to the online instruction manual.

  • Alerting System
    Access http://hostname:port/TAKMI_ALERT/ (example: http://localhost:9080/TAKMI_ALERT/) and verify that the following screen is displayed:

    Screen shot of ALERTING_SYSTEM

  • Dictionary Editor
    Access http://hostname:port/TAKMI_DIC/ (example: http://localhost:9080/TAKMI_DIC/) and verify that the following screen is displayed:

    Screen shot of DICTIONARY_EDITOR

4.2 Checking the Operation of the Preparatory Processing
By using the sample database, check that the preparatory processing (data conversion, language processing, and indexing) can be run.
See the Operation Manual for details on the preparatory processing.
  • Stop WebSphere Application Server.
  • In the databases/INDEXED_DATA_SAMPLE_EN/bin directory, which is in the IBM Content Analyzer installation directory, run the takmi_preprocess_all.bat command (when using Windows) or the takmi_preprocess_all.sh command (when using AIX).
  • Follow the instructions on the screen and enter 'Y.'



  • Data deletion, data conversion, language processing, and indexing are run in this order.
  • Follow the instructions on the screen and press any key to complete the processing.



  • Launch WebSphere Application Server.
  • Use TAKMI_MINER to check the newly created index for the INDEXED_DATA_SAMPLE_EN database.
5 Migration
This chapter describes how to migrate IBM Content Analyzer from old version to new version.
5.1 Uninstall the old version and install the new version
Please uninstall the old version of IBM Content Analyzer (called OmniFind Analytics Edition V8.4.1 and earlier) on the system before install the new version.

To reuse the databases, please do not delete the file "global_config.xml" when uninstalling the old version and installing the new version. There are some necessary actions to reuse the databases when migrate from OmniFind Analytics Edition V8.4 to OmniFind Analytics Edition V8.4.1 or when migrate from OmniFind Analytics Edition V8.4.1 to OmniFind Analytics Edition V8.4.1 Hotfix or IBM Content Analyzer V8.4.2

After installed the new version, reconfigure the settings of WebSphere Application Server to update the Enterprise Applications.
5.2 Necessary actions to reuse the databases on the OmniFind Analytics Edition V8.4.1
When migrating OmniFind Analytics Edition from version 8.4 to the version 8.4.1, the file "database_config.xml" must be modified to reuse the databases. Under the database directory, modify the file

    conf/database_config.xml

as follows.
  1. Add the tag <mining_function_entry> into the tag <mining_function_entries> of common parameters.

        <mining_function_entry name="com.ibm.takmi.mining.topview.TopViewFunction" impl="com.ibm.takmi.impl.std.mining.topview.StandardTopViewFunction"/>
  2. Move the 2 tags <category_entries> and <date_format_entries> from standard parameters to common parameters.
Below is the sample to show how to modify the file "database_config.xml" of the database INDEXED_DATA_SAMPLE_EN which was installed in OmniFind Analytics Edition V8.4 to reuse it on OmniFind Analytics Edition V8.4.1.

  • The database_config.xml on OmniFind Analytics Edition V8.4

    <?xml version="1.0" encoding="utf-8"?>
    <database_config>
    <!-- common -->
    <params>
    <param name="language" value="en"/>
    <param name="date_category_yyyymmdd" value=".date/yyyymmdd"/>
    <param name="date_category_yyyy" value=".date/yyyy"/>
    <param name="date_category_yyyymm" value=".date/yyyymm"/>
    <param name="date_category_yyyyww" value=".date/yyyyww"/>
    <param name="date_category_day_of_week" value=".date/dow"/>
    <param name="date_category_day_of_month" value=".date/dd"/>
    <param name="date_year_start_month" value="1"/>
    <param name="max_doc_length_to_display" value="10000"/>
    <param name="max_keyword_length" value="255"/>
    <param name="max_category_name_length" value="63"/>
    <param name="max_category_path_length" value="255"/>
    <param name="max_category_depth" value="20"/>
    <!-- parameters for NLP -->
    <param name="max_doc_length_for_nlp" value="65535"/>
    <param name="max_id_length" value="255"/>
    <param name="max_title_length" value="5000"/>
    <param name="max_text_name_length" value="63"/>
    </params>
    <application_entries>
    <application_entry name="Dictionary" impl="com.ibm.takmi.impl.std.dic.edit.config.StandardDictionaryFactoryImpl"/>
    <application_entry name="Indexer" impl="com.ibm.takmi.impl.std.idx.config.StandardIndexerFactoryImpl"/>
    <application_entry name="AlertingSystem" impl="com.ibm.takmi.impl.std.alerting.config.StandardAlertingFactoryImpl"/>
    <application_entry name="NLPResource" impl="com.ibm.takmi.impl.std.nlprsc.config.StandardNLPResourceFactoryImpl"/>
    </application_entries>
    <search_function_entries>
    <search_function_entry name="com.ibm.takmi.impl.common.search.CategorySearchFunction" impl="com.ibm.takmi.impl.std.search.StandardCategorySearchFunction"/>
    <search_function_entry name="com.ibm.takmi.impl.common.search.KeywordSearchFunction" impl="com.ibm.takmi.impl.std.search.StandardKeywordSearchFunction"/>
    <search_function_entry name="com.ibm.takmi.impl.common.search.NumberSearchFunction" impl="com.ibm.takmi.impl.std.search.StandardNumberSearchFunction"/>
    <search_function_entry name="com.ibm.takmi.impl.common.search.InvalidSearchFunction" impl="com.ibm.takmi.impl.std.search.StandardInvalidSearchFunction"/>
    </search_function_entries>
    <mining_function_entries>
    <mining_function_entry name="com.ibm.takmi.mining.doclistview.DocListViewFunction" impl="com.ibm.takmi.impl.std.mining.doclistview.StandardDocListViewFunction"/>
    <mining_function_entry name="com.ibm.takmi.mining.iddocview.IdDocViewFunction" impl="com.ibm.takmi.impl.std.mining.iddocview.StandardIdDocViewFunction"/>
    <mining_function_entry name="com.ibm.takmi.mining.categoryview.CategoryViewFunction" impl="com.ibm.takmi.impl.std.mining.categoryview.StandardCategoryViewFunction"/>
    <mining_function_entry name="com.ibm.takmi.mining.timeseriesview.TimeSeriesViewFunction" impl="com.ibm.takmi.impl.std.mining.timeseriesview.StandardTimeSeriesViewFunction"/>
    <mining_function_entry name="com.ibm.takmi.mining.topicview.TopicViewFunction" impl="com.ibm.takmi.impl.std.mining.topicview.StandardTopicViewFunction"/>
    <mining_function_entry name="com.ibm.takmi.mining.twodmapview.TwoDMapViewFunction" impl="com.ibm.takmi.impl.std.mining.twodmapview.StandardTwoDMapViewFunction"/>
    </mining_function_entries>
    <frequently_used_category_entries>
    <frequently_used_category_entry path=".tkm_en_base_word"/>
    <frequently_used_category_entry path=".tkm_en_base_phrase"/>
    </frequently_used_category_entries>
    <infrequently_used_category_entries>
    <infrequently_used_category_entry path=".date"/>
    </infrequently_used_category_entries>
    <!-- standard -->
    <impl name="standard">
    <params>
    <param name="number_of_groups" value="3"/>
    <param name="number_of_keywords_per_group" value="1000000"/>
    </params>
    <category_entries>
    <!-- Specifies subroot categories for system-reserved categories. -->
    <category_entry name="reserved_by_system" value=".date"/>
    <category_entry name="reserved_by_system" value=".date/dd"/>
    <category_entry name="reserved_by_system" value=".date/dow"/>
    <category_entry name="reserved_by_system" value=".date/yyyy"/>
    <category_entry name="reserved_by_system" value=".date/yyyymm"/>
    <category_entry name="reserved_by_system" value=".date/yyyyww"/>
    <category_entry name="reserved_by_system" value=".date/yyyymmdd"/>
    <category_entry name="reserved_by_system" value=".tkm_en_base_word"/>
    <category_entry name="reserved_by_system" value=".tkm_en_base_phrase"/>
    <category_entry name="reserved_by_system" value=".tkm_en_appl_sample_voc"/>
    <!-- Specifies the auto generated category pathes by a regular expression to avoid recursive generation.
    The following regular expression represents exclusive pathes which end with
    ".base_phrase",
    ".base_phrase.-[MN]-npred",
    ".base_phrase.-[MN]-verb((be)|(ing))?",
    ".base_phrase.-[MN]-adjs", and
    ".base_phrase.adj[s]?-[MN]-",
    ".base_phrase.verb((ing)|(be))?-[MN]-",
    ".base_phrase.prep-[MN]-",
    ".appl_(sample_)?voc_phrase",
    ".appl_(sample_)?voc_phrase.-[MN]-bad",
    ".appl_(sample_)?voc_phrase.-[MN]-good",
    ".appl_(sample_)?voc_phrase.-[MN]-request",
    ".appl_(sample_)?voc_phrase.-[MN]-thanks", and
    ".appl_(sample_)?voc_phrase.-[MN]-question",
    -->
    <category_entry name="auto_generated_path_pattern" value="\.((base)|(appl_(sample_)?voc))_phrase(\.((-[MN]-npred)|(-[MN]-verb((be)|(ing))?)|(-[MN]-adjs)|(adj[s]?-[MN]-)|(-[NM]-good)|(-[MN]-bad)|(-[MN]-thanks)|(-[MN]-request)|(-[MN]-question)|(verb((be)|(ing))?-N-)|(prep-N-)))?$"/>
    <!-- Specifies subroot categories for TAKMI_DOCAT -->
    <category_entry name="non_dictionary" value=".0"/>
    </category_entries>
    <file_indexer_entries>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.keyid.KeyIdIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.date.DateIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.cattodoc.CatToDocIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.doctokey.DocToKeyIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.keytodoc.KeyToDocIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.numtodoc.NumToDocIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.idtodoc.IdToDocIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.metadata.MetadataIndexWriter"/>
    </file_indexer_entries>
    <index_file_merger_entries>
    <index_file_merger_entry name="com.ibm.takmi.impl.std.idx.cattodoc.CatToDocIndexFileMerger"/>
    <index_file_merger_entry name="com.ibm.takmi.impl.std.idx.doctokey.DocToKeyIndexFileMerger"/>
    <index_file_merger_entry name="com.ibm.takmi.impl.std.idx.keytodoc.KeyToDocIndexFileMerger"/>
    <index_file_merger_entry name="com.ibm.takmi.impl.std.idx.numtodoc.NumToDocIndexFileMerger"/>
    <index_file_merger_entry name="com.ibm.takmi.impl.std.idx.idtodoc.IdToDocIndexFileMerger"/>
    <index_file_merger_entry name="com.ibm.takmi.impl.std.idx.metadata.MetadataIndexFileMerger"/>
    </index_file_merger_entries>
    <index_group_merger_entries>
    <index_group_merger_entry name="com.ibm.takmi.impl.std.idx.cattodoc.CatToDocIndexGroupMerger"/>
    <index_group_merger_entry name="com.ibm.takmi.impl.std.idx.doctokey.DocToKeyIndexGroupMerger"/>
    <index_group_merger_entry name="com.ibm.takmi.impl.std.idx.keytodoc.KeyToDocIndexGroupMerger"/>
    <index_group_merger_entry name="com.ibm.takmi.impl.std.idx.numtodoc.NumToDocIndexGroupMerger"/>
    <index_group_merger_entry name="com.ibm.takmi.impl.std.idx.idtodoc.IdToDocIndexGroupMerger"/>
    <index_group_merger_entry name="com.ibm.takmi.impl.std.idx.metadata.MetadataIndexGroupMerger"/>
    </index_group_merger_entries>
    <path_entries>
    <path_entry name="index_root" value="db/index"/>
    <path_entry name="keyword_id" value="keyword_id"/>
    <path_entry name="doc_to_key" value="doc_to_key"/>
    <path_entry name="key_to_doc" value="key_to_doc"/>
    <path_entry name="cat_to_doc" value="cat_to_doc"/>
    <path_entry name="num_to_doc" value="num_to_doc"/>
    <path_entry name="id_to_doc" value="id_to_doc"/>
    <path_entry name="metadata" value="metadata"/>
    </path_entries>
    <data_entries min_doc_id="0" max_doc_id="0">
    </data_entries>
    <date_format_entries path=".date">
    <date_format_entry value="yyyy MMM" locale="en_US"/>
    <date_format_entry value="yyyyMMdd" locale="ja_JP"/>
    </date_format_entries>
    </impl>
    </database_config>

  • The database_config.xml on OmniFind Analytics Edition V8.4.1

    <?xml version="1.0" encoding="utf-8"?>
    <database_config>
    <!-- common -->
    <params>
    <param name="language" value="en"/>
    <param name="date_category_yyyymmdd" value=".date/yyyymmdd"/>
    <param name="date_category_yyyy" value=".date/yyyy"/>
    <param name="date_category_yyyymm" value=".date/yyyymm"/>
    <param name="date_category_yyyyww" value=".date/yyyyww"/>
    <param name="date_category_day_of_week" value=".date/dow"/>
    <param name="date_category_day_of_month" value=".date/dd"/>
    <param name="date_year_start_month" value="1"/>
    <param name="max_doc_length_to_display" value="10000"/>
    <param name="max_keyword_length" value="255"/>
    <param name="max_category_name_length" value="63"/>
    <param name="max_category_path_length" value="255"/>
    <param name="max_category_depth" value="20"/>
    <!-- parameters for NLP -->
    <param name="max_doc_length_for_nlp" value="65535"/>
    <param name="max_id_length" value="255"/>
    <param name="max_title_length" value="5000"/>
    <param name="max_text_name_length" value="63"/>
    </params>
    <application_entries>
    <application_entry name="Dictionary" impl="com.ibm.takmi.impl.std.dic.edit.config.StandardDictionaryFactoryImpl"/>
    <application_entry name="Indexer" impl="com.ibm.takmi.impl.std.idx.config.StandardIndexerFactoryImpl"/>
    <application_entry name="AlertingSystem" impl="com.ibm.takmi.impl.std.alerting.config.StandardAlertingFactoryImpl"/>
    <application_entry name="NLPResource" impl="com.ibm.takmi.impl.std.nlprsc.config.StandardNLPResourceFactoryImpl"/>
    </application_entries>
    <search_function_entries>
    <search_function_entry name="com.ibm.takmi.impl.common.search.CategorySearchFunction" impl="com.ibm.takmi.impl.std.search.StandardCategorySearchFunction"/>
    <search_function_entry name="com.ibm.takmi.impl.common.search.KeywordSearchFunction" impl="com.ibm.takmi.impl.std.search.StandardKeywordSearchFunction"/>
    <search_function_entry name="com.ibm.takmi.impl.common.search.NumberSearchFunction" impl="com.ibm.takmi.impl.std.search.StandardNumberSearchFunction"/>
    <search_function_entry name="com.ibm.takmi.impl.common.search.InvalidSearchFunction" impl="com.ibm.takmi.impl.std.search.StandardInvalidSearchFunction"/>
    </search_function_entries>
    <mining_function_entries>
    <mining_function_entry name="com.ibm.takmi.mining.topview.TopViewFunction" impl="com.ibm.takmi.impl.std.mining.topview.StandardTopViewFunction"/>
    <mining_function_entry name="com.ibm.takmi.mining.doclistview.DocListViewFunction" impl="com.ibm.takmi.impl.std.mining.doclistview.StandardDocListViewFunction"/>
    <mining_function_entry name="com.ibm.takmi.mining.iddocview.IdDocViewFunction" impl="com.ibm.takmi.impl.std.mining.iddocview.StandardIdDocViewFunction"/>
    <mining_function_entry name="com.ibm.takmi.mining.categoryview.CategoryViewFunction" impl="com.ibm.takmi.impl.std.mining.categoryview.StandardCategoryViewFunction"/>
    <mining_function_entry name="com.ibm.takmi.mining.timeseriesview.TimeSeriesViewFunction" impl="com.ibm.takmi.impl.std.mining.timeseriesview.StandardTimeSeriesViewFunction"/>
    <mining_function_entry name="com.ibm.takmi.mining.topicview.TopicViewFunction" impl="com.ibm.takmi.impl.std.mining.topicview.StandardTopicViewFunction"/>
    <mining_function_entry name="com.ibm.takmi.mining.twodmapview.TwoDMapViewFunction" impl="com.ibm.takmi.impl.std.mining.twodmapview.StandardTwoDMapViewFunction"/>
    </mining_function_entries>
    <frequently_used_category_entries>
    <frequently_used_category_entry path=".tkm_en_base_word"/>
    <frequently_used_category_entry path=".tkm_en_base_phrase"/>
    </frequently_used_category_entries>
    <infrequently_used_category_entries>
    <infrequently_used_category_entry path=".date"/>
    </infrequently_used_category_entries>
    <category_entries>
    <!-- Specifies subroot categories for system-reserved categories. -->
    <category_entry name="reserved_by_system" value=".date"/>
    <category_entry name="reserved_by_system" value=".date/dd"/>
    <category_entry name="reserved_by_system" value=".date/dow"/>
    <category_entry name="reserved_by_system" value=".date/yyyy"/>
    <category_entry name="reserved_by_system" value=".date/yyyymm"/>
    <category_entry name="reserved_by_system" value=".date/yyyyww"/>
    <category_entry name="reserved_by_system" value=".date/yyyymmdd"/>
    <category_entry name="reserved_by_system" value=".tkm_en_base_word"/>
    <category_entry name="reserved_by_system" value=".tkm_en_base_phrase"/>
    <category_entry name="reserved_by_system" value=".tkm_en_appl_sample_voc"/>
    <!-- Specifies the auto generated category pathes by a regular expression to avoid recursive generation.
    The following regular expression represents exclusive pathes which end with
    ".base_phrase",
    ".base_phrase.-[MN]-npred",
    ".base_phrase.-[MN]-verb((be)|(ing))?",
    ".base_phrase.-[MN]-adjs", and
    ".base_phrase.adj[s]?-[MN]-",
    ".base_phrase.verb((ing)|(be))?-[MN]-",
    ".base_phrase.prep-[MN]-",
    ".appl_(sample_)?voc_phrase",
    ".appl_(sample_)?voc_phrase.-[MN]-bad",
    ".appl_(sample_)?voc_phrase.-[MN]-good",
    ".appl_(sample_)?voc_phrase.-[MN]-request",
    ".appl_(sample_)?voc_phrase.-[MN]-thanks", and
    ".appl_(sample_)?voc_phrase.-[MN]-question",
    -->
    <category_entry name="auto_generated_path_pattern" value="\.((base)|(appl_(sample_)?voc))_phrase(\.((-[MN]-npred)|(-[MN]-verb((be)|(ing))?)|(-[MN]-adjs)|(adj[s]?-[MN]-)|(-[NM]-good)|(-[MN]-bad)|(-[MN]-thanks)|(-[MN]-request)|(-[MN]-question)|(verb((be)|(ing))?-N-)|(prep-N-)))?$"/>
    <!-- Specifies subroot categories for TAKMI_DOCAT -->
    <category_entry name="non_dictionary" value=".0"/>
    </category_entries>
    <date_format_entries path=".date">
    <date_format_entry value="yyyy MMM" locale="en_US"/>
    <date_format_entry value="yyyyMMdd" locale="ja_JP"/>
    </date_format_entries>

    <!-- standard -->
    <impl name="standard">
    <params>
    <param name="number_of_groups" value="3"/>
    <param name="number_of_keywords_per_group" value="1000000"/>
    </params>
    <file_indexer_entries>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.keyid.KeyIdIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.date.DateIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.cattodoc.CatToDocIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.doctokey.DocToKeyIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.keytodoc.KeyToDocIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.numtodoc.NumToDocIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.idtodoc.IdToDocIndexWriter"/>
    <file_indexer_entry name="com.ibm.takmi.impl.std.idx.metadata.MetadataIndexWriter"/>
    </file_indexer_entries>
    <index_file_merger_entries>
    <index_file_merger_entry name="com.ibm.takmi.impl.std.idx.cattodoc.CatToDocIndexFileMerger"/>
    <index_file_merger_entry name="com.ibm.takmi.impl.std.idx.doctokey.DocToKeyIndexFileMerger"/>
    <index_file_merger_entry name="com.ibm.takmi.impl.std.idx.keytodoc.KeyToDocIndexFileMerger"/>
    <index_file_merger_entry name="com.ibm.takmi.impl.std.idx.numtodoc.NumToDocIndexFileMerger"/>
    <index_file_merger_entry name="com.ibm.takmi.impl.std.idx.idtodoc.IdToDocIndexFileMerger"/>
    <index_file_merger_entry name="com.ibm.takmi.impl.std.idx.metadata.MetadataIndexFileMerger"/>
    </index_file_merger_entries>
    <index_group_merger_entries>
    <index_group_merger_entry name="com.ibm.takmi.impl.std.idx.cattodoc.CatToDocIndexGroupMerger"/>
    <index_group_merger_entry name="com.ibm.takmi.impl.std.idx.doctokey.DocToKeyIndexGroupMerger"/>
    <index_group_merger_entry name="com.ibm.takmi.impl.std.idx.keytodoc.KeyToDocIndexGroupMerger"/>
    <index_group_merger_entry name="com.ibm.takmi.impl.std.idx.numtodoc.NumToDocIndexGroupMerger"/>
    <index_group_merger_entry name="com.ibm.takmi.impl.std.idx.idtodoc.IdToDocIndexGroupMerger"/>
    <index_group_merger_entry name="com.ibm.takmi.impl.std.idx.metadata.MetadataIndexGroupMerger"/>
    </index_group_merger_entries>
    <path_entries>
    <path_entry name="index_root" value="db/index"/>
    <path_entry name="keyword_id" value="keyword_id"/>
    <path_entry name="doc_to_key" value="doc_to_key"/>
    <path_entry name="key_to_doc" value="key_to_doc"/>
    <path_entry name="cat_to_doc" value="cat_to_doc"/>
    <path_entry name="num_to_doc" value="num_to_doc"/>
    <path_entry name="id_to_doc" value="id_to_doc"/>
    <path_entry name="metadata" value="metadata"/>
    </path_entries>
    <data_entries min_doc_id="0" max_doc_id="0">
    </data_entries>
    </impl>
    </database_config>

5.3 Necessary actions to reuse the databases on the OmniFind Analytics Edition V8.4.1 Hotfix and IBM Content Analyzer V8.4.2
When migrating OmniFind Analytics Edition from version 8.4.1 to the version 8.4.1 Hotfix or IBM Content Analyzer V8.4.2, the file "database_config.xml" must be modified to reuse the databases (showed in blue lines in the following sample).
Add the following line as the first line into tag <index_file_merger_entries>
<index_file_merger_entry name="com.ibm.takmi.impl.std.idx.keyid.KeyIdIndexFileMerger"/>
and
<index_group_merger_entry name="com.ibm.takmi.impl.std.idx.keyid.KeyIdIndexGroupMerger"/>
as the first line into tag <index_group_merger_entries>

The database_config.xml on OmniFind Analytics Edition V8.4.1Hotfix or IBM Content Analyzer V8.4.2 (red lines are the modified lines in OmniFind Analytics Edition V8.4.1 and blue lines are modified in OmniFind Analytics Edition V8.4.1 Hotfix and IBM Content Analyzer V8.4.2)
<?xml version="1.0" encoding="utf-8"?>
<database_config>
<!-- common -->
<params>
<param name="language" value="en"/>
<param name="date_category_yyyymmdd" value=".date/yyyymmdd"/>
<param name="date_category_yyyy" value=".date/yyyy"/>
<param name="date_category_yyyymm" value=".date/yyyymm"/>
<param name="date_category_yyyyww" value=".date/yyyyww"/>
<param name="date_category_day_of_week" value=".date/dow"/>
<param name="date_category_day_of_month" value=".date/dd"/>
<param name="date_year_start_month" value="1"/>
<param name="max_doc_length_to_display" value="10000"/>
<param name="max_keyword_length" value="255"/>
<param name="max_category_name_length" value="63"/>
<param name="max_category_path_length" value="255"/>
<param name="max_category_depth" value="20"/>
<!-- parameters for NLP -->
<param name="max_doc_length_for_nlp" value="65535"/>
<param name="max_id_length" value="255"/>
<param name="max_title_length" value="5000"/>
<param name="max_text_name_length" value="63"/>
</params>
<application_entries>
<application_entry name="Dictionary" impl="com.ibm.takmi.impl.std.dic.edit.config.StandardDictionaryFactoryImpl"/>
<application_entry name="Indexer" impl="com.ibm.takmi.impl.std.idx.config.StandardIndexerFactoryImpl"/>
<application_entry name="AlertingSystem" impl="com.ibm.takmi.impl.std.alerting.config.StandardAlertingFactoryImpl"/>
<application_entry name="NLPResource" impl="com.ibm.takmi.impl.std.nlprsc.config.StandardNLPResourceFactoryImpl"/>
</application_entries>
<search_function_entries>
<search_function_entry name="com.ibm.takmi.impl.common.search.CategorySearchFunction" impl="com.ibm.takmi.impl.std.search.StandardCategorySearchFunction"/>
<search_function_entry name="com.ibm.takmi.impl.common.search.KeywordSearchFunction" impl="com.ibm.takmi.impl.std.search.StandardKeywordSearchFunction"/>
<search_function_entry name="com.ibm.takmi.impl.common.search.NumberSearchFunction" impl="com.ibm.takmi.impl.std.search.StandardNumberSearchFunction"/>
<search_function_entry name="com.ibm.takmi.impl.common.search.InvalidSearchFunction" impl="com.ibm.takmi.impl.std.search.StandardInvalidSearchFunction"/>
</search_function_entries>
<mining_function_entries>
<mining_function_entry name="com.ibm.takmi.mining.topview.TopViewFunction" impl="com.ibm.takmi.impl.std.mining.topview.StandardTopViewFunction"/>
<mining_function_entry name="com.ibm.takmi.mining.doclistview.DocListViewFunction" impl="com.ibm.takmi.impl.std.mining.doclistview.StandardDocListViewFunction"/>
<mining_function_entry name="com.ibm.takmi.mining.iddocview.IdDocViewFunction" impl="com.ibm.takmi.impl.std.mining.iddocview.StandardIdDocViewFunction"/>
<mining_function_entry name="com.ibm.takmi.mining.categoryview.CategoryViewFunction" impl="com.ibm.takmi.impl.std.mining.categoryview.StandardCategoryViewFunction"/>
<mining_function_entry name="com.ibm.takmi.mining.timeseriesview.TimeSeriesViewFunction" impl="com.ibm.takmi.impl.std.mining.timeseriesview.StandardTimeSeriesViewFunction"/>
<mining_function_entry name="com.ibm.takmi.mining.topicview.TopicViewFunction" impl="com.ibm.takmi.impl.std.mining.topicview.StandardTopicViewFunction"/>
<mining_function_entry name="com.ibm.takmi.mining.twodmapview.TwoDMapViewFunction" impl="com.ibm.takmi.impl.std.mining.twodmapview.StandardTwoDMapViewFunction"/>
</mining_function_entries>
<frequently_used_category_entries>
<frequently_used_category_entry path=".tkm_en_base_word"/>
<frequently_used_category_entry path=".tkm_en_base_phrase"/>
</frequently_used_category_entries>
<infrequently_used_category_entries>
<infrequently_used_category_entry path=".date"/>
</infrequently_used_category_entries>
<category_entries>
<!-- Specifies subroot categories for system-reserved categories. -->
<category_entry name="reserved_by_system" value=".date"/>
<category_entry name="reserved_by_system" value=".date/dd"/>
<category_entry name="reserved_by_system" value=".date/dow"/>
<category_entry name="reserved_by_system" value=".date/yyyy"/>
<category_entry name="reserved_by_system" value=".date/yyyymm"/>
<category_entry name="reserved_by_system" value=".date/yyyyww"/>
<category_entry name="reserved_by_system" value=".date/yyyymmdd"/>
<category_entry name="reserved_by_system" value=".tkm_en_base_word"/>
<category_entry name="reserved_by_system" value=".tkm_en_base_phrase"/>
<category_entry name="reserved_by_system" value=".tkm_en_appl_sample_voc"/>
<!-- Specifies the auto generated category pathes by a regular expression to avoid recursive generation.
The following regular expression represents exclusive pathes which end with
".base_phrase",
".base_phrase.-[MN]-npred",
".base_phrase.-[MN]-verb((be)|(ing))?",
".base_phrase.-[MN]-adjs", and
".base_phrase.adj[s]?-[MN]-",
".base_phrase.verb((ing)|(be))?-[MN]-",
".base_phrase.prep-[MN]-",
".appl_(sample_)?voc_phrase",
".appl_(sample_)?voc_phrase.-[MN]-bad",
".appl_(sample_)?voc_phrase.-[MN]-good",
".appl_(sample_)?voc_phrase.-[MN]-request",
".appl_(sample_)?voc_phrase.-[MN]-thanks", and
".appl_(sample_)?voc_phrase.-[MN]-question",
-->
<category_entry name="auto_generated_path_pattern" value="\.((base)|(appl_(sample_)?voc))_phrase(\.((-[MN]-npred)|(-[MN]-verb((be)|(ing))?)|(-[MN]-adjs)|(adj[s]?-[MN]-)|(-[NM]-good)|(-[MN]-bad)|(-[MN]-thanks)|(-[MN]-request)|(-[MN]-question)|(verb((be)|(ing))?-N-)|(prep-N-)))?$"/>
<!-- Specifies subroot categories for TAKMI_DOCAT -->
<category_entry name="non_dictionary" value=".0"/>
</category_entries>
<date_format_entries path=".date">
<date_format_entry value="yyyy MMM" locale="en_US"/>
<date_format_entry value="yyyyMMdd" locale="ja_JP"/>
</date_format_entries>

<!-- standard -->
<impl name="standard">
<params>
<param name="number_of_groups" value="3"/>
<param name="number_of_keywords_per_group" value="1000000"/>
</params>
<file_indexer_entries>
<file_indexer_entry name="com.ibm.takmi.impl.std.idx.keyid.KeyIdIndexWriter"/>
<file_indexer_entry name="com.ibm.takmi.impl.std.idx.date.DateIndexWriter"/>
<file_indexer_entry name="com.ibm.takmi.impl.std.idx.cattodoc.CatToDocIndexWriter"/>
<file_indexer_entry name="com.ibm.takmi.impl.std.idx.doctokey.DocToKeyIndexWriter"/>
<file_indexer_entry name="com.ibm.takmi.impl.std.idx.keytodoc.KeyToDocIndexWriter"/>
<file_indexer_entry name="com.ibm.takmi.impl.std.idx.numtodoc.NumToDocIndexWriter"/>
<file_indexer_entry name="com.ibm.takmi.impl.std.idx.idtodoc.IdToDocIndexWriter"/>
<file_indexer_entry name="com.ibm.takmi.impl.std.idx.metadata.MetadataIndexWriter"/>
</file_indexer_entries>
<index_file_merger_entries>
<index_file_merger_entry name="com.ibm.takmi.impl.std.idx.keyid.KeyIdIndexFileMerger"/>
<index_file_merger_entry name="com.ibm.takmi.impl.std.idx.cattodoc.CatToDocIndexFileMerger"/>
<index_file_merger_entry name="com.ibm.takmi.impl.std.idx.doctokey.DocToKeyIndexFileMerger"/>
<index_file_merger_entry name="com.ibm.takmi.impl.std.idx.keytodoc.KeyToDocIndexFileMerger"/>
<index_file_merger_entry name="com.ibm.takmi.impl.std.idx.numtodoc.NumToDocIndexFileMerger"/>
<index_file_merger_entry name="com.ibm.takmi.impl.std.idx.idtodoc.IdToDocIndexFileMerger"/>
<index_file_merger_entry name="com.ibm.takmi.impl.std.idx.metadata.MetadataIndexFileMerger"/>
</index_file_merger_entries>
<index_group_merger_entries>
<index_group_merger_entry name="com.ibm.takmi.impl.std.idx.keyid.KeyIdIndexGroupMerger"/>
<index_group_merger_entry name="com.ibm.takmi.impl.std.idx.cattodoc.CatToDocIndexGroupMerger"/>
<index_group_merger_entry name="com.ibm.takmi.impl.std.idx.doctokey.DocToKeyIndexGroupMerger"/>
<index_group_merger_entry name="com.ibm.takmi.impl.std.idx.keytodoc.KeyToDocIndexGroupMerger"/>
<index_group_merger_entry name="com.ibm.takmi.impl.std.idx.numtodoc.NumToDocIndexGroupMerger"/>
<index_group_merger_entry name="com.ibm.takmi.impl.std.idx.idtodoc.IdToDocIndexGroupMerger"/>
<index_group_merger_entry name="com.ibm.takmi.impl.std.idx.metadata.MetadataIndexGroupMerger"/>
</index_group_merger_entries>
<path_entries>
<path_entry name="index_root" value="db/index"/>
<path_entry name="keyword_id" value="keyword_id"/>
<path_entry name="doc_to_key" value="doc_to_key"/>
<path_entry name="key_to_doc" value="key_to_doc"/>
<path_entry name="cat_to_doc" value="cat_to_doc"/>
<path_entry name="num_to_doc" value="num_to_doc"/>
<path_entry name="id_to_doc" value="id_to_doc"/>
<path_entry name="metadata" value="metadata"/>
</path_entries>
<data_entries min_doc_id="0" max_doc_id="0">
</data_entries>
</impl>
</database_config>

5.4 Notes on the NLP (natural language processing) for Japanese
In OmniFind Analytics Edition V8.4.1, natural language processing for Japanese has been improved as listed below. For the consistent analysis result, you may rerun the natural language processing on the existing data.
  • periods no longer delimit sentences if they are between alphabetical characters.
  • symbols like slashes now delimit words.

Terms of Use

Notices
This information was developed for products and services offered in the U.S.A.

IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to:

IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
U.S.A. 
For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to:

IBM World Trade Asia Corporation
Licensing
2-31 Roppongi 3-chome, Minato-ku
Tokyo 106-0032, Japan 
The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.

IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.

Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact:

IBM Corporation
Silicon Valley Lab
Building 090/H-410
555 Bailey Avenue
San Jose, CA 95141-1003
U.S.A.
Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee.

The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us.

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

All statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only.

This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.

Copyright License
This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.

Trademarks
This topic lists IBM trademarks and certain non-IBM trademarks.

See http://www.ibm.com/legal/copytrade.shtml for information about IBM trademarks.

The following terms are trademarks or registered trademarks of other companies:

Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.

Intel, Intel Inside (logos), MMX and Pentium are trademarks of Intel Corporation in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Linux is a trademark of Linus Torvalds in the United States, other countries, or both.

Other company, product or service names might be trademarks or service marks of others.