IBM Content Analyzer Dictionary Editor Guide

 

Edition Notice
This edition applies to version 8, release 4 of IBM Content Analyzer and to all subsequent releases and modifications until otherwise indicated in new editions.

This document contains proprietary information of IBM. This proprietary information is provided in accordance with the license conditions and is protected by copyright. Information contained in this document provides no warranties whatsoever for any products. Also, no descriptions provided in this document should be interpreted as product warranties. Depending on the system environment, the yen symbol may be displayed as the backslash symbol, or the backslash symbol may be displayed as the yen symbol.

© Copyright International Business Machines Corporation 2007, 2008. All rights reserved.

US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

1 Introduction
This document describes how to use the IBM Content Analyzer Dictionary Editor application.
1.1 Functional Overview
The Dictionary Editor is a Web application that you can use to edit the following items. See the Overview document for definitions of terms such as category, keyword, and synonyms.
The following figure shows the relationship between editing a dictionary with Dictionary Editor and analysis by Text Miner:

1.2 Dictionary Resource Files
The Dictionary Editor supports editing operations by multiple users. To avoid editing conflicts, Dictionary Editor includes a mechanism to lock the files to be edited and prevent other users from editing the same files. Smooth operation can be ensured if users know which files might cause conflicts when they edit them. A description of each file type is as follows:
1.3 Page Transition
Dictionary Editor provides the following screens for editing the category tree, keywords, and synonyms.

2 Before Editing the Dictionary
2.1 Initial Screen and Database Selection
In the initial state, a database has not been selected and among the menu items listed in the left side of the screen, only "Select Database" and "Help" are active.

Select a database in the initial screen:
  1. Click Select Database under Menu.
  2. Select the database that is to be used in the dictionary edit operation from the list.
  3. Click OK.
2.2 After Selecting a Database
The following screen is shown after you select a database.

After selecting a database:
  1. A message saying that the selected database has been loaded is displayed.
  2. The selected database is shown in the Current Database area.
  3. Configuration, Edit Category Tree, Edit Keywords and Edit Rules become active.
2.3 Editing the Settings
To change the keyword edit settings, click Configuration under Menu.

Configuration screen:
  1. Click Configuration under Menu to open the Configuration screen.
  2. Use the select box to specify the number of keywords to be displayed in a page.
  3. Click the Save button to save the change.
Note: The value set here becomes the maximum number of lines in the candidate word list and registered keyword list in 4.4 Candidate Words Display Mode in the Edit keyword screen.
2.4 After Editing the Settings
The following screen is shown after you edit the settings.

After editing the settings:


A message to confirm that the new settings have been saved is displayed.

3 Editing the Category Tree
3.1 How to Start
To edit the category tree, click Edit Category Tree from the Menu.

How to start editing the category tree:
  1. Click Edit Category Tree under Menu.
  2. Currently registered categories are displayed.
3.2 Warning when Editing the Category Tree
When multiple users are editing the dictionary, you must ensure that no other users are using Dictionary Editor when you edit the category tree. The following warning message appears if you try to edit the category tree while another user is editing the category tree or keywords.

Category tree edit interrupt warning:


Click the OK button to interrupt the edit operation. Unsaved data that is being edited by another user will be discarded. The same warning message is also displayed if you or a user who edited the category tree or keywords immediately before you start editing closed the window without properly completing the edit operation. Clicking OK to start editing will not affect the user who already finished editing.

3.3 Adding a Category
Adding a category:
  1. To add a category, click Add Subcategory at the next hierarchy level.
  2. Type a category name in the dialog box.
  3. Click OK.
Notes:
3.4 Renaming a Category
Renaming a category:
  1. Click the Rename link to the right of the category name that you want to change.
  2. Type a category name in the dialog box.
  3. Click OK.
Note: The category name will not be changed if you click Cancel in the dialog box.
3.5 Deleting a Category
Deleting a category:
  1. Click the Delete link to the right of the category name that you want to delete.
  2. Click OK in the confirmation dialog box.
Notes:
3.6 Saving and Exiting Edit Mode
After editing the category tree, you must run the termination processing regardless of whether or not changes such as adding or deleting categories have been made, or whether or not changes must be saved. If the screen is closed while the edit operation continues, the category file (see 1.2 Dictionary Resource Files) is locked, and other users must interrupt when they want to edit the category tree or keywords.

Category save/quit menu:


(1) Save the current changes and continue the operation. The file stays locked; therefore, termination processing (2) or (3) is necessary.

(2) Save the changes and exit the category tree edit mode. The file will be unlocked.

(3) Exit the category tree edit mode without saving changes. The file will be unlocked.

The following screen is shown after you save and exit the edit mode:


(1) The termination message appears.

(2) The file is unlocked, and Edit Category Tree, Edit Keywords and Edit Rules become active again.

3.7 Automatically Generated Dependency Categories
When the category tree is edited, categories for dependency keywords are automatically created. These categories cannot be seen while using Dictionary Editor, but they can be used with Text Miner.

Dictionary Editor category tree
Product
     Hardware
     Software

    ↓

Text Miner category tree
Product
     Hardware
         Dependency
             Hardware .. bad reputation
             Hardware .. verbs
             Hardware .. problem
             Hardware .. good reputation
             Hardware .. senses
             Hardware .. requests
             Hardware .. questions
     Software
         Dependency
             Software .. bad reputation
             Software .. verb
             Software .. problem
             Software .. good reputation
             Software .. senses
             Software .. requests
             Software .. questions
     Dependency
         Product .. bad reputation
         Product .. verb
         Product .. problem
         Product .. good reputation
         Product .. senses
         Product .. requests
         Product .. questions

In this example, the "dependency" category is added immediately below the "product," "hardware," and "software" categories, and below that, categories to show phrases using various types of declinable words are added. Dependency expressions belonging to these dependency categories are phrases consisting of keywords registered in individual categories and indeclinable words, in the same manner as the basic dependency categories described in 3 System-defined Categories. Note, however, that the "dependency" category immediately below the "product" category is only for phrases consisting of keywords that belong to the "product" category and various indeclinable words; dependency involving the "hardware" and "software" categories is not included.
4 Editing Keywords
4.1 How to Start
To edit keywords, click Edit Keywords from the Menu.

How to start editing keywords:


Click Edit Keywords to open the keyword file selection dialog.

  1. Select the check box if you want to use candidate word files (see 1.2 Dictionary Resource Files).
  2. Specify the keyword file to be edited (see 1.2 Dictionary Resource Files).
  3. Click OK. The dialog closes and the candidate words display mode of the edit keyword screen starts.
Hint: To create a new keyword file, select New File, type a file name without an extension in the text field, and then click OK.
4.2 Warning when Editing Keywords
If another user is editing the category tree, a pop-up warning message is displayed.

Keyword edit interrupt warning:


Click the OK button to interrupt the edit operation. Unsaved category tree data that is being edited by another user will be discarded. The same warning message will be displayed if you or a user who edited the category tree immediately before you start editing closed the window without properly completing the edit operation. Clicking OK to start editing will not affect the user who already finished editing.

4.3 Keyword Files Currently Being Edited
When selecting a keyword file in the keyword file selection dialog box, if another user is editing the keyword file, the message "Used by another user" appears on the right side of the file currently being edited.

Keyword file selection dialog while the keyword file is being edited:


Select the keyword file that is being edited and then click OK to open a dialog to confirm interrupting the edit. Click OK again. Unsaved edit data created by another user will be discarded.

4.4 Candidate Words Display Mode
The structure of the Edit keyword screen in the candidate words display mode is as follows.

Edit keyword screen in the candidate words display mode:


(1) Search/Sort menu: use this area to narrow down or sort candidate words or keywords that are displayed in (3) and (4). Operations in this area will be reflected in both lists at the same time.

(2) Display mode select box: use this select box to switch the display modes between the candidate words display mode and category tree display mode.

(3) Candidate word list: a list of candidate words will be displayed when two or more candidate files are selected in the keyword file selection dialog (see 1.2 Dictionary Resource Files).

(4) Registered keyword list: this is a list of keywords that are already registered in the keyword file (see 1.2 Dictionary Resource Files). Keywords can be registered while comparing between the candidate word list and the registered keyword list.

(5) Add and Delete: use these arrows to add or delete keywords. Use the right arrow to add keywords and use the left arrow to delete keywords.

(6) Save buttons: Use these buttons to save keyword edit information and exit the edit mode.

4.5 Search and Sort
The candidate word list and the registered keyword list can be further narrowed or sorted by using the word type filter (search), string match (search), and sort functions.

Search/Sort menu:


4.6 Adding Keywords from the List of Candidate Words
To use particular candidate words as keywords, follow the procedures below.

Adding candidate words as keywords:
  1. Select the check box of the candidate word to be added. You can select multiple candidate words at the same time.
  2. Click the arrow button to add the candidate words.
The newly added keywords are shown and highlighted at the top of the Registered keyword list.

After adding candidate words as keywords:


Hint: Clicking the Select All button above the Candidate word list selects all of the check boxes. This is a useful function when you want to add many keywords at the same time. After you click the Select All button, this button changes into the Cancel All button, and clicking this button will clear all of the selected check boxes.

4.7 Adding New Keywords by Entering Character Strings
To add keywords by directly entering character strings instead of selecting them from the candidate word list, follow the procedures below.

Adding candidate words as keywords:
  1. Click the New Keyword button of the Registered keyword list.
  2. When the dialog box appears, type the keyword.
  3. Click OK.
Notes:
4.8 Deleting Keywords
To delete keywords, follow the procedures below.

Deleting candidate words as keywords:
  1. Select the check box of the keyword in the Registered keyword list that you want to delete. You can select multiple keywords simultaneously.
  2. Click the arrow button for deleting keywords.
Notes:
4.9 Editing Synonyms
To edit synonyms of a particular keyword, click the Edit button to the right of the keyword that you want to edit in the Registered keyword list.

Edit synonyms:


Click the Edit button to open the Edit synonym screen.

Edit synonym screen:


(1) Use these radio buttons to select a synonym to be regarded as a keyword (standard form). If the identical character string has been registered as a synonym of a different keyword, that character string (candidate for a synonym) will appear on a different screen, and the radio buttons operate accordingly. The system operates this way in order to let users know that a word that is registered as a synonym of a different keyword can be separately registered as a keyword.

(2) Use these check boxes to select which words are to be used as synonyms. The check box will be automatically checked for the one with the radio button checked in the Keyword column.

(3) This area shows choices (candidates) for the keyword and synonyms.

(4) This area shows types of synonym candidates. The meaning of each type is as follows:
Type Meaning
Current keyword A keyword for which the Edit button is clicked in the Edit keyword screen.
Current synonym A synonym that is currently registered as a synonym of the current keyword.
Unused synonym candidate Among the synonym candidates for the current keyword and the current synonym, a word that is currently registered as a separate keyword or a keyword to which a word registered as a separate synonym belongs.
Aforementioned synonym A word that is registered in the candidate word file as a synonym candidate, or, a newly added synonym which is currently registered as a synonym of a different keyword.
Unregistered synonym A word that is registered in the candidate word file as a synonym candidate, or a newly added synonym which is not yet registered as a keyword or synonym.

(5) Click OK to apply the synonym settings to the Edit keyword screen. The keyword file is not yet saved when you click OK. To save the keyword file, you must save it in the Edit keyword screen.

Edit keyword screen after editing synonyms:


In the Edit keyword screen, synonyms are listed to the right of the equal sign.
4.10 Registering New Synonyms
To register new synonyms by entering character strings in the edit synonym screen, follow the procedures below.

Adding a new synonym:
  1. Click the New Synonym button to open the dialog box for entering a synonym.
  2. Type a synonym in the dialog box.
  3. Click OK.
After adding a new synonym:


The specified character string is added as a synonym candidate with the Synonym check box checked. Click the OK button at the bottom of the screen to add it as a new synonym. The keyword file is not saved at this point; therefore, it is necessary to save it in the Edit keyword screen.
4.11 Category Tree Display Mode
The structure of the Edit keyword screen in the category tree display mode is as follows.

Edit keyword screen in the category tree display mode:


The difference between this mode and 4.4 Candidate Words Display Mode is that in this mode, the category tree is displayed instead of the candidate word list.
4.12 Category Search
In the category tree display mode of the Edit keyword screen, you can specify a category to search registered keywords.

Category search:


(1) When the category name is clicked, the message "Selected" appears for that category.

(2) Keywords listed in the Registered keyword list are narrowed to the keywords registered in the specified category. This function can be used with other search or sorting functions.

(3) Click Reset Category Search" to cancel the search and restore the original list.
4.13 Registering Keywords in a Category
To register a keyword in a category, follow the procedures below.

Registering a keyword in a category:
  1. In the Registered keyword list, select the check box of a keyword that you want to register in a particular category. You can select multiple keywords simultaneously.
  2. Click the Add button to the right of the category name to register the selected keyword in that category.
After the keyword is registered, the category name appears in the Category area in the Registered keyword table.

After registering a keyword in a category:


Click the Remove button on the lower right side of the category name to cancel the category registration.
4.14 Saving and Exiting Edit Mode
After editing the keywords, you must run the termination processing regardless of whether or not changes such as adding or deleting keywords have been made, or whether or not changes must be saved. If the screen is closed while the edit operation continues, the keyword file (see 1.2 Dictionary Resource Files) is locked, and other users must interrupt when they want to edit the category tree or keywords.

Keyword file save/quit menu:


(1) Save the current changes and continue the operation. The file stays locked; therefore, the termination processing (2) or (3) is necessary.

(2) Save the changes and exit the keyword edit mode. The file will be unlocked.

(3) Exit the keyword edit mode without saving changes. The file will be unlocked.

Screen shown after saving and exiting the edit mode:


(1) The termination message appears.

(2) The file is unlocked, and Edit Category Tree, Edit Keywords and Edit Rules become active again.

Terms of Use
Notices
This information was developed for products and services offered in the U.S.A.

IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to:

IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
U.S.A. 
For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to:

IBM World Trade Asia Corporation
Licensing
2-31 Roppongi 3-chome, Minato-ku
Tokyo 106-0032, Japan 
The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.

IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.

Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact:

IBM Corporation
Silicon Valley Lab
Building 090/H-410
555 Bailey Avenue
San Jose, CA 95141-1003
U.S.A.
Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee.

The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us.

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

All statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only.

This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.

Copyright License
This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.

Trademarks
This topic lists IBM trademarks and certain non-IBM trademarks.

See http://www.ibm.com/legal/copytrade.shtml for information about IBM trademarks.

The following terms are trademarks or registered trademarks of other companies:

Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.

Intel, Intel Inside (logos), MMX and Pentium are trademarks of Intel Corporation in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Linux is a trademark of Linus Torvalds in the United States, other countries, or both.

Other company, product or service names might be trademarks or service marks of others.