IBM FileNet P8, Version 5.2.1            

Text normalization

During searching and indexing, some characters such as accented characters and umlauts are replaced with the equivalent characters from the Latin alphabet. This type of replacement is called normalization; it means that some words, such as Müller and Mueller, are treated as identical words. A search for either returns all instances of both.

The following table shows some of the characters that are normalized.

Character Replacement characters
é e
ç c
æ ae
ü ue


Last updated: March 2016
csscbr_normalization.htm

© Copyright IBM Corporation 2016.