IBM FileNet P8, Version 5.2.1            

Text normalization

During searching and indexing, some characters such as accented characters and umlauts are replaced with the equivalent characters from the Latin alphabet. This type of replacement is called normalization; it means that some words, such as Müller and Mueller, are treated as identical words. A search for either returns all instances of both.

The following table shows some of the characters that are normalized.

Character Replacement characters
é e
ç c
æ ae
ü ue


Last updated: October 2015
csscbr_normalization.htm

© Copyright IBM Corporation 2015.