ULENGTH
The ULENGTH function returns an integer value that is
equal to the number of UTF-8 or UTF-16 characters in a character data
item argument that contains UTF-8 or UTF-16 data.
The function type is integer.
- argument-1
- Must be of class
alphabetic, alphanumeric, or national
.
argument-1 must contain valid UTF-8 or UTF-16 encoded characters:
- If argument-1 is of class alphabetic or alphanumeric, it must contain valid UTF-8 data.
- If argument-1 is of class national, it must contain valid UTF-16 data.
The returned value is the number of
UTF-8 or UTF-16
characters in argument-1.
Example 1
If
argument-1 is a UTF-8
encoded item and
the UTF-8 argument contains composed characters,
the combining characters are counted individually in determining the
length. For example, when encoded in UTF-8, the Unicode character ä can
be x'C3A4' or x'61CC88'. With either of the UTF-8 characters as argument-1,
the returned values of the ULENGTH function are different. See the
following table for details.


Character | Unicode encoding | UTF-8 encoding | Returned value of the ULENGTH function |
---|---|---|---|
ä | U+00E4
(precomposed form,
latin small letter a with diaeresis) |
x'C3A4' | 1 |
U+0061 + U+0308
(canonical decomposition,
latin small letter a + combining diaeresis) |
x'61CC88' | 2 |

Example 2
If argument-1 is a national data item that contains UTF-16 data and argument-1 contains surrogate pairs, each pair of low and high surrogates will be counted as one UTF-16 character. For example, if B is a national item that contains the UTF-16 value x'005400F6006200750072D858DC6B0073' ('Töber𦁫s'), the returned value from ULENGTH(B) will be 7. Character 𦁫 = X'D858DC6B' is counted as one UTF-16 character.
