When interpreting commands, the database manager must identify which characters are valid, and which are not. To do this, the database manager uses an internal character classification table.
In the table, each of the 256 possible SBCS hexadecimal values are assigned a classification. The database manager uses these classifications to tell whether a character is, for example, a delimiter or a numeric. There are 12 classes. Each hexadecimal value is assigned one of these classes. The only hexadecimal values that you are able to reclassify are those that, in the ENGLISH character set, are classified as 3 or 0. Values classified as 0 can be reclassified as 3, and values classified as 3 can be reclassified as 0. No other reclassifications are allowed. The only exception to this rule occurs with certain class 6 characters. See the class 6 description below for details. See Table 57 for the ENGLISH character set class. Other character classes are shown only for reference:
Any hexadecimal value assigned to this class cannot be used in keywords or unquoted identifiers.
The hexadecimal value assigned to this class is a blank. A blank, in the SQL language, is a delimiter between keywords. The database manager uses X'40' for blanks.
The hexadecimal value assigned to this class is an apostrophe ('). An apostrophe, in the SQL language, is the delimiter for character constants. The database manager uses X'7D' for an apostrophe.
Numeric, uppercase English alphabetics, and underscores all belong to other classes. In the default ENGLISH character set, the lowercase alphabetics along with $, #, and .* are assigned to this class. In the sample character sets, characters such as n-tilde and o-umlaut are assigned this class.
Any hexadecimal value assigned to this class is a numeric. The SQL language defines the X'F0' to X'F9' to represent the numbers 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. You must not assign class 4 to any other hexadecimal values, nor can you reassign hexadecimal values X'F0' to X'F9' to some other class.
Any hexadecimal value assigned to this class is a period. A period, in the SQL language, is the delimiter between a qualifier (such as an owner) and a data object (such as a table). The database manager uses X'4B' for a period.
Hexadecimal values assigned to this class have special meanings in the SQL language, just as numerics do. You must not assign class 6 to any hexadecimal values other than those listed below. Nor can you reassign the hexadecimal values shown to some other class. The only exceptions are the ones which have a different hexadecimal value depending on the application server default CCSID used. For those hexadecimal values listed which map to a character used in the SQL language for your application server default CCSID, do not reassign these values. For those hexadecimal values listed which do not map to a character used in the SQL language for your application server default CCSID, you can assign them to class 0 or class 3.
For example, X'5A' maps to the exclamation mark (!) for CCSID 37. For CCSID 500, X'5A' maps to the right square bracket (]). For CCSID 37, the hexadecimal value should be class 6. For CCSID 500, the hexadecimal value could be either class 0 or class 3. In the SQL language the following hex values have these meanings:
Any hexadecimal value assigned to this class is a double quotation mark ("). A double quotation mark, in the SQL language, is the delimiter for quoted identifiers. The database manager uses X'7F' for a double quotation mark.
You should not assign any hexadecimal value to this class. When the DBCS option is YES, the database manager assigns this class to X'0E'.
You should not assign any hexadecimal value to this class. When the DBCS option is YES, the database manager assigns this class to X'0F'.
This class is restricted to all English uppercase alphabetics (hexadecimal values X'C1' to X'C9', X'D1' to X'D9', and X'E2' to X'E9'). English uppercase alphabetics can be used in unquoted identifiers and keywords. (This is true no matter what SBCS character set is specified.)
Any hexadecimal value assigned to this class is an underscore. An underscore, in the SQL language, can be used in an unquoted identifier except as a starting character. The database manager uses X'6D' for an underscore.
When you have defined a character set, you must classify each hexadecimal value that has a different representation in your character set than it does in the ENGLISH character set.
The database manager always sets the first 64 hexadecimal values (X'00' to X'3F') to class 0. You can set only the remaining 192 hexadecimal values. Therefore, if any character in your set has a hexadecimal value within X'00' to X'3F', you can use that hexadecimal value only in quoted identifiers.
The only hexadecimal values that the database manager reclassifies in the first 64 are X'0E' and X'0F'. Those hexadecimal values are permanently defined to the database manager as the DBCS shift-out and shift-in characters. When the DBCS option is YES, the database manager reclassifies X'0E' to class 8 and X'0F' to class 9. For more information, see Using Double-Byte Character Set (DBCS).
Not all SBCS character sets can be classified for use with the database manager, because it reserves certain hexadecimal values. For example, all hexadecimal values that (in the ENGLISH character set) represent uppercase English letters are reserved. The database manager reserves hexadecimal values so it can correctly interpret SQL statements.
Use Table 57 to classify your character set. The first column gives the hexadecimal value. The next two columns identify the ENGLISH classification and conversion values for each of those hexadecimal values. (Translation values are discussed in the next step.) The fourth and fifth columns show the classification and conversion values for the PORTUGUESE example. The remaining two columns are for your own character set.
Note: | All hexadecimal values are reserved except those that are classified as 0 or 3 in the ENGLISH character set. |
Any hexadecimal value that is classified in the ENGLISH character set as 0 or 3 can be reclassified as 3 or 0. Keep in mind that all hexadecimal values that are classified as 0 cannot be used in keywords and unquoted identifiers. Therefore, you would not want to classify as 0 any letter that is within your language's alphabet. You would not be able to use those letters in unquoted identifiers.
The English alphabet consists of the following letters: A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, and Z. Most likely, the hexadecimal values for letters in your language that are not in the English alphabet are classified as 0 in the ENGLISH character set. You would typically change the classification to 3.
If you must reclassify a hexadecimal value, but the hexadecimal value is reserved, then it is not possible to completely classify the character set. In that situation, it may not be to your advantage to specify an alternative character set. For example, if a character in your alphabet has a hexadecimal value that is assigned class 6 in the ENGLISH character set, you cannot reclassify that hexadecimal value (the only exceptions are the hexadecimal values associated with the characters |, !, ¬ and ^).
The rationale used in classifying the PORTUGUESE character set hexadecimal values that are different from the ENGLISH character set is as follows:
Having reclassified the characters, you next need to consider the translation values of those characters.
Table 57. Character Classification and Translation Table
Hex Value | English Class. | English Trans. | Brazilian Class. | Brazilian Trans. | Your Class. | Your Trans. |
---|---|---|---|---|---|---|
40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F |
1 0 0 0 0 0 0 0 0 0 0 5 6 6 6 6 |
40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F |
3 |
|
|
|
50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F |
6 0 0 0 0 0 0 0 0 0 6 3 6 6 6 6 |
50 51 52 53 54 55 56 57 58 59 5A 5B 5C 5D 5E 5F |
0 |
|
|
|
60 61 62 63 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F |
6 6 0 0 0 0 0 0 0 0 0 6 6 B 6 6 |
60 61 62 63 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F |
3 |
X'5B' |
|
|
70 71 72 73 74 75 76 77 78 79 7A 7B 7C 7D 7E 7F |
0 0 0 0 0 0 0 0 0 0 6 3 3 2 6 7 |
70 71 72 73 74 75 76 77 78 79 7A 7B 7C 7D 7E 7F |
3 |
X'7C' |
|
|
80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F |
0 3 3 3 3 3 3 3 3 3 0 0 0 0 0 0 |
80 C1 C2 C3 C4 C5 C6 C7 C8 C9 8A 8B 8C 8D 8E 8F |
|
|
|
|
90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F |
0 3 3 3 3 3 3 3 3 3 0 0 0 0 0 0 |
90 D1 D2 D3 D4 D5 D6 D7 D8 D9 9A 9B 9C 9D 9E 9F |
|
|
|
|
A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 AA AB AC AD AE AF |
0 0 3 3 3 3 3 3 3 3 0 0 0 0 0 0 |
A0 A1 E2 E3 E4 E5 E6 E7 E8 E9 AA AB AC AD AE AF |
|
|
|
|
B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF |
6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 |
B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF |
0 6 6 |
|
|
|
C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF |
0 A A A A A A A A A 0 0 0 0 0 0 |
C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 CA CB CC CD CE CF |
3 |
X'7B' |
|
|
D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF |
0 A A A A A A A A A 0 0 0 0 0 0 |
D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF |
3 |
X'4A' |
|
|
E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF |
0 0 A A A A A A A A 0 0 0 0 0 0 |
E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF |
|
|
|
|
F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF |
4 4 4 4 4 4 4 4 4 4 0 0 0 0 0 0 |
F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF |
|
|
|
|