The following sections describe the planning tasks for XML columns.
After you choose an access and storage method, you can determine whether to validate your data and you can specify the DAD. You validate XML data using a DTD. Using a DTD ensures that the XML document is valid and lets you perform structured searches on your XML data. The DTD is stored in the DTD repository.
Recommendation: Validate XML data with a DTD, unless you are storing XML documents for archival purposes. To validate, you need to have a DTD in the XML Extender repository. See Storing a DTD in the DTD repository to learn how to insert a DTD into the repository.
You can validate documents in the same XML column using different DTDs. In other words, you can have documents that have a similar structure, with similar elements and attributes, that call DTDs that are different. To reference multiple DTDs, use the following guidelines:
Important: Make the decision whether to validate before inserting XML data into DB2. The XML Extender does not support the validation of data that has already been inserted into DB2.
Considerations:
You store an XML document in an XML column as a UDT. See Table 4 for the available UDTs.
Table 4. The XML Extender UDTs
User-defined type column | Source data type | Usage description |
---|---|---|
XMLVARCHAR | VARCHAR(varchar_len) | Stores an entire XML document as VARCHAR inside DB2. |
XMLCLOB | CLOB(clob_len) | Stores an entire XML document as CLOB inside DB2. |
XMLFILE | VARCHAR(1024) | Stores the file name of an XML document in DB2, and stores the XML document in a file local to the DB2 server. |
When planning for side tables, you must consider how to organize the tables, how many tables to create, and whether to create a default view for the side tables. These decisions are partly based on serveral issues: whether elements and attributes can occur multiple times, and the requirements for query performance.
When a document has multiple occuring location paths, XML Extender will add a column DXX_SEQNO of type INTEGER in each side table to keep track of the order of elements that occur more than once. With DXX_SEQNO, you can retrieve a list of the elements using the same order as the original XML document by specifying ORDER BY DXX_SEQNO in an SQL query.
When you enable an XML column, you can specify a default, read-only view that joins the application table with the side tables using a unique ID, called the ROOT ID. With the default view, you can search XML documents by querying the side tables. For example, if you have the application table SALES_TAB, and the side tables ORDER_TAB, PART_TAB and SHIP_TAB:
SELECT sales_person FROM sales_order_view WHERE price > 2500.00
The SQL statement returns the names of sales people in SALES_TAB who have orders stored in the column ORDER, and where the PRICE is greater than 2500.00.
The advantage of querying the default view is that it provides a virtual single view of the application table and side tables. However, the more side tables that are created, the more expensive the query. Therefore, creating the default view is only recommended when the total number of side table columns is small. Applications can create their own views, joining the important side table columns.
An important planning decision is whether to index your XML column document. This decision should be made based on how often you need to access the data and how critical performance is during structural searches.
When using XML columns, which contain entire XML documents, you can create side tables to contain columns of XML element or attribute values, then create indexes on these columns. You must determine for which elements and attributes you need to create the index.
XML column indexing allows frequently queried data of general data types, such as integer, decimal, or date, to be indexed using the native DB2 index support from the database engine. The XML Extender extracts the values of XML elements or attributes from XML documents and stores them in the side tables, allowing you to create indexes on these side tables.
You can specify each column of a side table with a location path that identifies an XML element or attribute and an SQL data type. Figure 8 shows an XML column with side tables.
Figure 8. An XML column with side tables
![]() |
The XML Extender automatically populates the side table when you store XML documents in the XML column.
For fast search, create indexes on these columns using the DB2 B-tree indexing technology. The methods that are used to create an index vary on different operating systems, and the XML Extender supports these methods.
For example, you might want to create an index on /Order/Part/ExtendedPrice and specify /Order/Part/ExtendedPrice to be of data type REAL. In this case, the XML Extender stores the value of /Order/Part/ExtendedPrice in the PRICE column in a side table.
If the single primary key does not exist in the application table, or for some reason you don't want to use it, the XML Extender alters the application table to add a column DXXROOT_ID, which stores a unique ID that is created at the insertion time. All side tables have a DXXROOT_ID column with the unique ID. If the primary key is used as the ROOT ID, all side tables have a column with the same name and type as the primary key column in the application table, and the values of the primary keys are stored.
A location path is a sequence of XML tags that identify an XML element or attribute. The XML Extender uses the location path in the following situations:
Figure 9 shows an example of a location path and its relationship to the structure of the XML document.
Figure 9. Storing documents as structured XML documents in a DB2 table column
![]() |
The following list describes the location path syntax that is supported by the XML Extender. A single slash (/) path indicates that the context is the whole document.
Wildcards: You can substitute an asterisk for an element in a location path to match any string.
Simple location path is a sequence of element type
names that are connected by a single slash (/). The attribute values
are enclosed within square brackets following its element type. The
syntax descriptions for 3 and 6 in Location path syntax describe simple location paths. Table 5 describes the syntax.
Table 5. Simple location path syntax
Subject | Location path | Description |
---|---|---|
XML element | /tag1/tag2/..../tagn-1/tagn | An element content identified by the element named tagn and its parents |
XML attribute |
/tag_1/tag_2/..../tag_n-1/tag_n/@attr1
| An attribute with name attr1 of the element identified by tagn and its parents |
The XML Extender has restrictions for using the location path when defining
the element or attribute in the DAD. Because the XML Extender uses
one-to-one mapping between an element or attribute, and a DB2 column, it
requires special rules for the location path. Table 6 describes the restrictions for location path. The
numbers that are specified in the location path supported column refer to the
syntax representations in Location path syntax.
Table 6. The XML Extender's restrictions using location path
Use of the location path | Location path supported |
---|---|
Element in the DAD | 3, 6 (simple location path described in Table 5) |
Extracting UDFs | 1-9 |
Text Extender's search UDF | 1-9 |
For XML columns, the DAD primarily specifies how documents that are stored in an XML column are to be indexed. The DAD is an XML-formatted document, residing at the client. If you choose to validate XML documents with a DTD, the DAD file can be associated with that DTD. The DAD file has a data type of CLOB.
The DAD file for XML columns contains an XML header, specifies the directory paths on the client for the DAD file and DTD, and provides a map of any XML data that is to be stored in side tables for indexing.
To specify the XML column access and storage method, you use the following tag in the DAD file.
An XML-enabled column is of the XML Extender's UDT. Applications can include the column in any user table. You access the XML column data mainly through SQL statements and the XML Extender's UDFs.
You can use the XML Extender administration wizard or an editor to create and update the DAD.