When planning for XML collections, you have different considerations for composing documents from DB2 data, decomposing XML document into DB2 data, or both. The following sections address planning issues for XML collections, and address composition and decomposition considerations.
After you choose an access and storage method, you can determine whether to validate your data. You validate XML data using a DTD. Using a DTD ensures that the XML document is valid and lets you perform structured searches on your XML data. The DTD is stored in the DTD repository.
Recommendation: Validate XML data with a DTD. To validate, you need to have a DTD in the XML Extender repository. To learn how to insert a DTD into the repository, see Storing a DTD in the DTD repository. The DTD requirements differ depending on whether you are composing or decomposing XML documents.
Important: Make the decision whether to validate XML data before inserting XML data into DB2. The XML Extender does not support the validation of data that has already been inserted into DB2.
Considerations:
For XML collections, the DAD file maps the structure of the XML document to the DB2 tables from which you either compose the document, or to where you decompose the document.
For example, if you have an element called <Tax> in your XML document, you might need to map <Tax> to a column called TAX. You define the relationship between the XML data and the relational data in the DAD.
The DAD file is specified either while enabling a collection, or when you use the DAD file in XML collection stored procedures. The DAD is an XML-formatted document, residing at the client. If you choose to validate XML documents with a DTD, the DAD file can be associated with that DTD. When used as the input parameter of the XML Extender stored procedures, the DAD file has a data type of CLOB.
To specify the XML collection access and storage method, you use the following tag in the DAD file:
An XML collection is a virtual name for a set of relational tables that contains XML data. Applications can enable an XML collection of any user tables. These user tables can be existing tables of legacy business data or tables that the XML Extender recently created. You access XML collection data mainly through the stored procedures that the XML Extender provides.
The DAD file defines the XML document tree structure, using the following kinds of nodes:
Figure 10 shows a fragment of the mapping that is used in a DAD file. The nodes map the XML document content to table columns in a relational table.
<?xml version="1.0"?> <!DOCTYPE DAD SYSTEM "c:\dtd\dad.dtd"> <DAD> ... <Xcollection> <SQL_stmt> ... </SQL_stmt> <prolog>?xml version="1.0"?</prolog> <doctype>!DOCTYPE DAD SYSTEM "c:\dxx\sample\dtd\getstart.dtd"</doctype> <root_node> <element_node name="Order"> --> Identifies the element <Order> <attribute_node name="key"> --> Identifies the attribute "key" <column name="order_key"/> --> Defines the name of the column, "order_key", to which the element and attribute are mapped </attribute_node> <element_node name="Customer"> --> Identifies a child element of <Order> as <Customer> <text_node> --> Specifies the CDATA text for the element <Customer> <column name="customer"> --> Defines the name of the column, "customer", to which the child element is mapped </text_node> </element_node> ... </element_node> ... <root_node> </Xcollection> </DAD> |
In this example, the first two columns in the SQL statement have elements and attributes mapped to them.
You can use the XML Extender administration wizard or an editor to create and update the DAD file.
If you are using an XML collection, you must select a mapping scheme that defines how XML data is represented in a relational database. Because XML collections must match a hierarchical structure that is used in XML documents with a relational structure, you should understand how the two structures compare. Figure 11 shows how the hierarchical structure can be mapped to relational table columns.
Figure 11. XML document structured mapped to relational table columns
![]() |
The XML Extender uses the mapping scheme when composing or decomposing XML documents that are located in multiple relational tables. The XML Extender provides a wizard that assists you in creating the DAD file. However, before you create the DAD file, you must think about how your XML data is mapped to the XML collection.
The mapping scheme is specified in the <Xcollection> element in the DAD file. The XML Extender provides two types of mapping schemes: SQL mapping and Relational Database (RDB_node) mapping. Both methods use the XSLT model to define the hierarchy of the XML document.
The SQL mapping gives DB2 users the power to map the data using SQL. When using SQL mapping, you must be able to join all tables in one SELECT statement to form a query. If one SQL statement is not sufficient, consider using RDB_node mapping. To tie all tables together, the primary key and foreign key relationship is recommended among these tables.
The RDB_node contains one or more node definitions for tables, optional columns, and optional conditions. The tables and columns are used to define how the XML data is to be stored in the database. The condition specifies the criteria for selecting XML data or the way to join the XML collection tables.
To define a mapping scheme, you create a DAD with an <Xcollection> element. Figure 12 shows a fragment of a sample DAD file with an XML collection SQL mapping that composes a set of XML documents from data in three relational tables.
<?xml version="1.0"?> <!DOCTYPE DAD SYSTEM "c:\dtd\dad.dtd"> <DAD> <dtdid>c:\dxx\samples\dad\getstart.dtd</dtdid> <validation>YES</validation> <Xcollection> <SQL_stmt> SELECT o.order_key, customer, p.part_key, quantity, price, tax, date, mode, comment FROM order_tab o, part_tab p, table(select substr(char(timestamp(generate_unique())), as ship_id, date, mode, from ship_tab) as s WHERE p.price > 2500.00 and s.date > "1996-06-01" AND p.order_key = o.order_key and s.part_key = p.part_key </SQL_stmt> <prolog>?xml version="1.0"?</prolog> <doctype>!DOCTYPE DAD SYSTEM "c:\dxx\samples\dtd\getstart.dtd"</doctype> <root_node> <element_node name="Order"> <attribute_node name="key"> <column_name="order_key"/> </attribute_node> <element_node name="Customer"> <text_node> <column name="customer"/> </text_node> <element_node> ... </element_node><!-end Part-> </element_node><!-end Order-> </root_node> </Xcollection> </DAD> |
The XML Extender provides several stored procedures that manage data in an XML collection. These stored procedures support both types of mapping, but require that the DAD file follow the rules that are described in Mapping scheme requirements.
The following sections describe requirements for each type of the XML collection mapping schemes.
In this mapping scheme, you must specify the SQL_stmt element in the DAD <Xcollection> element. The SQL_stmt should contain a single SQL statement that can join multiple relational tables with the query predicate. In addition, the following clauses are required:
SELECT o.order_key, customer, p.part_key, quantity, price, tax, ship_id, date, mode
In this example, order_key and customer from table ORDER_TAB have the highest relational level because they are higher on the hierarchical tree of the XML document. The ship_id, date, and mode from table SHIP_TAB are at the lowest relational level.
FROM order_tab as o, part_tab as p, table(select substr(char(timestamp(generate_unique())),16) as ship_id, date, mode from ship_tab) as s
In this example, the generate_unique() function is cast to a CHAR data type of TIMESTAMP, and it is given an alias named ship_id.
WHERE p.price > 2500.00 AND s.date > "1996-06-01" AND p.order_key = o.order_key AND s.part_key = p.part_key
Although the SQL_stmt has the preceding requirements, it is powerful because you can specify any predicate in your WHERE clause, as long as the expression in the predicate uses the columns in the tables.
When using this mapping method, do not use the element SQL_stmt in the <Xcollection> element of the DAD file. Instead, use the RDB_node element in each of the top nodes for element_node and for each attribute_node and text_node.
The top element_node in the DAD file represents the root element of the XML document. Specify an RDB_node for the top element_node as follows:
<element_node name="Order"> <RDB_node> <table name="order_tab"/> <table name="part_tab"/> <table name="ship_tab"/> <condition> order_tab.order_key = part_tab.order_key AND part_tab.part_key = ship_tab.part_key </condition> </RDB_node>
<table name="part_tab" key="part_key, price"/>
The information specified for decomposition is ignored when composing a document.
You must explicitly spell out the table name and the column name.
In this mapping scheme, the data resides in the attribute_node and text_node for each element_node. Therefore, the XML Extender needs to know from where in the database it needs to find the data. You need to specify an RDB_node for each attribute_node and text_node, telling the stored procedure from which table, which column, and under which query condition to get the data. You must specify the table and column values; the condition value is optional.
<element_node name="Price"> <text_node> <RDB_node> <table name="part_tab"/> <column name="price"/> <condition> price > 2500.00 </condition> </RDB_node> </text_node> </element_node>
<column name="order_key" type="integer"/>
The information specified for decomposition is ignored when composing a document.
With the RDB_node mapping approach, you don't need to supply SQL statements. However, putting complex query conditions in the RDB_node element can be more difficult. For example, using a union, expression, or operation is somewhat less powerful than the SQL-to-XML approach.
Decomposition uses RDB_node mapping to specify how an XML document is decomposed into DB2 tables by extracting the element and attribute values into table rows. The values from each XML document are stored in one or more DB2 tables. Each table can have a maximum of 1024 rows decomposed from each document.
For example, if an XML document is decomposed into five tables, each of the five tables can have up to 1024 rows for that particular document. If the table has rows for multiple documents, it can have up to 1024 rows for each document. If the table has 20 documents, it can have 20,480 rows, 1024 for each document.
Using multiple-occuring elements (elements with location paths that can occur more than once in the XML structure) affects the number of rows. For example, a document that contains an element <Part> that occurs 20 times, might be decomposed as 20 rows in a table. When using multiple occuring elements, consider this table size restriction.