Administration Guide
To support user-defined index types, DB2 Universal Database allows you to
create and apply your own logic for the primary components that make up how an
index works. Those components that can be substituted are:
- Index maintenance. This allows the ability to map index column
content to an index key. Such a mapping is done through a user-defined
mapping function. Exactly one structured type column can participate in
an extended index. Unlike an ordinary index, an extended index may have
more than one index entry per row. Multiple index entries per row could
enable a text document to be stored as an object with a separate index entry
for each keyword in the document.
- Index exploitation. This enables the application designer to
associate filtering conditions (range predicates) with an arbitrary
user-defined function (UDF) that would otherwise be opaque to the
optimizer. This enables DB2 to avoid making a separate UDF call for
each row, and thereby avoids context switching between client and server,
greatly improving performance.
Note: | The user-defined function definition must be deterministic and must not allow
external actions in order to be exploitable by the optimizer.
|
An optional data filter function can also be specified. The
optimizer uses the filter against the fetched tuple before the user-defined
function is evaluated.
Only a structured type or distinct type column can use the index extension
to create a user-defined extended index type on these objects. The
user-defined extended index type must not:
- Be defined with clustering indexes
- Have INCLUDE columns
You define two of the components that make up the operations of an
index through the CREATE INDEX EXTENSION statement.
Index maintenance is the process of transforming the index column content
(or source key) to a target index key. The transformation process is
defined using a table function that has previously been defined in the
database.
The FROM SOURCE KEY clause specifies a structured data type or distinct
type for the source key column supported by this index extension. A
single parameter name and data type are given and associated with the source
key column.
The GENERATE KEY USING clause specifies the user-defined table function
used to generate the index key. The output from this function must be
specified in the TARGET KEY clause specification (which is presented
below). The output from this function can also be used as input for the
index filtering function specified on the FILTER USING clause.
Index searching maps search arguments to search ranges.
The WITH TARGET KEY clause specifies the target key parameters that are the
output of the key generation function specified on the GENERATE KEY USING
clause. A single parameter name and data type are given and associated
with the target key column. This parameter corresponds to the columns
of the RETURNS table of the user-defined table function of the GENERATE KEY
USING clause.
The SEARCH METHODS clause introduces one or more search methods defined for
the index. Each search method consists of a method name, search
arguments, a range producing function, and an optional index filter
function. Each search method defines how index search ranges for the
underlying user-defined index are produced by a user-defined table
function. Further, each search method defines how the index entries in
a particular search range can be further qualified by a user-defined scalar
function.
- The WHEN clause associates a label with a search method. The label
is an SQL identifier that relates to the method name specified in the index
exploitation rule (found in the PREDICATES clause of a user-defined
function). One or more parameter names and data types are given for use
as arguments in the range function and/or index filtering
function. The WHEN clause specifies the action that can be taken by the
optimizer when the PREDICATES clause of the CREATE FUNCTION statement matches
an incoming query.
- The RANGE THROUGH clause specifies the user-defined external table
function that produces index key ranges. This enables the optimizer to
avoid calling the associated UDF when the index keys fall outside the key
ranges.
- The FILTER USING clause is an optional way of specifying a user-defined
external table function or a case expression used to filter index entries
returned from the range-producing function. If the value returned by
the index filter function or case expression is 1, the row corresponding to
the index entry is retrieved from the table. If the value returned is
something other than 1, the index entry is discarded. This feature is
valuable when the cost of the secondary filter is low compared to the cost of
evaluating the original method, and the selectivity of the secondary filter is
relatively low.
Index exploitation occurs in the evaluation of the search
method.
The CREATE FUNCTION (External Scalar) statement creates a user-defined
predicate used with the search methods defined for the index extension.
The PREDICATES clause identifies those predicates using this function that
can possibly exploit the index extensions (and that can possibly use the
optional SELECTIVITY clause for the predicate's search condition).
If the PREDICATES clause is specified, the function must be defined as
DETERMINISTIC with NO EXTERNAL ACTION.
- The WHEN clause introduces a specific use of the function being defined in
a predicate with a comparison operator (=, >, <, and others) and a
constant or expression (using the EXPRESSION AS clause). When a
predicate uses this function with the same comparison operator and the given
constant or expression, filtering and index exploitation may be used.
The use of a constant is provided mainly to cover Boolean expressions where
the result type is either a 1 or a 0. For all other cases, the
EXPRESSION AS clause is the better choice.
- The FILTER USING clause identifies a filter function that can be used to
perform additional filtering of the result table. It is an alternative
and faster version of the defined function (used in the predicate) that
reduces the number of rows on which the user-defined predicate must be
executed to determine if rows qualify. Should the results produced by
the index be close to the results expected by the user-defined predicate, then
the application of this filter function may be redundant.
- You can optionally define a set of rules for each search method of an
index extension to exploit the index. You can also define a search
method in the index extension to describe the search targets, the search
arguments, and how these can be used to perform the index search.
- The SEARCH BY INDEX EXTENSION clause identifies the index
extension.
- The optional EXACT clause indicates that the index lookup is exact in its
predicate evaluation. This clause tells the database not to apply the
original user-provided predicate function or the filter function after the
index lookup. If the index lookup is not used, then the original
predicate and the filter functions have to be applied. If the EXACT
clause is not used, then the original user-provided predicate is applied after
the index lookup. The EXACT predicate is useful when the index lookup
returns the same results as the predicate. This prevents the query
execution from applying the user-defined predicate on the results obtained
from the index lookup. If the index is expected to provide only an
approximation of the predicate, do not specify the EXACT clause.
- The WHEN KEY clause defines the search target. Only one search
target is specified for a key. The value given following the WHEN KEY
clause identifies a parameter name of the function being defined. This
clause is evaluated as true when the values of the named parameter are columns
that are covered by an index based on the index extension specified.
- The USE clause defines the search argument. The search argument
identifies which method defined in the index extension will be used.
The method name given here must match a method defined in the index
extension. The one or more parameter values identify parameter names of
the function being defined and which must be different from any of the
parameter names specified in the search target. The number of parameter
values and the data type of each must match the parameters defined for the
method in the index extension. The match must be exact for built-in and
distinct data types, and be within the same structure types.
A scenario for defining an index extension follows:
- Define the structured types (for shapes). Use the CREATE TYPE
statement to define a type hierarchy where shape is a supertype and nullshape,
point, line, and polygon are subtypes. These structured types model
spatial entities. For example, the location of a store is a point;
the path of a river is a line; and, the boundary of a business zone is a
polygon. A minimum bounded rectangle (mbr) is an attribute. The
gtype attribute identifies whether the associated entity is a point, a line,
or a polygon. Geographical boundaries are modeled by numpart, numpoint,
and geometry attributes. All other attributes are ignored because they
are of no interest to this scenario.
- Create the index extension.
- Use the CREATE FUNCTION statement to create functions that are used for
key transformation (gridentry), range-producing (gridrange), and index filter
(checkduplicate and mbroverlap).
- Use the CREATE INDEX EXTENSION statement to create the remaining needed
components of the index.
- Create the key transformation which corresponds to the index maintenance
component of an index.
CREATE INDEX EXTENSION iename (parm_name datatype, ...)
FROM SOURCE KEY (parm_name datatype)
GENERATE KEY USING table_function_invocation
...
The FROM SOURCE KEY clause identifies the parameter and data type of the
key transformation. The GENERATE KEY USING clause identifies the
function used to map the source key with the value generated from the
function.
- Define the range-producing and index-filter functions which correspond to
the index search component of an index.
CREATE INDEX EXTENSION iename (parm_name datatype, ...)
...
WITH TARGET KEY
WHEN method_name (parm_name datatype, ...)
RANGE THROUGH range_producing_function_invocation
FILTER USING index_filtering_function_invocation
The WITH TARGET KEY clause identifies the search method definition.
The WHEN clause identifies the method name. The RANGE THROUGH clause
identifies the function used to limit the scope of the index to be
used. The FILTER USING clause identifies the function used to eliminate
unnecessary items from the resulting index values.
Note: | The FILTER USING clause could identify a case expression instead of an index
filtering function.
|
- Define the predicates to exploit the index extension.
CREATE FUNCTION within (x shape, y shape)
RETURNS INTEGER
...
PREDICATES
WHEN = 1
FILTER USING mbrWithin (x..mbr..xmin, ...)
SEARCH BY INDEX EXTENSION grid_extension
WHEN KEY (parm_name) USE method_name(parm_name)
The PREDICATES clause introduces one or more predicates that are started
with each WHEN clause. The WHEN clause begins the specification for the
predicate with a comparison operator followed by either a constant or an
EXPRESSION AS clause. The FILTER USING clause identifies a filter
function that can be used to perform additional filtering of the result
table. This is a cheaper version of the defined function (used in the
predicate) that reduces the number of rows on which the user-defined predicate
must be executed to determine the rows that qualify. The SEARCH BY
INDEX EXTENSION clause specifies where the index exploitation takes
place. Index exploitation defines the set of rules using the search
method of an index extension that can be used to exploit the index. The
WHEN KEY clause specifies the exploitation rule. The exploitation rule
describes the search targets and search arguments as well as how they can be
used to perform the index search through a search method.
- Define a filter function.
CREATE FUNCTION mbrWithin (...)
The function defined here is created for use in the predicate of the
index extension.
In order for the query optimizer to successfully exploit indexes created to
improve query performance, a SELECTIVITY option is available on function
invocation. In cases where you have some idea of the percentage of rows
that the predicate may return, you can use the SELECTIVITY option on function
invocation to help the DB2 optimizer choose a more efficient access
path.
In the following example, the within user-defined function
computes the center and radius (based on the first and second parameters,
respectively), and builds a statement string with an appropriate
selectivity:
SELECT * FROM customer
WHERE within(loc, circle(100, 100, 10) = 1 SELECTIVITY .05
In this example, the indicated predicate (SELECTIVITY
.05) filters out 95 percent of the rows in the
customer table.
[ Top of Page | Previous Page | Next Page ]