SQL Reference

Function Resolution

Given a function invocation, the database manager must decide which of the possible functions with the same name is the "best" fit. This includes resolving functions from the built-in and user-defined functions.

An argument is a value passed to a function upon invocation. When a function is invoked in SQL, it is passed a list of zero or more arguments. They are positional in that the semantics of an argument are determined by its position in the argument list. A parameter is a formal definition of an input to a function. When a function is defined to the database, either internally (the built-in functions) or by a user (user-defined functions), its parameters (zero or more) are specified, the order of their definitions defining their positions and thus their semantics. Therefore, every parameter is a particular positional input of a function. On invocation, an argument corresponds to a particular parameter by virtue of its position in the list of arguments.

The database manager uses the name of the function given in the invocation, the number and data types of the arguments, all the functions with the same name in the SQL path, and the data types of their corresponding parameters as the basis for deciding whether or not to select a function. The following are the possible outcomes of the decision process:

A particular function is deemed to be the best fit. For example, given the functions named RISK in the schema TEST with signatures defined as:
```
    TEST.RISK(INTEGER)
    TEST.RISK(DOUBLE)
```
a SQL path including the TEST schema and the following function reference (where DB is a DOUBLE column):
```
    SELECT ... RISK(DB) ...
```
then, the second RISK will be chosen.
The following function reference (where SI is a SMALLINT column):
```
    SELECT ... RISK(SI) ...
```
would choose the first RISK, since SMALLINT can be promoted to INTEGER and is a better match than DOUBLE which is further down the precedence list (as shown in Table 5).
No function is deemed to be an acceptable fit. For example, given the same two functions in the previous case and the following function reference (where C is a CHAR(5) column):
```
    SELECT ... RISK(C) ...
```
the argument is inconsistent with the parameter of both RISK functions.
A particular function is selected based on the SQL path and the number and data types of the arguments passed on invocation. For example, given functions named RANDOM with signatures defined as:
```
    TEST.RANDOM(INTEGER)
    PROD.RANDOM(INTEGER)
```
and a SQL path of:
```
    "TEST","PROD"
```
Then the following function reference:
```
    SELECT ... RANDOM(432) ...
```
will choose TEST.RANDOM since both RANDOM functions are equally good matches (exact matches in this particular case) and both schemas are in the path, but TEST precedes PROD in the SQL path.

Method of Choosing the Best Fit

A comparison of the data types of the arguments with the defined data types of the parameters of the functions under consideration forms the basis for the decision of which function in a group of like-named functions is the "best fit". Note that the data type of the result of the function or the type of function (column, scalar, or table) under consideration does not enter into this determination.

Function resolution is done using the steps that follow.

First, find all functions from the catalog (SYSCAT.FUNCTIONS) and built-in functions such that all of the following are true:
1. For invocations where the schema name was specified (i.e. qualified references), then the schema name and the function name match the invocation name.
2. For invocations where the schema name was not specified (i.e. unqualified references), then the function name matches the invocation name and has a schema name that matches one of the schemas in the SQL path.
3. The number of defined parameters matches the invocation.
4. Each invocation argument matches the function's corresponding defined parameter in data type, or is "promotable" to it (see Promotion of Data Types).
Next, consider each argument of the function invocation, from left to right. For each argument, eliminate all functions that are not the best match for that argument. The best match for a given argument is the first data type appearing in the precedence list corresponding to the argument data type in Table 5 for which there exists a function with a parameter of that data type. Lengths, precisions, scales and the "FOR BIT DATA" attribute are not considered in this comparison. For example, a DECIMAL(9,1) argument is considered an exact match for a DECIMAL(6,5) parameter, and a VARCHAR(19) argument is an exact match for a VARCHAR(6) parameter.
If more than one candidate function remains after Step 2, then it has to be the case (the way the algorithm works) that all the remaining candidate functions have identical signatures but are in different schemas. Choose the function whose schema is earliest in the user's SQL path.
If there are no candidate functions remaining after step 2, an error is returned (SQLSTATE 42884).

Conservative Binding Semantics

There are situations within a database where the functions and data types are resolved when the statement is processed and the database manager must be able to repeat this resolution. This is true of references to functions and data types in:

static DML statements in packages,
views,
triggers, and
check constraints.

For static DML statements in packages, the function and data type references are resolved during the bind operation. Function and data type references in views, triggers, and check constraints are resolved when the object is created.

If function resolution is performed again on any function references in these objects, it could change the behaviour if a new function has been added with a signature that is a better match but the actual executable perfoms different operations. Similarly, if resolution is performed again on any data type in these objects, it could change the behaviour if a new data type has been added with the same name in a different schema that is also on the SQL path. In order to avoid this, where necessary, the database manager applies the concept of conservative binding semantics. This concept ensures that the function and data type references will be resolved using the same SQL path as when it was bound and that the creation timestamp of functions ²⁹ and data types considered during resolution is not later than the time when the statement was bound ³⁰ . In this way, only the functions and data types that were considered during function and data type resolution when the statement was originally processed will be considered. Hence, newly created functions and data types are not considered when conservative binding semantics are applied.

For static DML in packages, the packages can rebind either implicitly or by explicitly issuing the REBIND command (or corresponding API) or the BIND command (or corresponding API). The implicit rebind is always performed to resolve functions with conservative binding semantics. The REBIND command provides the choice to resolve with conservative binding semantics (RESOLVE CONSERVATIVE option) or to resolve considering any new functions (by default or using the RESOLVE ANY option).

Function Path Considerations for Built-in Functions

Built-in functions reside in a special schema called SYSIBM. Additional functions are available in the SYSFUN schema which are not considered built-in functions since they are developed as user-defined functions and have no special processing considerations. Users can not define additional functions in SYSIBM or SYSFUN schemas (or in any schema whose name begins with the letters "SYS").

As already stated, the built-in functions participate in the function resolution process exactly as do the user-defined functions. One difference between built-in and user-defined functions, from a function resolution perspective, is that the built-in function must always be considered by function resolution. Therefore, omission of SYSIBM from the path results in an assumption for function and data type resolution that SYSIBM is the first schema on the path.

For example, if a user's SQL path is defined as:

     "SHAREFUN","SYSIBM","SYSFUN"

and there is a LENGTH function defined in schema SHAREFUN with the same number and types of arguments as SYSIBM.LENGTH, then an unqualified reference to LENGTH in this user's SQL statement will result in selecting SHAREFUN.LENGTH. However, if the user's SQL path is defined as:

     "SHAREFUN","SYSFUN"

and the same SHAREFUN.LENGTH function exists, then an unqualified reference to LENGTH in this user's SQL statement will result in selecting SYSIBM.LENGTH since SYSIBM is implicitly first in the path because it was not specified.

It is possible to minimize potential problems in this area by:

never using the names of built-in functions for user-defined functions, or
qualifying any references to these functions, if for some reason it is deemed necessary to create a user-defined function with the same name as a built-in function.

Example of Function Resolution

The following is an example of successful function resolution.

There are six FOO functions, in two different schemas, registered as (note that not all required keywords appear):

 CREATE FUNCTION AUGUSTUS.FOO (CHAR(5), INT, DOUBLE)  SPECIFIC FOO_1 ...
 CREATE FUNCTION AUGUSTUS.FOO (INT, INT, DOUBLE)      SPECIFIC FOO_2 ...
 CREATE FUNCTION AUGUSTUS.FOO (INT, INT, DOUBLE, INT) SPECIFIC FOO_3 ...
 CREATE FUNCTION JULIUS.FOO   (INT, DOUBLE, DOUBLE)   SPECIFIC FOO_4 ...
 CREATE FUNCTION JULIUS.FOO   (INT, INT, DOUBLE)      SPECIFIC FOO_5 ...
 CREATE FUNCTION JULIUS.FOO   (SMALLINT, INT, DOUBLE) SPECIFIC FOO_6 ...
 CREATE FUNCTION NERO.FOO     (INT, INT, DEC(7,2))    SPECIFIC FOO_7 ...

The function reference is as follows (where I1 and I2 are INTEGER columns, and D is a DECIMAL column):

    SELECT ... FOO(I1, I2, D) ...

Assume that the application making this reference has a SQL path established as:

    "JULIUS","AUGUSTUS","CAESAR"

Following through the algorithm...

FOO_7 is eliminated as a candidate, because the schema "NERO" is not included in the SQL path.

FOO_3 is eliminated as a candidate, because it has the wrong number of parameters. FOO_1 and FOO_6 are eliminated because in both cases the first argument cannot be promoted to the data type of the first parameter.

Because there is more than one candidate remaining, the arguments are then considered in order.

For the first argument, all remaining functions -- FOO_2, FOO_4 and FOO_5 are an exact match with the argument type. No functions can be eliminated from consideration, therefore the next argument must be examined.

For this second argument, FOO_2 and FOO_5 are exact matches while FOO_4 is not, so it is eliminated from consideration. The next argument is examined to determine some differentiation between FOO_2 and FOO_5.

On the third and last argument, neither FOO_2 nor FOO_5 match the argument type exactly, but both are equally good.

There are two functions remaining, FOO_2 and FOO_5, with identical parameter signatures. The final tie-breaker is to see which function's schema comes first in the SQL path, and on this basis FOO_5 is finally chosen.

Function Invocation

Once the function is selected, there are still possible reasons why the use of the function may not be permitted. Each function is defined to return a result with a specific data type. If this result data type is not compatible with the context in which the function is invoked, an error will occur. For example, given functions named STEP defined, this time, with different data types as the result:

    STEP(SMALLINT) returns CHAR(5)
    STEP(DOUBLE)   returns INTEGER

and the following function reference (where S is a SMALLINT column):

    SELECT ... 3 + STEP(S) ...

then, because there is an exact match on argument type, the first STEP is chosen. An error occurs on the statement because the result type is CHAR(5) instead of a numeric type as required for an argument of the addition operator.

A couple of other examples where this can happen are as follows, both of which will result in an error on the statement:

The function was referenced in a FROM clause, but the function selected by the function resolution step was a scalar or column function.
The reverse case, where the context calls for a scalar or column function, and function resolution selects a table function.

In cases where the arguments of the function invocation were not an exact match to the data types of the parameters of the selected function, the arguments are converted to the data type of the parameter at execution using the same rules as assignment to columns (see Assignments and Comparisons). This includes the case where precision, scale, or length differs between the argument and the parameter.

Footnotes:

²⁹: Built-in functions added starting with Version 6.1 have a creation timestamp that is based on the time of database creation or migration.
³⁰: For views, conservative binding also ensures that the columns of the view are the same as those that existed at the time the view was created. For example, a view defined using the asterisk in the select list will not consider any columns added to the underlying tables after the view was created.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]