Application Development Guide

Using Functions and Methods

Scalar and column UDFs and methods can be invoked within an SQL statement wherever an expression is valid (there are additional rules for all column functions that limit validity). Table UDFs can only be referenced in the FROM clause of a SELECT. The SQL Reference discusses all these contexts in detail. The discussion and examples used in this section focus on relatively simple SELECT statement contexts, but note that their use is not restricted to these contexts.

See UDF And Method Concepts for a summary of the use and importance of the function path and the function selection algorithm. You can find the details for both of these concepts in the SQL Reference. The resolution of any Data Manipulation Language (DML) reference to a function uses the function selection algorithm, so it is important to understand how it works.

Referring to Functions

Each reference to a function, whether it is a UDF, or a built-in function, contains the following syntax:

>>-function_name--(--+--------------------+---)----------------><
                     |  .-,------------.  |
                     |  V              |  |
                     '----expression---+--'

In the above, function_name can be either an unqualified or a qualified function name, and the arguments can number from 0 to 90, and are expressions which may contain:

A column name, qualified or unqualified
A constant
A host variable
A special register
A parameter marker. (For information on the limitations of parameter marker use, refer to the section in the SQL Reference that describes the rules for parameter markers.)

The position of the arguments is important and must conform to the function definition for the semantics to be correct. Both the position of the arguments and the function definition must conform to the function body itself. DB2 does not attempt to shuffle arguments to better match a function definition, and DB2 does not understand the semantics of the individual function parameters.

Use of column names in UDF argument expressions requires that the table references which contain the column have proper scope. For table functions referenced in a join, this means that for any argument which involves columns from another table or table function, that other table or table function must appear before the table function containing the reference, in the FROM clause. For a complete discussion of the rules for using columns in the arguments of table functions, refer to the SQL Reference.

Examples of Function Invocations

Some valid examples of function invocations are:

      AVG(FLOAT_COLUMN) 
      BLOOP(COLUMN1) 
      BLOOP(FLOAT_COLUMN + CAST(? AS INTEGER)) 
      BLOOP(:hostvar :indicvar) 
      BRIAN.PARSE(CHAR_COLUMN CONCAT USER, 1, 0, 0, 1) 
      CTR() 
      FLOOR(FLOAT_COLUMN) 
      PABLO.BLOOP(A+B) 
      PABLO.BLOOP(:hostvar) 
      "search_schema"(CURRENT FUNCTION PATH, 'GENE')
      SUBSTR(COLUMN2,8,3) 
      SYSFUN.FLOOR(AVG(EMP.SALARY)) 
      SYSFUN.AVG(SYSFUN.FLOOR(EMP.SALARY)) 
      SYSIBM.SUBSTR(COLUMN2,11,LENGTH(COLUMN3)) 
      SQRT(SELECT SUM(length*length) 
           FROM triangles 
           WHERE id= 'J522' 
           AND legtype <> 'HYP')

Note that if any of the above functions are table functions, the syntax to reference them is slightly different than presented above. For example, if PABLO.BLOOP is a table function, to properly reference it, use:

     TABLE(PABLO.BLOOP(A+B)) AS Q

Using Parameter Markers in Functions

An important restriction involves parameter markers; you cannot simply code the following:

     BLOOP(?)

As the function selection logic does not know what data type the argument may turn out to be, it cannot resolve the reference. You can use the CAST specification to provide a type for the parameter marker, for example INTEGER, and then the function selection logic can proceed:

     BLOOP(CAST(? AS INTEGER))

Using Qualified Function Reference

If you use a qualified function reference, you restrict DB2's search for a matching function to that schema. For example, you have the following statement:

     SELECT PABLO.BLOOP(COLUMN1) FROM T

Only the BLOOP functions in schema PABLO are considered. It does not matter that user SERGE has defined a BLOOP function, or whether or not there is a built-in BLOOP function. Now suppose that user PABLO has defined two BLOOP functions in his schema:

     CREATE FUNCTION BLOOP (INTEGER) RETURNS ... 
     CREATE FUNCTION BLOOP (DOUBLE) RETURNS ...

BLOOP is thus overloaded within the PABLO schema, and the function selection algorithm would choose the best BLOOP, depending on the data type of the argument, column1. In this case, both of the PABLO.BLOOPs take numeric arguments, and if column1 is not one of the numeric types, the statement will fail. On the other hand if column1 is either SMALLINT or INTEGER, function selection will resolve to the first BLOOP, while if column1 is DECIMAL, DOUBLE, REAL, or BIGINT, the second BLOOP will be chosen.

Several points about this example:

It illustrates argument promotion. The first BLOOP is defined with an INTEGER parameter, yet you can pass it a SMALLINT argument. The function selection algorithm supports promotions among the built-in data types (for details, refer to the SQL Reference) and DB2 performs the appropriate data value conversions.
If for some reason you want to invoke the second BLOOP with a SMALLINT or INTEGER argument, you have to take an explicit action in your statement as follows:
```
     SELECT PABLO.BLOOP(DOUBLE(COLUMN1)) FROM T
```
Alternatively, if you want to invoke the first BLOOP with a DECIMAL or DOUBLE argument, you have your choice of explicit actions, depending on your exact intent:
```
     SELECT PABLO.BLOOP(INTEGER(COLUMN1)) FROM T
     SELECT PABLO.BLOOP(FLOOR(COLUMN1)) FROM T 
     SELECT PABLO.BLOOP(CEILING(COLUMN1)) FROM T 
     SELECT PABLO.BLOOP(INTEGER(ROUND(COLUMN1,0))) FROM T
```
You should investigate these other functions in the SQL Reference. The INTEGER function is a built-in function in the SYSIBM schema. The FLOOR, CEILING, and ROUND functions are UDFs shipped with DB2, which you can find in the SYSFUN schema along with many other useful functions.

Using Unqualified Function Reference

If, instead of a qualified function reference, you use an unqualified function reference, DB2's search for a matching function normally uses the function path to qualify the reference. In the case of the DROP FUNCTION or COMMENT ON FUNCTION functions, the reference is qualified using the current authorization ID, if they are unqualified. Thus, it is important that you know what your function path is, and what, if any, conflicting functions exist in the schemas of your current function path. For example, suppose you are PABLO and your static SQL statement is as follows, where COLUMN1 is data type INTEGER:

     SELECT BLOOP(COLUMN1) FROM T

You have created the two BLOOP functions cited in Using Qualified Function Reference, and you want and expect one of them to be chosen. If the following default function path is used, the first BLOOP is chosen (since column1 is INTEGER), if there is no conflicting BLOOP in SYSIBM or SYSFUN:

     "SYSIBM","SYSFUN","PABLO"

However, suppose you have forgotten that you are using a script for precompiling and binding which you previously wrote for another purpose. In this script, you explicitly coded your FUNCPATH parameter to specify the following function path for another reason that does not apply to your current work:

     "KATHY","SYSIBM","SYSFUN","PABLO"

If Kathy has written a BLOOP function for her own purposes, the function selection could very well resolve to Kathy's function, and your statement would execute without error. You are not notified because DB2 assumes that you know what you are doing. It becomes your responsibility to identify the incorrect output from your statement and make the required correction.

Summary of Function References

For both qualified and unqualified function references, the function selection algorithm looks at all the applicable functions, both built-in and user-defined, that have:

The given name
The same number of defined parameters as arguments in the function reference
Each parameter identical to or promotable from the type of the corresponding argument.

(Applicable functions means functions in the named schema for a qualified reference, or functions in the schemas of the function path for an unqualified reference.) The algorithm looks for an exact match, or failing that, a best match among these functions. The current function path is used, in the case of an unqualified reference only, as the deciding factor if two identically good matches are found in different schemas. The details of the algorithm can be found in the SQL Reference.

An interesting feature, illustrated by the examples at the end of Using Qualified Function Reference, is the fact that function references can be nested, even references to the same function. This is generally true for built-in functions as well as UDFs; however, there are some limitations when column functions are involved.

Refining an earlier example:

     CREATE FUNCTION BLOOP (INTEGER) RETURNS INTEGER ... 
     CREATE FUNCTION BLOOP (DOUBLE) RETURNS INTEGER ...

Now consider the following DML statement:

     SELECT BLOOP( BLOOP(COLUMN1)) FROM T

If column1 is a DECIMAL or DOUBLE column, the inner BLOOP reference resolves to the second BLOOP defined above. Because this BLOOP returns an INTEGER, the outer BLOOP resolves to the first BLOOP.

Alternatively, if column1 is a SMALLINT or INTEGER column, the inner bloop reference resolves to the first BLOOP defined above. Because this BLOOP returns an INTEGER, the outer BLOOP also resolves to the first BLOOP. In this case, you are seeing nested references to the same function.

A few additional points important for function references are:

By defining a function with the name of one of the SQL operators, you can actually invoke a UDF using infix notation. For example, suppose you can attach some meaning to the "+" operator for values which have distinct type BOAT. You can define the following UDF:
```
     CREATE FUNCTION "+" (BOAT, BOAT) RETURNS ...
```
Then you can write the following valid SQL statement:
```
     SELECT BOAT_COL1 + BOAT_COL2 
     FROM BIG_BOATS 
     WHERE BOAT_OWNER = 'Nelson Mattos'
```
But you can also write the equally valid statement:
```
     SELECT "+"(BOAT_COL1, BOAT_COL2) 
     FROM BIG_BOATS 
     WHERE BOAT_OWNER = 'Nelson Mattos' 
```
Note that you are not permitted to overload the built-in conditional operators such as >, =, LIKE, IN, and so on, in this way. See Example: Integer Divide Operator for an example of a UDF which overloads the divide (/) operator.
The function selection algorithm does not consider the context of the reference in resolving to a particular function. Look at these BLOOP functions, modified a bit from before:
```
     CREATE FUNCTION BLOOP (INTEGER) RETURNS INTEGER ... 
     CREATE FUNCTION BLOOP (DOUBLE) RETURNS CHAR(10)...
```
Now suppose you write the following SELECT statement:
```
     SELECT 'ABCDEFG' CONCAT BLOOP(SMALLINT_COL) FROM T
```
Because the best match, resolved using the SMALLINT argument, is the first BLOOP defined above, the second operand of the CONCAT resolves to data type INTEGER. The statement fails because CONCAT demands string arguments. If the first BLOOP was not present, the other BLOOP would be chosen and the statement execution would be successful.
Another type of contextual inconsistency that causes a statement to fail is if a given function reference resolves to a table function in a context that requires a scalar or column function. The reverse could also occur. A reference could resolve to a scalar or column function when a table function is necessary.
UDFs and methods can be defined with parameters or results having any of the LOB types: BLOB, CLOB, or DBCLOB. DB2 will materialize the entire LOB value in storage before invoking such a function, even if the source of the value is a LOB locator host variable. For example, consider the following fragment of a C language application:
```
     EXEC SQL BEGIN DECLARE SECTION; 
       SQL TYPE IS CLOB(150K) clob150K ;        /* LOB host var */ 
       SQL TYPE IS CLOB_LOCATOR clob_locator1;  /* LOB locator host var */ 
       char                   string[40];       /* string host var */ 
     EXEC SQL END DECLARE SECTION; 
```
Either host variable :clob150K or :clob_locator1 is valid as an argument for a function whose corresponding parameter is defined as CLOB(500K). Thus, referring to the FINDSTRING defined in Example: String Search, both of the following are valid in the program:
```
     ... SELECT FINDSTRING (:clob150K, :string) FROM ... 
     ... SELECT FINDSTRING (:clob_locator1, :string) FROM ...
```
UDF parameters or results which have one of the LOB types can be created with the AS LOCATOR modifier. In this case, the entire LOB value is not materialized prior to invocation. Instead, a LOB LOCATOR is passed to the UDF, which can then use the special UDF APIs to manipulate the actual bytes of the LOB value (see Using LOB Locators as UDF Parameters or Results for details).
You can also use this capability on UDF parameters or results which have a distinct type that is based on a LOB. This capability is limited to UDFs defined as not-fenced. Note that the argument to such a function can be any LOB value of the defined type; it does not have to be a host variable defined as one of the LOCATOR types. The use of host variable locators as arguments is completely orthogonal to the use of AS LOCATOR in UDF parameters and result definitions.
UDFs and methods can be defined with distinct types as parameters or as the result. (Earlier examples have illustrated this.) DB2 will pass the value to the UDF in the format of the source data type of the distinct type.
Distinct type values which originate in a host variable and which are used as arguments to a UDF which has its corresponding parameter defined as a distinct type, must be explicitly cast to the distinct type by the user. There is no host language type for distinct types. DB2's strong typing necessitates this. Otherwise your results may be ambiguous. So, consider the BOAT distinct type which is defined over a BLOB, and consider the BOAT_COST UDF from Example: External Function with UDT Parameter, which takes an object of type BOAT as its argument. In the following fragment of a C language application, the host variable :ship holds the BLOB value that is to passed to the BOAT_COST function:
```
     EXEC SQL BEGIN DECLARE SECTION; 
       SQL TYPE IS BLOB(150K) ship; 
     EXEC SQL END DECLARE SECTION;
```
Both of the following statements correctly resolve to the BOAT_COST function, because both cast the :ship host variable to type BOAT:
```
     ... SELECT BOAT_COST (BOAT(:ship)) FROM ... 
     ... SELECT BOAT_COST (CAST(:ship AS BOAT)) FROM ...
```
If there are multiple BOAT distinct types in the database, or BOAT UDFs in other schema, you must exercise care with your function path. Otherwise your results may be ambiguous.

[ Top of Page | Previous Page | Next Page ]