IBM Books

Application Development Guide


Interface between DB2 and a UDF

This section discusses some of the details of the interface between DB2 and a UDF, and discusses the sqludf.h include file which makes the interface manageable. This include file only applies to C and C++ UDFs. For information on coding UDFs in Java, see Coding a Java UDF.

The Arguments Passed from DB2 to a UDF

In addition to the SQL arguments which are specified in the DML reference to the function, DB2 passes additional arguments to the external UDF. For C and C++, all of these arguments are passed in the order shown in Passing Arguments to a UDF. Java UDFs take only the SQL-argument and SQL-result arguments, but can call extra methods to access the other information. Java UDFs have the same restrictions on the resulting SQL-state and diagnostic-message arguments documented below. For information on coding UDFs in Java, see Coding a Java UDF.

Syntax for Passing Arguments to a UDF
 
                              .--------------.
                              V              |
>>-+----------------------+-----SQL-result---+------------------>
   |  .----------------.  |
   |  V                |  |
   '----SQL-argument---+--'
 
                                     .------------------.
                                     V                  |
>-----+--------------------------+-----SQL-result-ind---+------->
      |  .--------------------.  |
      |  V                    |  |
      '----SQL-argument-ind---+--'
 
>----SQL-state---function-name---specific-name------------------>
 
>----diagnostic-message---+------------+---+-----------+-------->
                          '-scratchpad-'   '-call-type-'
 
>----+--------+------------------------------------------------><
     '-dbinfo-'
 

Note:Each of the above arguments passed to the external function is a pointer to the value, and not the actual value.

The arguments are described as follows:

SQL-argument
This argument is set by DB2 before calling the UDF. This value repeats n times, where n is the number of arguments specified in the function reference. The value of each of these arguments is taken from the expression specified in the function invocation. It is expressed in the data type of the corresponding parameter definition in the CREATE FUNCTION statement. How these data types map to C language constructs is described in How the SQL Data Types are Passed to a UDF.

SQL-result
This argument is set by the UDF before returning to DB2. For scalar functions there is exactly one SQL-result. For table functions there is one SQL-result for each result column of the function defined in the RETURNS TABLE clause of the CREATE FUNCTION statement. They correspond by position to the columns defined in the RETURNS TABLE clause. That is, the first SQL-result argument corresponds to the first column defined in the RETURNS TABLE clause, and so on.

For both scalar functions and table functions, DB2 allocates the buffer and passes its address to the UDF. The UDF puts each result value into the buffer. Enough buffer space is allocated by DB2 to contain the value expressed in the data type. For scalar functions, this data type is defined in the CAST FROM clause, if it is present, or in the RETURNS clause, if no CAST FROM clause is present. For table functions, the data types are defined in the RETURNS TABLE(...) clause. For information on how these types map to C language constructs, see How the SQL Data Types are Passed to a UDF.

Note that for table functions, DB2 defines a performance optimization where every defined column does not have to be returned to DB2. If you write your UDF to take advantage of this feature, it returns only the columns required by the statement referencing the table function.

For example, consider a CREATE FUNCTION statement for a table function defined with 100 result columns. If a given statement referencing the function is only interested in two of them, this optimization enables the UDF to return only those two columns for each row and not spend time on the other 98 columns. See the dbinfo argument below for more information on this optimization.

For each value returned, (that is, a single value for a scalar function, and in general, multiple values for a table function), the UDF code should not return more bytes than is required for the data type and length of the result. DB2 will attempt to determine if the UDF body has written beyond the end of the result buffer by a few bytes, returning SQLCODE -450 (SQLSTATE 39501). However, a major overwrite by the UDF that DB2 does not detect can cause unpredictable results or an abnormal termination.

SQL-argument-ind
This argument is set by DB2 before calling the UDF. It can be used by the UDF to determine if the corresponding SQL-argument is null or not. The nth SQL-argument-ind corresponds to the nth SQL-argument (described above). It contains one of the following values:

0
The argument is present and not null.

-1
The argument is present and its value is null.

If the function is defined with NOT NULL CALL, the UDF body does not need to check for a null value. However, if it is defined with NULL CALL, any argument can be NULL and the UDF should check it.

The indicator takes the form of a SMALLINT value, and this can be defined in your UDF as described in How the SQL Data Types are Passed to a UDF.

SQL-result-ind
This argument is set by the UDF before returning to DB2. There is one of these for each SQL-result argument.

This argument is used by the UDF to signal if the particular result value is null:

0 or positive
The result is not null
negative
The result is the null value. For more information, see Interpreting Negative SQL-result-ind Values.

Interpreting Negative SQL-result-ind Values:

DB2 treats the function result as null (-2) if the following is true:

This is also true if you define the function with the NOT NULL CALL option.

Even if the function is defined with NOT NULL CALL, the UDF body must set the indicator of the result. For example, a divide function could set the result to null when the denominator is zero.

The indicator takes the form of a SMALLINT value, and this can be defined in your UDF as described in How the SQL Data Types are Passed to a UDF.

If the UDF takes advantage of table function optimization using the RESULT column list, then only the indicators corresponding to the required columns need be set.

SQL-state
This argument is set by the UDF before returning to DB2. It takes the form of a CHAR(5) value. Ensure that the argument definition in the UDF is appropriate for a CHAR(5) as described in How the SQL Data Types are Passed to a UDF, and can be used by the UDF to signal warning or error conditions. It contains the value '00000', when the function is called. The UDF can set the value to the following:

00000
The function code did not detect any warning or error situations.

01Hxx
The function code detected a warning situation. This results in a SQL warning, SQLCODE +462 (SQLSTATE 01Hxx ). Here 'xx' is any string.

02000
Only valid for the FETCH call to table functions, it means that there are no more rows in the table.

38502
A special value for the case where the UDF body attempted to issue an SQL call and received an error, SQLCODE -487 (SQLSTATE 38502). because SQL is not allowed in UDFs), and chose to pass this same error back through to DB2.

Any other 38xxx
The function code detected an error situation. It results in a SQL error, SQLCODE -443 (SQLSTATE 38xxx). Here 'xxx' is any string. Do not use 380xx through 384xx because those values are reserved by the draft extensions to the SQL92 international standard, or 385xx because those values are reserved by IBM.

Any other value is treated as an error situation resulting in SQLCODE -463 (SQLSTATE 39001).

function-name
This argument is set by DB2 before calling the UDF. It is the qualified function name, passed from DB2 to the UDF code. This variable takes the form of a VARCHAR(27) value. Ensure that the argument definition in the UDF is appropriate for a VARCHAR(27). See How the SQL Data Types are Passed to a UDF for more information.

The form of the function name that is passed is:

     <schema-name>.<function-name>

The parts are separated by a period. Two examples are:

 
     PABLO.BLOOP       WILLIE.FINDSTRING

This form enables you to use the same UDF body for multiple external functions, and still differentiate between the functions when it is invoked.
Note:Although it is possible to include the period in object names and schema names, it is not recommended. For example, if a function, rotate is in a schema, obj.op, the function name that is returned is obj.op.rotate, and it is not obvious if the schema name is obj or obj.op.

specific-name
This argument is set by DB2 before calling the UDF. It is the specific name of the function passed from DB2 to the UDF code. This variable takes the form of a VARCHAR(18) value. Ensure that the argument definition in the UDF is appropriate for a VARCHAR(18). See How the SQL Data Types are Passed to a UDF for more information. Two examples are:
 
     willie_find_feb99       SQL9904281052440430

This first value is provided by the user in his CREATE FUNCTION statement. The second value is generated by DB2 from the current timestamp if the user does not specify a value.

As with the function-name argument, the reason for passing this value is to give the UDF the means of distinguishing exactly which specific function is invoking it.

diagnostic-message
This argument is set by the UDF before returning to DB2. The UDF can use this argument to insert a message text in a DB2 message. It takes the form of a VARCHAR(70) value. Ensure that the argument definition in the UDF is appropriate for a VARCHAR(70). See How the SQL Data Types are Passed to a UDF for more information.

When the UDF returns either an error or a warning, using the SQL-state argument described above, it can include descriptive information here. DB2 includes this information as a token in its message.

DB2 sets the first character to null before calling the UDF. Upon return, it treats the string as a C null-terminated string. This string will be included in the SQLCA as a token for the error condition. At least the first part of this string will appear in the SQLCA or DB2 CLP message. However, the actual number of characters which will appear depends on the lengths of the other tokens, because DB2 may truncate the tokens to conform to the restrictive limit on total token length imposed by the SQLCA. Avoid using X'FF' in the text since this character is used to delimit tokens in the SQLCA.

The UDF code should not return more text than will fit in the VARCHAR(70) buffer which is passed to it. DB2 will attempt to determine if the UDF body has written beyond the end of this buffer by a few characters, SQLCODE -450 (SQLSTATE 39501). However, an overwrite by the UDF can cause unpredictable results or an abend, as it may not be detected by DB2.

DB2 assumes that any message tokens returned from the UDF to DB2 are in the same code page as the database. Your UDF should ensure that If this is the case. If you use the 7-bit invariant ASCII subset, your UDF can return the message tokens in any code page.

scratchpad
This argument is set by DB2 before calling the UDF. It is only present if the CREATE FUNCTION statement for the UDF specified the SCRATCHPAD keyword. This argument is a structure, exactly like the structure used to pass a value of any of the LOB data types, with the following elements:

The scratchpad can be mapped in your UDF using the same type as either a CLOB or a BLOB, since the argument passed has the same structure. See How the SQL Data Types are Passed to a UDF for more information.

Ensure your UDF code does not make changes outside of the scratchpad buffer. DB2 attempts to determine if the UDF body has written beyond the end of this buffer by a few characters, SQLCODE -450 (SQLSTATE 39501), but a major overwrite by the UDF can cause unpredictable results, or an abend, and may not result in a graceful failure by DB2.

If a scalar UDF which uses a scratchpad is referenced in a subquery, DB2 may decide to refresh the scratchpad between invocations of the subquery. This refresh occurs after a final-call is made, if FINAL CALL is specified for the UDF.

call-type
This argument, if present, is set by DB2 before calling the UDF. For scalar functions this argument is only present if FINAL CALL is specified in the CREATE FUNCTION statement, but for table functions it is ALWAYS present. It follows the scratchpad argument; or the diagnostic-message argument if the scratchpad argument is not present. This argument takes the form of an INTEGER value. Ensure that this argument definition in the UDF is appropriate for INTEGER. See How the SQL Data Types are Passed to a UDF for more information.

Note that even though all the current possible values are listed below, your UDF should contain a switch or case statement which explicitly tests for all the expected values, rather than containing "if A do AA, else if B do BB, else it must be C so do CC" type logic. This is because it is possible that additional call types may be added in the future, and if you don't explicitly test for condition C you will have trouble when new possibilities are added.

Notes:

  1. For all the call-types, it may be appropriate for the UDF to set a SQL-state and diagnostic-message return value. This information will not be repeated in the following descriptions of each call-type. For all calls DB2 will take the indicated action as described previously for these arguments.

  2. The include file sqludf.h is intended for use with UDFs and is described in The UDF Include File: sqludf.h. The file contains symbolic defines for the following call-type values, which are spelled out as constants.

For scalar functions call-type contains:

-1
This is the FIRST call to the UDF for this statement. The scratchpad (if any) is set to binary zeros when the UDF is called. All argument values are passed, and the UDF should do whatever one-time initialization actions are required. In addition, a FIRST call to a scalar UDF is like a NORMAL call, in that it is expected to develop and return an answer.

Note that if SCRATCHPAD is specified but FINAL CALL is not, then the UDF will not have this call-type argument to identify the very first call. Instead it will have to rely on the all-zero state of the scratchpad.

0
This is a NORMAL call. All the SQL input values are passed, and the UDF is expected to develop and return the result. The UDF may also return SQL-state and diagnostic-message information.

1
This is a FINAL call, that is no SQL-argument or SQL-argument-ind values are passed, and attempts to examine these values may cause unpredictable results. If a scratchpad is also passed, it is untouched from the previous call. The UDF is expected to release resources at this point.

Releasing resources.

A scalar UDF is expected to release resources it has required, for example, memory. If FINAL CALL is specified for the UDF, then that FINAL call is a natural place to release resources, provided that SCRATCHPAD is also specified and is used to track the resource. If FINAL CALL is not specified, then any resource acquired should be released on the same call.

For table functions call-type contains:

-2
This is the FIRST call, which only occurs if the FINAL CALL keyword was specified for the UDF. The scratchpad is set to binary zeros before this call. Argument values are passed to the table function, and it may choose to acquire memory or perform other one-time only resource initialization. Note that this is not an OPEN call, that call follows this one. On a FIRST call the table function should not return any data to DB2 as DB2 ignores the data.

-1
This is the OPEN call. The scratchpad will be initialized if NO FINAL CALL is specified, but not necessarily otherwise. All SQL argument values are passed to the table function on OPEN. The table function should not return any data to DB2 on the OPEN call.

0
This is a FETCH call, and DB2 expects the table function to return either a row comprising the set of return values, or an end-of-table condition indicated by SQLSTATE value '02000'. If a scratchpad is passed to the UDF, then on entry it is untouched from the previous call.

1
This is a CLOSE call to the table function. It balances the OPEN call, and can be used to perform any external CLOSE processing (for example, closing a source file), and resource release (particularly for the NO FINAL CALL case).

In cases involving a join or a subquery, the OPEN/FETCH.../CLOSE call sequences can repeat within the execution of a statement, but there is only one FIRST call and only one FINAL call. The FIRST and FINAL call only occur if FINAL CALL is specified for the table function.

2
This is a FINAL call, which only occurs if FINAL CALL was specified for the table function. It balances the FIRST call, and occurs only once per execution of the statement. It is intended for the purpose of releasing resources.

Releasing resources.

Write UDFs to release any resources that they acquire. For table functions, there are two natural places for this release: the CLOSE call and the FINAL call. The CLOSE call balances each OPEN call and can occur multiple times in the execution of a statement. The FINAL call only occurs if FINAL CALL is specified for the UDF, and occurs only once per statement.

If you can apply a resource across all OPEN/FETCH/CLOSE sequences of the UDF, write the UDF to acquire the resource on the FIRST call and free it on the FINAL call. The scratchpad is a natural place to track this resource. For table functions, if FINAL CALL is specified, the scratchpad is initialized only before the FIRST call. If FINAL CALL is not specified, then it is reinitialized before each OPEN call.

If a resource is specific to each OPEN/FETCH/CLOSE sequence, write the UDF to free the resource on the CLOSE call. (Note that when a table function is in a subquery or join, it is very possible that there will be multiple occurrences of the OPEN/FETCH/CLOSE sequence, depending on how the DB2 Optimizer chooses to organize the execution of the statement.)

dbinfo
This argument is set by DB2 before calling the UDF. It is only present if the CREATE FUNCTION statement for the UDF specifies the DBINFO keyword. The argument is the sqludf_dbinfo structure defined in the header file sqludf.h, which is discussed in The UDF Include File: sqludf.h. The variables in this structure that contain names and identifiers may be longer than the longest value possible in this release of DB2, but are defined this way for compatibility with future releases. You can use the length variable that complements each name and identifier variable to read or extract the portion of the variable that is actually used. The dbinfo structure contains the following elements:

  1. Data base name length (dbnamelen)

    The length of data base name below. This field is an unsigned short integer.

  2. Data base name (dbname)

    The name of the currently connected database. This field is a long identifier of 128 characters. The data base name length field described above identifies the actual length of this field. It does not contain a null terminator or any padding.

  3. Application Authorization ID Length (authidlen)

    The length of application authorization ID below. This field is an unsigned short integer.

  4. Application authorization ID (authid)

    The application run time authorization ID. This field is a long identifier of 128 characters. It does not contain a null terminator or any padding. The application authorization ID length field described above identifies the actual length of this field.

  5. Database code page (codepg)

    This is a union of two 48-byte long structures; one is used by DB2 Universal Database, the other is reserved for future use. The structure used by DB2 Universal Database contains the following fields:

    1. SBCS. Single byte code page, an unsigned long integer.
    2. DBCS. Double byte code page, an unsigned long integer.
    3. COMP. Composite code page, an unsigned long integer.

  6. Schema name length (tbschemalen)

    The length of schema name below. Contains 0 (zero) if a table name is not passed. This field is an unsigned short integer.

  7. Schema name (tbschema)

    Schema for the table name below. This field is a long identifier of 128 characters. It does not contain a null terminator or any padding. The schema name length field described above identifies the actual length of this field.

  8. Table name length (tbnamelen)

    The length of the table name below. Contains 0 (zero) if a table name is not passed. This field is an unsigned short integer.

  9. Table name (tbname)

    This is the name of the table being updated or inserted. This field is set only if the UDF reference is the right-hand side of a SET clause in an UPDATE statement, or an item in the VALUES list of an INSERT statement. This field is a long identifier of 128 characters. It does not contain a null terminator or any padding. The table name length field described above, identifies the actual length of this field. The schema name field above, together with this field form the fully qualified table name.

  10. Column name length (colnamelen)

    Length of column name below. It contains a 0 (zero) if a column name is not passed. This field is an unsigned short integer.

  11. Column name (colname)

    Under the exact same conditions as for table name, this field contains the name of the column being updated or inserted; otherwise not predictable. This field is a long identifier of 128 characters. It does not contain a null terminator or any padding. The column name length field described above, identifies the actual length of this field.

  12. Version/Release number (ver_rel)

    An 8 character field that identifies the product and its version, release, and modification level with the format pppvvrrm where:

    • ppp identifies the product as follows:
      DSN
      DB2 for MVS/ESA or OS/390
      ARI
      SQL/DS
      QSQ
      DB2 Universal Database for AS/400
      SQL
      DB2 Universal Database
    • vv is a two digit version identifier.
    • rr is a two digit release identifier.
    • m is a one digit modification level identifier.

  13. Platform (platform)

    The operating platform for the application server, as follows:

    SQLUDF_PLATFORM_AIX
    AIX
    SQLUDF_PLATFORM_HP
    HP-UX
    SQLUDF_PLATFORM_MVS
    OS/390
    SQLUDF_PLATFORM_NT
    Windows NT
    SQLUDF_PLATFORM_OS2
    OS/2
    SQLUDF_PLATFORM_SUN
    Solaris
    SQLUDF_PLATFORM_WINDOWS
    Windows 95 and Windows 98
    SQLUDF_PLATFORM_UNKNOWN
    Unknown platform

    For additional platforms that are not contained in the above list, see the contents of the sqludf.h file.

  14. Number of table function column list entries (numtfcol)

    The number of non-zero entries in the table function column list specified in the table function column list field below.

  15. Reserved field (resd1)

    This field is for future use. It is defined as 24 characters long.

  16. Table function column list (tfcolumn)

    If this is a table function, this field is a pointer to an array of short integers which is dynamically allocated by DB2. If this is a scalar function, this pointer is null.

    This field is used only for table functions. Only the first n entries, where n is specified in the number of table function column list entries field, numtfcol, are of interest. n may be equal to 0, and is less than or equal to the number of result columns defined for the function in the RETURNS TABLE(...) clause of the CREATE FUNCTION statement. The values correspond to the ordinal numbers of the columns which this statement needs from the table function. A value of '1' means the first defined result column, '2' means the second defined result column, and so on, and the values may be in any order. Note that n could be equal to zero, that is, the variable numtfcol might be zero, for a statement similar to SELECT COUNT(*) FROM TABLE(TF(...)) AS QQ, where no actual column values are needed by the query.

    This array represents an opportunity for optimization. The UDF need not return all values for all the result columns of the table function, only those needed in the particular context, and these are the columns identified (by number) in the array. Since this optimization may complicate the UDF logic in order to gain the performance benefit, the UDF can choose to return every defined column.

  17. Unique application identifier (appl_id)

    This field is a pointer to a C null-terminated string which uniquely identifies the application's connection to DB2. It is regenerated at each database connect.

    The string has a maximum length of 32 characters, and its exact format depends on the type of connection established between the client and DB2. Generally it takes the form

            <x>.<y>.<ts>
    

    where the <x> and <y> vary by connection type, but the <ts> is a 12 character time stamp of the form YYMMDDHHMMSS, which is potentially adjusted by DB2 to ensure uniqueness.

          Example:  *LOCAL.db2inst.980707130144
    

  18. Reserved field (resd2)

    This field is for future use. It is defined as 20 characters long.

Summary of UDF Argument Use

The following is a summary of the arguments described above, and how you use them in the interface between DB2 and an external UDF.

For scalar functions, the arguments are:

A table function logically returns a table to the SQL statement that references it, but the physical interface between DB2 and the table function is row by row. For table functions, the arguments are:

Observe that the normal value outputs of the UDF, as well as the SQL-result, SQL-result-ind, and SQL-state, are returned to DB2 using arguments passed from DB2 to the UDF. Indeed, the UDF is written not to return anything in the functional sense (that is, the function's return type is void). See the void definition and the return statement in the following example:

 
     #include ... 
      void SQL_API_FN divid( 
           ... arguments ... ) 
     { 
           ... UDF body ... 
           return; 
     }

In the above example, SQL_API_FN is a macro that specifies the calling convention for a function that may vary across supported operating systems. This macro is required when you write stored procedures or UDFs.

For programming examples of UDFs, see Examples of UDF Code.

How the SQL Data Types are Passed to a UDF

This section identifies the valid types, for both UDF parameters and result, and specifies for each how the corresponding argument should be defined in your C or C++ language UDF. For type definitions in Java UDFs, see Supported SQL Data Types. Note that if you use the sqludf.h include file and the types defined there, you can automatically generate language variables and structures that are correct for the different data types and compilers. For example, for BIGINT you can use the SQLUDF_BIGINT data type to hide differences in the name of the 64 bit integer type between different compilers. This include file is discussed in The UDF Include File: sqludf.h.

It is the data type for each function parameter defined in the CREATE FUNCTION statement that governs the format for argument values. Promotions from the argument data type may be needed to get the value in that format. Such promotions are performed automatically by DB2 on the argument values; argument promotion is discussed in the SQL Reference.

For the function result, it is the data type specified in the CAST FROM clause of the CREATE FUNCTION statement that defines the format. If no CAST FROM clause is present, then the data type specified in the RETURNS clause defines the format.

In the following example, the presence of the CAST FROM clause means that the UDF body returns a SMALLINT and that DB2 casts the value to INTEGER before passing it along to the statement where the function reference occurs:

 
     ... RETURNS INTEGER CAST FROM SMALLINT ...

In this case the UDF must be written to generate a SMALLINT, as defined below. Note that the CAST FROM data type must be castable to the RETURNS data type, so one cannot just arbitrarily choose another data type. Casting between data types is discussed in the SQL Reference.

The following is a list of the SQL types and their C language representations. For a list of SQL type representations for Java, see Supported SQL Data Types. It includes information on whether each type is valid as a parameter or a result. Also included are examples of how the types could appear as an argument definition in your C or C++ language UDF:

The type udf_locator is defined in the header file sqludf.h, which is discussed in The UDF Include File: sqludf.h. The use of these locators is discussed in Using LOB Locators as UDF Parameters or Results.

The UDF Include File: sqludf.h

This include file contains structures, definitions and values which are useful when writing your UDF. Its use is optional, however, and in the sample UDFs shown in Examples of UDF Code, some examples use the include file. When compiling your UDF, you need to reference the directory which contains this file. This directory is sqllib/include.

The sqludf.h include file is self-describing. Following is a brief summary of its content:

  1. Structure definitions for the passed arguments which are structures:

  2. C language type definitions for all the SQL data types, for use in the definition of UDF arguments corresponding to SQL arguments and result having the data types. These are the definitions with names SQLUDF_x and SQLUDF_x_FBD where x is a SQL data type name, and FBD represents For Bit Data.

    Also included is a C language type for an argument or result which is defined with the AS LOCATOR appendage.

  3. Definition of C language types for the scratchpad and call-type arguments, with an enum type definition of the call-type argument.

  4. Macros for defining the standard trailing arguments, both with and without the inclusion of scratchpad and call-type arguments. This corresponds to the presence and absence of SCRATCHPAD and FINAL CALL keywords in the function definition. These are the SQL-state, function-name, specific-name, diagnostic-message, scratchpad and call-type UDF invocation arguments defined in The Arguments Passed from DB2 to a UDF. Also included are definitions for referencing these constructs, and the various valid SQLSTATE values.

  5. Macros for testing whether the SQL arguments are null.

  6. Function prototypes for the APIs which can be used to manipulate LOB values by means of LOB locators passed to the UDF.

Some of the UDF examples in the next section illustrate the inclusion and use of sqludf.h.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]