Application Development Guide

Interface between DB2 and a UDF

This section discusses some of the details of the interface between DB2 and a UDF, and discusses the sqludf.h include file which makes the interface manageable. This include file only applies to C and C++ UDFs. For information on coding UDFs in Java, see Coding a Java UDF.

The Arguments Passed from DB2 to a UDF

In addition to the SQL arguments which are specified in the DML reference to the function, DB2 passes additional arguments to the external UDF. For C and C++, all of these arguments are passed in the order shown in Passing Arguments to a UDF. Java UDFs take only the SQL-argument and SQL-result arguments, but can call extra methods to access the other information. Java UDFs have the same restrictions on the resulting SQL-state and diagnostic-message arguments documented below. For information on coding UDFs in Java, see Coding a Java UDF.

Syntax for Passing Arguments to a UDF
 
                              .--------------.
                              V              |
>>-+----------------------+-----SQL-result---+------------------>
   |  .----------------.  |
   |  V                |  |
   '----SQL-argument---+--'
 
                                     .------------------.
                                     V                  |
>-----+--------------------------+-----SQL-result-ind---+------->
      |  .--------------------.  |
      |  V                    |  |
      '----SQL-argument-ind---+--'
 
>----SQL-state---function-name---specific-name------------------>
 
>----diagnostic-message---+------------+---+-----------+-------->
                          '-scratchpad-'   '-call-type-'
 
>----+--------+------------------------------------------------><
     '-dbinfo-'

Note: Each of the above arguments passed to the external function is a pointer to the value, and not the actual value.

The arguments are described as follows:

SQL-argument

This argument is set by DB2 before calling the UDF. This value repeats n times, where n is the number of arguments specified in the function reference. The value of each of these arguments is taken from the expression specified in the function invocation. It is expressed in the data type of the corresponding parameter definition in the CREATE FUNCTION statement. How these data types map to C language constructs is described in How the SQL Data Types are Passed to a UDF.

SQL-result

This argument is set by the UDF before returning to DB2. For scalar functions there is exactly one SQL-result. For table functions there is one SQL-result for each result column of the function defined in the RETURNS TABLE clause of the CREATE FUNCTION statement. They correspond by position to the columns defined in the RETURNS TABLE clause. That is, the first SQL-result argument corresponds to the first column defined in the RETURNS TABLE clause, and so on.

For both scalar functions and table functions, DB2 allocates the buffer and passes its address to the UDF. The UDF puts each result value into the buffer. Enough buffer space is allocated by DB2 to contain the value expressed in the data type. For scalar functions, this data type is defined in the CAST FROM clause, if it is present, or in the RETURNS clause, if no CAST FROM clause is present. For table functions, the data types are defined in the RETURNS TABLE(...) clause. For information on how these types map to C language constructs, see How the SQL Data Types are Passed to a UDF.

Note that for table functions, DB2 defines a performance optimization where every defined column does not have to be returned to DB2. If you write your UDF to take advantage of this feature, it returns only the columns required by the statement referencing the table function.

For example, consider a CREATE FUNCTION statement for a table function defined with 100 result columns. If a given statement referencing the function is only interested in two of them, this optimization enables the UDF to return only those two columns for each row and not spend time on the other 98 columns. See the dbinfo argument below for more information on this optimization.

For each value returned, (that is, a single value for a scalar function, and in general, multiple values for a table function), the UDF code should not return more bytes than is required for the data type and length of the result. DB2 will attempt to determine if the UDF body has written beyond the end of the result buffer by a few bytes, returning SQLCODE -450 (SQLSTATE 39501). However, a major overwrite by the UDF that DB2 does not detect can cause unpredictable results or an abnormal termination.

SQL-argument-ind

This argument is set by DB2 before calling the UDF. It can be used by the UDF to determine if the corresponding SQL-argument is null or not. The nth SQL-argument-ind corresponds to the nth SQL-argument (described above). It contains one of the following values:

0: The argument is present and not null.
-1: The argument is present and its value is null.

If the function is defined with NOT NULL CALL, the UDF body does not need to check for a null value. However, if it is defined with NULL CALL, any argument can be NULL and the UDF should check it.

The indicator takes the form of a SMALLINT value, and this can be defined in your UDF as described in How the SQL Data Types are Passed to a UDF.

SQL-result-ind

This argument is set by the UDF before returning to DB2. There is one of these for each SQL-result argument.

This argument is used by the UDF to signal if the particular result value is null:

0 or positive: The result is not null
negative: The result is the null value. For more information, see Interpreting Negative SQL-result-ind Values.

Interpreting Negative SQL-result-ind Values:

DB2 treats the function result as null (-2) if the following is true:

The database configuration parameter DFT_SQLMATHWARN is 'YES'
One of the input arguments is a null because of an arithmetic error
The SQL-result-ind is negative.

This is also true if you define the function with the NOT NULL CALL option.

Even if the function is defined with NOT NULL CALL, the UDF body must set the indicator of the result. For example, a divide function could set the result to null when the denominator is zero.

The indicator takes the form of a SMALLINT value, and this can be defined in your UDF as described in How the SQL Data Types are Passed to a UDF.

If the UDF takes advantage of table function optimization using the RESULT column list, then only the indicators corresponding to the required columns need be set.

SQL-state

This argument is set by the UDF before returning to DB2. It takes the form of a CHAR(5) value. Ensure that the argument definition in the UDF is appropriate for a CHAR(5) as described in How the SQL Data Types are Passed to a UDF, and can be used by the UDF to signal warning or error conditions. It contains the value '00000', when the function is called. The UDF can set the value to the following:

00000: The function code did not detect any warning or error situations.
01Hxx: The function code detected a warning situation. This results in a SQL warning, SQLCODE +462 (SQLSTATE 01Hxx ). Here 'xx' is any string.
02000: Only valid for the FETCH call to table functions, it means that there are no more rows in the table.
38502: A special value for the case where the UDF body attempted to issue an SQL call and received an error, SQLCODE -487 (SQLSTATE 38502). because SQL is not allowed in UDFs), and chose to pass this same error back through to DB2.
Any other 38xxx: The function code detected an error situation. It results in a SQL error, SQLCODE -443 (SQLSTATE 38xxx). Here 'xxx' is any string. Do not use 380xx through 384xx because those values are reserved by the draft extensions to the SQL92 international standard, or 385xx because those values are reserved by IBM.

Any other value is treated as an error situation resulting in SQLCODE -463 (SQLSTATE 39001).

function-name

This argument is set by DB2 before calling the UDF. It is the qualified function name, passed from DB2 to the UDF code. This variable takes the form of a VARCHAR(27) value. Ensure that the argument definition in the UDF is appropriate for a VARCHAR(27). See How the SQL Data Types are Passed to a UDF for more information.

The form of the function name that is passed is:

     <schema-name>.<function-name>

The parts are separated by a period. Two examples are:

 
     PABLO.BLOOP       WILLIE.FINDSTRING

This form enables you to use the same UDF body for multiple external functions, and still differentiate between the functions when it is invoked.
Note: Although it is possible to include the period in object names and schema names, it is not recommended. For example, if a function, rotate is in a schema, obj.op, the function name that is returned is obj.op.rotate, and it is not obvious if the schema name is obj or obj.op.

specific-name

This argument is set by DB2 before calling the UDF. It is the specific name of the function passed from DB2 to the UDF code. This variable takes the form of a VARCHAR(18) value. Ensure that the argument definition in the UDF is appropriate for a VARCHAR(18). See How the SQL Data Types are Passed to a UDF for more information. Two examples are:

 
     willie_find_feb99       SQL9904281052440430

This first value is provided by the user in his CREATE FUNCTION statement. The second value is generated by DB2 from the current timestamp if the user does not specify a value.

As with the function-name argument, the reason for passing this value is to give the UDF the means of distinguishing exactly which specific function is invoking it.

diagnostic-message

This argument is set by the UDF before returning to DB2. The UDF can use this argument to insert a message text in a DB2 message. It takes the form of a VARCHAR(70) value. Ensure that the argument definition in the UDF is appropriate for a VARCHAR(70). See How the SQL Data Types are Passed to a UDF for more information.

When the UDF returns either an error or a warning, using the SQL-state argument described above, it can include descriptive information here. DB2 includes this information as a token in its message.

DB2 sets the first character to null before calling the UDF. Upon return, it treats the string as a C null-terminated string. This string will be included in the SQLCA as a token for the error condition. At least the first part of this string will appear in the SQLCA or DB2 CLP message. However, the actual number of characters which will appear depends on the lengths of the other tokens, because DB2 may truncate the tokens to conform to the restrictive limit on total token length imposed by the SQLCA. Avoid using X'FF' in the text since this character is used to delimit tokens in the SQLCA.

The UDF code should not return more text than will fit in the VARCHAR(70) buffer which is passed to it. DB2 will attempt to determine if the UDF body has written beyond the end of this buffer by a few characters, SQLCODE -450 (SQLSTATE 39501). However, an overwrite by the UDF can cause unpredictable results or an abend, as it may not be detected by DB2.

DB2 assumes that any message tokens returned from the UDF to DB2 are in the same code page as the database. Your UDF should ensure that If this is the case. If you use the 7-bit invariant ASCII subset, your UDF can return the message tokens in any code page.

scratchpad

This argument is set by DB2 before calling the UDF. It is only present if the CREATE FUNCTION statement for the UDF specified the SCRATCHPAD keyword. This argument is a structure, exactly like the structure used to pass a value of any of the LOB data types, with the following elements:

An INTEGER containing the length of the scratchpad (always 100)
The actual scratchpad (100 bytes in length), initialized to all binary 0's as follows. For scalar functions it is initialized before the first call, and not generally looked at or modified by DB2 thereafter.
For table functions, the scratchpad is initialized as above prior to the FIRST call to the UDF if FINAL CALL is specified on the CREATE FUNCTION. After this call, the scratchpad content is totally under control of the table function.
If NO FINAL CALL was specified or defaulted for a table function, then the scratchpad is initialized as above for each OPEN call, and the scratchpad content is completely under control of the table function between OPEN calls. (This can be very important for a table function used in a join or subquery. If it is necessary to maintain the content of the scratchpad across OPEN calls, then FINAL CALL must be specified in your CREATE FUNCTION statement. With FINAL CALL specified, in addition to the normal OPEN, FETCH and CLOSE calls, the table function will also receive FIRST and FINAL calls, for the purpose of scratchpad maintenance and resource release.)

The scratchpad can be mapped in your UDF using the same type as either a CLOB or a BLOB, since the argument passed has the same structure. See How the SQL Data Types are Passed to a UDF for more information.

Ensure your UDF code does not make changes outside of the scratchpad buffer. DB2 attempts to determine if the UDF body has written beyond the end of this buffer by a few characters, SQLCODE -450 (SQLSTATE 39501), but a major overwrite by the UDF can cause unpredictable results, or an abend, and may not result in a graceful failure by DB2.

If a scalar UDF which uses a scratchpad is referenced in a subquery, DB2 may decide to refresh the scratchpad between invocations of the subquery. This refresh occurs after a final-call is made, if FINAL CALL is specified for the UDF.

call-type

This argument, if present, is set by DB2 before calling the UDF. For scalar functions this argument is only present if FINAL CALL is specified in the CREATE FUNCTION statement, but for table functions it is ALWAYS present. It follows the scratchpad argument; or the diagnostic-message argument if the scratchpad argument is not present. This argument takes the form of an INTEGER value. Ensure that this argument definition in the UDF is appropriate for INTEGER. See How the SQL Data Types are Passed to a UDF for more information.

Note that even though all the current possible values are listed below, your UDF should contain a switch or case statement which explicitly tests for all the expected values, rather than containing "if A do AA, else if B do BB, else it must be C so do CC" type logic. This is because it is possible that additional call types may be added in the future, and if you don't explicitly test for condition C you will have trouble when new possibilities are added.

Notes:

For all the call-types, it may be appropriate for the UDF to set a SQL-state and diagnostic-message return value. This information will not be repeated in the following descriptions of each call-type. For all calls DB2 will take the indicated action as described previously for these arguments.
The include file sqludf.h is intended for use with UDFs and is described in The UDF Include File: sqludf.h. The file contains symbolic defines for the following call-type values, which are spelled out as constants.

For scalar functions call-type contains:

-1

This is the FIRST call to the UDF for this statement. The scratchpad (if any) is set to binary zeros when the UDF is called. All argument values are passed, and the UDF should do whatever one-time initialization actions are required. In addition, a FIRST call to a scalar UDF is like a NORMAL call, in that it is expected to develop and return an answer.

Note that if SCRATCHPAD is specified but FINAL CALL is not, then the UDF will not have this call-type argument to identify the very first call. Instead it will have to rely on the all-zero state of the scratchpad.

0

This is a NORMAL call. All the SQL input values are passed, and the UDF is expected to develop and return the result. The UDF may also return SQL-state and diagnostic-message information.

1

This is a FINAL call, that is no SQL-argument or SQL-argument-ind values are passed, and attempts to examine these values may cause unpredictable results. If a scratchpad is also passed, it is untouched from the previous call. The UDF is expected to release resources at this point.

Releasing resources.

A scalar UDF is expected to release resources it has required, for example, memory. If FINAL CALL is specified for the UDF, then that FINAL call is a natural place to release resources, provided that SCRATCHPAD is also specified and is used to track the resource. If FINAL CALL is not specified, then any resource acquired should be released on the same call.

For table functions call-type contains:

-2

This is the FIRST call, which only occurs if the FINAL CALL keyword was specified for the UDF. The scratchpad is set to binary zeros before this call. Argument values are passed to the table function, and it may choose to acquire memory or perform other one-time only resource initialization. Note that this is not an OPEN call, that call follows this one. On a FIRST call the table function should not return any data to DB2 as DB2 ignores the data.

-1

This is the OPEN call. The scratchpad will be initialized if NO FINAL CALL is specified, but not necessarily otherwise. All SQL argument values are passed to the table function on OPEN. The table function should not return any data to DB2 on the OPEN call.

0

This is a FETCH call, and DB2 expects the table function to return either a row comprising the set of return values, or an end-of-table condition indicated by SQLSTATE value '02000'. If a scratchpad is passed to the UDF, then on entry it is untouched from the previous call.

1

This is a CLOSE call to the table function. It balances the OPEN call, and can be used to perform any external CLOSE processing (for example, closing a source file), and resource release (particularly for the NO FINAL CALL case).

In cases involving a join or a subquery, the OPEN/FETCH.../CLOSE call sequences can repeat within the execution of a statement, but there is only one FIRST call and only one FINAL call. The FIRST and FINAL call only occur if FINAL CALL is specified for the table function.

2

This is a FINAL call, which only occurs if FINAL CALL was specified for the table function. It balances the FIRST call, and occurs only once per execution of the statement. It is intended for the purpose of releasing resources.

Releasing resources.

Write UDFs to release any resources that they acquire. For table functions, there are two natural places for this release: the CLOSE call and the FINAL call. The CLOSE call balances each OPEN call and can occur multiple times in the execution of a statement. The FINAL call only occurs if FINAL CALL is specified for the UDF, and occurs only once per statement.

If you can apply a resource across all OPEN/FETCH/CLOSE sequences of the UDF, write the UDF to acquire the resource on the FIRST call and free it on the FINAL call. The scratchpad is a natural place to track this resource. For table functions, if FINAL CALL is specified, the scratchpad is initialized only before the FIRST call. If FINAL CALL is not specified, then it is reinitialized before each OPEN call.

If a resource is specific to each OPEN/FETCH/CLOSE sequence, write the UDF to free the resource on the CLOSE call. (Note that when a table function is in a subquery or join, it is very possible that there will be multiple occurrences of the OPEN/FETCH/CLOSE sequence, depending on how the DB2 Optimizer chooses to organize the execution of the statement.)

dbinfo

This argument is set by DB2 before calling the UDF. It is only present if the CREATE FUNCTION statement for the UDF specifies the DBINFO keyword. The argument is the sqludf_dbinfo structure defined in the header file sqludf.h, which is discussed in The UDF Include File: sqludf.h. The variables in this structure that contain names and identifiers may be longer than the longest value possible in this release of DB2, but are defined this way for compatibility with future releases. You can use the length variable that complements each name and identifier variable to read or extract the portion of the variable that is actually used. The dbinfo structure contains the following elements:

Data base name length (dbnamelen)
The length of data base name below. This field is an unsigned short integer.
Data base name (dbname)
The name of the currently connected database. This field is a long identifier of 128 characters. The data base name length field described above identifies the actual length of this field. It does not contain a null terminator or any padding.
Application Authorization ID Length (authidlen)
The length of application authorization ID below. This field is an unsigned short integer.
Application authorization ID (authid)
The application run time authorization ID. This field is a long identifier of 128 characters. It does not contain a null terminator or any padding. The application authorization ID length field described above identifies the actual length of this field.
Database code page (codepg)
This is a union of two 48-byte long structures; one is used by DB2 Universal Database, the other is reserved for future use. The structure used by DB2 Universal Database contains the following fields:
1. SBCS. Single byte code page, an unsigned long integer.
2. DBCS. Double byte code page, an unsigned long integer.
3. COMP. Composite code page, an unsigned long integer.
Schema name length (tbschemalen)
The length of schema name below. Contains 0 (zero) if a table name is not passed. This field is an unsigned short integer.
Schema name (tbschema)
Schema for the table name below. This field is a long identifier of 128 characters. It does not contain a null terminator or any padding. The schema name length field described above identifies the actual length of this field.
Table name length (tbnamelen)
The length of the table name below. Contains 0 (zero) if a table name is not passed. This field is an unsigned short integer.
Table name (tbname)
This is the name of the table being updated or inserted. This field is set only if the UDF reference is the right-hand side of a SET clause in an UPDATE statement, or an item in the VALUES list of an INSERT statement. This field is a long identifier of 128 characters. It does not contain a null terminator or any padding. The table name length field described above, identifies the actual length of this field. The schema name field above, together with this field form the fully qualified table name.
Column name length (colnamelen)
Length of column name below. It contains a 0 (zero) if a column name is not passed. This field is an unsigned short integer.
Column name (colname)
Under the exact same conditions as for table name, this field contains the name of the column being updated or inserted; otherwise not predictable. This field is a long identifier of 128 characters. It does not contain a null terminator or any padding. The column name length field described above, identifies the actual length of this field.
Version/Release number (ver_rel)
An 8 character field that identifies the product and its version, release, and modification level with the format pppvvrrm where:
- ppp identifies the product as follows:
  
  DSN
  DB2 for MVS/ESA or OS/390
  ARI
  SQL/DS
  QSQ
  DB2 Universal Database for AS/400
  SQL
  DB2 Universal Database
- vv is a two digit version identifier.
- rr is a two digit release identifier.
- m is a one digit modification level identifier.
Platform (platform)
The operating platform for the application server, as follows:

SQLUDF_PLATFORM_AIX
AIX
SQLUDF_PLATFORM_HP
HP-UX
SQLUDF_PLATFORM_MVS
OS/390
SQLUDF_PLATFORM_NT
Windows NT
SQLUDF_PLATFORM_OS2
OS/2
SQLUDF_PLATFORM_SUN
Solaris
SQLUDF_PLATFORM_WINDOWS
Windows 95 and Windows 98
SQLUDF_PLATFORM_UNKNOWN
Unknown platform

For additional platforms that are not contained in the above list, see the contents of the sqludf.h file.
Number of table function column list entries (numtfcol)
The number of non-zero entries in the table function column list specified in the table function column list field below.
Reserved field (resd1)
This field is for future use. It is defined as 24 characters long.
Table function column list (tfcolumn)
If this is a table function, this field is a pointer to an array of short integers which is dynamically allocated by DB2. If this is a scalar function, this pointer is null.
This field is used only for table functions. Only the first n entries, where n is specified in the number of table function column list entries field, numtfcol, are of interest. n may be equal to 0, and is less than or equal to the number of result columns defined for the function in the RETURNS TABLE(...) clause of the CREATE FUNCTION statement. The values correspond to the ordinal numbers of the columns which this statement needs from the table function. A value of '1' means the first defined result column, '2' means the second defined result column, and so on, and the values may be in any order. Note that n could be equal to zero, that is, the variable numtfcol might be zero, for a statement similar to SELECT COUNT(*) FROM TABLE(TF(...)) AS QQ, where no actual column values are needed by the query.
This array represents an opportunity for optimization. The UDF need not return all values for all the result columns of the table function, only those needed in the particular context, and these are the columns identified (by number) in the array. Since this optimization may complicate the UDF logic in order to gain the performance benefit, the UDF can choose to return every defined column.
Unique application identifier (appl_id)

This field is a pointer to a C null-terminated string which uniquely identifies the application's connection to DB2. It is regenerated at each database connect.
The string has a maximum length of 32 characters, and its exact format depends on the type of connection established between the client and DB2. Generally it takes the form
```
        <x>.<y>.<ts>
```
where the <x> and <y> vary by connection type, but the <ts> is a 12 character time stamp of the form YYMMDDHHMMSS, which is potentially adjusted by DB2 to ensure uniqueness.
```
      Example:  *LOCAL.db2inst.980707130144
```
Reserved field (resd2)
This field is for future use. It is defined as 20 characters long.

Summary of UDF Argument Use

The following is a summary of the arguments described above, and how you use them in the interface between DB2 and an external UDF.

For scalar functions, the arguments are:

SQL-argument.
This argument passes the values identified in the function reference from DB2 to the UDF. There is one of these arguments for each SQL argument.
SQL-result.
This argument passes the result value generated by the UDF back to DB2 and to the SQL statement where the function reference occurred.
SQL-argument-ind.
This argument corresponds positionally to SQL-argument, and tells the UDF whether or not a particular argument is null. There is one of these for each SQL-argument.
SQL-result-ind.
This argument is used by the UDF to report back to DB2 whether the function result in SQL-result contains nulls.
SQL-state and diagnostic-message.
These arguments are used by the UDF to signal exception information back to DB2.
function-name and specific-name.
These arguments are used by DB2 to pass the identity of the referenced function to the UDF.
scratchpad and call-type.
These arguments are used by DB2 to manage the saving of UDF state between calls. The scratchpad is created and initialized by DB2 and thereafter managed by the UDF. DB2 signals the type of call to the UDF using the call-type argument.
dbinfo.
A structure passed by DB2 to the UDF containing additional information.

A table function logically returns a table to the SQL statement that references it, but the physical interface between DB2 and the table function is row by row. For table functions, the arguments are:

SQL-argument.
This argument passes the values identified in the function reference from DB2 to the UDF. The argument has the same value for FETCH calls as it did for the OPEN and FIRST calls. There is one of these for each SQL argument.
SQL-result.
This argument is used to pass back the individual column values for the row being returned by the UDF. There is one of these arguments for each result column value defined in the RETURNS TABLE (...) clause of the CREATE FUNCTION statement.
SQL-argument-ind.
This argument corresponds positionally to SQL-argument values, and tells the UDF whether the particular argument is null. There is one of these for each SQL argument.
SQL-result-ind.
This argument is used by the UDF to report back to DB2 whether the individual column values returned in the table function output row is null. It corresponds positionally to the SQL-result argument.
SQL-state and diagnostic-message.
These arguments are used by the UDF to signal exception information and the end-of-table condition back to DB2.
function-name and specific-name.
These arguments are used by DB2 to pass the identity of the referenced function to the UDF.
scratchpad and call-type.
These arguments are used by DB2 to manage the saving of UDF state between calls. The scratchpad is created and initialized by DB2 and thereafter managed by the UDF. DB2 signals the type of call to the UDF using the call-type argument. For table functions these call types are OPEN, FETCH, CLOSE, and optionally FIRST and FINAL.
dbinfo.
This is a structure passed by DB2 to the UDF containing additional information.

Observe that the normal value outputs of the UDF, as well as the SQL-result, SQL-result-ind, and SQL-state, are returned to DB2 using arguments passed from DB2 to the UDF. Indeed, the UDF is written not to return anything in the functional sense (that is, the function's return type is void). See the void definition and the return statement in the following example:

 
     #include ... 
      void SQL_API_FN divid( 
           ... arguments ... ) 
     { 
           ... UDF body ... 
           return; 
     }

In the above example, SQL_API_FN is a macro that specifies the calling convention for a function that may vary across supported operating systems. This macro is required when you write stored procedures or UDFs.

For programming examples of UDFs, see Examples of UDF Code.

How the SQL Data Types are Passed to a UDF

This section identifies the valid types, for both UDF parameters and result, and specifies for each how the corresponding argument should be defined in your C or C++ language UDF. For type definitions in Java UDFs, see Supported SQL Data Types. Note that if you use the sqludf.h include file and the types defined there, you can automatically generate language variables and structures that are correct for the different data types and compilers. For example, for BIGINT you can use the SQLUDF_BIGINT data type to hide differences in the name of the 64 bit integer type between different compilers. This include file is discussed in The UDF Include File: sqludf.h.

It is the data type for each function parameter defined in the CREATE FUNCTION statement that governs the format for argument values. Promotions from the argument data type may be needed to get the value in that format. Such promotions are performed automatically by DB2 on the argument values; argument promotion is discussed in the SQL Reference.

For the function result, it is the data type specified in the CAST FROM clause of the CREATE FUNCTION statement that defines the format. If no CAST FROM clause is present, then the data type specified in the RETURNS clause defines the format.

In the following example, the presence of the CAST FROM clause means that the UDF body returns a SMALLINT and that DB2 casts the value to INTEGER before passing it along to the statement where the function reference occurs:

 
     ... RETURNS INTEGER CAST FROM SMALLINT ...

In this case the UDF must be written to generate a SMALLINT, as defined below. Note that the CAST FROM data type must be castable to the RETURNS data type, so one cannot just arbitrarily choose another data type. Casting between data types is discussed in the SQL Reference.

The following is a list of the SQL types and their C language representations. For a list of SQL type representations for Java, see Supported SQL Data Types. It includes information on whether each type is valid as a parameter or a result. Also included are examples of how the types could appear as an argument definition in your C or C++ language UDF:

SMALLINT
Valid. Represent in C as short.
When defining integer UDF parameters, consider using INTEGER rather than SMALLINT as DB2 does not promote SMALLINT arguments to INTEGER. For example, suppose you define a UDF as follows:
```
     CREATE FUNCTION SIMPLE(SMALLINT)... 
```
Example:
```
 
     short    *arg1;          /* example for SMALLINT */ 
     short    *arg1_null_ind; /* example for any null indicator */ 
```
If you invoke the SIMPLE function using INTEGER data, (... SIMPLE(1)...), you will receive an SQLCODE -440 (SQLSTATE 42884) error indicating that the function was not found, and end-users of this function may not perceive the reason for the message. In the above example, 1 is an INTEGER, so you can either cast it to SMALLINT or define the parameter as INTEGER.

INTEGER or INT

Valid. Represent in C as long.

Example:

 
     long     *arg2;          /* example for INTEGER */

BIGINT
Valid. Represent in C as sqlint64.
Example:
```
 
     sqlint64 *arg3;          /* example for INTEGER */
```
DB2 defines the sqlint64 C language type to overcome differences between definitions of the 64 bit signed integer in compilers and operating systems. You must #include sqludf.h to pick up the definition.
DECIMAL(p,s) or NUMERIC(p,s)
Not valid, because there is no C language representation. If you want to pass a decimal value, you must define the parameter to be of a data type castable from DECIMAL (for example CHAR or DOUBLE), and explicitly cast the argument to this type. In the case of DOUBLE, you do not need to explicitly cast a decimal argument to a DOUBLE parameter as DB2 promotes it automatically .
Suppose you have two columns, WAGE as DECIMAL(5,2) and HOURS as DECIMAL(4,1), and you wish to write a UDF to calculate weekly pay based on wage, number of hours worked and some other factors. The UDF could be as follows:
```
     CREATE FUNCTION WEEKLY_PAY (DOUBLE, DOUBLE, ...)
            RETURNS DECIMAL(7,2) CAST FROM DOUBLE
            ...;
```
For the above UDF, the first two parameters correspond to the wage and number of hours. You invoke the UDF WEEKLY_PAY in your SQL select statement as follows:
```
     SELECT WEEKLY_PAY (WAGE, HOURS, ...) ...;
```
Note that no explicit casting is required because the DECIMAL arguments are castable to DOUBLE.
Alternatively, you could define WEEKLY_PAY with CHAR arguments as follows:
```
     CREATE FUNCTION WEEKLY_PAY (VARCHAR(6), VARCHAR(5), ...)
            RETURNS DECIMAL (7,2) CAST FROM VARCHAR(10)
            ...;
```
You would invoke it as follows:
```
     SELECT WEEKLY_PAY (CHAR(WAGE), CHAR(HOURS), ...) ...;
```
Observe the explicit casting that is required because DECIMAL arguments are not promotable to VARCHAR.

An advantage of using floating point parameters is that it is easier to perform arithmetic on the values in the UDF; an advantage of using character parameters is that it is always possible to exactly represent the decimal value. This is not always possible with floating point.

REAL

Valid. Represent in C as float.

Example:

 
     float *result;          /* example for REAL */

DOUBLE or DOUBLE PRECISION or FLOAT
Valid. Represent in C as double.
Example:
```
 
     double  *result;        /* example for DOUBLE   */
```
CHAR(n) or CHARACTER(n) with or without the FOR BIT DATA modifier.
Valid. Represent in C as char...[n+1] (this is a C null-terminated string, the last character is a null, that is X'00').
Example:
```
 
     char     arg1[14];      /* example for CHAR(13)   */ 
     char    *arg1;          /* also perfectly acceptable */
```
For a CHAR(n) parameter, DB2 always moves n bytes of data to the buffer and sets the n+1 byte to null. For a RETURNS CHAR(n) value, DB2 always takes the n bytes and ignores the n+1 byte. For this RETURNS CHAR(n) case, you are warned against the inadvertent inclusion of a null-character in the first n characters. DB2 will not recognize this as anything but a normal part of the data, and it might later on cause seemingly anomalous results if it was not intended.

If FOR BIT DATA is specified, exercise caution about using the normal C string handling functions in the UDF. Many of these functions look for a null to delimit the string, and the null-character (X'00') could be a legitimate character in the middle of the data value.
When defining character UDF parameters, consider using VARCHAR rather than CHAR as DB2 does not promote VARCHAR arguments to CHAR. For example, suppose you define a UDF as follows:
```
     CREATE FUNCTION SIMPLE(INT,CHAR(1))...
```
If you invoke the SIMPLE function using VARCHAR data, (... SIMPLE(1,'A')...), you will receive an SQLCODE -440 (SQLSTATE 42884) error indicating that the function was not found, and end-users of this function may not perceive the reason for the message. In the above example, 'A' is VARCHAR, so you can either cast it to CHAR or define the parameter as VARCHAR.
VARCHAR(n) FOR BIT DATA or LONG VARCHAR with or without the FOR BIT DATA modifier.
Valid. Represent in C as a structure similar to:
```
 
     struct sqludf_vc_fbd 
     { 
        unsigned short length;         /* length of data */ 
        char           data[1];        /* first char of data */ 
     };
```
The [1] is merely to indicate an array to the compiler. It does not mean that only one character is passed; because the address of the structure is passed, and not the actual structure, it just provides a way to use array logic.
These values are not represented as C null-terminated strings because the null-character could legitimately be part of the data value. The length is explicitly passed to the UDF for parameters using the structure variable length. For the RETURNS clause, the length that is passed to the UDF is the length of the buffer. What the UDF body must pass back, using the structure variable length, is the actual length of the data value.
Example:
```
     struct sqludf_vc_fbd *arg1;  /* example for VARCHAR(n) FOR BIT DATA */ 
     struct sqludf_vc_fbd *result; /* also for LONG VARCHAR FOR BIT DATA */
```
VARCHAR(n) without FOR BIT DATA.
Valid. Represent in C as char...[n+1]. (This is a C null-terminated string.)
For a VARCHAR(n) parameter, DB2 will put a null in the (k+1) position, where k is the length of the particular occurrence. The C string-handling functions are thus well suited for manipulation of these values. For a RETURNS VARCHAR(n) value, the UDF body must delimit the actual value with a null, because DB2 will determine the result length from this null character.
Example:
```
     char     arg2[51];      /* example for VARCHAR(50)   */ 
     char    *result;        /* also perfectly acceptable */ 
```
GRAPHIC(n)
Valid. Represent in C as sqldbchar[n+1]. (This is a null-terminated graphic string). Note that you can use wchar_t[n+1] on platforms where wchar_t is defined to be 2 bytes in length; however, sqldbchar is recommended. See Selecting the wchar_t or sqldbchar Data Type for more information on these two data types.
For a GRAPHIC(n) parameter, DB2 moves n double-byte characters to the buffer and sets the following two bytes to null. Data passed from DB2 to a UDF is in DBCS format, and the result passed back is expected to be in DBCS format. This behavior is the same as using the WCHARTYPE NOCONVERT precompiler option described in The WCHARTYPE Precompiler Option. For a RETURNS GRAPHIC(n) value, DB2 always takes the n double-byte characters and ignores the following bytes.
When defining graphic UDF parameters, consider using VARGRAPHIC rather than GRAPHIC as DB2 does not promote VARGRAPHIC arguments to GRAPHIC. For example, suppose you define a UDF as follows:
```
     CREATE FUNCTION SIMPLE(GRAPHIC)... 
```
If you invoke the SIMPLE function using VARGRAPHIC data, (... SIMPLE('graphic_literal')...), you will receive an SQLCODE -440 (SQLSTATE 42884) error indicating that the function was not found, and end-users of this function may not understand the reason for this message. In the above example, graphic_literal is a literal DBCS string that is interpreted as VARGRAPHIC data, so you can either cast it to GRAPHIC or define the parameter as VARGRAPHIC.
Example:
```
     sqldbchar   arg1[14];      /* example for GRAPHIC(13)   */ 
     sqldbchar  *arg1;          /* also perfectly acceptable */ 
```
VARGRAPHIC(n)
Valid. Represent in C as sqldbchar[n+1]. (This is a null-terminated graphic string). Note that you can use wchar_t[n+1] on platforms where wchar_t is defined to be 2 bytes in length; however, sqldbchar is recommended. See Selecting the wchar_t or sqldbchar Data Type for more information on these two data types.
For a VARGRAPHIC(n) parameter, DB2 will put a graphic null in the (k+1) position, where k is the length of the particular occurrence. A graphic null refers to the situation where all the bytes of the last character of the graphic string contain binary zeros ('\0's). Data passed from DB2 to a UDF is in DBCS format, and the result passed back is expected to be in DBCS format. This behavior is the same as using the WCHARTYPE NOCONVERT precompiler option described in The WCHARTYPE Precompiler Option. For a RETURNS VARGRAPHIC(n) value, the UDF body must delimit the actual value with a graphic null, because DB2 will determine the result length from this graphic null character.
Example:
```
     sqldbchar   args[51],    /* example for VARGRAPHIC(50) */ 
     sqldbchar   *result,     /* also perfectly acceptable  */
```
LONG VARGRAPHIC
Valid. Represent in C as a structure:
```
     struct sqludf_vg 
     { 
        unsigned short length;         /* length of data */ 
        sqldbchar      data[1];        /* first char of data */ 
     };
```
Note that in the above structure, you can use wchar_t in place of sqldbchar on platforms where wchar_t is defined to be 2 bytes in length, however, the use of sqldbchar is recommended. See Selecting the wchar_t or sqldbchar Data Type for more information on these two data types.
The [1] merely indicates an array to the compiler. It does not mean that only one graphic character is passed. Because the address of the structure is passed, and not the actual structure, it just provides a way to use array logic.
These are not represented as null-terminated graphic strings. The length, in double-byte characters, is explicitly passed to the UDF for parameters using the structure variable length. Data passed from DB2 to a UDF is in DBCS format, and the result passed back is expected to be in DBCS format. This behavior is the same as using the WCHARTYPE NOCONVERT precompiler option described in The WCHARTYPE Precompiler Option. For the RETURNS clause, the length that is passed to the UDF is the length of the buffer. What the UDF body must pass back, using the structure variable length, is the actual length of the data value, in double byte characters.
Example:
```
     struct sqludf_vg *arg1;  /* example for VARGRAPHIC(n)   */ 
     struct sqludf_vg *result; /* also for LONG VARGRAPHIC   */
```
DATE
Valid. Represent in C same as CHAR(10), that is as char...[11]. The date value is always passed to the UDF in ISO format: yyyy-mm-dd.

Example:
```
     char     arg1[11];      /* example for DATE       */ 
     char    *result;        /* also perfectly acceptable */ 
```
TIME
Valid. Represent in C same as CHAR(8), that is, as char...[9]. The time value is always passed to the UDF in ISO format: hh.mm.ss.
Example:
```
     char    *arg;           /* example for DATE          */ 
     char     result[9];     /* also perfectly acceptable */
```
TIMESTAMP
Valid. Represent in C same as CHAR(26), that is. as char...[27]. The timestamp value is always passed with format: yyyy-mm-dd-hh.mm.ss.nnnnnn.
Example:
```
     char     arg1[27];      /* example for TIMESTAMP  */ 
     char    *result;        /* also perfectly acceptable */ 
```
BLOB(n) and CLOB(n)
Valid. Represent in C as a structure:
```
     struct sqludf_lob 
     { 
         unsigned long length;      /* length in bytes */ 
         char          data[1];      /* first byte of lob */ 
     }; 
```
The [1] merely indicates an array to the compiler. It does not mean that only one character is passed; because the address of the structure is passed, and not the actual structure, it just provides a way to use array logic.
These are not represented as C null-terminated strings. The length is explicitly passed to the UDF for parameters using the structure variable length. For the RETURNS clause, the length that is passed back to the UDF, is the length of the buffer. What the UDF body must pass back, using the structure variable length, is the actual length of the data value.
Example:
```
     struct sqludf_lob *arg1;  /* example for BLOB(n), CLOB(n) */ 
     struct sqludf_lob *result;
```
DBCLOB(n)
Valid. Represent in C as a structure:
```
     struct sqludf_lob 
     { 
         unsigned long length;      /* length in graphic characters */ 
         sqldbchar data[1];         /* first byte of lob */ 
     }; 
```
Note that in the above structure, you can use wchar_t in place of sqldbchar on platforms where wchar_t is defined to be 2 bytes in length, however, the use of sqldbchar is recommended. See Selecting the wchar_t or sqldbchar Data Type for more information on these two data types.
The [1] merely indicates an array to the compiler. It does not mean that only one graphic character is passed; because the address of the structure is passed, and not the actual structure, it just provides a way to use array logic.
These are not represented as null-terminated graphic strings. The length is explicitly passed to the UDF for parameters using the structure variable length. Data passed from DB2 to a UDF is in DBCS format, and the result passed back is expected to be in DBCS format. This behavior is the same as using the WCHARTYPE NOCONVERT precompiler option described in The WCHARTYPE Precompiler Option. For the RETURNS clause, the length that is passed to the UDF is the length of the buffer. What the UDF body must pass back, using the structure variable length, is the actual length of the data value, with all of these lengths expressed in double byte characters.

Example:
```
 
     struct sqludf_lob *arg1;  /* example for DBCLOB(n) */ 
     struct sqludf_lob *result; 
```

Distinct Types

Valid or invalid depending on the base type. Distinct types will be passed to the UDF in the format of the base type of the UDT, so may be specified if and only if the base type is valid.

Example:

 
     struct sqludf_lob *arg1;  /* for distinct type based on BLOB(n) */ 
     double            *arg2;  /* for distinct type based on DOUBLE  */ 
     char              res[5]; /* for distinct type based on CHAR(4) */

Distinct Types AS LOCATOR, or any LOB type AS LOCATOR
The AS LOCATOR type modifier is valid only in UDF parameter and result definitions. It may only be used to modify the LOB types or any distinct type that is based on a LOB type. If you specify the type modifier, a four byte locator is passed to the UDF rather the entire LOB value.
Example:
```
 
     sqludf_locator       *arg1;  /* locator argument */ 
     sqludf_locator       *result; /* locator result */ 
```

The type udf_locator is defined in the header file sqludf.h, which is discussed in The UDF Include File: sqludf.h. The use of these locators is discussed in Using LOB Locators as UDF Parameters or Results.

The UDF Include File: sqludf.h

This include file contains structures, definitions and values which are useful when writing your UDF. Its use is optional, however, and in the sample UDFs shown in Examples of UDF Code, some examples use the include file. When compiling your UDF, you need to reference the directory which contains this file. This directory is sqllib/include.

The sqludf.h include file is self-describing. Following is a brief summary of its content:

Structure definitions for the passed arguments which are structures:
- VARCHAR FOR BIT DATA arguments and result
- LONG VARCHAR (with or without FOR BIT DATA) arguments and result
- LONG VARGRAPHIC arguments and result
- All the LOB types, SQL arguments and result
- The scratchpad
- The dbinfo structure.
C language type definitions for all the SQL data types, for use in the definition of UDF arguments corresponding to SQL arguments and result having the data types. These are the definitions with names SQLUDF_x and SQLUDF_x_FBD where x is a SQL data type name, and FBD represents For Bit Data.
Also included is a C language type for an argument or result which is defined with the AS LOCATOR appendage.
Definition of C language types for the scratchpad and call-type arguments, with an enum type definition of the call-type argument.
Macros for defining the standard trailing arguments, both with and without the inclusion of scratchpad and call-type arguments. This corresponds to the presence and absence of SCRATCHPAD and FINAL CALL keywords in the function definition. These are the SQL-state, function-name, specific-name, diagnostic-message, scratchpad and call-type UDF invocation arguments defined in The Arguments Passed from DB2 to a UDF. Also included are definitions for referencing these constructs, and the various valid SQLSTATE values.
Macros for testing whether the SQL arguments are null.
Function prototypes for the APIs which can be used to manipulate LOB values by means of LOB locators passed to the UDF.

Some of the UDF examples in the next section illustrate the inclusion and use of sqludf.h.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]