In writing DB2 applications, you have a choice when implementing desired actions or operations:
Although it may seem easier to implement new operations as subroutines in your application, there are good reasons why you should consider using UDFs and methods:
If the new operation is something of which other users or programs at your site can take advantage, then UDFs and methods can help to reuse it. In addition, the operation can be invoked directly in SQL wherever an expression can be used by any user of the database. For UDFs, the database will take care of many data type promotions of the function arguments automatically, for example DECIMAL to DOUBLE, allowing your operation to be applied to different, but compatible data types.
It may seem easier to implement your new operation as a subroutine and then make it available to others for use in their programs, thereby avoiding the need to define the function to DB2. This requires that you inform all other interested application developers, and package the subroutine effectively for their use. However, it ignores the interactive users like those who normally use the Command Line Processor (CLP) to access the database. CLP users cannot use your function unless it is a UDF or method in the database. This also applies to any other tools that use SQL (such as Lotus Approach), that can not be recompiled.
Invoking the UDF or method directly from the database engine instead of from your application can have a considerable performance advantage, particularly when the operation qualifies data for further processing. Consider a simple scenario where you want to process some data, provided you can meet some selection criteria which can be expressed as a function SELECTION_CRITERIA(). Your application could issue the following select statement:
SELECT A,B,C FROM T
When it receives each row, it runs SELECTION_CRITERIA against the data to decide if it is interested in processing the data further. Here, every row of table T must be passed back to the application. But if SELECTION_CRITERIA() is implemented as a UDF, your application can issue the following statement:
SELECT A,B,C FROM T WHERE SELECTION_CRITERIA(A,B) = 1
In this case, only the rows of interest are passed across the interface between the application and the database. For large tables, or for cases where SELECTION_CRITERIA supplies significant filtering, the performance improvement can be very significant.
Another case where a UDF can offer a performance benefit is when dealing with Large Objects (LOB). If you have a function whose purpose is to extract some information from a value of one of the LOB types, you can perform this extraction right on the database server and pass only the extracted value back to the application. This is more efficient than passing the entire LOB value back to the application and then performing the extraction. The performance value of packaging this function as a UDF could be enormous, depending on the particular situation. (Note that you can also extract a portion of a LOB by using a LOB locator. See Example: Deferring the Evaluation of a LOB Expression for an example of a similar scenario.)
In addition, you can use the RETURNS TABLE clause of the CREATE FUNCTION statement to define UDFs called table functions. Table functions enable you to very efficiently use relational operations and the power of SQL on data that resides outside a DB2 database (including non-relational data stores). A table function takes individual scalar values of different types and meanings as its arguments, and returns a table to the SQL statement that invokes it. You can write table functions that generate only the data in which you are interested, eliminating any unwanted rows or columns. For more information on table functions, including rules on where you can use them, refer to the SQL Reference.
You cannot create a method that returns a table.
You can implement the behavior of a user-defined distinct type (UDT), also called distinct type, using a UDF. For more information on UDTs, see User-defined Distinct Types. For additional details on UDTs and the important concept of castability discussed therein, refer to the SQL Reference. When you create a distinct type, you are automatically provided cast functions between the distinct type and its source type, and you may be provided comparison operators such as =, >, <, and so on, depending on the source type. You have to provide any additional behavior yourself. Because it is clearly desirable to keep the behavior of a distinct type in the database where all of the users of the distinct type can easily access it, UDFs can be used as the implementation mechanism.
For example, suppose you have a BOAT distinct type, defined over a one megabyte BLOB. The BLOB contains the various nautical specifications, and some drawings. You may wish to compare sizes of boats, and with a distinct type defined over a BLOB source type, you do not get the comparison operations automatically generated for you. You can implement a BOAT_COMPARE function which decides if one boat is bigger than another based on a measure that you choose. These could be: displacement, length over all, metric tonnage, or another calculation based on the BOAT object. You create the BOAT_COMPARE function as follows:
CREATE FUNCTION BOAT_COMPARE (BOAT, BOAT) RETURNS INTEGER ...
If your function returns 1 if the first BOAT is bigger, 2 if the second is bigger, and 0 if they are equal, you could use this function in your SQL code to compare boats. Suppose you create the following tables:
CREATE TABLE BOATS_INVENTORY ( BOAT_ID CHAR(5), BOAT_TYPE VARCHAR(25), DESIGNER VARCHAR(40), OWNER VARCHAR(40), DESIGN_DATE DATE, SPEC BOAT, ... ) CREATE TABLE MY_BOATS ( BOAT_ID CHAR(5), BOAT_TYPE VARCHAR(25), DESIGNER VARCHAR(40), DESIGN_DATE DATE, ACQUIRE_DATE DATE, ACQUIRE_PRICE CANADIAN_DOLLAR, CURR_APPRAISL CANADIAN_DOLLAR, SPEC BOAT, ... )
You can execute the following SQL SELECT statement:
SELECT INV.BOAT_ID, INV.BOAT_TYPE, INV.DESIGNER, INV.OWNER, INV.DESIGN_DATE FROM BOATS_INVENTORY INV, MY_BOATS MY WHERE MY.BOAT_ID = '19GCC' AND BOAT_COMPARE(INV.SPEC, MY.SPEC) = 1
This simple example returns all the boats from BOATS_INVENTORY that are bigger than a particular boat in MY_BOATS. Note that the example only passes the rows of interest back to the application because the comparison occurs in the database server. In fact, it completely avoids passing any values of data type BOAT. This is a significant improvement in storage and performance as BOAT is based on a one megabyte BLOB data type.