This section documents additional considerations for
implementing a UDF, items to keep in mind, and items to avoid.
The following are recommendations to consider to successfully implement
your UDF:
- UDF bodies need to be protected. The executable function
bodies are not captured or protected in any way by DB2. The CREATE
FUNCTION statement merely points to the body. To preserve the integrity
of the function and the database applications which depend on the function,
you must, by managing access to the directory containing the function and by
protecting the body itself, prevent the function body from being inadvertently
or intentionally deleted or replaced.
- DB2 passes pointers to all of the buffers in the interface between DB2 and
SQL (that is, all the SQL arguments and the function return value). Be
sure you define your UDF arguments as pointers.
- All SQL argument values are buffered. This means that a copy of the
value is made and presented to the UDF. If a UDF changes its input
parameters, the changes have no effect on SQL values or processing, but may
cause DB2 to malfunction.
- For OLE automation, do not change the input parameters, otherwise memory
resources may not be freed and you may encounter a memory leak.
In case of a major OLE library version mismatch or a failure in
initializing the OLE library, the database manager returns SQLCODE -465
(SQLSTATE 58032) with reason code 34, (Failure to initialize OLE
library).
- Re-entrancy is strongly recommended for UDFs on all operating
platforms, so that one copy of it can be used for multiple concurrent
statements and applications.
Note that the SCRATCHPAD facility can be used to circumvent many of the
limitations imposed by re-entrancy.
- If the body of a function currently being used is modified (for example,
recompiled and relinked), DB2 will not change functions in
mid-transaction. However, the copy used in a subsequent transaction may
be different if this kind of dynamic modification is taking place. Your
operating system may also prevent you from changing a UDF body that is in
use. This practice is not recommended.
- If you allocate dynamic memory in the UDF, it should be freed before
returning to DB2. This is especially important for the NOT FENCED
case. The SCRATCHPAD facility can be used, however, to anchor dynamic
memory needed by the UDF across invocations. If you use the scratchpad
in this manner, specify the FINAL CALL attribute on the CREATE FUNCTION for
the UDF so that it can free the allocated memory at end-of-statement
processing. The reason for this is that the system could run out of
memory over time, with repeated use of the UDF.
This reasoning holds as well for other system resources used by the
UDF.
- Use the NOT NULL CALL option if it makes sense to do so. With this
CREATE FUNCTION option, you do not have to check whether each SQL argument is
null, and it performs better when you do have NULL values.
- Use the NOT DETERMINISTIC option if the result from your UDF depends on
anything other than the input SQL arguments. This option prevents the
SQL compiler from performing certain optimizations which can cause
inconsistent results.
- Use the EXTERNAL ACTION option if your UDF has any side effects which need
to be reliably performed. EXTERNAL ACTION prevents the SQL compiler
from performing certain optimizations which can prevent invocation of your UDF
in certain circumstances.
- Regarding the choice between FENCED and UNFENCED:
- FENCED UDF
- A FENCED UDF runs in its own process and thus cannot access most DB2
internal control and data areas, whether inadvertently or deliberately.
This makes a FENCED UDF a safer choice for the database. However, it is
still possible for a FENCED UDF that contains programming errors to bring down
DB2, though not as easy as for a NOT FENCED UDF. A UDF that performs a
massive overwrite of a return variable, for example, can cause DB2 to
abend.
- UNFENCED UDF
- A NOT FENCED UDF performs better than a FENCED UDF, because a NOT FENCED
UDF is loaded and executed directly in the DB2 engine process. NOT
FENCED UDFs avoid the performance expense of process communication
overhead. However, a NOT FENCED UDF could conceivably access or alter
DB2 internal control or data areas. It is easier for an improperly
written NOT FENCED UDF to bring down DB2 than a FENCED UDF.
Obviously, with both FENCED and NOT FENCED UDFs, you should:
- ensure the UDF is robustly written
- subject the UDF to a rigorous design and code review
- test the UDF in an environment where no harm can be done if it is not
correctly written; for example, a test database.
Most abends caused by a UDF are caught by DB2, which returns a -430
SQLCODE and prevents the database from being corrupted. However,
certain types of UDF misbehavior, including a massive overwrite of a return
value buffer, can cause DB2 to fail as well as the UDF. Pay attention
particularly to any UDF which returns variable-length data, or which
calculates how many bytes it must move to the return value buffer.
- For considerations on using UDFs with EUC code sets, see Considerations for UDFs.
- For an application running NOT FENCED UDFs, the first time such a UDF is
invoked, a block of memory of the size indicated by the UDF_MEM_SZ
configuration parameter is created. Thereafter, on a statement by
statement basis, memory for interfacing between DB2 and NOT FENCED UDFs is
allocated and deallocated from this block of memory as needed.
For FENCED UDFs, a different block of memory is used in the same
way. It is different because the memory is shared between
processes. In fact, if an application uses both NOT FENCED and FENCED
UDFs, two separate blocks of memory, each of the size indicated by the
UDF_MEM_SZ parameter are used. Refer to the Administration Guide for more information about this configuration parameter.
- Use the DISALLOW PARALLELISM option in the following situations:
- On scalar UDFs, if your UDF absolutely depends on running the same
copy. Generally, this will be the case for NOT DETERMINISTIC SCRATCHPAD
UDFs. (For an example, see the counter UDF specified in Scratchpad Considerations.)
- If you do not want the UDF to run on multiple partitions at once for a
single reference.
- If you are specifying a table function.
Otherwise, ALLOW PARALLELISM (the default) should be specified.