DB2 Server for VSE & VM: Application Programming

Joining Tables

With joins, you can write a query against the combined data of two or more tables. (You can also join views.)

To join tables, follow these steps:

In the FROM clause, list all the tables you want to join.

In the WHERE clause, specify a join condition to express a relationship between the tables to be joined.

Note: The data types of the columns involved in the join condition do not have to be identical; however, they must be compatible. The join condition is evaluated the same way as any other search condition, and the same rules for comparisons apply. (These rules are discussed under Using Expressions.)

Joining Tables Using the Database Manager

The system forms all combinations of rows from the indicated tables. For each combination, it tests the join condition. If you do not specify a join condition, all combinations of rows from tables listed in the FROM clause are returned, even though the rows may be completely unrelated.

Performing a Simple Join Query

The join query in Figure 30 finds the project number and the last name of the employees in department D11:

Figure 30. A Simple Join

The WHERE clause above expresses a join condition. If a row from one of the participating tables does not satisfy the join condition, that row does not appear in the result of the join. So, if a EMPNO in the EMPLOYEE table has no matching EMPNO in the EMP_ACT table (or if EMPNO in the EMP_ACT table has no matching EMPNO in the EMPLOYEE table), that row does not appear in your result.

Note: More than one table in a join may have a common column name. To identify exactly which column you are referring to, you must use the table name as a prefix, as in the example above. Unique column names do not require a table name prefix.

Here is the query result (based on the example tables):

PROJNO LASTNAME ______ _____________ MA2111 BROWN MA2111 BROWN MA2111 LUTZ MA2112 ADAMSON MA2112 ADAMSON MA2112 WALKER MA2112 WALKER MA2112 YOSHIMURA MA2112 YOSHIMURA MA2113 JONES MA2113 JONES MA2113 PIANKA MA2113 SCOUTTEN MA2113 YOSHIMURA

Joining Another User's Tables

If you are referring to another user's table, you must prefix the table name with the owner-name. If, for example, the tables in the query above belonged to JONES, you would write:

Analyzing How a Join Works

When writing a join query, it is often helpful to mentally go through the query to see how SQL develops a JOIN.

For example, look at the previous select-statement. It refers to the EMPLOYEE and EMP_ACT tables. Joining the two tables will produce one table that contains all the columns in both tables.

Each EMPNO in the EMPLOYEE table is compared to every EMPNO in the EMP_ACT table. When the EMPNO column of both tables matches, a row is formed that contains the combined columns of the "matching" rows. Notice that the only column name that is common to both tables is EMPNO. If the name of this EMPNO column were different in each table, the EMPNO column of the result could have been called either name. This is because of the equality expressed in the join condition. In fact, the select-list could have specified EMPLOYEE.EMPNO instead of EMP_ACT.EMPNO, and identical results would have been produced.

Now consider what happens when the second part of the WHERE clause (AND WORKDEPT='D11') is applied.

The result is further reduced so that only the rows with a department name of D11 remain. The entire search condition is now satisfied. The system strips off the columns not specified in the select-list. This produces the query result previously shown.

Using VARCHAR and VARGRAPHIC within Join Conditions

If you are joining VARCHAR or VARGRAPHIC columns, trailing blanks are not used. For example, "JONES" and "JONES " match. If they were from two different EMPLOYEE tables joined on the LASTNAME column, they would form one row.

Using Nulls within Join Conditions

Like other predicates, a join condition is never satisfied by a null value. For example, if a row in the EMPLOYEE table and a row in the EMP_ACT table both have a null EMPNO, neither row will appear in the result of the join.

Joining a Table to Itself Using a Correlation Name

You can write a query in which you join a table to itself, by repeating the table name two or more times in the FROM clause. This tells the system that the join consists of combinations of rows from the same table. When you repeat the table name in the FROM clause, it is no longer unique. You must give one or both table names in the FROM clause a unique correlation_name to correctly designate the tables.

You use the correlation names to resolve column name ambiguities in the select-list and the WHERE clause. Rules for table designation are given at the end of this section.

For example, the following query finds the total of the values from the ACSTAFF column (PROJ_ACT table) for activities 60 and 70 for any project that contains both these activities:

DECLARE C1 CURSOR FOR SELECT PA1.PROJNO, PA1.ACSTAFF + PA2.ACSTAFF FROM PROJ_ACT PA1, PROJ_ACT PA2 WHERE PA1.PROJNO = PA2.PROJNO AND PA1.ACTNO = 60 AND PA2.ACTNO = 70 ORDER BY 1 OPEN C1 FETCH C1 INTO :PRONUM, :TOTAL CLOSE C1

This type of join query can also be easily visualized. Each PROJNO in the PROJ_ACT table is compared to every other PROJNO in the PROJ_ACT table. When two rows with the same PROJNO are found, a row is formed. The new row contains the combined columns of the "matching" rows.

Now consider what happens when the second part of the WHERE clause (PA1.ACTNO = 60 AND PA2.ACTNO = 70) is applied.

The result is further reduced to only the rows with an ACTNO of 60 in the first ACTNO column and with an ACTNO of 70 in the second ACTNO column.

Finally, the system sorts the query by PROJNO and strips off the columns not specified in the select-list. This produces:

PROJNO EXPRESSION 1 PROJNO EXPRESSION 1 ------ ------------ ------ ------------ AD3111 2.30 AD3113 2.00 AD3111 1.30 AD3113 1.25 AD3111 2.00 AD3113 1.75 AD3111 1.00 AD3113 1.50 AD3112 1.50 AD3113 1.75 AD3112 1.25 AD3113 2.25 AD3112 1.75 AD3113 1.50 AD3112 1.00 AD3113 2.00 AD3112 1.25 AD3113 1.75 AD3112 1.00 AD3113 2.00 AD3112 1.50 AD3113 1.50 AD3112 0.75 AD3113 0.75 AD3112 1.50 AD3113 1.25 AD3112 1.25 AD3113 1.00 AD3112 1.75 AD3113 1.25 AD3112 1.00 MA2112 3.00 AD3112 1.75 MA2112 3.50 AD3112 1.50 MA2112 3.00 AD3112 2.00 MA2113 3.00 AD3112 1.25 MA2113 3.00

If the table is owned by another user, the table name must be qualified in the usual fashion. For example, here is how to write the above query if the owner of the PROJ_ACT table is SCOTT:

DECLARE C1 CURSOR FOR SELECT PA1.PROJNO, PA1.ACSTAFF + PA2.ACSTAFF FROM SCOTT.PROJ_ACT PA1, SCOTT.PROJ_ACT PA2 WHERE PA1.PROJNO = PA2.PROJNO AND PA1.ACTNO = 60 AND PA2.ACTNO = 70 ORDER BY 1 OPEN C1 FETCH C1 INTO :PRONUM, :TOTAL CLOSE C1

Rules for Table Designation

Only exposed table names and correlation names in the FROM clause can be referenced in other clauses.
An exposed table name is one that is not followed by a correlation_name (for example, PROJECT). A nonexposed table name is a table name which is followed by a correlation_name (for example, PROJECT P). In the latter example, PROJECT has no scope in the query and cannot be referenced; the table designator in this case is P.
Exposed table names in the FROM clause must be different from each other.
Correlation names in the FROM clause must be different from each other and different from any exposed table names.

These rules are illustrated here:

The above query is not allowed. EMPLOYEE is a nonexposed table name and cannot be used to qualify column LASTNAME.

The above query is allowed. The second table in the FROM clause can be designated by the exposed table name EMPLOYEE. There is no ambiguity or conflict with the table name EMPLOYEE in the first table of the FROM clause, because that is a nonexposed table name.

Imposing Limits on Join Queries

The example of a simple join query in Figure 30 had only one join condition relating the values of EMPNO in two tables. |The following limits exist with respect to joins:

|You can join up to 16 tables in a query
|The maximum number of join columns in a query is 40. Note, |however, that this limit is evaluated after the Optimizer does query |transformation internally, and that this transformation may affect the number |of join columns in the query.

For more information on these limits, see the section on 'SQL Limits' in the DB2 Server for VSE & VM SQL Reference manual.

Using SELECT * In a Join

The notation SELECT * in a join query means "select all the columns of the first table, followed by all the columns of the second table, and so on." You can also use the notation SELECT T1.*. to select all the columns of the table T1. However, it is not recommended that you use either SELECT * or SELECT T1.* for join queries written in programs because if someone adds a new column to the first table in the join (by an ALTER TABLE statement), the columns of the second table are no longer delivered into the correct host variables. To avoid this problem, use a select-list in which all the columns are specifically listed.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]