IBM Books

SQL Getting Started


Joins

The process of combining data from two or more tables is called joining tables. The database manager forms all combinations of rows from the specified tables. For each combination, it tests the join condition. A join condition is a search condition, with some restrictions. For a list of restrictions refer to the SQL Reference.

Note that the data types of the columns involved in the join condition do not have to be identical; however, they must be compatible. The join condition is evaluated the same way as any other search condition, and the same rules for comparisons apply.

If you do not specify a join condition, all combinations of rows from tables listed in the FROM clause are returned, even though the rows may be completely unrelated. The result is referred to as the cross product of the two tables.

Examples in this section are based on the next two tables. They are simplifications of the tables from the sample database but do not exist in the sample database. They are used to outline interesting points about joins in general. SAMP_STAFF lists the name of employees who are not employed as contractors and their job descriptions, while SAMP_PROJECT lists the name of employees (contract and full-time) and the projects that they are working on.

The tables are as follows:

Figure 5. SAMP_PROJECT TABLE


REQTEXT

Figure 6. SAMP_STAFF TABLE


REQTEXT

The following example produces the cross product of two table. A join condition is not specified, so all combination of rows is present:

     SELECT SAMP_PROJECT.NAME, 
            SAMP_PROJECT.PROJ, SAMP_STAFF.NAME, SAMP_STAFF.JOB
        FROM SAMP_PROJECT, SAMP_STAFF

This statement produces the following result:

     NAME       PROJ   NAME       JOB     
     ---------- ------ ---------- --------
     Haas       AD3100 Haas       PRES    
     Thompson   PL2100 Haas       PRES    
     Walker     MA2112 Haas       PRES    
     Lutz       MA2111 Haas       PRES    
     Haas       AD3100 Thompson   MANAGER 
     Thompson   PL2100 Thompson   MANAGER 
     Walker     MA2112 Thompson   MANAGER 
     Lutz       MA2111 Thompson   MANAGER 
     Haas       AD3100 Lucchessi  SALESREP
     Thompson   PL2100 Lucchessi  SALESREP
     Walker     MA2112 Lucchessi  SALESREP
     Lutz       MA2111 Lucchessi  SALESREP
     Haas       AD3100 Nicholls   ANALYST 
     Thompson   PL2100 Nicholls   ANALYST 
     Walker     MA2112 Nicholls   ANALYST 
     Lutz       MA2111 Nicholls   ANALYST 

The two main types of joins are inner joins and outer joins. So far, in all of our examples we have used the inner join. Inner joins keep only the rows from the cross product that meet the join condition. If a row exists in one table, but not the other, the information is not included in the result table.

The following example produces the inner join of the two tables. The inner join lists the information full-time employees who are assigned to a project :

     SELECT SAMP_PROJECT.NAME, 
            SAMP_PROJECT.PROJ, SAMP_STAFF.NAME, SAMP_STAFF.JOB
        FROM SAMP_PROJECT, SAMP_STAFF
        WHERE SAMP_STAFF.NAME = SAMP_PROJECT.NAME

Alternately, you can specify the inner join as follows:

     SELECT SAMP_PROJECT.NAME, 
            SAMP_PROJECT.PROJ, SAMP_STAFF.NAME, SAMP_STAFF.JOB
        FROM SAMP_PROJECT INNER JOIN SAMP_STAFF
          ON SAMP_STAFF.NAME = SAMP_PROJECT.NAME

The result is:

     NAME       PROJ   NAME       JOB     
     ---------- ------ ---------- --------
     Haas       AD3100 Haas       PRES    
     Thompson   PL2100 Thompson   MANAGER 

Note that the result of the inner join consists of rows that have matching values for the NAME column in the right and the left tables - both 'Haas' and 'Thompson' are included in the SAMP_STAFF table that lists all full-time employee and in the SAMP_PROJECT table that lists full-time and contract employee assigned to a project.

Outer joins are a concatentation of the inner join and rows from the left table, right table, or both tables that are missing from the inner join. When you perform an outer join on two tables, you arbitrarily assign one table as the left table and the other one as the right table. There are three types of outer joins:

  1. left outer join includes the inner join and the rows from the left table that are not included in the inner join.

  2. right outer join includes the inner join and the rows from the right table that are not included in the inner join.

  3. full outer join includes the inner join and the rows from both the left and right tables that are not included in the inner join.

Use the SELECT statement to specify the columns to be displayed. In the FROM clause, list the name of the first table followed by the keywords LEFT OUTER JOIN, RIGHT OUTER JOIN or FULL OUTER JOIN. Next you need to specify the second table followed by the ON keyword. Following the ON keyword, specify the join condition to express a relationship between the tables to be joined.

In the following example, SAMP_STAFF is designated as the right table and SAMP_PROJECT as the left table. By using LEFT OUTER JOIN, we list the name and project number of all employees, full-time and contract, (listed in SAMP_PROJECT) and their job title, if they are a full-time employee (listed in SAMP_STAFF):

     SELECT SAMP_PROJECT.NAME, SAMP_PROJECT.PROJ,
            SAMP_STAFF.NAME, SAMP_STAFF.JOB
        FROM SAMP_PROJECT LEFT OUTER JOIN SAMP_STAFF
          ON SAMP_STAFF.NAME = SAMP_PROJECT.NAME

This statement produces the following result:

     NAME       PROJ                 NAME       JOB                 
     ---------- -------------------- ---------- --------------------
     Haas       AD3100               Haas       PRES                
     Lutz       MA2111               -          -                   
     Thompson   PL2100               Thompson   MANAGER             
     Walker     MA2112               -          -                   

Rows with values in all columns are the result of the inner join. These are rows that satisfy the join condition: 'Haas' and 'Thompson' are listed in both SAMP_PROJECT (left table) and SAMP_STAFF (right table). For rows that the join condition was not satisfied, the null value appears on columns of the right table: 'Lutz' and 'Walker' are contract employees listed in the SAMP_PROJECT table and not in the SAMP_STAFF table. Note that all rows from the left table are included in the result set.

In the next example, SAMP_STAFF is designated as the right table and SAMP_PROJECT as the left table. By using RIGHT OUTER JOIN we list the name and job title of all full-time employees (listed in SAMP_STAFF) and their project number, if they are assigned to one (listed in SAMP_PROJECT):

     SELECT SAMP_PROJECT.NAME, 
            SAMP_PROJECT.PROJ, SAMP_STAFF.NAME, SAMP_STAFF.JOB
        FROM SAMP_PROJECT RIGHT OUTER JOIN SAMP_STAFF
          ON SAMP_STAFF.NAME = SAMP_PROJECT.NAME

The result is:

    NAME       PROJ                 NAME       JOB                 
    ---------- -------------------- ---------- --------------------
    Haas       AD3100               Haas       PRES                
    -          -                    Lucchessi  SALESREP            
    -          -                    Nicholls   ANALYST             
    Thompson   PL2100               Thompson   MANAGER             

As in the left outer join, rows with values in all columns are the result of the inner join. These are rows that satisfy the join condition: 'Haas' and 'Thompson' are listed in both SAMP_PROJECT (left table) and SAMP_STAFF (right table). For rows that the join condition was not satisfied, the null value appears on columns of the right table: 'Lucchessi' and 'Nicholls' are full-time employee that are not assigned to a project. While they are listed in SAMP_STAFF, they are not in SAMP_PROJECT. Note that all rows from the right table are included in the result set.

The next example uses FULL OUTER JOIN with the SAMP_PROJECT and SAMP_STAFF tables. It lists the name of all full-time, including the ones that are not assigned to a project, and contract employees:

     SELECT SAMP_PROJECT.NAME, SAMP_PROJECT.PROJ,
            SAMP_STAFF.NAME, SAMP_STAFF.JOB
        FROM SAMP_PROJECT FULL OUTER JOIN SAMP_STAFF
          ON SAMP_STAFF.NAME = SAMP_PROJECT.NAME

The result is:

     NAME       PROJ                 NAME       JOB                 
     ---------- -------------------- ---------- --------------------
     Haas       AD3100               Haas       PRES                
     -          -                    Lucchessi  SALESREP            
     -          -                    Nicholls   ANALYST             
     Thompson   PL2100               Thompson   MANAGER             
     Lutz       MA2111               -          -                   
     Walker     MA2112               -          -                   

This result includes the left outer join, the right outer join and the inner join. All full-time and contract employees are listed. Just like left outer join and right outer join, for values that the join condition was not satisfied the null value appears in the respective column. Every row from SAMP_STAFF and SAMP_PROJECT is included in the result set.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]