SQL Series...30

Relational Operators

JOINS

The UNION, INTERSECT, and EXCEPT operators are valuable in multitable databases in which the tables are union-compatible. In many cases, however, you want to draw data from multiple tables that have very little in common. JOINs are powerful relational operators that combine data from multiple tables into a single result table. The source tables may have little (or even nothing) in common with each other.

SQL:2003 supports a number of types of JOINs. The best one to choose in a given situation depends on the result you’re trying to achieve.

Basic JOIN

Any multitable query is a type of JOIN. The source tables are joined in the sense that the result table includes information taken from all the source tables. The simplest JOIN is a two-table SELECT that has no WHERE clause qualifiers. Every row of the first table is joined to every row of the second table. The result table is the Cartesian product of the two source tables. The number of rows in the result table is equal to the number of rows in the first source table multiplied by the number of rows in the second source table.

For example, imagine that you’re the personnel manager for a company and that part of your job is to maintain employee records. Most employee data, such as home address and telephone number, is not particularly sensitive.

But some data, such as current salary, should be available only to authorized personnel. To maintain security of the sensitive information, keep it in a sepa-rate table that is password protected. Consider the following pair of tables:

EMPLOYEE COMPENSATION

-------- ------------

EmpID Employ

FName Salary

LName Bonus

City

Phone

Fill the tables with some sample data:

EmpID	FName	LName	City	Phone
-----	-----	-----	----	-----
1	Whitey	Ford	Orange	555-1001
2	Don	Larson	Newark	555-3221
3	Sal	Maglie	Nutley	555-6905
4	Bob	Turley	Passaic	555-8908
Employ	Salary	Bonus
------	------	-----
1	33000	10000
2	18000	2000
3	24000	5000
4	22000	7000

Create a virtual result table with the following query:

SELECT *

FROM EMPLOYEE, COMPENSATION ;

Producing:

EmpID FName LName City Phone Employ Salary Bonus

----- ----- ----- ---- ----- ------ ------ -----

1 Whitey Ford Orange 555-1001 1 33000 10000

1 Whitey Ford Orange 555-1001 2 18000 2000

1 Whitey Ford Orange 555-1001 3 24000 5000

1 Whitey Ford Orange 555-1001 4 22000 7000

2 Don Larson Newark 555-3221 1 33000 10000

2 Don Larson Newark 555-3221 2 18000 2000

2 Don Larson Newark 555-3221 3 24000 5000

2 Don Larson Newark 555-3221 4 22000 7000

3 Sal Maglie Nutley 555-6905 1 33000 10000

3 Sal Maglie Nutley 555-6905 2 18000 2000

3 Sal Maglie Nutley 555-6905 3 24000 5000

3 Sal Maglie Nutley 555-6905 4 22000 7000

4 Bob Turley Passaic 555-8908 1 33000 10000

4 Bob Turley Passaic 555-8908 2 18000 2000

4 Bob Turley Passaic 555-8908 3 24000 5000

4 Bob Turley Passaic 555-8908 4 22000 7000

The result table, which is the Cartesian product of the EMPLOYEE and COMPENSATION tables, contains considerable redundancy. Furthermore, it doesn’t make much sense. It combines every row of EMPLOYEE with every row of COMPENSATION. The only rows that convey meaningful information are those in which the EmpID number that came from EMPLOYEE matches the Employ number that came from COMPENSATION. In those rows, an employee’s name and address are associated with that same employee’s compensation.

When you’re trying to get useful information out of a multitable database, the Cartesian product produced by a basic JOIN is almost never what you want, but it’s almost always the first step toward what you want. By applying con-straints to the JOIN with a WHERE clause, you can filter out the unwanted rows. The most common JOIN that uses the WHERE clause filter is the equi-join.

Equi-join

An equi-join is a basic join with a WHERE clause containing a condition specify-ing that the value in one column in the first table must be equal to the value of a corresponding column in the second table. Applying an equi-join to the example tables from the previous section brings a more meaningful result:

SELECT *

FROM EMPLOYEE, COMPENSATION

WHERE EMPLOYEE.EmpID = COMPENSATION.Employ ;

This produces:

EmpID	FName	LName	City	Phone	Employ Salary Bonus
-----	------	-----	----	-----	------ ------ -----
1	Whitey	Ford	Orange	555-1001	1	33000	10000
2	Don	Larson	Newark	555-3221	2	18000	2000
3	Sal	Maglie	Nutley	555-6905	3	24000	5000
4	Bob	Turley	Passaic	555-8908	4	22000	7000

In this result table, the salaries and bonuses on the right apply to the employees named on the left. The table still has some redundancy because the EmpID column duplicates the Employ column. You can fix this problem with a slight reformulation of the query:

SELECT EMPLOYEE.*,COMPENSATION.Salary,COMPENSATION.Bonus

FROM EMPLOYEE, COMPENSATION

WHERE EMPLOYEE.EmpID = COMPENSATION.Employ ;

This produces:

EmpID FName LName City Phone Salary Bonus

----- ----- ----- ---- ----- ------ -----

1 Whitey Ford Orange 555-1001 33000 10000

2 Don Larson Newark 555-3221 18000 2000

3 Sal Maglie Nutley 555-6905 24000 5000

4 Bob Turley Passaic 555-8908 22000 7000

This table tells you what you want to know, but doesn’t burden you with any extraneous data. The query is somewhat tedious to write, however. To avoid ambiguity, qualify the column names with the names of the tables they came from. Writing those table names repeatedly provides good exercise for the fingers but has no merit otherwise.

You can cut down on the amount of typing by using aliases (or correlation names). An alias is a short name that stands for a table name. If you use aliases in recasting the preceding query, it comes out like this:

SELECT E.*, C.Salary, C.Bonus

FROM EMPLOYEE E, COMPENSATION C

WHERE E.EmpID = C.Employ ;

In this example, E is the alias for EMPLOYEE, and C is the alias for COMPENSATION. The alias is local to the statement it’s in. After you declare an alias (in the FROM clause), you must use it throughout the statement. You can’t use both the alias and the long form of the table name.

Mixing the long form of table names with aliases creates confusion. Consider the following example, which is confusing:

SELECT T1.C, T2.C

FROM T1 T2, T2 T1

WHERE T1.C > T2.C ;

In this example, the alias for T1 is T2, and the alias for T2 is T1. Admittedly, this isn’t a smart selection of aliases, but it isn’t forbidden by the rules. If you mix aliases with long-form table names, you can’t tell which table is which.

The preceding example with aliases is equivalent to the following SELECT with no aliases:

SELECT T2.C, T1.C

FROM T1 , T2

WHERE T2.C > T1.C ;

SQL:2003 enables you to join more than two tables. The maximum number varies from one implementation to another. The syntax is analogous to the two-table case:

SELECT E.*, C.Salary, C.Bonus, Y.TotalSales

FROM EMPLOYEE E, COMPENSATION C, YTD_SALES Y

WHERE E.EmpID = C.Employ

AND C.Employ = Y.EmpNo ;

This statement performs an equi-join on three tables, pulling data from corresponding rows of each one to produce a result table that shows the sales-people’s names, the amount of sales they are responsible for, and their com-pensation. The sales manager can quickly see whether compensation is in line with production.

Storing a salesperson’s year-to-date sales in a separate YTD_SALES table ensures better performance and reliability than keeping that data in the EMPLOYEE table. The data in the EMPLOYEE table is relatively static. A person’s name, address, and telephone number don’t change very often. In contrast, the year-to-date sales change frequently (you hope). Because YTD_SALES has fewer columns than EMPLOYEE, you may be able to update it more quickly. If, in the course of updating sales totals, you don’t touch the EMPLOYEE table, you decrease the risk of accidentally modifying EMPLOYEE information that should stay the same.

Cross join

CROSS JOIN is the keyword for the basic join without a WHERE clause.

Therefore,

SELECT *

FROM EMPLOYEE, COMPENSATION ;

can also be written as:

SELECT *

FROM EMPLOYEE CROSS JOIN COMPENSATION ;

The result is the Cartesian product (also called the cross product) of the two source tables. CROSS JOIN rarely gives you the final result you want, but it can be useful as the first step in a chain of data manipulation operations that ultimately produce the desired result.

Natural join

The natural join is a special case of an equi-join. In the WHERE clause of an equi-join, a column from one source table is compared with a column of a second source table for equality. The two columns must be the same type and length and must have the same name. In fact, in a natural join, all columns in one table that have the same names, types, and lengths as corresponding columns in the second table are compared for equality.

Imagine that the COMPENSATION table from the preceding example has columns EmpID, Salary, and Bonus rather than Employ, Salary, and Bonus. In that case, you can perform a natural join of the COMPENSATION table with the EMPLOYEE table. The traditional JOIN syntax would look like this:

SELECT E.*, C.Salary, C.Bonus

FROM EMPLOYEE E, COMPENSATION C

WHERE E.EmpID = C.EmpID ;

This query is a natural join. An alternate syntax for the same operation is the following:

SELECT E.*, C.Salary, C.Bonus

FROM EMPLOYEE E NATURAL JOIN COMPENSATION C ;

Condition join

A condition join is like an equi-join, except the condition being tested doesn’t have to be equal (although it can be). It can be any well-formed predicate. If the condition is satisfied, the corresponding row becomes part of the result table. The syntax is a little different from what you have seen so far, in that the condition is contained in an ON clause rather than a WHERE clause.

Say that a baseball statistician wants to know which National League pitchers have pitched the same number of complete games as one or more American League pitchers. This question is a job for an equi-join, which can also be expressed with condition join syntax:

SELECT *

FROM NATIONAL JOIN AMERICAN

ON NATIONAL.CompleteGames = AMERICAN.CompleteGames ;

Column-name join

The column-name join is like a natural join, but it’s more flexible. In a natural join, all the source table columns that have the same name are compared with each other for equality. With the column-name join, you select which same-name columns to compare. You can choose them all if you want, making the column-name join effectively a natural join. Or you may choose fewer than all same-name columns. In this way, you have a great degree of control over which cross product rows qualify to be placed into your result table.

Say that you’re a chess set manufacturer and have one inventory table that keeps track of your stock of white pieces and another that keeps track of black pieces. The tables contain data as follows:

WHITE			BLACK
-----			-----
Piece Quant Wood			Piece	Quant Wood
----- ----- ----			-----	----- ----
King	502	Oak	King	502	Ebony
Queen	398	Oak	Queen	397	Ebony
Rook	1020	Oak	Rook	1020	Ebony
Bishop	985	Oak	Bishop	985	Ebony
Knight	950	Oak	Knight	950	Ebony
Pawn	431	Oak	Pawn	453	Ebony

For each piece type, the number of white pieces should match the number of black pieces. If they don’t match, some chessmen are being lost or stolen, and you need to tighten security measures.

A natural join compares all columns with the same name for equality. In this case, a result table with no rows is produced because no rows in the WOOD column in the WHITE table match any rows in the WOOD column in the BLACK table. This result table doesn’t help you determine whether any merchandise is missing. Instead, do a column-name join that excludes the WOOD column from consideration. It can take the following form:

SELECT *

FROM WHITE JOIN BLACK

USING (Piece, Quant) ;

The result table shows only the rows for which the number of white pieces in stock equals the number of black pieces:

Piece Quant		Wood	Piece	Quant	Wood
----- -----		----	-----	-----	----
King	502	Oak	King	502	Ebony
Rook	1020	Oak	Rook	1020	Ebony
Bishop	985	Oak	Bishop	985	Ebony
Knight	950	Oak	Knight	950	Ebony

The shrewd person can deduce that Queen and Pawn are missing from the list, indicating a shortage somewhere for those piece types.

Inner join

By now, you’re probably getting the idea that joins are pretty esoteric and that it takes an uncommon level of spiritual discernment to deal with them adequately. You may have even heard of the mysterious inner join and specu-lated that it probably represents the core or essence of relational operations.

Well, ha! The joke is on you. There’s nothing mysterious about inner joins. In fact, all the joins covered so far in this article are inner joins. I could have formulated the column-name join in the last example as an inner join by using the following syntax:

SELECT *

FROM WHITE INNER JOIN BLACK

USING (Piece, Quant) ;

The result is the same.

The inner join is so named to distinguish it from the outer join. An inner join discards all rows from the result table that don’t have corresponding rows in both source tables. An outer join preserves unmatched rows. That’s the dif-ference. There is nothing metaphysical about it.

Outer join

When you’re joining two tables, the first one (call it the one on the left) may have rows that don’t have matching counterparts in the second table (the one on the right). Conversely, the table on the right may have rows that don’t have matching counterparts in the table on the left. If you perform an inner join on those tables, all the unmatched rows are excluded from the output. Outer joins, however, don’t exclude the unmatched rows. Outer joins come in three types: the left outer join, the right outer join, and the full outer join.

Left outer join

In a query that includes a join, the left table is the one that precedes the key-word JOIN, and the right table is the one that follows it. The left outer join pre-serves unmatched rows from the left table but discards unmatched rows from the right table.

To understand outer joins, consider a corporate database that maintains records of the company’s employees, departments, and locations. Tables 1, 2, and 3 contain the database’s example data.

Table	LOCATION
*LOCATION_ID*	*CITY*
1	Boston

3	Tampa

5	Chicago

Table		DEPT

*DEPT_ID*	*LOCATION_ID*	*NAME*
21	1	Sales

24	1	Admin

27	5	Repair

29	5	Stock

Table		EMPLOYEE

*EMP_ID*	*DEPT_ID*	*NAME*
61	24	Kirk

63	27	McCoy

Now suppose that you want to see all the data for all employees, including department and location. You get this with an equi-join:

SELECT *

FROM LOCATION L, DEPT D, EMPLOYEE E

WHERE L.LocationID = D.LocationID

AND D.DeptID = E.DeptID ;

This statement produces the following result:

1 Boston 24 1 Admin 61 24 Kirk

5 Chicago 27 5 Repair 63 27 McCoy

This result table gives all the data for all the employees, including their loca-tion and department. The equi-join works because every employee has a location and a department.

Suppose now that you want the data on the locations, with the related department and employee data. This is a different problem because a loca-tion without any associated departments may exist. To get what you want, you have to use an outer join, as in the following example:

SELECT *

FROM LOCATION L LEFT OUTER JOIN DEPT D

ON (L.LocationID = D.LocationID)

LEFT OUTER JOIN EMPLOYEE E

ON (D.DeptID = E.DeptID);

This join pulls data from three tables. First, the LOCATION table is joined to the DEPT table. The resulting table is then joined to the EMPLOYEE table. Rows from the table on the left of the LEFT OUTER JOIN operator that have no corresponding row in the table on the right are included in the result. Thus, in the first join, all locations are included, even if no department associated with them exists. In the second join, all departments are included, even if no employee associated with them exists. The result is as follows:

1	Boston	24	1	Admin	61	24	Kirk
5	Chicago	27	5	Repair	63	27	McCoy
3	Tampa	NULL	NULL	NULL	NULL	NULL	NULL
5	Chicago	29	5	Stock	NULL	NULL	NULL
1	Boston	21	1	Sales	NULL	NULL	NULL

The first two rows are the same as the two result rows in the previous example. The third row (3 Tampa) has nulls in the department and employee columns because no departments are defined for Tampa and no employees are stationed there. The fourth and fifth rows (5 Chicago and 1 Boston) contain data about the Stock and the Sales departments, but the employee columns for these rows contain nulls because these two departments have no employees. This outer join tells you everything that the equi-join told you plus the following:

All the company’s locations, whether they have any departments or not All the company’s departments, whether they have any employees or not

The rows returned in the preceding example aren’t guaranteed to be in the order you want. The order may vary from one implementation to the next. To make sure that the rows returned are in the order you want, add an ORDER BY clause to your SELECT statement, like this:

SELECT *

FROM LOCATION L LEFT OUTER JOIN DEPT D

ON (L.LocationID = D.LocationID)

LEFT OUTER JOIN EMPLOYEE E

ON (D.DeptID = E.DeptID)

ORDER BY L.LocationID, D.DeptID, E.EmpID;

You can abbreviate the left outer join language as LEFT JOIN because there’s no such thing as a left inner join.

Right outer join

I bet you figured out how the right outer join behaves. Right! The right outer join preserves unmatched rows from the right table but discards unmatched rows from the left table. You can use it on the same tables and get the same result by reversing the order in which you present tables to the join:

SELECT *

FROM EMPLOYEE E RIGHT OUTER JOIN DEPT D

ON (D.DeptID = E.DeptID)

RIGHT OUTER JOIN LOCATION L

ON (L.LocationID = D.LocationID) ;

In this formulation, the first join produces a table that contains all depart-ments, whether they have an associated employee or not. The second join produces a table that contains all locations, whether they have an associated department or not.

You can abbreviate the right outer join language as RIGHT JOIN because there’s no such thing as a right inner join.

Full outer join

The full outer join combines the functions of the left outer join and the right outer join. It retains the unmatched rows from both the left and the right tables. Consider the most general case of the company database used in the preceding examples. It could have

Locations with no departments Departments with no locations Departments with no employees Employees with no departments

To show all locations, departments, and employees, regardless of whether they have corresponding rows in the other tables, use a full outer join in the following form:

SELECT *

FROM LOCATION L FULL JOIN DEPT D

ON (L.LocationID = D.LocationID)

FULL JOIN EMPLOYEE E

ON (D.DeptID = E.DeptID) ;

You can abbreviate the full outer join language as FULL JOIN because there’s no such thing as a full inner join.

Union join

Unlike the other kinds of join, the union join makes no attempt to match a row from the left source table with any rows in the right source table. It cre-ates a new virtual table that contains the union of all the columns in both source tables. In the virtual result table, the columns that came from the left source table contain all the rows that were in the left source table. For those rows, the columns that came from the right source table all have the null value. Similarly, the columns that came from the right source table contain all the rows that were in the right source table. For those rows, the columns that came from the left source table all have the null value. Thus, the table result-ing from a union join contains all the columns of both source tables, and the number of rows that it contains is the sum of the number of rows in the two source tables.

The result of a union join by itself is not immediately useful in most cases; it produces a result table with many nulls in it. Look at an example.

Suppose that you work for a company that designs and builds experimental rockets. You have several projects in the works. You also have several design engineers who have skills in multiple areas. As a manager, you want to know which employees, having which skills, have worked on which projects. Currently, this data is scattered among the EMPLOYEE table, the PROJECTS table, and the SKILLS table.

The EMPLOYEE table carries data about employees, and EMPLOYEE.EmpID is its primary key. The PROJECTS table has a row for each project that an employee has worked on. PROJECTS.EmpID is a foreign key that references the EMPLOYEE table. The SKILLS table shows the expertise of each employee. SKILLS.EmpID is a foreign key that references the EMPLOYEE table.

The EMPLOYEE table has one row for each employee; the PROJECTS table and the SKILLS table have zero or more rows.

Tables 1, 2, and 3 show example data in the three tables.

Table	EMPLOYEE Table
*EmpID*	*Name*
1	Ferguson

2	Frost

3	Toyon

Table	PROJECTS Table

*ProjectName*	*EmpID*
X-63 Structure	1

X-64 Structure	1

X-63 Guidance	2

X-64 Guidance	2

X-63 Telemetry	3

X-64 Telemetry	3

Table	SKILLS Table

*Skill*	*EmpID*
Mechanical Design	1

Aerodynamic Loading	1

Analog Design	2

Gyroscope Design	2

Digital Design	3

R/F Design	3

From the tables, you can see that Ferguson has worked on X-63 and X-64 structure design and has expertise in mechanical design and aerodynamic loading.

Now suppose that, as a manager, you want to see all the information about all the employees. You decide to apply an equi-join to the EMPLOYEE, PROJECTS, and SKILLS tables:

SELECT *

FROM EMPLOYEE E, PROJECTS P, SKILLS S

WHERE E.EmpID = P.EmpID

AND E.EmpID = S.EmpID ;

You can express this same operation as an inner join by using the following syntax:

SELECT *

FROM EMPLOYEE E INNER JOIN PROJECTS P

ON (E.EmpID = P.EmpID)

INNER JOIN SKILLS S

ON (E.EmpID = S.EmpID) ;

Both formulations give the same result, as shown in Table

Table			Result of Inner Join

*E.EmpID*	*Name*	*P.EmpID*	*ProjectName*	*S.EmpID*	*Skill*
1	Ferguson	1	X-63 Structure	1	Mechanical
					Design

1	Ferguson	1	X-63 Structure	1	Aerodynamic
					Loading

1	Ferguson	1	X-64 Structure	1	Mechanical
					Design

1	Ferguson	1	X-64 Structure	1	Aerodynamic
					Loading

2	Frost	2	X-63 Guidance	2	Analog Design

2	Frost	2	X-63 Guidance	2	Gyroscope Design

2	Frost	2	X-64 Guidance	2	Analog Design

2	Frost	2	X-64 Guidance	2	Gyroscope Design

3	Toyon	3	X-63 Telemetry	3	Digital Design

3	Toyon	3	X-63 Telemetry	3	R/F Design

3	Toyon	3	X-64 Telemetry	3	Digital Design

3	Toyon	3	X-64 Telemetry	3	R/F Design

This data arrangement is not particularly enlightening. The employee ID numbers appear three times, and the projects and skills are duplicated for each employee. The inner joins are not well suited to answering this type of question. You can put the union join to work here, along with some strategi-cally chosen SELECT statements, to produce a more suitable result. You begin with the basic union join:

SELECT *

FROM EMPLOYEE E UNION JOIN PROJECTS P

UNION JOIN SKILLS S ;

Notice that the union join has no ON clause. It doesn’t filter the data, so an ON clause isn’t needed. This statement produces the result shown in Table

Table			Result of Union Join
*E.EmpID*	*Name*	*P.EmpID*	*ProjectName*	*S.EmpID*	*Skill*
1	Ferguson	NULL	NULL	NULL	NULL

NULL	NULL	1	X-63 Structure	NULL	NULL

NULL	NULL	1	X-64 Structure	NULL	NULL

NULL	NULL	NULL	NULL	1	Mechanical
					Design

NULL	NULL	NULL	NULL	1	Aerodynamic
					Loading

2	Frost	NULL	NULL	NULL	NULL

NULL	NULL	2	X-63 Guidance	NULL	NULL

NULL	NULL	2	X-64 Guidance	NULL	NULL

NULL	NULL	NULL	NULL	2	Analog Design

NULL	NULL	NULL	NULL	2	Gyroscope
					Design

3	Toyon	NULL	NULL	NULL	NULL

NULL	NULL	3	X-63 Telemetry	NULL	NULL

NULL	NULL	3	X-64 Telemetry	NULL	NULL

NULL	NULL	NULL	NULL	3	Digital Design

NULL	NULL	NULL	NULL	3	R/F Design

Each table has been extended to the right or left with nulls, and those null-extended rows have been unioned. The order of the rows is arbitrary and depends on the implementation. Now you can massage the data to put it in a more useful form.

Notice that the table has three ID columns, only one of which is nonnull in any row. You can improve the display by coalescing the ID columns. As I note in previous article, the COALESCE expression takes on the value of the first nonnull value in a list of values. In the present case, it takes on the value of the only nonnull value in a column list:

SELECT COALESCE (E.EmpID, P.EmpID, S.EmpID) AS ID,

E.Name, P.ProjectName, S.Skill

FROM EMPLOYEE E UNION JOIN PROJECTS P

UNION JOIN SKILLS S

ORDER BY ID ;

The FROM clause is the same as in the previous example, but now you’re coa-lescing the three EMP_ID columns into a single column named ID. You’re also ordering the result by ID. Table shows the result.

Table Result of Union Join with COALESCE Expression

ID	*Name*	*ProjectName*	*Skill*
1	Ferguson	X-63 Structure	NULL

1	Ferguson	X-64 Structure	NULL

1	Ferguson	NULL	Mechanical Design

1	Ferguson	NULL	Aerodynamic Loading

2	Frost	X-63 Guidance	NULL

2	Frost	X-64 Guidance	NULL

2	Frost	NULL	Analog Design

2	Frost	NULL	Gyroscope Design

3	Toyon	X-63 Telemetry	NULL

3	Toyon	X-64 Telemetry	NULL

3	Toyon	NULL	Digital Design

3	Toyon	NULL	R/F Design

Each row in this result has data about a project or a skill, but not both. When you read the result, you first have to determine what type of information is in each row (project or skill). If the ProjectName column has a nonnull value, the row names a project the employee has worked on. If the Skill column is non-null, the row names one of the employee’s skills.

You can make the result a little clearer by adding another COALESCE to the

SELECT statement, as follows:

SELECT COALESCE (E.EmpID, P.EmpID, S.EmpID) AS ID,

E.Name, COALESCE (P.Type, S.Type) AS Type,

P.ProjectName, S.Skill

FROM EMPLOYEE E

UNION JOIN (SELECT “Project” AS Type, P.*

FROM PROJECTS) P

UNION JOIN (SELECT “Skill” AS Type, S.*

FROM SKILLS) S

ORDER BY ID, Type ;

In this union join, the PROJECTS table in the previous example has been replaced with a nested SELECT that appends a column named P.Type with a constant value “Project” to the columns coming from the PROJECTS table. Similarly, the SKILLS table has been replaced with a nested SELECT that appends a column named S.Type with a constant value “Skill” to the columns coming from the SKILLS table. In each row, P.Type is either null or “Project”, and S.Type is either null or “Skill”.

The outer SELECT list specifies a COALESCE of those two Type columns into a single column named Type. You then specify Type in the ORDER BY clause, which sorts the rows that all have the same ID so that all projects are first, followed by all the skills. The result is shown in Table

Table		Refined Result of Union Join with
			COALESCE Expressions

ID	*Name*	*Type*	*ProjectName*	*Skill*
1	Ferguson	Project	X-63 Structure	NULL

1	Ferguson	Project	X-64 Structure	NULL

1	Ferguson	Skill	NULL	Mechanical Design

1	Ferguson	Skill	NULL	Aerodynamic Loading

2	Frost	Project	X-63 Guidance	NULL

2	Frost	Project	X-64 Guidance	NULL

2	Frost	Skill	NULL	Analog Design

2	Frost	Skill	NULL	Gyroscope Design

3	Toyon	Project	X-63 Telemetry	NULL

3	Toyon	Project	X-64 Telemetry	NULL

3	Toyon	Skill	NULL	Digital Design

3	Toyon	Skill	NULL	R/F Design

The result table now presents a very readable account of the project experi-ence and the skill sets of all the employees in the EMPLOYEE table.

Considering the number of JOIN operations available, relating data from dif-ferent tables shouldn’t be a problem, regardless of the tables’ structure. Trust that if the raw data exists in your database, SQL:2003 has the means to get it out and display it in a meaningful form.

ON versus WHERE

The function of the ON and WHERE clauses in the various types of joins is potentially confusing. These facts may help you keep things straight:

The ON clause is part of the inner, left, right, and full joins. The cross join and union join don’t have an ON clause because neither of them does any filtering of the data.

The ON clause in an inner join is logically equivalent to a WHERE clause; the same condition could be specified either in the ON clause or a WHERE clause.

The ON clauses in outer joins (left, right, and full joins) are different from WHERE clauses. The WHERE clause simply filters the rows that are returned by the FROM clause. Rows that are rejected by the filter are not included in the result. The ON clause in an outer join first filters the rows of a cross product and then includes the rejected rows, extended with nulls.

IT Skills

Tags

Search This Blog

Sunday, 10 June 2018

SQL Series...30

Relational Operators

JOINS

Basic JOIN

Equi-join

Cross join

Natural join

Condition join

Column-name join

Inner join

Outer join

Left outer join

Right outer join

Full outer join

Union join

ON versus WHERE

No comments:

Post a Comment

Report Abuse

Popular

About Me

Archive

Comments

Tags

Contact Form

Labels

Categories

Trending

Recent