
Delving Deep with Nested Queries
If you normalize a database properly, the data is scattered across multiple tables. Most queries that you want to make need to pull data from two or more tables. One way to do this is to use a JOIN operator or one of the other relational operators (UNION, INTERSECT, or EXCEPT). The relational operators take information from multiple tables and combine it all into a single table. Different operators combine the data in different ways. Another way to pull data from two or more tables is to use a nested query.
In SQL, a nested query is one in which an outer enclosing statement contains within it a subquery. That subquery may serve as an enclosing statement for a lower-level subquery that is nested within it. There are no theoretical limits to the number of nesting levels that a nested query may have, although implementation-dependent practical limits do exist.
Subqueries are invariably SELECT statements, but the outermost enclosing statement may also be an INSERT, UPDATE, or DELETE.
Because a subquery can operate on a different table than the table operated on by its enclosing statement, nested queries give you another way to extract information from multiple tables.
For example, suppose that you want to query your corporate database to find all department managers who are more than 50 years old. With the JOINs I discuss in previous article, you may use a query like this:
SELECT D.Deptno, D.Name, E.Name, E.Age
FROM DEPT D, EMPLOYEE E
WHERE D.ManagerID = E.ID AND E.Age > 50 ;
D is the alias for the DEPT table, and E is the alias for the EMPLOYEE table. The EMPLOYEE table has an ID column that is the primary key, and the DEPT table has a column ManagerID that is the ID value of the employee who is the department’s manager. I use a simple JOIN (the list of tables in the FROM clause) to pair related tables, and a WHERE clause to filter all rows except those that meet the criterion. Note that the SELECT statement’s parameter list includes the Deptno and Name columns from the DEPT table and the Name and Age columns from the EMPLOYEE table.
Next, suppose that you’re interested in the same set of rows but you want only the columns from the DEPT table. In other words, you’re interested in the departments whose managers are 50 or older, but you don’t care who those managers are or exactly how old they are. You could then write the query with a subquery rather than a JOIN:
SELECT D.Deptno, D.Name
FROM DEPT D
WHERE EXISTS (SELECT * FROM EMPLOYEE E
WHERE E.ID = D.ManagerID AND E.Age > 50) ;
This query has two new elements: the EXISTS keyword and the SELECT * in the WHERE clause of the first SELECT. The second SELECT is a subquery (or subselect), and the EXISTS keyword is one of several tools for use with a sub-query that is described in this article.
Why Use a Subquery?
In many instances, you can accomplish the same result with a subquery as you can with a JOIN. In most cases, the complexity of the subquery syntax is comparable to the complexity of the corresponding JOIN operation. It comes down to a matter of personal preference. Some people prefer formulating a retrieval in terms of JOIN operations, whereas others prefer nested queries. Sometimes, obtaining the results that you want isn’t possible by using JOIN. In those cases, you must either use a nested query or break the problem up into multiple SQL statements and execute them one at a time.
What Subqueries Do
Subqueries are located within the WHERE clause of their enclosing statement. Their function is to set the search conditions for the WHERE clause. Different kinds of subqueries produce different results. Some subqueries produce a list of values that is then used as input by the enclosing statement. Other subqueries produce a single value that the enclosing statement then evaluates with a com-parison operator. A third kind of subquery returns a value of True or False.
Nested queries that return sets of rows
To illustrate how a nested query returns a set of rows, suppose that you work for a systems integrator of computer equipment. Your company, Zetec Corporation, assembles systems from components that you buy, and then it sells them to companies and government agencies. You keep track of your business with a relational database. The database consists of many tables, but right now you’re concerned with only three of them: the PRODUCT table, the COMP_USED table, and the COMPONENT table. The PRODUCT table contains a list of all your standard products. The COMPONENT table lists components that go into your products, and the COMP_USED table tracks which components go into each product. The tables are defined as follows:
Table
|
PRODUCT Table
|
|
Column
|
Type
|
Constraints
|
Model
|
Char (6)
|
PRIMARY KEY
|
|
|
|
ProdName
|
Char (35)
|
|
|
|
|
ProdDesc
|
Char (31)
|
|
|
|
|
ListPrice
|
Numeric (9,2)
|
|
|
|
|

Table
|
COMPONENT
Table
|
|
Column
|
Type
|
Constraints
|
CompID
|
CHAR (6)
|
PRIMARY KEY
|
|
|
|
CompType
|
CHAR (10)
|
|
|
|
|
CompDesc
|
CHAR (31)
|
|
Table
|
COMP_USED
Table
|
|
|
|
|
Column
|
Type
|
Constraints
|
Model
|
CHAR (6)
|
FOREIGN KEY (for
|
|
|
PRODUCT)
|
|
|
|
CompID
|
CHAR (6)
|
FOREIGN KEY (for
|
|
|
COMPONENT)
|
A component may be used in multiple products, and a product can contain multiple components (a many-to-many relationship). This situation can cause integrity problems. To circumvent the problems, create the linking table COMP_USED to relate COMPONENT to PRODUCT. A component may appear in many COMP_USED rows, but each COMP_USED row references only one component (a one-to-many relationship). Similarly, a product may appear in many COMP_USED rows, but each COMP_USED row references only one product (another one-to-many relationship). By adding the linking table, a troublesome many-to-many relationship has been transformed into two rela-tively simple one-to-many relationships. This process of reducing the com-plexity of relationships is one example of normalization.
No comments:
Post a Comment