SQL Series...27 - IT Skills

This blog is for IT lovers, accounts and finance executives who want to add value in their professional life.

test

Welcome to our blog. This blog is for IT lovers, Accounts/Finance Executives who wants to add value in their career path.

Search This Blog

Thursday, 7 June 2018

SQL Series...27

test

Relational Operators

Because the data in a relational database is distributed across multiple tables, a query usually draws data from more than one table. SQL:2003 has operators that combine data from multiple sources into a single result table. These are the UNION, INTERSECTION, and EXCEPT operators, as well as a family of JOIN operators. Each operator combines data from multiple tables in a different way.

UNION


The UNION operator is the SQL implementation of relational algebra’s union operator. The UNION operator enables you to draw information from two or more tables that have the same structure. Same structure means

The tables must all have the same number of columns.

Corresponding columns must all have identical data types and lengths.


When these criteria are met, the tables are union compatible. The union of two tables returns all the rows that appear in either table and eliminates duplicates.

Say that you create a baseball statistics database. It contains two union-compatible tables named AMERICAN and NATIONAL. Both tables have three columns, and corresponding columns are all the same type. In fact, corresponding columns have identical column names (although this condition isn’t required for union compatibility).

NATIONAL lists the names and number of complete games pitched by National League pitchers. AMERICAN lists the same information about pitchers in the American League. The UNION of the two tables gives you a virtual result table containing all the rows in the first table plus all the rows in the second table. For this example, I put just a few rows in each table to illustrate the operation:


SELECT * FROM NATIONAL ;
FirstName LastName CompleteGames
--------- -------- -------------
Sal Maglie 11
Don Newcombe 9
Sandy Koufax 13
Don Drysdale 12
SELECT * FROM AMERICAN ;
FirstName LastName CompleteGames
--------- -------- -------------
Whitey Ford 12
Don Larson 10
Bob Turley 8
Allie Reynolds 14

SELECT * FROM NATIONAL

UNION
SELECT * FROM AMERICAN ;

FirstName LastName CompleteGames
--------- -------- -------------
Allie Reynolds 14
Bob Turley 8
Don Drysdale 12
Don Larson 10
Don Newcombe 9
Sal Maglie 11
Sandy Koufax 13
Whitey Ford 12

The UNION DISTINCT operator functions identically to the UNION operator without the DISTINCT keyword. In both cases duplicate rows are eliminated from the result set.


I’ve been using the asterisk (*) as shorthand for all the columns in a table. This shortcut is fine most of the time, but it can get you into trouble when you use relational operators in embedded or module-language SQL. What if you add one or more new columns to one table and not to another, or you
add different columns to the two tables? The two tables are then no longer union-compatible, and your program will be invalid the next time it’s recom-piled. Even if the same new columns are added to both tables so that they are still union-compatible, your program is probably not prepared to deal with this additional data. So, explicitly listing the columns that you want rather than relying on the * shorthand is generally a good idea. When you’re enter-ing ad hoc SQL from the console, the asterisk will probably work fine because you can quickly display table structure to verify union compatibility if your query isn’t successful.


As mentioned previously, the UNION operation normally eliminates any dupli-cate rows that result from its operation, which is the desired result most of the time. Sometimes, however, you may want to preserve duplicate rows. On those occasions, use UNION ALL.

Referring to the example, suppose that “Bullet” Bob Turley had been traded in midseason from the New York Yankees in the American League to the Brooklyn Dodgers in the National League. Now suppose that during the season, he pitched eight complete games for each team. The ordinary UNION displayed in the example throws away one of the two lines containing Turley’s data. Although he seemed to pitch only eight complete games in the season, he actually hurled a remarkable 16 complete games. The following query gives you the true facts:


SELECT * FROM NATIONAL

UNION ALL
SELECT * FROM AMERICAN ;

You can sometimes form the UNION of two tables even if they are not union-compatible. If the columns you want in your result table are present and com-patible in both tables, you can perform a UNION CORRESPONDING operation. Only the specified columns are considered, and they are the only columns displayed in the result table.

Baseball statisticians keep different statistics on pitchers than they keep on outfielders. In both cases, first name, last name, putouts, errors, and fielding percentage are recorded. Outfielders, of course, don’t have a won/lost record, a saves record, or a number of other things that pertain only to pitching. You can still perform a UNION that takes data from the OUTFIELDER table and from the PITCHER table to give you some overall information about defensive skill:


SELECT *

FROM OUTFIELDER
UNION CORRESPONDING
(FirstName, LastName, Putouts, Errors, FieldPct)
SELECT *
FROM PITCHER ;
The result table holds the first and last names of all the outfielders and pitchers, along with the putouts, errors, and fielding percentage of each player. As with the simple UNION, duplicates are eliminated. Thus, if a player spent some time in the outfield and also pitched in one or more games, the UNION CORRESPONDING operation loses some of his statistics. To avoid this problem, use UNION ALL CORRESPONDING.


Each column name in the list following the CORRESPONDING keyword must be a name that exists in both unioned tables. If you omit this list of names, an implicit list of all names that appear in both tables is used. But this implicit list of names may change when new columns are added to one or both tables. Therefore, explicitly listing the column names is better than omitting them.

No comments:

Post a Comment

Popular

Welcome to our blog. If you want to; learn writing skills, preparation for CSS/Corporate laws, IT skills, downloading Business/IT books, and many more; this blog is for you.