SQL Series...12

Building a Multitable Relational Database

A database must do more than merely hold your data. It must also protect the data from becoming corrupted. In the latter part of this article, I discuss how to protect the integrity of your data. Normalization is one of the key meth-ods you can use to protect the integrity of a database. I discuss the various “normal” forms and point out the kinds of problems that normalization solves.

Designing the Database

To design a database, follow these basic steps (I go into detail about each step in the sections that follow this list):

1. Decide what objects you want to include in your database.

2. Determine which of these objects should be tables and which should be columns within those tables.

3. Define tables according to your determination of how you need to organize the objects.

Optionally, you may want to designate a table column or a combination of columns as a key. Keys provide a fast way of locating a row of interest in a table.

The following sections discuss these steps in detail, as well as some other technical issues that arise during database design.

Step 1: Defining objects

The first step in designing a database is deciding which aspects of the system are important enough to include in the model. Treat each aspect as an object and create a list containing the names of all the objects you can think of. At this stage, don’t try to decide how these objects relate to each other. Just try to list them all.

You may find it helpful to gather a team of people who are familiar with the system you’re modeling. These people can brainstorm and respond to each other’s ideas. Working together, you’ll probably develop a more complete and accurate set of objects.

When you have a reasonably complete set of objects, move on to the next step: deciding how these objects relate to each other. Some of the objects are major entities, crucial to giving you the results that you want. Others are sub-sidiary to those major entities. You ultimately may decide that some objects don’t belong in the model at all.

Step 2: Identifying tables and columns

Major entities translate into database tables. Each major entity has a set of associated attributes, which translate into the table columns. Many business databases, for example, have a CUSTOMER table that keeps track of cus-tomers’ names, addresses, and other permanent information. Each attribute of a customer, such as name, street, city, state, zip code, phone number, and e-mail address, becomes a column in the CUSTOMER table.

No rules exist about what to identify as tables and which of the attributes in the system belong to which table. You may have some reasons for assigning a particular attribute to one table and other reasons for assigning the attribute to another table. You must make a judgment based on what information you want to get from the database and how you want to use that information.

In deciding how to structure database tables, involve the future users of the database as well as the people who will make decisions based on database information. If the “reasonable” structure you arrive at isn’t consistent with the way that people will use the information, your system will turn out to be frustrating to use at best — and could even produce wrong information, which is even worse. Don’t let this happen! Put careful effort into deciding how to structure your tables.

Take a look at an example to demonstrate the thought process that goes into creating a multitable database. Say that you just established VetLab, a clinical microbiology laboratory that tests biological specimens sent in by veterinari-ans. You want to track several things, such as the items in the following list:

Clients

Tests that you perform Employees

Orders Results

Each of these entities has associated attributes. Each client has a name, address, and other contact information. Each test has a name and a standard charge. Employees have contact information as well as a job classification and pay rate. For each order, you need to know who ordered it, when it was ordered, and what test was ordered. For each test result, you need to know the outcome of the test, whether the results were preliminary or final, and the test order number.

Step 3: Defining tables

Now you want to define a table for each entity and a column for each attribute. Table 5-1 shows how you may define the VetLab tables.

Table	VetLab Tables
*Table*	*Columns*
CLIENT	Client Name

	Address 1

	Address 2

	City

Table	Columns
	State

	Postal Code

	Phone

	Fax

	Contact Person

TESTS	Test Name

	Standard Charge

EMPLOYEE	Employee Name

	Address 1

	Address 2

	City

	State

	Postal Code

	Home Phone

	Office Extension

	Hire Date

	Job Classification

	Hourly/Salary/Commission

ORDERS	Order Number

	Client Name

	Test Ordered

	Responsible Salesperson

	Order Date

RESULTS	Result Number

	Order Number

	Result

	Date Reported

	Preliminary/Final

You can create the tables defined in Table by using either a rapid applica-tion development (RAD) tool or by using SQL’s Data Definition Language (DDL), as shown in the following code:

CREATE TABLE CLIENT (
ClientName	CHARACTER (30)	NOT NULL,
Address1	CHARACTER (30),
Address2	CHARACTER (30),
City	CHARACTER (25),
State	CHARACTER (2),
PostalCode	CHARACTER (10),
Phone	CHARACTER (13),
Fax	CHARACTER (13),
ContactPerson	CHARACTER (30) ) ;
CREATE TABLE TESTS (
TestName	CHARACTER (30)	NOT NULL,
StandardCharge	CHARACTER (30) ) ;
CREATE TABLE EMPLOYEE (
EmployeeName	CHARACTER (30)	NOT NULL,
Address1	CHARACTER (30),
Address2	CHARACTER (30),
City	CHARACTER (25),
State	CHARACTER (2),
PostalCode	CHARACTER (10),
HomePhone	CHARACTER (13),
OfficeExtension	CHARACTER (4),
HireDate	DATE,
JobClassification	CHARACTER (10),
HourSalComm	CHARACTER (1) ) ;
CREATE TABLE ORDERS (
OrderNumber	INTEGER	NOT NULL,
ClientName	CHARACTER (30),
TestOrdered	CHARACTER (30),
Salesperson	CHARACTER (30),
OrderDate	DATE ) ;
CREATE TABLE RESULTS (
ResultNumber	INTEGER	NOT NULL,
OrderNumber	INTEGER,
Result	CHARACTER(50),
DateReported	DATE,
PrelimFinal	CHARACTER (1) ) ;

These tables relate to each other by the attributes (columns) that they share, as the following list describes:

The CLIENT table links to the ORDERS table by the ClientName column.

The TESTS table links to the ORDERS table by the TestName (TestOrdered) column.

The EMPLOYEE table links to the ORDERS table by the EmployeeName (Salesperson) column.

The RESULTS table links to the ORDERS table by the OrderNumber column.

For a table to serve as an integral part of a relational database, link that table to at least one other table in the database by using a common column. Figure 5-1 illustrates the relationships between the tables.

The links in Figure 5-1 illustrate four different one-to-many relationships. The single arrowhead points to the “one” side of the relationship, and the double arrowhead points to the “many” side.

One client can make many orders, but each order is made by one, and only one, client.

Each test can appear on many orders, but each order calls for one, and only one, test.

Each order is taken by one, and only one, employee (or salesperson), but each salesperson can (and, you hope, does) take multiple orders.

Each order can produce several preliminary test results and a final result, but each result is associated with one, and only one, order.

As you can see in the figure, the attribute that links one table to another can have a different name in each table. Both attributes must, however, have matching data types.

Domains, character sets, collations, and translations

Although tables are the main components of a database, additional elements play a part, too. In previous article, I define the domain of a column in a table as the set of all values that the column may assume. Establishing clear-cut domains for the columns in a table, through the use of constraints, is an important part of designing a database.

People who communicate in standard American English aren’t the only ones who use relational databases. Other languages — even some that use other character sets — work equally well. Even if your data is in English, some appli-cations may still require a specialized character set. SQL:2003 enables you to specify the character set you want to use. In fact, you can use a different char-acter set for each column in a table. This flexibility is generally unavailable in languages other than SQL.

A collation, or collating sequence, is a set of rules that determines how strings in a character set compare with one another. Every character set has a default collation. In the default collation of the ASCII character set, A comes before B, and B comes before C. A comparison, therefore, considers A as less than B and considers C as greater than B. SQL:2003 enables you to apply different collations to a character set. Again, this degree of flexibility isn’t generally available in other languages.

Sometimes, you encode data in a database in one character set, but you want to deal with the data in another character set. Perhaps you have data in the German character set, for example, but your printer doesn’t support German characters that the ASCII character set doesn’t include. A translation is a SQL:2003 facility that enables you to translate character strings from one character set to another. The translation may translate one character into two, such as a German ü to an ASCII ue, or the translation may translate lowercase characters to uppercase. You can even translate one alphabet into another, such as Hebrew into ASCII.

Getting into your database fast with keys

A good rule for database design is to make sure that every row in a database table is distinguishable from every other row; each row should be unique. Sometimes, you may want to extract data from your database for a specific

purpose, such as a statistical analysis, and in doing so, you create tables where rows aren’t necessarily unique. For your limited purpose, this sort of duplication doesn’t matter. Tables that you may use in more than one way, however, should not contain duplicate rows.

A key is an attribute or a combination of attributes that uniquely identifies a row in a table. To access a row in a database, you must have some way of dis-tinguishing that row from all the other rows. Because keys must be unique, they provide such an access mechanism. Furthermore, a key must never con-tain a null value. If you use null keys, two rows that each contain a null key field may not be distinguishable from each other.

In the veterinary lab example, you can designate appropriate columns as keys. In the CLIENT table, ClientName is a good key. This key can distinguish each client from all others. Entering a value in this column becomes manda-tory for every row in the table. TestName and EmployeeName make good keys for the TESTS and EMPLOYEE tables. OrderNumber and ResultNumber make good keys for the ORDERS and RESULTS tables. Make sure that you enter a unique value for every row.

You can have two kinds of keys: primary keys and foreign keys. The keys that I discuss in the preceding paragraph are primary keys. Primary keys guarantee uniqueness. I discuss primary and foreign keys in the next two sections.

Primary keys

To incorporate the idea of keys into the VetLab database, you can specify the primary key of a table as you create the table. In the following example, a single column is sufficient (assuming that all of VetLab’s clients have unique names):

CREATE TABLE CLIENT (
ClientName	CHARACTER (30)	PRIMARY KEY,
Address1	CHARACTER (30),
Address2	CHARACTER (30),
City	CHARACTER (25),
State	CHARACTER (2),
PostalCode	CHARACTER (10),
Phone	CHARACTER (13),
Fax	CHARACTER (13),
ContactPerson	CHARACTER (30)
) ;

The constraint PRIMARY KEY replaces the constraint NOT NULL, given in the earlier definition of the CLIENT table. The PRIMARY KEY constraint implies the NOT NULL constraint, because a primary key can’t have a null value.

Although most DBMSs will allow you to create a table without one, all tables in a database should have a primary key. With that in mind, replace the NOT NULL constraint in the TESTS, EMPLOYEE, ORDERS, and RESULTS tables with the PRIMARY KEY constraint, as in the following example:

CREATE TABLE TESTS (

TestName CHARACTER (30) PRIMARY KEY,

StandardCharge CHARACTER (30) ) ;

Sometimes, no single column in a table can guarantee uniqueness. In such cases, you can use a composite key. A composite key is a combination of columns that, together, guarantee uniqueness. Imagine that some of VetLab’s clients are chains that have offices in several cities. ClientName isn’t suffi-cient to distinguish two branch offices of the same client. To handle this situ-ation, you can define a composite key as follows:

CREATE TABLE CLIENT (

ClientName CHARACTER (30) NOT NULL,

Address1 CHARACTER (30),

Address2 CHARACTER (30),

City CHARACTER (25) NOT NULL,

State CHARACTER (2),

PostalCode CHARACTER (10),

Phone CHARACTER (13),

Fax CHARACTER (13),

ContactPerson CHARACTER (30),

CONSTRAINT BranchPK

PRIMARY KEY

(ClientName, City)

) ;

Foreign keys

A foreign key is a column or group of columns in a table that correspond to or reference a primary key in another table in the database. A foreign key doesn’t have to be unique, but it must uniquely identify the column(s) in the table that the key references.

If the ClientName column is the primary key in the CLIENT table, every

row in the CLIENT table must have a unique value in the ClientName column. ClientName is a foreign key in the ORDERS table. This foreign key corresponds to the primary key of the CLIENT table, but the key doesn’t have to be unique in the ORDERS table. In fact, you hope the foreign key isn’t unique. If each of your clients gave you only one order and then never ordered again, you’d go out of business rather quickly. You hope that many rows in the ORDERS table correspond with each row in the CLIENT table, indicating that nearly all your clients are repeat customers.

The following definition of the ORDERS table shows how you can add the concept of foreign keys to a CREATE statement:

CREATE TABLE ORDERS (

OrderNumber INTEGER

ClientName CHARACTER (30),

TestOrdered CHARACTER (30),

Salesperson CHARACTER (30),

OrderDate DATE,

PRIMARY KEY,

CONSTRAINT BRANCHFK FOREIGN KEY (ClientName)

REFERENCES CLIENT (ClientName),

CONSTRAINT TestFK FOREIGN KEY (TestOrdered)

REFERENCES TESTS (TestName),

CONSTRAINT SalesFK FOREIGN KEY (Salesperson)

REFERENCES EMPLOYEE (EmployeeName)

) ;

Foreign keys in the ORDERS table link that table to the primary keys of the CLIENT, TESTS, and EMPLOYEE tables.

IT Skills

Tags

Search This Blog

Wednesday, 23 May 2018

SQL Series...12

Building a Multitable Relational Database

Designing the Database

Step 1: Defining objects

Step 2: Identifying tables and columns

Clients

Step 3: Defining tables

Domains, character sets, collations, and translations

Getting into your database fast with keys

Primary keys

Foreign keys

PRIMARY KEY,

No comments:

Post a Comment

Report Abuse

Popular

About Me

Archive

Comments

Tags

Contact Form

Labels

Categories

Trending

Recent