SQL Series...13 - IT Skills

This blog is for IT lovers, accounts and finance executives who want to add value in their professional life.

test

Welcome to our blog. This blog is for IT lovers, Accounts/Finance Executives who wants to add value in their career path.

Search This Blog

Thursday, 24 May 2018

SQL Series...13

test

Working with Indexes


The SQL:2003 specification doesn’t address the topic of indexes, but that omission doesn’t mean that indexes are rare or even optional parts of a data-base system. Every SQL implementation supports indexes, but no universal agreement exists on how to support them. In previous article, I show you how to create an index by using Microsoft Access, a rapid application development (RAD) tool. You must refer to the documentation for your particular database management system (DBMS) to see how the system implements indexes.


What’s an index, anyway?


Data generally appears in a table in the order in which you originally entered the information. That order may have nothing to do with the order in which you later want to process the data. Say, for example, that you want to process your CLIENT table in ClientName order. The computer must first sort the table in ClientName order. These sorts take time. The larger the table, the longer the sort takes. What if you have a table with 100,000 rows? Or a table with a million rows? In some applications, such table sizes are not rare. The best sort algorithms would have to make some 20 million comparisons and millions of swaps to put the table in the desired order. Even with a very fast computer, you may not want to wait that long.


Indexes can be a great timesaver. An index is a subsidiary or support table that goes along with a data table. For every row in the data table, you have a corresponding row in the index table. The order of the rows in the index table is different.
Table

CLIENT Table


ClientName
Address1
Address2
City
State
Butternut
5 Butternut Lane

Hudson
NH
Animal Clinic









Amber
470 Kolvir Circle

Amber
MI
Veterinary, Inc.









Vets R Us
2300 Geoffrey Road
Suite 230
Anaheim
CA





Doggie Doctor
32 Terry Terrace

Nutley
NJ





The Equestrian
Veterinary
7890 Paddock
Gallup
NM
Center

Parkway







Dolphin Institute
1002 Marine Drive

Key West
FL





J. C. Campbell,
2500 Main Street

Los Angeles
CA
Credit Vet









Wenger’s
15 Bait Boulevard

Sedona
AZ
Worm Farm










 The rows are not in alphabetical order by ClientName. In fact, they aren’t in any useful order at all. The rows are simply in the order in which somebody entered the data.

An index for this CLIENT table may look like Table 

Table
Client Name Index for the CLIENT Table
ClientName
Pointer to Data Table
Amber Veterinary, Inc.
2


Butternut Animal Clinic
1


Doggie Doctor
4


Dolphin Institute
6


J. C. Campbell, Credit Vet
7


The Equestrian Center
5


Vets R Us
3


Wenger’s Worm Farm
8

The index contains the field that forms the basis of the index (in this case, ClientName) and a pointer into the data table. The pointer in each index row gives the row number of the corresponding row in the data table.

Why would I want an index?


If I want to process a table in ClientName order, and I have an index arranged in ClientName order, I can perform my operation almost as fast as I could if the data table itself was in ClientName order. I can work through the index sequentially, moving immediately to each index row’s corresponding data record by using the pointer in the index.

If you use an index, the table processing time is proportional to N, where N is the number of records in the table. Without an index, the processing time for the same operation is proportional to N lg N, where lg N is the logarithm of N to the base 2. For small tables, the difference is insignificant, but for large tables, the difference is great. On large tables, some operations aren’t practical to perform without the help of indexes.

As an example, say that you have a table containing 1,000,000 records (N = 1,000,000), and processing each record takes one millisecond (one-thousandth of a second). If you have an index, processing the entire table takes only 1,000 seconds — less than 17 minutes. Without an index, you need to go through the table approximately 1,000,000 × 20 times to achieve the same result. This process would take 20,000 seconds — more than five and a half hours. I think you can agree that the difference between 17 minutes and five and a half hours is substantial. That’s the difference that indexing makes on processing records.

Maintaining an index


After you create an index, something must maintain it. Fortunately, your DBMS maintains your indexes for you by updating them every time you update the corresponding data tables. This process takes some extra time, but it’s worth it. After you create an index and your DBMS maintains it, the index is always available to speed up your data processing, no matter how many times you need to call on it.


The best time to create an index is at the same time you create its corre-sponding data table. If you create the index at the start and begin maintaining it at the same time, you don’t need to undergo the pain of building the index later, with the entire operation taking place in a single, long session. Try to anticipate all the ways that you may want to access your data and then create an index for each possibility.
Some DBMS products give you the capability to turn off index maintenance. You may want to do so in some real-time applications where updating indexes takes a great deal of time and you have precious little to spare. You may even elect to update the indexes as a separate operation during off-peak hours.

Don’t fall into the trap of creating an index for retrieval orders that you’re unlikely ever to use. Index maintenance is an extra operation that the com-puter must perform every time it modifies the index field or adds or deletes a data table row, which affects performance. For optimal performance, create only those indexes that you expect to use as retrieval keys — and only for tables containing a large number of rows. Otherwise, indexes can degrade performance.



You may need to compile something such as a monthly or quarterly report that requires the data in an odd order that you don’t ordinarily need. Create an index just before running that periodic report, run the report, and then drop the index so that the DBMS isn’t burdened with maintaining the index during the long period between reports.


No comments:

Post a Comment

Popular

Welcome to our blog. If you want to; learn writing skills, preparation for CSS/Corporate laws, IT skills, downloading Business/IT books, and many more; this blog is for you.