Many people know
importance of creating indexes on SQL Server database tables. Indexes greatly improve
performance of a database. However, while many people create indexes on their SQL Server tables, many people don't maintain them properly to ensure queries run efficiently as possible. I'll begin by giving a quick overview of how SQL Server 2000 stores data and how indexes improve performance. Then, I'll spend quite a bit of time explaining why, when, and how to maintain indexes with DBCC SHOWCONTIG and DBCC INDEXDEFRAG to ensure queries run in
most efficient manner.SQL Server 2000 stores data into what is known as a heap. A heap is a collection of data pages containing rows for a table. The data isn't stored in any particular order and
data pages themselves aren't in any sequential order. The data is just there with no real form or organization. When SQL Server accesses data in this form, it does a table scan. This means SQL Server starts reading at
beginning of
table and scans every page until it finds
data that meets
criteria of
query. If a table is very large, this could greatly decrease
performance of queries.
Indexes will hasten
retrieval of data. It is important to understand how data is used,
types of queries being performed and
frequency of
queries that are typically performed when planning to create indexes. An index is far more efficient when
query results return a low percentage of rows and
selectivity is high. High selectivity means a query is written so it returns
lowest number of rows possible. As a rule, indexes should be created on columns that are commonly searched; this includes primary and foreign keys. It follows that columns that contain few unique values should never be indexed; this will increase
number of rows returned in a query.
There are two types of indexes to consider when planning: Non-Clustered and Clustered Indexes.
A non-clustered index stores data comparable to
index of a text book. The index is created in a different location than
actual data. The structure creates an index with a pointer that points to
actual location of
data. Non-clustered indexes should be created on columns where
selectivity of query ranges from highly selective to unique. These indexes are useful when providing multiple ways to search data is desired.
A clustered index stores data similar to a phone directory where all people with
same last name are grouped together. SQL Server will quickly search a table with a clustered index while
index itself determines
sequence in which rows are stored in a table. Clustered indexes are useful for columns searched frequently for ranges of values, or are accessed in sorted order.
Each table can have only one clustered index, however up to 249 clustered indexes can be added per table. For more information on how Clustered and Non-Clustered indexes store data visit http://www.sql-server-performance.com/gv_index_data_structures.asp
While I could go on and on about how SQL Server 2000 stores and accesses data in a heap and in an Index architecture, I will move on to discuss maintaining indexes with DBCC SHOWCONTIG and DBCC INDEXDEFRAG.
Once indexes have been created, it is important to maintain indexes to ensure
best possible performance. If indexes are not maintained, over time
data will become fragmented. Fragmentation is
inefficient use of pages within an index*. There are a number of tools available that will help with optimizing indexes to ensure they are running well, however I will only discuss DBCC SHOWCONTIG and DBCC INDEXDEFRAG in this article.
The DBCC SHOWCONTIG command will provide fragmentation information on data and indexes within a specified table and it will also determine if
data and index pages are full. If a page is full, SQL Server must split
page to make room for new rows. This statement should be run on heavily modified tables, tables that contain imported data, or tables that seem to cause poor query performance. When
statement is executed, here is what will be returned: Statistic Description Pages Scanned Number of pages in
table or index. Extents Scanned Number of extents in
table or index. Extent Switches Number of times
DBCC statement moved from one extent to another while it traversed
pages of
table or index. Avg. Pages per Extent Number of pages per extent in
page chain. Scan Density [Best Count: Actual Count] Best count is
ideal number of extent changes if everything is contiguously linked. Actual count is
actual number of extent changes. The number in scan density is 100 if everything is contiguous; if it is less than 100, some fragmentation exists. Scan density is a percentage. Logical Scan Fragmentation Percentage of out-of-order pages returned from scanning
leaf pages of an index. This number is not relevant to heaps and text indexes. An out of order page is one for which
next page indicated in an IAM is a different page than
page pointed to by
next page pointer in
leaf page. Extent Scan Fragmentation Percentage of out-of-order extents in scanning
leaf pages of an index. This number is not relevant to heaps. An out-of-order extent is one for which
extent containing
current page for an index is not physically
next extent after
extent containing
previous page for an index. Avg. Bytes free per page Average number of free bytes on
pages scanned. The higher
number,
less full
pages are. Lower numbers are better. This number is also affected by row size; a large row size can result in a higher number. Avg. Page density (full) Average page density (as a percentage). This value takes into account row size, so it is a more accurate indication of how full your pages are. The higher
percentage,
better.
The DBCC INDEXDEFAG command will rebuild a specified index or all indexes for a specific table. This command also allows use of
fillfactor option which reduces
number of page splits per data or index page. Using
fillfactor option increases performance on insert and upstate statements. If a data page is full, SQL Server must split
page to make room for
new rows. The fillfactor allows specification of a percentage of space to leave available on
data pages for inserts and updates.