Indexing Guidelines

There's not a single best way to assure optimal performance through indexes. However, there are some guidelines to help.

- Keep indexes narrow. Build indexes on few columns at most. Wide indexes take longer to scan than narrow indexes.

- Create clustered index on every table as possible. Choose the column(s) for the clustered index which is used most frequently for retrieving data, and the indexing column(s) should be seldom being updated.

- Try to create the clustered index on the column with high selectivity (not have many duplicate values, better unique).

- Try to create the clustered index on the column that will never be updated. Every time the clustered index key is updated, SQL Server must maintain not just the clustered index but also the non-clustered indexes since non-clustered indexes contain a pointer to the clustered index. This is yet another reason why you should keep index narrow.

- By default, SQL Server creates the clustered index on the PRIMARY KEY. It is often beneficial to have the clustered index on the primary key, because it's unique and doesn't contain NULL values.

- SQL Server has to perform some maintenance when index keys are updated, so each additional index could impose a slight performance penalty on the performance of DML statements. So if you sure that an index is useless, just remove it.

- Eliminate duplicate indexes, that is, multiple indexes created on the same set of columns. Such indexes provide no additional benefit but use up extra disk space and hinder performance of DML statements.

- Check the default fill factor configured at the server level. If you don't specify the FILLFACTOR option with CREATE INDEX statement, every new index will be created using the default fill factor. This may or may not be what you intend.

- Create non-clustered indexes on foreign keys. SQL Server doesn't create any indexes on foreign keys automatically. Those columns are mostly being used in table join.

- Try to create non-clustered indexes with included columns to increase the possibility of covered query. The non-key columns are stored at the leaf level of the index b-tree. When a query is covered by an index, no further bookmark lookup is required on the table, thus improving query performance.

- Try to create filtered indexes, which offer great benefits in terms of query execution performance and index storage savings. A filtered index is an optimized non-clustered index especially suited to cover queries that select from a well-defined subset of data. It uses a filter predicate to index a portion of rows in the table. The filtered statistics are more accurate than full-table statistics because they cover only the rows in the filtered index. Index maintenance cost in response to DML is reduced because it is smaller and is only maintained when the data in the index is changed, which is only a subset of the full table data.

- Peruse the built-in Database Engine Tuning Advisor (DETA) for analyzing and recommending indexes but do NOT rely on it solely. DETA allows you to choose from a number of tuning options and provides recommendations for improving performance by creating indexes and statistics. Even so you as the DBA still has to decide which queries are most important to optimize, and the suggested indexes provided by DETA may have duplicated indexes that needs your attention before applying them.

- List the equality columns first (leftmost in the column list). List the inequality columns after the equality columns (to the right of equality columns listed). List the include columns in the INCLUDE clause of the CREATE INDEX statement.

- A good rule for the order of columns in an index key is to put the most selective columns first. It's only useful when the columns in the where/join are a left-based subset of the index key columns. (Ref.: Index Selectivity and Column Order)
- If possible, specify the column as NOT NULL, it can have a huge performance difference. (Ref.: Nullable columns and performance)

No comments:

Post a Comment