Structure and Concepts used:
The RID is built from the file identifier (ID), page number, and number of the
row on the page. When a nonclustered index is created on a heap, the index
contains the keys of the index and the RID.
Data Manipulation Language. This is a term that alters data in a relational
table or index. Examples of DML are insert, delete, and update operations.
Partitioning: A new feature in SQL Server that allows data to be
horizontally partitioned across filegroups by using a specified partitioning
scheme. Indexes may also be partitioned and can be broadly categorized in two
An aligned index uses the same
partition scheme and partitioning column as the partitioned table.
A nonaligned index does not use the same partition scheme and
SQL Server uses temporary storage while
creating or rebuilding an index for sorting and other intermediate tasks.
SQL Server uses SQL memory for this work but disk space is used when there
is insufficient memory. The amount of memory requested for an index operation
is controlled by the index create
memory option. For large index operations, it is common that memory
will not be sufficient, and disk space will be used.
When disk space is required for temporary
storage, it can be used from the space allocated to the user database, or it
can be used from the space allocated to the tempdb database. By default, space
allocated to the user database is used. However, when the SORT_IN_TEMPDB option
of the CREATE INDEX or ALTER INDEX statement is set to ON, the tempdb database
is used for temporary storage space. The amount of temporary storage space
required is the same, regardless of its location. Tempdb has more advantages
than using Userdb for index creation.
Phases of Online Index
To get the idea for a use of brilliant feature
in SQL server – online index feature, it is important to understand some of the
internal processes involved in the online operation.
The online index build process can be roughly
divided into three phases:
The following illustration shows the steps
perform during the three phases of online index creation.
The primary purpose of the preparation phase is
to create the index metadata and recompile all DML plans to take the new index
into account. In addition, a snapshot of the table is defined that is row
versioning is used to provide transaction-level read consistency. The preparation
phase is roughly divided into the following actions.
1. Locks are acquired
Transaction acquires a data-share S lock
on the table or indexed view for the duration of this phase. Row versioning is used to provide transaction-level read
consistency. This isolates the index operation from the effects of
modifications that are made by other transactions and removes the need for
requesting share locks on rows that have been read. Concurrent user update and
delete operations during online index operations require space for version
records in tempdb.
The index-operation lock resource is
visible in the sys.dm_tran_locks
dynamic management view as resource_type = ‘Object’ and resource_subtype =
Metadata is created
The logical metadata describing the new
index is created and held only in memory in the metadata cache until the final
phase. This is done to avoid violating ID and name uniqueness constraints in
the system catalog. For example, when an index is rebuilt online, two versions
of the same index with the same name and ID exist during the rebuild operation.
The original index and the new in-build index. The new index remains in an
“in-build” state until the final phase of the online index operation.
From the user’s point of view, the new index is not visible in the sys.indexes catalog view until the
index operation has completed.
The physical metadata of the index, that
is, the definition of the B-tree structure and allocation units, is persisted
in the system catalog in this phase and is visible via the sys.partitions and sys.allocation_units
DML plans are recompiled
After the index metadata is created,
online-version number is incremented before making this new index visible to
concurrent connections. The version number change causes all cached DML plans
for that table or indexed view to recompile so that the new index, as well as
the original index, are maintained going forward. Remember that an S lock was
acquired at the beginning of the preparation phase. While this lock allows read
operations, write operations are blocked, so no DML queries are actually
running at this point. Select operations are not affected by the version change
because these queries access the original index and simply ignore the
Build phase is all about populating the new
index with sorted data from the existing data source while allowing select and
DML operations to continue. Thus, the build phase begins by activating row
versioning and releasing the S lock acquired in the preparation phase.
Select operations access only the original data
source and any preexisting indexes. Concurrent insert, update, and delete
operations are applied to both the original data source, any preexisting
indexes, and any new indexes being built.
During the build phase, empty index structure is
populated by scanning and sorting the source data and bulk inserting the data
into the new index. Numerous execution plans are possible, depending on the
index operation being performed and the data distribution statistics. The query
optimizer chooses the plan that is best suited to the type of index that is
being created or rebuilt. Note that during a rebuild operation of a clustered
or nonclustered index, the existing index is used for the base scan, making a
sort operation unnecessary. This saves disk space and CPU resources.
Index and Clustered Index Operations
When a clustered index is created, dropped, or
rebuilt online, a temporary, nonclustered index, called a mapping index, is
created in the build phase to map old bookmarks to new bookmarks. A bookmark is
simply the row identifier. For example, when creating a clustered index from a
heap, the old bookmark is the row ID (RID) of the heap and the new bookmark is
the clustering key of the index. The mapping index is used by concurrent DML transactions
to determine which records to delete in the new indexes that are being built
when rows in the underlying table are updated or deleted. The mapping index is
discarded when the index build process commits.
The mapping index contains one record for each
row in the table, and it contains the index key columns along with the old and
new record identifiers. The mapping index is not used if the clustered index
operation is performed offline, or when the operation is on a nonclustered
If the index operation creates or drops a
clustered index and there are nonclustered indexes on the table, the
preparation and build phases are repeated twice; once for the new clustered
index and again for the nonclustered indexes.
The last phase of the online index operation is
to declare to the system that the new index is “ready” and also to
clean up any previous states. This phase is roughly divided into the following
The IS lock on the table is
upgraded to either a schema-modify SCH_M or data-share S depending on the type
of online index operation. An S lock is required only when a new nonclustered
index is created. If an SCH_M lock is obtained, conflicting index operations are
prevented from starting by using the special index-operation lock resource.
If a clustered index is
created, all nonclustered indexes are rebuilt in a single operation and the
previous heap is dropped. Similarly, if an index is rebuilt instead of created
new, the old version of the index is dropped. Apart from locking the table, the
new index itself is also locked in SCH_M mode during this phase.
The index lock resource is visible in the sys.dm_tran_locks dynamic
management view as resource_type = ‘METADATA’ and resource_subtype = ‘INDEXSTATS’.
This index lock prevents the new index from being used by others until the
index builder transaction is committed. All these locks in the final phase are
acquired in the index builder transaction and are held until the end of the
transaction. Hence, the table is not available during the final phase until the
index builder transaction is committed and locks are released.
The modified_date of the
table or indexed view is updated to reflect the completion of the index build
operation. This is visible in the modified_date column in the sys.objects
catalog view. The modified_date
also acts internally as the schema-version of the object.
Updating the schema-version
causes another recompile of all cached DML plans that access the table or
indexed view. Unlike the recompile operation in the preparation phase, this
recompile also includes select queries.
The SCH_M or S lock is