Index

SQL provides a dedicated structure to speed up the retrieval of rows from a table called index.

SQL Server has two types of indexes: clustered index and non-clustered index:

This sorted data structure is called a B-tree (balanced tree). B-tree structure enables us to find the queried rows faster to using the sorted key value(s). Table data can be sorted physically in only one direction for this reason we can define only one clustered index per table. The following image illustrates a logical structure of the clustered index.

SQL Server clustered index creates a physical sorted data structure of the table rows according to the defined index key.

Structure of the clustered index

The root and intermediate levels contain the index key values and page pointers. The page pointers point to the previous and subsequent index pages of their own. These two levels don’t store any row data. At the same time, index pages hold information about the ahead and behind index page numbers.B-tree structure based-on three different levels:

  • Root level: The top level of the B-tree is called as root level. The root level is the starting point of the data searching
  • Intermediate level: This level provides a connection between root and leaf levels. SQL Server does not create an intermediate level when the amount of data rows are too small
  • Leaf Level: This level is the lowest level of the clustered index and all records are stored
Clustered Index and Singleton seek:
The clustered index seek operation uses the structure of the B-tree structure very efficiently and easily finds the qualified row(s).
CREATE Table Cars (Id INT, BrandName VARCHAR(100))
GO
INSERT INTO Cars VALUES(1,'Ford')
,INSERT INTO Cars VALUES(2,'Fiat')
INSERT INTO Cars VALUES(3,'Mini')
INSERT INTO Cars VALUES(4,'Jaguar')
INSERT INTO Cars VALUES(5,'Kia')
INSERT INTO Cars VALUES(6,'Nissan')
INSERT INTO Cars VALUES(7,'BMW')
INSERT INTO Cars VALUES(8,'Mercedes')
INSERT INTO Cars VALUES(9,'Mazda')
INSERT INTO Cars VALUES(10,'Volvo')
INSERT INTO Cars VALUES(11,'Lexus')
INSERT INTO Cars VALUES(12,'Buick')
INSERT INTO Cars VALUES(13,'GMC')
INSERT INTO Cars VALUES(14,'Honda')
INSERT INTO Cars VALUES(15,'Lotus')
INSERT INTO Cars VALUES(16,'Opel')
INSERT INTO Cars VALUES(17,'Bentley')
INSERT INTO Cars VALUES(18,'Dodge')
INSERT INTO Cars VALUES(19,'Tesla')
INSERT INTO Cars VALUES(20,'Porche')
INSERT INTO Cars VALUES(21,'Ferrari')
INSERT INTO Cars VALUES(22,'Audi')
GO
CREATE UNIQUE CLUSTERED INDEX IX_001
ON Cars (Id);
When we execute below query sql server, it will perform cluster index seek operation.
SELECT * FROM Cars WHERE Id = 12;


In this execution plan, the seek predicates indicate that the storage engine uses the B-tree structure and finds the leaf level that stores the data rows. For this query, the uniqueness of the clustered index is very important because this constraint guarantees that only one row will return from the query. This data searching concept is called singleton seek.
Clustered Index range Scan: 
SELECT *
FROM Cars
WHERE Id > 12;

the above query will perform singleton seek firstly and reaches the leaf level that contains the data row (Id=12). According to retrieve the index keys, the range scans operation has been performed either in forward or backward directions.
The following image illustrates how the range scan process works.

Range scan working principle and SQL Server clustered index

Direction of see we can see in the property of execution plan:
The singleton seeks principle can also work multiple times for some queries as we can see in the below query.

The following image illustrates how the multiple singleton seek process works:

Multiple singleton seeks and SQL Server clustered index

SQL Server Clustered Index and Primary Key

The primary key ensures that the values of a column in the table are unique so that all rows are identified with a unique key. 

when we create a primary key SQL Server creates a unique clustered index. However, we can create a primary key without a clustered index because the only mandatory requirement is uniqueness for the primary key. So the main differences between the primary key and clustered index are:

  • A primary key is a logical structure and it provides the uniqueness for a table
  • A clustered index is a physical structure and it provides the physical order of the records on the storage
Non-Clustered Index:

In Non Cluster Index the underlying table rows will not be stored and sorted based on the Non-clustered key, The index pages of the Non-clustered index contain Non-clustered index key values with pointers to the storage location of these rows in the underlying heap table or the Clustered index.

If a Non-Clustered index is built over a heap table or view the leaf level nodes of that index hold the index key values and Row ID (RID) pointers to the location of the rows in the heap table. The RID consists of the file identifier, the data page number, and the number of rows on that data page.

 if a Non-clustered index is created over a Clustered table, the leaf level nodes of that index contain Non-clustered index key values and clustering keys for the base table.

 If a Non-clustered index is built over a non-unique Clustered index, the leaf level nodes of the Non-clustered index will hold additional uniqueifier values of the data rows, that is added by the SQL Server Engine to ensure uniqueness of the Clustered index.Non-Clustered index in SQL Server

We can have multiple non-clustered indexes upto 999 in SQL tables because it is a logical index and does not sort data physically as compared to the clustered index.

Some Important point of Non Clustered Index: 

Index added to the table will negatively impact data modification performance on that table. This is due to the fact that, when you modify a key column in the underlying table, the Non-clustered indexes should be adjusted appropriately as well.

When designing a Non-clustered index, you should consider the type of the workload performed on your database or table:

create a minimum of narrow indexes, with a minimum number of columns participating in the index key, on the heavily updated table. 

A table that has a large number of rows with a low data modification requirement can heavily benefit from more Non-clustered indexes with composite index keys, that contain more than one column in the index key, that cover all columns in the query to improve the data retrieval performance.

it is highly recommended to avoid adding too many key or non-key columns to the Non-clustered index that are not required by the queries,Adding too many columns to the index will result with fewer number of rows that can be fit in each data page, increasing the I/O overhead and reducing the caching efficiency. 

Syntax:

create nonclustered index nonclusterdemo on employee (dept)

Non Clustered Index Implementation: When you create a UNIQUE constraint, a unique Non-clustered index will be created automatically to enforce that constraint. 

To be able to create a Non-clustered index, you should be a member of the sysadmin fixed server role or the db_ddladmin and db_owner fixed database roles.

CREATE TABLE NonClusteredDemo (ID INT IDENTITY (1,1) NOT NULL, StudentName NVARCHAR(50)  NULL,

STDBirthDate DATETIME  NULL,STDPhone VARCHAR(20) NULL,STDAddress NVARCHAR(MAX) NULL)


 CREATE NONCLUSTERED INDEX IX_NonClusteredIndexDemo_StudentName ON NonClusteredDemo(StudentName)WITH (ONLINE = ON , FILLFACTOR=90)

The ONLINE option allows concurrent users access to the underlying table or the Clustered index data during the Non-clustered index creation process.

FILLFACTOR option is used to set the percentage of free space that will be left in the leaf level nodes of the Non-clustered index during the index creation, in order to minimize the page split and fragmentation performance issues.

Non Clustered Index with Include Column:

Different Scenario to create Index and to avoid creating index

Example, How index is effective