SQL Server Internal Data Structures


This entry is part 1 of 5 in the series SQL Server Data Structures

As opposed to the logical query processing discussed in the post SQL Server Logical Query, we are now entering the world of physical implementation of SQL Server. The physical layer is not based on a standard, and physical platforms differ in their implementation. To be good at query tuning you need to understand the physical layer of SQL Server.

Query Tuning

In order to tune queries you need to go through the following steps:

  1. analyze query execution plans that show performance measures
  2. understand access methods
  3. understand internal data structures
  4. understand pages, extents, heaps and B-trees

To understand all of this we need to actually start from the bottom and work our way up. All of this discussion is based on disk-based tables, not memory-optimized tables. The first topic is Pages and Extents. After that we look at Table Organization, which is either going to be as a B-tree or as a Heap. We also need to understand what an index is.

Key Points

Here is a list of a few key take-away points that you must know and keep in mind at all times when discussing disk-based table’s internal structures.

  • A page is an 8-KB unit where SQL Server stores data. It is the smallest I/O unit that SQL Server can read and write for disk-based tables (as oppoed to memory-optimized ones).
  • An extent is a unit that contains 8 contiguous pages. An extent can be mixed or uniform.
  • Tables can be organized in one of two ways: either as a heap or a B-tree.
  • The table is organized as a B-tree when it has a clustered index defined on it and a heap when it doesn’t.
  • By default, when you add a primary key to a table, SQL Server adds a clustered index to it.
  • When you add a unique constraint to a table, SQL Server will enforce it using a nonclustered index unless you specify the CLUSTERED keyword.
  • Nonclustered indexes are always organized as B-trees.
  • A heap is a table that has no clustered index.
  • All indexes in SQL Server disk-based tables are structured as B-trees.
  • A clustered index is structured as a B-tree and it maintains the entire table’s data at its leaf level. The clustered index is not a copy of the data, it is the data.
  • A nonclustered index is structured as a B-tree.
  • A leaf row in a nonclustered index contains only the index key columns and a row locator value representing the particular row.
  • Nonclustered indexes created on a B-tree (clustered table) are architected differently than on a heap.
  • When SQL Server needs to find a certain key or range of keys at the leaf level of the index it performs an access method called an index seek.
  • Most of the cost associated with access methods is related to I/O activity.
Series NavigationSQL Server Pages and Extents >>