- SQL Server Window Functions
- SQL Server Aggregate Window Functions
- SQL Server Aggregate Window Functions Raptors
- SQL Server Ranking Window Functions
Ranking calculations are implemented in T-SQL as a window function. When you rank a value you do not rank it alone, rather you rank it with respect to some set of values based on a certain order. The set of values is defined by the OVER clause. If the set is all of the rows in the underlying query you omit the partition clause. With a window partition clause you rank the row against the rows in the same window partition. A mandatory window order clause defines the order for the ranking.
To demonstrate ranking window functions I will use the following table from Microsoft Press’ book by Itzik Ben-Gan called T-SQL Querying. T-SQL support four types of ranking functions called:
- ROW_NUMBER
- RANK
- DENSE_RANK
- NTILE
CREATE TABLE dbo.Orders ( orderid INT NOT NULL, orderdate DATE NOT NULL, empid INT NOT NULL, custid VARCHAR(5) NOT NULL, qty INT NOT NULL, CONSTRAINT PK_Orders PRIMARY KEY (orderid) ); GO INSERT INTO dbo.Orders(orderid, orderdate, empid, custid, qty) VALUES(30001, '20130802', 3, 'B', 10), (10001, '20131224', 1, 'C', 10), (10005, '20131224', 1, 'A', 30), (40001, '20140109', 4, 'A', 40), (10006, '20140118', 1, 'C', 10), (20001, '20140212', 2, 'B', 20), (40005, '20140212', 4, 'A', 10), (20002, '20140216', 2, 'C', 20), (30003, '20140418', 3, 'B', 15), (30004, '20140418', 3, 'B', 20), (30007, '20140907', 3, 'C', 30); GO
Below is a query that demonstrates all four against the Orders table without a window partition clause ordered by the qty column. The ordering value is qty.
-- Ranking SELECT orderid, qty, ROW_NUMBER() OVER(ORDER BY qty) AS rownum, RANK() OVER(ORDER BY qty) AS rnk, DENSE_RANK() OVER(ORDER BY qty) AS densernk, NTILE(4) OVER(ORDER BY qty) AS ntile4 FROM dbo.Orders;
Here is the output from the above query. The output is nondeterministic. What does that mean? If you run the query again you are not guaranteed to get the same results.
ORDER BY at the end
Let’s add an order by at the end of the query. Recall that the ORDER BY is the sixth step in our logical query processing and happens after select. It is only for presentation.
-- Ranking SELECT orderid, qty, ROW_NUMBER() OVER(ORDER BY qty) AS rownum, RANK() OVER(ORDER BY qty) AS rnk, DENSE_RANK() OVER(ORDER BY qty) AS densernk, NTILE(4) OVER(ORDER BY qty) AS ntile4 FROM dbo.Orders ORDER BY qty, orderid;
Force a Deterministic Result
Suppose you need to force a deterministic calculation of row numbers. How do you do that? You need to add a tiebreaker. In our case there are ties (duplicates) of quantity values. If there were no duplicates of quantity then our results would be deterministic. Adding an order by at the very end is just for presentation and does not make the results deterministic. Below is our new query that makes the results deterministic because we have added an ORDER BY tiebreaker, orderid that references a column with unique values.
-- Ranking SELECT orderid, qty, ROW_NUMBER() OVER(ORDER BY qty,orderid) AS rownum, RANK() OVER(ORDER BY qty) AS rnk, DENSE_RANK() OVER(ORDER BY qty) AS densernk, NTILE(4) OVER(ORDER BY qty) AS ntile4 FROM dbo.Orders;
Below is the results of the above query.
Dense Rank
One application of dense rank would be creating a sequence number based on dates. Suppose you were analyzing your favourite sports team’s box scores for all of the games they played. One column has the date of the game and each row represents each player’s statistics. You need game numbers that are a sequence based on the date of the game. Dense Rank will do the job.
Window Partition Clause
Available to all window functions is the optional window partition clause. Let’s compute the row numbers again but we will do it by custid. Each time we encounter a new custid we will restart the numbering at one. The row numbers are independent for each customer.
select custid, orderid, qty, row_number() over(partition by custid order by orderid) as rownum from dbo.orders order by custid, orderid