- SQL Server Window Functions
- SQL Server Aggregate Window Functions
- SQL Server Aggregate Window Functions Raptors
- SQL Server Ranking Window Functions
Aggregate window functions in SQL Server are similar to grouped functions (SUM, AVG and so on) except that you apply them to a window instead of a group. In order to understand window functions, you first need to understand grouping with SQL’s Group By. There is a post at this site called SQL Server Summarizing Data Part 4 that looks at Group By.
If you need to write a query that preserves all of the detail in the source table(s) and also gives you more columns of statistical information, you can use window functions. Grouped queries hide the detail. Window functions do not. A window function’s result is returned in addition to the detail.
Consider a table that has a column called Amount that contains a number. For example, you can get the percent of the current val out of the grand total of all of the Amount.
You can also get the percent of the current val out of a type of value in another column.
Here is an example. A very simple table is shown below. The table name is GroupBySum and the database I created to hold this table is called MikeTest. It is simply a screen shot in SSMS of a select all query.
Below is some SQL code that produces similar results for the window function as the group by function, except for the order. So we can think of window functions as similar to the GROUP BY cluse except that window functions allow you to retain all of those detail rows and simply add aggregation numbers (sum, average and so on) in new columns.
-- Window function SELECT [Type],[SubType],[Amount], SUM(Amount) OVER() as totamt FROM dbo.GroupBySum -- Group By with a subquery SELECT [Type],[SubType],[Amount], (SELECT SUM(Amount) FROM dbo.GroupBySum) as totamt FROM dbo.GroupBySum GROUP BY [Type],[SubType],[Amount]
Below is the result of both select queries. They are almost identical except for the order.
Limitations of Other Options
Here we can start to see the limitations of the Group By way of doing data-analysis calculations. Notice that it is possible to introduce a bug into our code when we use Group By with a WHERE filter that we forget to put in our subquery or subqueries. Observe the code below. The second code set has a bug. The third code set fixes the bug.
-- Window function -- OVER() is in the select clause which is logically processed -- AFTER the where clause and therefore correctly gives 16, not 18. SELECT [Type],[SubType],[Amount], SUM(Amount) OVER() as totamt FROM dbo.GroupBySum WHERE SubType = '001' -- BUG! Group By with a subquery SELECT [Type],[SubType],[Amount], (SELECT SUM(Amount) FROM dbo.GroupBySum) as totamt FROM dbo.GroupBySum WHERE SubType = '001' GROUP BY [Type],[SubType],[Amount] -- Fixed bug with where clause in subquery SELECT [Type],[SubType],[Amount], (SELECT SUM(Amount) FROM dbo.GroupBySum where SubType = '001') as totamt FROM dbo.GroupBySum WHERE SubType = '001' GROUP BY [Type],[SubType],[Amount]
Partition
Let’s change the query a bit.
SELECT [Type],[SubType],[Amount], SUM(Amount) OVER(PARTITION BY [Type]) as Type_amt FROM dbo.GroupBySum ORDER BY [Type], SubType
Below is a screenshot of the results as seen in SSMS.
Unleash the Power!
So already window functions have an advantage over the Group By for data analysis. Have a look at the query below and the results it produces. I think the best way to describe the power of window functions is look at examples. Here we have introduced PARTITION BY. Using an empty specification in the OVER() clause defines a window with the entire underlying query result set. Adding a window partition restricts the window to only certain rows.
SELECT [Type],[SubType],[Amount] ,SUM(Amount) OVER() as totamt ,cast(100. * Amount / SUM(Amount) OVER() as numeric(6,3)) as pctTotAmt ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [Type]) as numeric(6,3)) as pctType FROM [dbo].[GroupBySum]
Common Table Expression (CTE) and Aggregate Window Function
Have a look at the following T-SQL code and the result. We can run our window function and then take the results and group them with a group by. Here in our data we have a Type and a SubType. We could also have a SubSubType but it is not illustrated here. With CTEs we have an inner query and an outer query.
SELECT [Type],[SubType],[Amount] ,SUM(Amount) OVER() as totamt ,cast(100. * Amount / SUM(Amount) OVER() as numeric(6,3)) as pctTotAmt ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [Type]) as numeric(6,3)) as pctType ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [SubType]) as numeric(6,3)) as pctSubType FROM [dbo].[GroupBySum] ORDER BY [Type], SubType; with cte as -- common table expression ( SELECT [Type],Subtype,[Amount] ,SUM(Amount) OVER() as totamt ,cast(100. * Amount / SUM(Amount) OVER() as numeric(6,3)) as pctTotAmt ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [Type]) as numeric(6,3)) as pctType ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [SubType]) as numeric(6,3)) as pctSubType FROM [dbo].[GroupBySum] ) select [Type],Subtype,sum(Amount) as SumAmt,totamt,sum(pctTotAmt) as pctTotAmt ,sum(pctType) as pctType, sum(pctSubType) as pctSubType from cte group by [Type], SubType, totamt ORDER BY [Type], SubType;
Below are the results in SSMS.
Filter with WHERE
You can filter with the WHERE clause in the query with the Window function.
with cte as -- common table expression ( SELECT [Type],Subtype,[Amount] ,SUM(Amount) OVER() as totamt ,cast(100. * Amount / SUM(Amount) OVER() as numeric(6,3)) as pctTotAmt ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [Type]) as numeric(6,3)) as pctType ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [SubType]) as numeric(6,3)) as pctSubType FROM [dbo].[GroupBySum] where [Type] = 'AAA' ) select [Type],Subtype,sum(Amount) as SumAmt,totamt,sum(pctTotAmt) as pctTotAmt ,sum(pctType) as pctType, sum(pctSubType) as pctSubType from cte group by [Type], SubType, totamt ORDER BY [Type], SubType;
The results are below.
Window Functions, Nested CTEs and Select Into a New Table
Here we are combining several features of T-SQL to create a new table.
with cte1 as -- common table expression (CTE) ( SELECT [Type],Subtype,[Amount] ,SUM(Amount) OVER() as totamt ,cast(100. * Amount / SUM(Amount) OVER() as numeric(6,3)) as pctTotAmt ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [Type]) as numeric(6,3)) as pctType ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [SubType]) as numeric(6,3)) as pctSubType FROM [dbo].[GroupBySum] where [Type] = 'AAA' ), cte2 as -- nesting CTE ( select [Type],Subtype,sum(Amount) as SumAmt,totamt,sum(pctTotAmt) as pctTotAmt ,sum(pctType) as pctType, sum(pctSubType) as pctSubType from cte1 group by [Type], SubType, totamt ) select [Type],Subtype,SumAmt,totamt,pctTotAmt,pctType,pctSubType into GroupBySumAAAReport -- a new table (it must not already exist) from cte2 order by [Type],Subtype;
Here is the result after simply running a SELECT on our new table. In the above query, notice the syntax of nesting CTEs. We use multiple CTEs separated by commas. The second CTE can refer to the previous CTE and the SELECT INTO can refer to the previous CTE.
Tableau
Are you working with Tableau for data visualization? In Tableau, have a look at their level of detail (LOD) expressions. They can be compared to SQL’s window functions. Here is an article called Types of Calculations in Tableau. LOC calculations are covered there.
Excel
Are you working is Excel? If you want to sum a column based on a condition, have a look at SUMIFS().