SQL Server Aggregate Window Functions


This entry is part 2 of 4 in the series SQL Server Window Functions

Aggregate window functions in SQL Server are similar to grouped functions (SUM, AVG and so on) except that you apply them to a window instead of a group. In order to understand window functions, you first need to understand grouping with SQL’s Group By. There is a post at this site called SQL Server Summarizing Data Part 4 that looks at Group By.

If you need to write a query that preserves all of the detail in the source table(s) and also gives you more columns of statistical information, you can use window functions. Grouped queries hide the detail. Window functions do not. A window function’s result is returned in addition to the detail.

Consider a table that has a column called Amount that contains a number. For example, you can get the percent of the current val out of the grand total of all of the Amount.

You can also get the percent of the current val out of a type of value in another column.

Here is an example. A very simple table is shown below. The table name is GroupBySum and the database I created to hold this table is called MikeTest. It is simply a screen shot in SSMS of a select all query.

Below is some SQL code that produces similar results for the window function as the group by function, except for the order. So we can think of window functions as similar to the GROUP BY cluse except that window functions allow you to retain all of those detail rows and simply add aggregation numbers (sum, average and so on) in new columns.

-- Window function
SELECT  [Type],[SubType],[Amount],
	  SUM(Amount) OVER() as totamt
  FROM dbo.GroupBySum
-- Group By with a subquery
 SELECT  [Type],[SubType],[Amount],
	  (SELECT SUM(Amount) FROM dbo.GroupBySum) as totamt
  FROM dbo.GroupBySum
  GROUP BY [Type],[SubType],[Amount]

Below is the result of both select queries. They are almost identical except for the order.

Limitations of Other Options

Here we can start to see the limitations of the Group By way of doing data-analysis calculations. Notice that it is possible to introduce a bug into our code when we use Group By with a WHERE filter that we forget to put in our subquery or subqueries. Observe the code below. The second code set has a bug. The third code set fixes the bug.

-- Window function
-- OVER() is in the select clause which is logically processed
-- AFTER the where clause and therefore correctly gives 16, not 18.
SELECT  [Type],[SubType],[Amount],
	  SUM(Amount) OVER() as totamt
  FROM dbo.GroupBySum
  WHERE SubType = '001'
-- BUG! Group By with a subquery
 SELECT  [Type],[SubType],[Amount],
	  (SELECT SUM(Amount) FROM dbo.GroupBySum) as totamt
  FROM dbo.GroupBySum
  WHERE SubType = '001'
  GROUP BY [Type],[SubType],[Amount]
-- Fixed bug with where clause in subquery
 SELECT  [Type],[SubType],[Amount],
	  (SELECT SUM(Amount) FROM dbo.GroupBySum where SubType = '001') as totamt
  FROM dbo.GroupBySum
  WHERE SubType = '001'
  GROUP BY [Type],[SubType],[Amount]

Partition

Let’s change the query a bit.

SELECT  [Type],[SubType],[Amount],
	  SUM(Amount) OVER(PARTITION BY [Type]) as Type_amt
  FROM dbo.GroupBySum
  ORDER BY [Type], SubType

Below is a screenshot of the results as seen in SSMS.

Unleash the Power!

So already window functions have an advantage over the Group By for data analysis. Have a look at the query below and the results it produces. I think the best way to describe the power of window functions is look at examples. Here we have introduced PARTITION BY. Using an empty specification in the OVER() clause defines a window with the entire underlying query result set. Adding a window partition restricts the window to only certain rows.

SELECT  [Type],[SubType],[Amount]
  ,SUM(Amount) OVER() as totamt
  ,cast(100. * Amount / SUM(Amount) OVER() as numeric(6,3)) as pctTotAmt
  ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [Type]) as numeric(6,3)) as pctType
  FROM [dbo].[GroupBySum]

Common Table Expression (CTE) and Aggregate Window Function

Have a look at the following T-SQL code and the result. We can run our window function and then take the results and group them with a group by. Here in our data we have a Type and a SubType. We could also have a SubSubType but it is not illustrated here. With CTEs we have an inner query and an outer query.

SELECT  [Type],[SubType],[Amount]
  ,SUM(Amount) OVER() as totamt
  ,cast(100. * Amount / SUM(Amount) OVER() as numeric(6,3)) as pctTotAmt
  ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [Type]) as numeric(6,3)) as pctType
  ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [SubType]) as numeric(6,3)) as pctSubType
  FROM [dbo].[GroupBySum]
  ORDER BY [Type], SubType;

with cte as -- common table expression
(
SELECT  [Type],Subtype,[Amount]
  ,SUM(Amount) OVER() as totamt
  ,cast(100. * Amount / SUM(Amount) OVER() as numeric(6,3)) as pctTotAmt
  ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [Type]) as numeric(6,3)) as pctType
  ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [SubType]) as numeric(6,3)) as pctSubType
  FROM [dbo].[GroupBySum]
)
select [Type],Subtype,sum(Amount) as SumAmt,totamt,sum(pctTotAmt) as pctTotAmt ,sum(pctType) as pctType, sum(pctSubType) as pctSubType
from cte
group by [Type], SubType, totamt
ORDER BY [Type], SubType;

Below are the results in SSMS.

Filter with WHERE

You can filter with the WHERE clause in the query with the Window function.

with cte as -- common table expression
(
SELECT  [Type],Subtype,[Amount]
  ,SUM(Amount) OVER() as totamt
  ,cast(100. * Amount / SUM(Amount) OVER() as numeric(6,3)) as pctTotAmt
  ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [Type]) as numeric(6,3)) as pctType
  ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [SubType]) as numeric(6,3)) as pctSubType
  FROM [dbo].[GroupBySum]
  where [Type] = 'AAA'
)
select [Type],Subtype,sum(Amount) as SumAmt,totamt,sum(pctTotAmt) as pctTotAmt ,sum(pctType) as pctType, sum(pctSubType) as pctSubType
from cte
group by [Type], SubType, totamt
ORDER BY [Type], SubType;

The results are below.

Window Functions, Nested CTEs and Select Into a New Table

Here we are combining several features of T-SQL to create a new table.

with cte1 as -- common table expression (CTE)
(
SELECT  [Type],Subtype,[Amount]
  ,SUM(Amount) OVER() as totamt
  ,cast(100. * Amount / SUM(Amount) OVER() as numeric(6,3)) as pctTotAmt
  ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [Type]) as numeric(6,3)) as pctType
  ,cast (100. * Amount / SUM(Amount) OVER(PARTITION BY [SubType]) as numeric(6,3)) as pctSubType
  FROM [dbo].[GroupBySum]
  where [Type] = 'AAA'
),
cte2 as  -- nesting CTE
(
select [Type],Subtype,sum(Amount) as SumAmt,totamt,sum(pctTotAmt) as pctTotAmt ,sum(pctType) as pctType, sum(pctSubType) as pctSubType
from cte1
group by [Type], SubType, totamt
)
select [Type],Subtype,SumAmt,totamt,pctTotAmt,pctType,pctSubType
into GroupBySumAAAReport  -- a new table (it must not already exist)
from cte2
order by [Type],Subtype;

Here is the result after simply running a SELECT on our new table. In the above query, notice the syntax of nesting CTEs. We use multiple CTEs separated by commas. The second CTE can refer to the previous CTE and the SELECT INTO can refer to the previous CTE.

Tableau

Are you working with Tableau for data visualization? In Tableau, have a look at their level of detail (LOD) expressions. They can be compared to SQL’s window functions. Here is an article called Types of Calculations in Tableau. LOC calculations are covered there.

Excel

Are you working is Excel? If you want to sum a column based on a condition, have a look at SUMIFS().

Series Navigation<< SQL Server Window FunctionsSQL Server Aggregate Window Functions Raptors >>