SQL Server Correlated Subqueries


This entry is part 2 of 3 in the series SQL Server Subqueries

Correlated subqueries have references known as correlations to columns in tables from tables in the outer query. They can be trickier to troubleshoot when problems occur because they cannot be run independently. Using Itzik Ben-Gan’s database (in the Microsoft Press book T-SQL Querying published in 2015) as inspiration, we will look at the Orders table. We want to return the orders placed on the last date of activity of the customer.

Simplified Example

First we will look at a simplified example by using a new table called Ratings. Suppose we have a company that has customers that give periodic ratings to on a scale of 1 to 10, where 10 is the best score. They have two customers, customer 1 and customer 2. We have a table called Ratings that stores those scores and it has a unique identification column that identifies the rating itself.

We need a query that returns the unqid, the custid and the rates. For each customer, we want the maximum rate score.

Here below is the source table, Ratings. Our correlated subquery will need to show two rows. Highlighted in blue are the three rows that represent the highest ratings for each customer. Customer 1’s highest rating is 9 and customer 2’s highest rating is 10. Since customer 2 got two scores of 10, we will need to decide how we will break the tie if we don’t want duplicate ratings for each customer.

Our query needs to return the above 3 highlighted rows, assuming we allow duplicates. Below is our correlated subquery.

select unqid, custid, rates
from Ratings as R1
where rates =
(
   select max(R2.rates)
   from Ratings as R2
   where R2.custid = R1.custid
)

Below is the result of the above query. It works. If we were using Itzik’s example, the unqid would be an orderid from an Orders table. The rates would be order dates. The custid would be the same, a customer identification number.

Remove Duplicates

In the query below we are removing the duplicates.

select unqid, custid, rates
from Ratings as R1
where rates =
(
	select max(rates)
	from Ratings as R2
	where R2.custid = R1.custid
)
and unqid = 
(
	select max(unqid)
	from Ratings as R2
	where R2.custid = R1.custid
	and R2.rates = R1.rates
);

Below are the results of the above query in SSMS. We have removed the duplicates.

Series Navigation<< SQL Server SubqueriesSQL Self-Contained Subqueries >>