SQL Server Union


This article is discussing the combining of two or more tables using UNION or UNION ALL into one table. When we use UNION or UNION ALL we end up with one data set. In order to do this we must meed two conditions. This discussion is based on a video series at Udemy.com that Philip Burton provided.

As an aside, if you are working in Excel, check out the VSTACK function. It combines multiple sheets of data into one sheet. If you are working with R language, have a look at the bind_rows() function. It’s similar to union.

  • same number of columns
  • the same or compatible data types

What do we mean by compatible? For example, varchar(2) and varchar(6) are compatible in this case and the resulting table will use the larger of the two. In this case it is varchar(6). This will work fine. Another example would be datetime and datetime2. The winner is the larger of the two, which is datetime2. Another example would be a smllint and a numeric, where the numeric wins. The computer does what it can to accomodate without having an overflow error.

It still however maintains precedence. Consider the following T-SQL code. The SELECT statement does not work. You get an error stating the following: “Conversion failed when converting the varchar value ‘Hello’ to data type int.” The computer tries, unsuccessfully to convert the varchar to the int. So a varchar and an int are not compatible data types.

DECLARE @MyInt INT = 2;
DECLARE @MyString VARCHAR(8) = 'Hello';
SELECT @MyInt + @MyString

In this next example the computer is successful in converting the varchar to the int and gives the answer 5.

declare @MyInt INT = 2;
DECLARE @MyString VARCHAR(8) = '3';
SELECT @MyInt + @MyString

The following code does not work.

DECLARE @MyInt INT = 2;
DECLARE @MyString VARCHAR(8) = 'Hello';
SELECT @MyInt 
UNION
SELECT @MyString

The following code does work. It returns two rows with the 3 under the 2.

DECLARE @MyInt INT = 2;
DECLARE @MyString VARCHAR(8) = '3';
SELECT @MyInt 
UNION
SELECT @MyString

Column Names

Column names are taken from the first set of data. If you try to use “as MyColumn” (or something similar) in the second set of data you still won’t have a column name. You will not get an error either.

select convert(char(5),'hi') as 'A Greeting'  -- the first one wins
union
select convert(char(11),'hello there') 'My Salutation'

Order

The results are not sorted! They may look like they are sorted alphabetically, but UNION does not perform a sort that you can count on. There is no guarantee as to the order of the results.

Order By

If you do want the results sorted you can use an order by at the end. This rule applies to UNION, UNION ALL and EXCEPT.

select convert(char(5),'hi') as 'A Greeting'
union
select convert(char(11),'hello there')
union
select convert(char(6),'Yo')
order by 'A Greeting' DESC  -- order by is at the end

Here are the results in SSMS.

Duplicates

With UNION the computer removes duplicates. With UNION ALL it does not. Consider the following results. If there we many columns in our select statement, all of them would have to be duplicate for union to remove the duplicate rows.

select convert(char(5),'hi') as 'A Greeting'  -- the first one wins
union
select convert(char(11),'hello there') 'My Salutation'
union
select convert(char(5),'hi') 
union
select convert(char(11),'bonjour') 

UNION ALL will give you all of the rows, provided that you surround, on both side the offending duplicate. It is best to probably use union all in all of your union statements to be safe. The following code produces all four rows but if I remove one os the all’s it produces only three rows. Also note that UNION ALL runs faster than UNION because UNION will check for duplicates.

select convert(char(5),'hi') as 'A Greeting'  -- the first one wins
union 
select convert(char(11),'hello there') 'My Salutation'
union all
select convert(char(5),'hi') 
union all
select convert(char(11),'bonjour') 

Data Types Revisited

Let’s revisit data types. Let’s look at the WITH statement also. WITH specifies a temporary named result set, known as a common table expression (CTE). There are several rules involved with CTEs and Microsoft’s online documentation describes them. In this example we wll also be using INTO. The main takeaway right now is that the created data type in the example below is bigint. The number 389 is too large for tinyint but fits nicely into smallint. The result however is bigint.

with cte as (
select convert(tinyint, 45) as MyColumn
union
select convert(bigint, 389)
)
select MyColumn
into tblTemp
from cte

If we changed the number 389 to 45 and used UNION (not UNION ALL) we would have only one row in the result table and the data type would be bigint.