T-SQL Parse a String in a Column


This entry is part 2 of 2 in the series Atomicity

Do you have a column in your SQL Server database that contains a list of things separated by commas? For example, you may have a list of music pieces or films that have a column called Genre. Since forms of art can be part of one or many genres, the list of genres may not be normalized. The first step would be to split that last of genres out into separate columns, with a program so you don’t have to spend hours or days retyping the data. This post will discuss an algorithm and code that will solve this problem.

Below is a T-SQL script you could run in SSMS to test these functions and this algorithm in a non-production database. This code is based on an article at MS SQL Tips website Split Delimited String into Columns in SQL Server with PARSENAME.

/*
USE master;
GO
CREATE DATABASE ParseString;
GO
*/
USE ParseString;
GO

IF OBJECT_ID(N'[dbo].[custAddress]', N'U') IS NOT NULL
DROP TABLE [dbo].[custAddress];
GO

CREATE TABLE dbo.custAddress(
     colID INT IDENTITY PRIMARY KEY
   , myAddress NVARCHAR(200)
   );
GO
INSERT INTO dbo.custAddress(myAddress)
VALUES('7890 – 20th Ave E Apt 2A, Seattle, VA')
    , ('9012 W Capital Way, Tacoma, CA')
    , ('5678 Old Redmond Rd, Fletcher, OK')
    , ('3456 Coventry House Miner Rd, Richmond, TX')
	, ('123 Main Street')
	, ('')
	, ('2468 Pine Street, Anytown, NY, extraData')
GO
SELECT *
FROM dbo.custAddress;
GO
SELECT 
     REVERSE(PARSENAME(REPLACE(REVERSE(myAddress), ',', '.'), 1)) AS [StreetAddress]
   , REVERSE(PARSENAME(REPLACE(REVERSE(myAddress), ',', '.'), 2)) AS [CityAddress]
   , REVERSE(PARSENAME(REPLACE(REVERSE(myAddress), ',', '.'), 3)) AS [StateAddress]
FROM dbo.custAddress;
GO
ALTER TABLE dbo.custAddress ADD StreetAddress NVARCHAR(100) NULL, CityAddress NVARCHAR(100) NULL, StateAddress NVARCHAR(100) NULL;
SELECT * FROM dbo.custAddress;
GO
UPDATE dbo.custAddress SET StreetAddress = REVERSE(PARSENAME(REPLACE(REVERSE(myAddress), ',', '.'), 1));
UPDATE dbo.custAddress SET CityAddress = REVERSE(PARSENAME(REPLACE(REVERSE(myAddress), ',', '.'), 2));
UPDATE dbo.custAddress SET StateAddress = REVERSE(PARSENAME(REPLACE(REVERSE(myAddress), ',', '.'), 3));
GO
SELECT * FROM dbo.custAddress;
GO

Below is the screenshot from SSMS. Click on it to enlarge it.

Series Navigation<< Atomicity of First Normal Form