T-SQL Parsing To Get Frequency


This entry is part 2 of 7 in the series SQL Server Parsing Many Columns

This post is about determining just how many parts there are in a string that is delimited with a comma. Actually, our string could be delimited with a period or a semicolon or some other character, but in our example, we’ll use the comma as a delimiter.

Since this is so important in our algorithm, we’ll create a new column to store the number of parts there are in the string to be parsed. For example, if we count the number of commas and find that there are two of them in the string, we will assume that there are three parts in the string.

Below is a simple statement that creates the column FruitFreq in the table ParseFruit.

ALTER TABLE ParseFruit
    ADD FruitFreq int

Now that we have a frequency column, we need to populate that column. We’ll use a user-defined function. We find out the number of commas by comparing the length of the string (with the LEN function) with the length of that string with the commas removed (with the REPLACE function). Below is the code you can run in your non-production server to create the function.

CREATE FUNCTION [dbo].[ufn_CountChar] ( @pInput VARCHAR(1000), @pSearchChar CHAR(1) )
RETURNS INT
BEGIN
    RETURN (LEN(@pInput) - LEN(REPLACE(@pInput, @pSearchChar, '')))
END
GO

Now we can use the user-defined function we just created.

UPDATE ParseFruit SET FruitFreq = dbo.ufn_CountChar(ListFruit, ',') + 1

Below is a screenshot of our ParseFruit table so far. Notice that we now know how many parts are in each string.

Create Columns for Groups of 4 or Fewer

Looking at our new frequency column we notice that the maximum number of fruits in our string is 9. We need 3 groups of 4 to cover that. We need to create 3 more columns. I will just name them Group1, Group2 and Group3.

ALTER TABLE ParseFruit
    ADD Group1 NVARCHAR(50) NULL, Group2 NVARCHAR(50) NULL, Group3 NVARCHAR(50) NULL

Here is our table now.

Series Navigation<< T-SQL Parsing Into Many ColumnsT-SQL Parsing Groups of 4 >>