SQL Server csv in xml out Project Part 6


This entry is part 6 of 13 in the series SQL CSV XML Project

In this post we will focus on cleaning the incoming data. In our project, the stored procedure that handles this is usp_14OrderClean.

Here is a list of the things that this stored procedure does. Below that is just part of the code we will use.

  • Ensure the validity of the DateandTime
  • Ensure that the SupplierCode is three capital letters and four numerals
  • Ensure that the OrderNumber contains exactly five numerals
  • For quantity of units ensure that it is a positive integer
  • Check for any bad characters. – Yet to be defined…
  • Remove all leading and trailing spaces
  • Remove any double quotes and dollar signs
-- We need to clean up the data in 01HeaderBulkInsert and 02DetailBulkInsert
-- We have not yet copied the data in these two tables to the three tables
DECLARE @RowsReturned INT
DECLARE @ErrMsg VARCHAR(200)
-- Ensure that the DateandTime is numerals, has leading zeros where
-- necessary and actually can represent a date and time (no Feb. 30th and so on),
-- otherwise raise a critical error.
-- To do this we could parse the string, and attempt to put the data into 
-- a datetime2 variable. If not successful, it raises a critical error.
DECLARE @Year CHAR(4), @Month CHAR(2), @Day CHAR(2), @Hour CHAR(2), @Minute CHAR(2), @Second CHAR(2)
DECLARE @DateTime DATETIME2
SELECT @Year = SUBSTRING(DateAndTime, 1, 4) FROM [dbo].[01HeaderBulkInsert]
SELECT @Month = SUBSTRING(DateAndTime, 5, 2) FROM [dbo].[01HeaderBulkInsert]
SELECT @Day = SUBSTRING(DateAndTime, 7, 2) FROM [dbo].[01HeaderBulkInsert]
SELECT @Hour = SUBSTRING(DateAndTime, 7, 2) FROM [dbo].[01HeaderBulkInsert]
SELECT @Minute = SUBSTRING(DateAndTime, 11, 2) FROM [dbo].[01HeaderBulkInsert]
SELECT @Second = SUBSTRING(DateAndTime, 13, 2) FROM [dbo].[01HeaderBulkInsert]
PRINT 'inside usp_14OrderClean before error'
SET @DateTime = @Year + '-' + @Month + '-' + @Day + ' ' + @Hour + ':' + @Minute + ':' + @Second + '.000'
--
-- Ensure that the SupplierCode is three capital letters and four numerals, 
-- otherwise raise a critical error
SELECT SupplierCode FROM [dbo].[01HeaderBulkInsert]
	WHERE SupplierCode LIKE '[A-Z][A-Z][A-Z][0-9][0-9][0-9][0-9]'
	COLLATE Latin1_General_CS_AS;
SET @RowsReturned = @@ROWCOUNT
IF @RowsReturned = 0
BEGIN
	SET @ErrMsg = 'Error in the format of column SupplierCode in table 01HeaderBulkInsert. Expecting 3 capital letters and 4 numerals.'
	RAISERROR(@ErrMsg, 16,1);
END
--
-- Ensure that the OrderNumber contains exactly five numerals
SELECT OrderNumber FROM [dbo].[01HeaderBulkInsert]
	WHERE OrderNumber LIKE '[0-9][0-9][0-9][0-9][0-9]';
SET @RowsReturned = @@ROWCOUNT
IF @RowsReturned = 0
BEGIN
	SET @ErrMsg = 'Error in the format of column OrderNumber in table 01HeaderBulkInsert. Expecting 5 numerals.'
	RAISERROR(@ErrMsg, 16,1);
END
-- For quantity of units ensure that it is a positive integer and if not
-- raise a critical error



-- Check for any bad characters. What defines a bad character? If it has
-- bad characters raise critical error

-- Remove all leading and trailing spaces from FullSupplierName
UPDATE [dbo].[01HeaderBulkInsert] SET FullSupplierName = LTRIM(RTRIM(FullSupplierName));
-- Remove all leading and trailing spaces from BookTitle and Authors
UPDATE [dbo].[02DetailBulkInsert] SET BookTitle = LTRIM(RTRIM(BookTitle));
UPDATE [dbo].[02DetailBulkInsert] SET Authors = LTRIM(RTRIM(Authors));
-- Remove any double quotes; REPLACE " with nothing
UPDATE [dbo].[02DetailBulkInsert] SET BookTitle = REPLACE(BookTitle ,'"','')
-- Remove any currency symbols from the amount; REPLACE $ with nothing
UPDATE [dbo].[02DetailBulkInsert] SET UnitPrice = REPLACE(UnitPrice ,'$','')

Series Navigation<< SQL Server csv in xml out Project Part 5SQL Server csv in xml out Project Part 7 >>