SQL Server Searching for Words


This post discusses the query you will need to be able to search inside a column in a table for any word or phrase that is not allowed. For example, you may have some restricted words or phrases that you cannot allow to be anywhere inside a particular column in a table. Perhaps these are profane words or keywords used in a programming language. This would be part of your data validation.

Let’s work with an example table that we need to search, and we’ll name the table Persons. Also we will have another table that contains a list of bad words or phrases and we’ll name that table BadWordsList. Below is the T-SQL script that will create the tables and the data in your non-production server.

create database [SearchRestrictedWords]
go

USE [SearchRestrictedWords]
GO
/****** Object:  Table [dbo].[BadWordsList]    Script Date: 2017-11-23 4:36:11 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[BadWordsList](
	[BadWord] [varchar](100) NULL
) ON [PRIMARY]
GO
/****** Object:  Table [dbo].[Persons]    Script Date: 2017-11-23 4:36:11 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Persons](
	[FirstName] [varchar](50) NULL,
	[Memo] [varchar](100) NULL
) ON [PRIMARY]
GO
INSERT [dbo].[BadWordsList] ([BadWord]) VALUES (N'javascript')
GO
INSERT [dbo].[BadWordsList] ([BadWord]) VALUES (N'return')
GO
INSERT [dbo].[BadWordsList] ([BadWord]) VALUES (N'select')
GO
INSERT [dbo].[Persons] ([FirstName], [Memo]) VALUES (N'Bob', N'Hi Bob!')
GO
INSERT [dbo].[Persons] ([FirstName], [Memo]) VALUES (N'Sally', N'I love javascript')
GO
INSERT [dbo].[Persons] ([FirstName], [Memo]) VALUES (N'Sam', N'Many happy returns')
GO

Below is the query (and the results) you can use to show all of the rows that contain any of the bad words in the list.

SELECT p.*
  FROM [dbo].[Persons] p
 WHERE EXISTS (SELECT NULL 
     FROM [dbo].[BadWordsList] b 
     WHERE p.Memo LIKE CONCAT('%', b.BadWord, '%'))

Notice that this query uses the key word EXISTS. Notice that “return” is a restricted word, but the table has the word “returns”. The query simply looks a the string of characters, “return” and searches for that and will find that string within the word “returns” and therefore return true. You may want to first get a count of how many rows have words that are part of the restricted list and then display the rows if there are any. To do that we can just get a count first.

declare @NumberBadRows int = 0;
-- first find out how many, if any
SELECT @NumberBadRows = count(*) 
  FROM [dbo].[Persons] p
  WHERE EXISTS (SELECT NULL 
     FROM [dbo].[BadWordsList] b 
     WHERE p.Memo LIKE CONCAT('%', b.BadWord, '%'));
print 'Number of bad rows: ' + cast(@NumberBadRows as varchar(8));
-- if there are any bad rows display them
if (@NumberBadRows > 0) 
	begin
		SELECT p.*
		  FROM [dbo].[Persons] p
		  WHERE EXISTS (SELECT NULL 
			 FROM [dbo].[BadWordsList] b 
			 WHERE p.Memo LIKE CONCAT('%', b.BadWord, '%'));
	end

Two Words or NULL

If a row has two bad words and another row is NULL, the row with two bad words will be displayed and the one with NULL will not be displayed, as you would expect.