SQL Server Searching for Words


This post discusses the query you will need to be able to search inside a column in a table for any word or phrase that is not allowed. For example, you may have some restricted words or phrases that you cannot allow to be anywhere inside a particular column in a table. Perhaps these are profane words or keywords used in a programming language. This would be part of your data validation.

Let’s work with an example table that we need to search, and we’ll name the table Persons. Also we will have another table that contains a list of bad words or phrases and we’ll name that table BadWordsList. Below is the T-SQL script that will create the tables and the data in your non-production server.

1create database [SearchRestrictedWords]
2go
3 
4USE [SearchRestrictedWords]
5GO
6/****** Object:  Table [dbo].[BadWordsList]    Script Date: 2017-11-23 4:36:11 PM ******/
7SET ANSI_NULLS ON
8GO
9SET QUOTED_IDENTIFIER ON
10GO
11CREATE TABLE [dbo].[BadWordsList](
12    [BadWord] [varchar](100) NULL
13) ON [PRIMARY]
14GO
15/****** Object:  Table [dbo].[Persons]    Script Date: 2017-11-23 4:36:11 PM ******/
16SET ANSI_NULLS ON
17GO
18SET QUOTED_IDENTIFIER ON
19GO
20CREATE TABLE [dbo].[Persons](
21    [FirstName] [varchar](50) NULL,
22    [Memo] [varchar](100) NULL
23) ON [PRIMARY]
24GO
25INSERT [dbo].[BadWordsList] ([BadWord]) VALUES (N'javascript')
26GO
27INSERT [dbo].[BadWordsList] ([BadWord]) VALUES (N'return')
28GO
29INSERT [dbo].[BadWordsList] ([BadWord]) VALUES (N'select')
30GO
31INSERT [dbo].[Persons] ([FirstName], [Memo]) VALUES (N'Bob', N'Hi Bob!')
32GO
33INSERT [dbo].[Persons] ([FirstName], [Memo]) VALUES (N'Sally', N'I love javascript')
34GO
35INSERT [dbo].[Persons] ([FirstName], [Memo]) VALUES (N'Sam', N'Many happy returns')
36GO

Below is the query (and the results) you can use to show all of the rows that contain any of the bad words in the list.

1SELECT p.*
2  FROM [dbo].[Persons] p
3 WHERE EXISTS (SELECT NULL
4     FROM [dbo].[BadWordsList] b
5     WHERE p.Memo LIKE CONCAT('%', b.BadWord, '%'))

Notice that this query uses the key word EXISTS. Notice that “return” is a restricted word, but the table has the word “returns”. The query simply looks a the string of characters, “return” and searches for that and will find that string within the word “returns” and therefore return true. You may want to first get a count of how many rows have words that are part of the restricted list and then display the rows if there are any. To do that we can just get a count first.

1declare @NumberBadRows int = 0;
2-- first find out how many, if any
3SELECT @NumberBadRows = count(*)
4  FROM [dbo].[Persons] p
5  WHERE EXISTS (SELECT NULL
6     FROM [dbo].[BadWordsList] b
7     WHERE p.Memo LIKE CONCAT('%', b.BadWord, '%'));
8print 'Number of bad rows: ' + cast(@NumberBadRows as varchar(8));
9-- if there are any bad rows display them
10if (@NumberBadRows > 0)
11    begin
12        SELECT p.*
13          FROM [dbo].[Persons] p
14          WHERE EXISTS (SELECT NULL
15             FROM [dbo].[BadWordsList] b
16             WHERE p.Memo LIKE CONCAT('%', b.BadWord, '%'));
17    end

Two Words or NULL

If a row has two bad words and another row is NULL, the row with two bad words will be displayed and the one with NULL will not be displayed, as you would expect.