Welcome Guest ( Log In | Register )

Outline · [ Standard ] · Linear+

 How to do search that doesnt match perfectly

views
     
TSnarf03
post Nov 16 2018, 06:03 AM, updated 6y ago

Look at all my stars!!
*******
Senior Member
4,544 posts

Joined: Dec 2004
From: Metro Prima, Kuala Lumpur, Malaysia, Earth, Sol


Lets say i want to do some text search that does not perfectly match some words that in the database

ie i have these in my database
HelloWorld
Asteroid

And i want be able to search by
hello world <- additional space
hwlloworld <- typo e
assteriod <- i and o swap

would be nice if the result also tell how close is the match, ie 100% is the max, hello world should be like 90-95% match, any ideas ?
Lord Tiki Mick
post Nov 16 2018, 02:47 PM

Regular
******
Senior Member
1,018 posts

Joined: Jul 2012
QUOTE(narf03 @ Nov 16 2018, 06:03 AM)
Lets say i want to do some text search that does not perfectly match some words that in the database

ie i have these in my database
HelloWorld
Asteroid

And i want be able to search by
hello world <- additional space
hwlloworld <- typo e
assteriod  <- i and o swap

would be nice if the result also tell how close is the match, ie 100% is the max, hello world should be like 90-95% match, any ideas ?
*
Which database are you using?
bumpo
post Nov 16 2018, 03:01 PM

On my way
****
Junior Member
632 posts

Joined: Mar 2013


read up apache solr
cassian948
post Nov 16 2018, 03:04 PM

Enthusiast
*****
Junior Member
889 posts

Joined: Jan 2017
From: Kuala Lumpur
Using wildcard?
RookieDaddy
post Nov 16 2018, 05:00 PM

Getting Started
**
Junior Member
160 posts

Joined: Nov 2008


hints:
  • SOUNDEX()
  • DIFFERENCE()
  • REPLACE()
  • Damerau-Levenshtein

Norlane
post Nov 16 2018, 10:52 PM

Getting Started
**
Junior Member
83 posts

Joined: Nov 2018
Hmmm....Soundalike and lookalike names.

OP may want to research Damerau-Levenshtein algorithm as mentioned by RookieDaddy.


anti-informatic
post Dec 12 2018, 12:30 PM

Enthusiast
*****
Senior Member
902 posts

Joined: Dec 2006
Here is one function that I found recently to get two string and compare them to determine the possibility of matching rate.

CODE
create function CompareText  (@String1 varchar (50), @String2 varchar (50))
returns integer
--Function CompareText
--blindman 4/2005, Adapted from MS Access algorithm developed 1997
--Returns value between 0 and 100 indicating the similarity between two character strings.

--usage: select * from [Table] where dbo.CompareText([ColumnValue], 'SearchString') > 80

begin

declare @Possibles integer
declare @Hits integer
declare @Counter integer

set @Possibles = len(@String1) + len(@String2) - 2
set @Hits = 0

set @Counter = len(@String1)-1
while @Counter > 0
begin
  if charindex(substring(@String1, @Counter, 2), @String2) > 0 set @Hits = @Hits + 1
  set @Counter = @Counter - 1
end

set @Counter = len(@String2)-1
while @Counter > 0
begin
  if charindex(substring(@String2, @Counter, 2), @String1) > 0 set @Hits = @Hits + 1
  set @Counter = @Counter - 1
end

return (100*@Hits)/@Possibles
end


SOS

From the code above, I modify to get list of matching character from my DB table and concat into a full string instead like below:

CODE
// Get percentage of matching rate
SELECT @TableValueComparisonindex  = (100*@Hits)/@Possibles

// If matching rate is more than 20%, concat the current DB record into the full string
IF @TableValueComparisonindex >= 20
 BEGIN
  SET @CompleteMatchingStrings = @CompleteMatchingStrings + @DBTableValue + @StringSeparator
 END


After returning the full string to my application, I will format and show list of matching character by user input. Perhaps you can use this for reference

 

Change to:
| Lo-Fi Version
0.0159sec    0.33    5 queries    GZIP Disabled
Time is now: 29th March 2024 - 12:37 AM