SQL Server Best way to Join 2 Tables with columns containing SIMILAR data

acruukt9  于 2023-06-28  发布在  其他
关注(0)|答案(1)|浏览(209)

I am having trouble joining to tables together, the 2 columns have similar data but not exactly the same data.

Example:

Table 1: Column 1 = "Expect rain for todays weather" Table 2: Column 2 = "Expect rain for todays weather and overcast clouds"

I have tried using the below to join but it is not 100% accurate: ... FROM [Table1] as [one] LEFT JOIN[Table2] as [two] ON [one].[column1] LIKE '%' + [two].[column2] + '%'

Would fuzzy matching be the best way to get the most accurate matches? Or are there other methods of joining Tables with columns that have similar but not exact data?

Any assistance and advise will be greatly appreciated!

u91tlkcl

u91tlkcl1#

It is difficult in the situation you describe if one of the values in not exactly a subset of the other.

I use a Damerau-Levenshtein Distance function but it is really meant to measure typographical differences. I use it find typographical mistakes in street and suburb names so they are quite short strings. I don't know if the link is one I tried at one point but I currently use one written for CLR which is a lot faster. Probably too slow on long stings if you have a lot of records anyway.

Maybe consider using a Full Text Index and a CONTAINS query to find similar patterns of words.

相关问题