You need to ensure that more accurate matches are made by the Fuzzy Lookup transformation without degrading performance

You develop a SQL Server Integration Services (SSIS) package that imports SQL Azure
data into a data warehouse every night.
The SQL Azure data contains many misspellings and variations of abbreviations. To import
the data, a developer used the Fuzzy Lookup transformation to choose the closest-matching
string from a reference table of allowed values. The number of rows in the reference table is
very large.
If no acceptable match is found, the Fuzzy Lookup transformation passes a null value.
The current setting for the Fuzzy Lookup similarity threshold is 0.50.
Many values are incorrectly matched.
You need to ensure that more accurate matches are made by the Fuzzy Lookup
transformation without degrading performance.
What should you do?

You develop a SQL Server Integration Services (SSIS) package that imports SQL Azure
data into a data warehouse every night.
The SQL Azure data contains many misspellings and variations of abbreviations. To import
the data, a developer used the Fuzzy Lookup transformation to choose the closest-matching
string from a reference table of allowed values. The number of rows in the reference table is
very large.
If no acceptable match is found, the Fuzzy Lookup transformation passes a null value.
The current setting for the Fuzzy Lookup similarity threshold is 0.50.
Many values are incorrectly matched.
You need to ensure that more accurate matches are made by the Fuzzy Lookup
transformation without degrading performance.
What should you do?

A.
Change the Exhaustive property to True.

B.
Change the similarity threshold to 0.55.

C.
Change the similarity threshold to 0.40.

D.
Increase the maximum number of matches per lookup.

Explanation:
http://msdn.microsoft.com/en-us/library/ms137786.aspx



Leave a Reply 5

Your email address will not be published. Required fields are marked *


moogaloo

moogaloo

C

Gary

Gary

C. Not just more matches but more acurate matches.

jml

jml

I think B is correct.

Juan

Juan

I think is B
higher threshold more accuracy

“The similarity score is represented by a decimal value between 0 and 1, where a similarity score of 1 means an exact match”