You need to ensure that more accurate matches are made by the Fuzzy Lookup transformation without degrading performance

You develop a SQL Server Integration Services (SSIS) package that imports SQL Azure
data into a data warehouse every night.
The SQL Azure data contains many misspellings and variations of abbreviations. To import
the data, a developer used the Fuzzy Lookup transformation to choose the closest-matching
string from a reference table of allowed values. The number of rows in the reference table is
very large.
If no acceptable match is found, the Fuzzy Lookup transformation passes a null value.
The current setting for the Fuzzy Lookup similarity threshold is 0.50.
Many values are incorrectly matched.
You need to ensure that more accurate matches are made by the Fuzzy Lookup
transformation without degrading performance.
What should you do?

You develop a SQL Server Integration Services (SSIS) package that imports SQL Azure
data into a data warehouse every night.
The SQL Azure data contains many misspellings and variations of abbreviations. To import
the data, a developer used the Fuzzy Lookup transformation to choose the closest-matching
string from a reference table of allowed values. The number of rows in the reference table is
very large.
If no acceptable match is found, the Fuzzy Lookup transformation passes a null value.
The current setting for the Fuzzy Lookup similarity threshold is 0.50.
Many values are incorrectly matched.
You need to ensure that more accurate matches are made by the Fuzzy Lookup
transformation without degrading performance.
What should you do?

A.
Change the Exhaustive property to True.

B.
Decrease the maximum number of matches per lookup.

C.
Change the similarity threshold to 0.85.

D.
Increase the maximum number of matches per lookup.



Leave a Reply 4

Your email address will not be published. Required fields are marked *


Den

Den

Why ‘B’?

The answer should be ‘C’

Donk

Donk

Agreed. Answer is C

Slazenjer_m

Slazenjer_m

Absolutely, option C.

*The number of rows in the reference table is very large, yet many values are incorrectly matched!!

**The current setting for the Fuzzy Lookup similarity threshold is 0.50.

The similarity threshold of 0.50 is definitely too low.

Jass

Jass

answer A(Exhaustive property to True.) will work in this case but it will slow down the performance which will not be good enough according to the question.

Answer C is the correct. Because B does not satisfy the results as per the statement “Specifying a maximum number of matches does not guarantee that the transformation returns the maximum number of matches; it only guarantees that the transformation returns at most that number of matches. If you set the maximum number of matches to a value greater than 1, the output of the transformation may include more than one row per lookup and some of the rows may be duplicates. “