Match Scores Are All Low

match scores all low

If all match scores are consistently low (below 50%), it usually indicates a data quality issue or a misconfiguration in how fields are being compared. Low scores mean the matching engine does not see enough similarity between compared records, even for pairs that should clearly be duplicates.

1. Data Quality Issues

Inconsistent data formatting is the most common cause of low scores. Examples:

  • "John Smith" vs "JOHN SMITH" — case differences
  • "123 Main St" vs "123 Main Street" — abbreviation differences
  • " Jane Doe" vs "Jane Doe" — leading whitespace
  • "555-1234" vs "5551234" — punctuation differences

Fix: Run Data Cleansing before matching. Apply uppercase/lowercase normalization, remove extra whitespace, standardize abbreviations, and strip punctuation from phone numbers. Clean data produces dramatically higher match scores.

2. Wrong Field Comparison

Comparing fields that do not contain equivalent data will always score poorly. For example, if "Full Name" in Source A contains "John Smith" but you map it to "First Name" in Source B which contains "John," the comparison will score around 45–50% at best.

Fix: Review your field mappings in Match Definitions. Ensure you are comparing semantically equivalent fields. Use Data Profiling to inspect actual values in each field to confirm they contain the same type of data.

3. Incorrect Data Types

If a numeric field (such as a customer ID stored as text in one source) is being compared using text similarity, the score calculation may behave unexpectedly.

Fix: In Match Definitions, check the Criteria Data Type setting for each field. Set it to Number for numeric fields and Text for string fields. Use Phonetic only for name fields where sound-alike matching is appropriate.

4. Overly Strict Fuzzy Similarity Level

Counterintuitively, if the similarity level is set very high and the algorithm is not finding close-enough matches, scores may be reported as zero or very low rather than a partial value.

Fix: Try lowering the fuzzy similarity level to Medium or Low temporarily. If scores improve significantly, your data has more variation than the current threshold allows for.

5. High-Weight Field Always Mismatches

If one field carries a very high weight (e.g., 80 out of 100) and it consistently scores near zero, the overall match score will be pulled down even when all other fields match well.

Fix: Review score breakdowns per field in the Detailed Analysis view of Match Results. Identify which field is dragging down the score and either lower its weight, fix the underlying data quality issue, or remove it from the match criteria if it is not reliable.

6. Comparing Encrypted or Hashed Values

If a field contains hashed, encrypted, or tokenized values, fuzzy matching is meaningless — any two different hash values will appear completely dissimilar even if they represent the same underlying data.

Fix: Only use Exact matching for hashed fields. If the same hash algorithm was applied to both sources, exact matching will correctly identify matches. Fuzzy matching on hashes should be disabled.

Quick test: Pick two records you know are the same person and look at their individual field scores in the Detailed Analysis view. This will immediately show you which field is the source of low scores.