Match Type — Similar Text
Match Type — Similar Text
Similar Text, also known as fuzzy matching, is a comparison method that tolerates minor differences between values. Instead of requiring an exact character-by-character match, it calculates a similarity score that reflects how close two values are to each other.
How Similar Text Works
Similar Text uses edit-distance algorithms to measure how many changes (insertions, deletions, substitutions) are needed to transform one value into the other. The result is a similarity score from 0% to 100%, where 100% means the values are identical and lower scores indicate increasing differences.
Examples:
JohnvsJon— approximately 85% similarity (one character difference)RobertvsRobart— approximately 83% similarity (one substitution)123 Main Streetvs123 Main St— high similarity despite the abbreviationSmithvsJohnson— very low similarity (completely different values)
Configuring Match Strictness
The Match Strictness slider controls the minimum similarity score required for two values to be considered a match. This is the most important setting for fuzzy matching:
- Lower strictness (e.g., 60-70%) — more lenient. Catches more potential matches, including ones with significant differences. Higher risk of false positives.
- Higher strictness (e.g., 85-95%) — stricter. Only very similar values match. Fewer results but higher accuracy.
Tip
A good starting point for name fields is 75-85% strictness. For address fields, try 70-80%. Run a match, review the results, and adjust the strictness up or down based on the quality of matches you see.
Best Use Cases
Similar Text is the right choice for fields that may contain:
- Typos — data entry errors like "Micheal" instead of "Michael"
- Abbreviations — "St" vs "Street", "Corp" vs "Corporation"
- Minor formatting differences — extra spaces, missing punctuation
- Transliterations — slightly different spellings of the same word
When to Use a Different Match Type
If the values sound alike but are spelled very differently (such as "Catherine" vs "Katherine"), consider using #match-type-sounds-alike instead. If the values are structured identifiers that must be identical, use https://help.matchlogic.io/article/257-match-type-exact-match.
Important
Very short field values (1-3 characters) can produce misleading similarity scores. A single-character difference between "Al" and "Bo" results in a 0% score even though both are short names. For very short fields, Exact Match may be more appropriate.
Combining with Other Match Types
Similar Text works well as one criterion among several in a definition. For example, pair a fuzzy name comparison with an exact postal code match to find people with similar names at the same location. See #setting-field-weights to learn how to balance multiple criteria.