Custom Regex Patterns


MatchLogic uses regex (regular expression) patterns to validate field values during data profiling. While the platform includes built-in patterns for common data types, you can create custom patterns to validate domain-specific formats unique to your organization.

Important

This is an advanced feature. Custom regex patterns require familiarity with regular expression syntax. If you are not comfortable writing regex, the built-in patterns handle most common validation scenarios.

Accessing Pattern Options

To manage regex patterns, open the Pattern Options modal from the Data Profiling page. The modal has three tabs:

  1. Default Patterns -- Built-in patterns that ship with MatchLogic. These are read-only and cover common formats like email addresses, phone numbers, dates, postal codes, and URLs.
  2. Custom Patterns -- Patterns you have created. These are fully editable and can be toggled on or off.
  3. Add New Pattern -- A form for creating new custom patterns.

Built-In Default Patterns

The default patterns include validators for:

  • Email addresses
  • US and international phone numbers
  • US Social Security numbers
  • Zip codes (5-digit and ZIP+4)
  • Date formats (multiple variations)
  • URLs
  • IP addresses

Default patterns cannot be edited or deleted, but you can toggle them on or off to control which patterns are applied during profiling.

Creating a Custom Pattern

To create a new pattern:

  1. Open the Pattern Options modal.
  2. Select the Add New Pattern tab.
  3. Enter a Pattern Name -- a descriptive label like "Employee ID" or "Product SKU".
  4. Enter the Regex Pattern -- the regular expression that values should match.
  5. Click Save.

Example Custom Patterns

Here are examples of common custom patterns:

  • SSN format: ^\d{3}-\d{2}-\d{4}$   -- Matches Social Security numbers in XXX-XX-XXXX format.
  • Employee ID: ^EMP-\d{6}$   -- Matches IDs like EMP-001234.
  • Product SKU: ^[A-Z]{2,4}-\d{4,8}$   -- Matches codes like ABC-12345.
  • Canadian postal code: ^[A-Za-z]\d[A-Za-z]\s?\d[A-Za-z]\d$   -- Matches formats like K1A 0B1.
  • US state abbreviation: ^[A-Z]{2}$   -- Matches two-letter state codes.

Managing Custom Patterns

On the Custom Patterns tab, you can:

  • Toggle patterns on/off -- Disabled patterns are not applied during profiling. This lets you keep patterns for future use without affecting current analysis.
  • Edit patterns -- Update the name or regex expression of any custom pattern.
  • Delete patterns -- Remove patterns you no longer need.
  • Reset to defaults -- Restore the built-in default patterns if they have been modified.

How Patterns Affect Profiling

When you run a data profile, all enabled patterns (both default and custom) are tested against every field. Fields whose values match a pattern are reported as valid for that pattern type. This drives the validity scores and pattern classification shown in the https://help.matchlogic.io/article/228-validity-and-pattern-analysis charts.

Tip

Start with the default patterns and only add custom patterns when you have domain-specific formats that the defaults do not cover. Too many custom patterns can slow down profiling on very large datasources.