N-Datasource Matching

N-Datasource Matching

When your project contains three or more datasources, MatchLogic supports multi-source matching where records from all sources can be compared against each other. This is commonly needed in enterprise scenarios where customer or product data exists across multiple systems that need to be reconciled.

Field mapping table with three or more datasource columns, showing one column per datasource where equivalent fields are mapped across all sources on the same row

How Multi-Source Field Mapping Works

With two datasources, field mapping is straightforward: you map a field in Source A to a field in Source B. With three or more datasources, the field mapping table expands to show one column per datasource. Each row in the mapping table represents an equivalent field across all sources.

For example, with three datasources (CRM, ERP, and Marketing List), a mapping row might look like:

CRM ERP Marketing List
CustomerName AccountName FullName
EmailAddr Email ContactEmail
Phone PhoneNumber Mobile

This tells MatchLogic that CustomerName, AccountName, and FullName all represent the same concept and should be compared when records from these datasources are evaluated against each other.

Definitions Apply Across All Pairs

Each match definition you create applies across all configured datasource pairs. If you have three datasources and the "Find All Duplicates" strategy enabled, MatchLogic generates pairs for every combination (CRM vs ERP, CRM vs Marketing, ERP vs Marketing, plus within-file pairs for each). Your definitions are evaluated against every enabled pair.

This means a single definition with criteria for Name and Email will be used to compare CRM records against ERP records, CRM against Marketing, and ERP against Marketing. The field mapping tells the system which specific column to use for each datasource in each comparison.

Network diagram showing three or more datasource nodes with edges connecting all configured pairs, illustrating the full matching topology of an n-datasource project

Setting Up N-Datasource Matching

  1. Import all datasources into the project via the Data Import module
  2. Configure pairs in Match Configuration. Choose a strategy and enable the pairs you want (see https://help.matchlogic.io/article/250-selecting-a-matching-strategy and https://help.matchlogic.io/article/251-configuring-datasource-pairs)
  3. Map fields across all datasources. Each row should map equivalent fields. Use the Suggest Mapping feature as a starting point (see https://help.matchlogic.io/article/256-field-mapping-between-datasources)
  4. Create definitions using the mapped fields. Each criterion references a mapping row, and MatchLogic resolves the correct column for each datasource pair automatically
  5. Run the match and review results. Groups may contain records from multiple datasources

Tip

Use the network diagram view (https://help.matchlogic.io/article/252-using-the-network-diagram) to visualize your matching topology with many datasources. It provides a clear overview of which sources are being compared that is much easier to read than a long table of pairs.

Considerations for Large Multi-Source Projects

Matching complexity grows with the number of datasources. With N datasources and the "Find All Duplicates" strategy, the number of possible pairs is N + N*(N-1)/2 (within-file plus cross-file). For five datasources, this is 15 pairs. Keep the following in mind:

  • Processing time increases — more pairs means more comparisons. Disable unnecessary pairs to improve performance.
  • Field mapping must be consistent — make sure equivalent fields are mapped on the same row across all datasources. Inconsistent mappings lead to incorrect comparisons.
  • Not every datasource needs every field — if a datasource does not have an equivalent field for a mapping row, leave that cell empty. The criterion will be skipped for pairs involving that datasource.
  • Groups can span multiple sources — with Merge Overlapping Groups enabled (https://help.matchlogic.io/article/264-merge-overlapping-groups), a single group may contain records from all datasources.

Important

Ensure every datasource is covered by at least one enabled pair. The coverage validation (https://help.matchlogic.io/article/254-pair-coverage-validation) will prevent you from proceeding if any datasource is orphaned. With many datasources, it is easy to accidentally leave one uncovered.

Match results showing a group containing records from three different datasources, with datasource name labels visible on each record row

Common Use Cases

  • Customer master data management — reconcile customer records across CRM, billing, support, and marketing systems
  • Vendor deduplication — match vendor records across procurement, accounts payable, and supplier portals
  • Product catalog merging — link product entries across multiple inventory and e-commerce systems
  • Data migration — before consolidating systems, identify overlapping records across all legacy sources