Importing Multiple Datasources

Importing Multiple Datasources

Many real-world matching scenarios involve comparing records across two or more datasets. For example, you might need to match a customer list from your CRM against a mailing list from a marketing platform, or deduplicate records that exist in both an internal database and a third-party file. MatchLogic supports importing multiple datasources into a single project to enable these cross-source matching workflows.

Datasource list showing three datasources from different sources (CSV, database, Excel)

How to Add Additional Datasources

  1. Navigate to the Data Import page for your project.
  2. Click the Add New Data button at the top of the datasource list.
  3. Select the connector type for your next datasource. It does not need to be the same type as your first import. For example, your first datasource can be a CSV file and your second can be a PostgreSQL table.
  4. Follow the standard import workflow: upload or connect, configure column mapping, and import.
  5. The new datasource will appear alongside your existing datasources in the list.

Repeat this process for each additional datasource you need.

Mixing Connector Types

MatchLogic treats all imported datasources the same way internally, regardless of their original source. This means you can freely mix connector types within a single project:

  • One datasource from a CSV file and another from a SQL Server database
  • One from an S3 bucket and another from Google Drive
  • Multiple Excel files from different departments
  • A combination of FTP files and direct database connections

All datasources are stored in MatchLogic's internal database after import, so downstream modules work with them uniformly.

Why Import Multiple Datasources?

Multiple datasources unlock cross-source matching, which is one of MatchLogic's most powerful capabilities. Use cases include:

  • Cross-system deduplication — Find records that exist in multiple systems (e.g., the same customer in both Salesforce and SAP).
  • Data consolidation — Merge records from multiple sources into a single golden record.
  • Vendor matching — Match supplier lists from different procurement platforms.
  • Data migration validation — Verify that records migrated correctly by matching source and target datasets.

Tip

When importing multiple datasources that you plan to match against each other, try to use consistent column names for equivalent fields. For example, if both sources have a first name field, name them both "FirstName" during column mapping. This makes the field mapping step in Match Definitions much smoother.

Single-Source Deduplication

You do not need multiple datasources to use MatchLogic. A single datasource is sufficient for within-source deduplication, where you are looking for duplicate records within the same file or table. In the Match Configuration stage, you can configure matching within a single datasource.

Project Scope

All datasources must belong to the same project. You cannot match datasources across different projects. If you need to match data from sources that are currently in separate projects, import them all into a single project.

Important

Adding or removing datasources after you have configured matching may require you to update your Match Configuration and Match Definitions. MatchLogic will guide you if changes are needed, but be aware that adding a new datasource mid-workflow may require some reconfiguration.

After importing all your datasources, proceed to https://help.matchlogic.io/article/207-the-project-dashboard to verify your project statistics, or move directly to the Data Profiling stage to analyze data quality across your sources.