Duplicate Field

The Duplicate Field operation duplicates an existing column in your datasource, creating an identical duplicate with a new name. This is a critical preparatory step when you want to apply destructive transformations (like removing characters or changing case) while preserving the original values for reference or for use in other matching criteria.

Why Duplicate First

Data cleansing transformations modify field values in place. Once you apply an uppercase conversion or remove special characters, the original values are changed. Duplicating a field before transforming it gives you two versions:

The original field with untouched values for display, reporting, or alternative matching strategies.
The duplicate field with cleansed values optimized for matching.

For example, if you have a "FullName" field and want to create a cleaned version for matching:

Duplicate "FullName" to "FullName_Clean".
Apply Trim to "FullName_Clean".
Apply Uppercase to "FullName_Clean".
Apply Remove Special Characters to "FullName_Clean".

Now "FullName" still contains "O'Brien-Smith, Jr." while "FullName_Clean" contains "OBRIENSMITHJR" -- optimized for matching but preserving the original for display.

How to Use the Duplicate Field Node

Drag the duplicate field node from the Standardization category onto the canvas.
Connect it into your workflow, typically early in the chain before any destructive operations.
Click the node to open its properties panel.
Select the source field -- the column you want to duplicate.
Enter a new field name for the duplicate. Use a naming convention that makes the relationship clear, such as appending "_Clean", "_Match", or "_Normalized" to the original name.

Best Practices

Place Duplicate Field nodes at the beginning of your workflow, before any transformation nodes. This ensures the duplicate contains the original, unmodified values.
Use consistent naming conventions. If you copy multiple fields, use the same suffix pattern (e.g., "FirstName_Clean", "LastName_Clean", "Address_Clean") so duplicated fields are easy to identify.
Duplicate before aggressive transformations. Operations like Remove Non-Alphanumeric, Remove Letters, or Remove Numbers are highly destructive and should almost always be applied to duplicate fields rather than original fields.
Apply subsequent transformations to the copy. After creating the copy, make sure your downstream nodes target the duplicated field name, not the original.

When Duplicating Is Not Necessary

Some transformations are non-destructive or minimally destructive, and copying may not be needed:

Trim whitespace -- Rarely changes the meaningful content of a value.
Remove Extra Whitespace -- Only collapses multiple spaces to one.
Case conversion -- Changes appearance but not the textual content. Whether you need a duplicate depends on whether you need the original casing for display.

Tip

When configuring match definitions later, you can choose to match on the cleansed duplicate fields rather than the originals. This gives you the best of both worlds -- clean matching with original data preserved for review and export.

Important

Each Duplicate Field operation adds a new column to your datasource schema. If you duplicate many fields, your datasource will have additional columns. Keep this in mind when configuring field mappings in https://help.matchlogic.io/article/236-building-a-cleansing-workflow and later in match definitions.