Multi-column deduplication means finding duplicate records by comparing several fields at the same time.

Instead of matching only on one column, such as email, you can match on a combination:

  • First name + last name + company domain
  • Company name + city + country
  • Product name + SKU + supplier
  • LinkedIn URL + company domain

Why use several columns

One column is not always enough.

Two contacts can share a first name. Two companies can have similar names. Emails can be missing. Multi-column deduplication reduces false matches by requiring several fields to agree.

📌 Short version

Multi-column deduplication is useful when no single field is a reliable unique identifier.

Example

These two rows might be the same contact:

| First name | Last name | Company domain | |---|---|---| | John | Smith | example.com | | Jon | Smith | example.com |

The first names are not identical, but the last name and company domain match. A multi-column check can detect the duplicate if the first name column uses a fuzzy or smart algorithm.

Datablist workflow

Datablist lets you select several columns and choose a comparison algorithm for each one.

For example:

  • Exact match on company domain
  • Smart match on first name
  • Smart match on last name
  • URL processor on LinkedIn profile URL

After the check, use the Duplicates Remover to review groups, auto-merge simple duplicates, or handle conflicts manually.

For two files or collections, read cross-list deduplication. For fields that contain several emails, phone numbers, tags, or URLs in one cell, read multi-value deduplication. If blank fields should change matching behavior, read empty value rules for deduplication.

For deeper examples, read the CRM cleanup guide and the data matching guide.