Multi-column deduplication means finding duplicate records by comparing several fields at the same time.
Instead of matching only on one column, such as email, you can match on a combination:
- First name + last name + company domain
- Company name + city + country
- Product name + SKU + supplier
- LinkedIn URL + company domain
Why use several columns
One column is not always enough.
Two contacts can share a first name. Two companies can have similar names. Emails can be missing. Multi-column deduplication reduces false matches by requiring several fields to agree.
📌 Short version
Multi-column deduplication is useful when no single field is a reliable unique identifier.
Example
These two rows might be the same contact:
| First name | Last name | Company domain | |---|---|---| | John | Smith | example.com | | Jon | Smith | example.com |
The first names are not identical, but the last name and company domain match. A multi-column check can detect the duplicate if the first name column uses a fuzzy or smart algorithm.
Datablist workflow
Datablist lets you select several columns and choose a comparison algorithm for each one.
For example:
- Exact match on company domain
- Smart match on first name
- Smart match on last name
- URL processor on LinkedIn profile URL
After the check, use the Duplicates Remover to review groups, auto-merge simple duplicates, or handle conflicts manually.
For two files or collections, read cross-list deduplication. For fields that contain several emails, phone numbers, tags, or URLs in one cell, read multi-value deduplication. If blank fields should change matching behavior, read empty value rules for deduplication.
For deeper examples, read the CRM cleanup guide and the data matching guide.