What is multi-column deduplication?

Question

Florian Poullin · Accepted Answer

Multi-column deduplication finds duplicate records by comparing several properties in the same check.

Useful combinations include:

First name + last name + company domain
Company name + city + country
Product name + SKU + supplier
LinkedIn URL + company domain

Why compare several columns

One property is not always enough. Two contacts can share a name, and two companies can have similar names. Adding a stable identifier reduces false matches.

The first names differ, but the last name and domain agree. A multi-column check can return the pair for review.

📌 Tradeoff

More properties can reduce false matches, but they can also hide real duplicates when one property is empty or inconsistent.

Datablist workflow

Select several properties and configure each comparison independently. For example:

Exact on company domain
Smart on first name
Smart on last name
URL processor on LinkedIn profile URL

All selected properties contribute to the final similarity score. Test the configuration on known duplicate and non-duplicate records before bulk processing.

After the check, use the Duplicates Finder to review groups, merge Ready groups, or resolve conflicts manually.

For separate files, read cross-list deduplication. For several emails or URLs inside one cell, read multi-value deduplication.

Follow the list deduplication guide for a tested workflow and the data matching guide for algorithm selection.

First name	Last name	Company domain
John	Smith	example.com
Jon	Smith	example.com