What is smart matching?

Question

Florian Poullin · Accepted Answer

Smart matching is a data matching method that cleans values before comparing them.

It sits between exact matching and fuzzy matching. It is stricter than fuzzy matching, but more flexible than raw exact matching.

Example

Raw exact matching can treat these values as different:

https://www.example.com/
http://www.example.com

Smart URL matching ignores the protocol and trailing slash, so these values match. Subdomains, paths, and query parameters remain significant unless their separate URL options are enabled.

What smart matching can clean

The result depends on the selected processor:

Text ignores accents, spaces, punctuation, symbols, case, and word order.
Email handles case, spaces, plus aliases, wrappers, and Gmail-specific variants.
URL handles protocol and trailing-slash differences, with optional rules for subdomains, paths, and query parameters.
Phone compares common formatting variants.
Company Name removes common legal, business, and geographical terms and is available on paid plans.

📌 Short version

Smart matching is useful when values should match after cleanup, but you do not want loose fuzzy matches.

Smart matching vs fuzzy matching

Fuzzy matching uses similarity scores to find values that are close, even with typos.

Smart matching focuses on normalization. It removes formatting noise and compares the cleaned values.

Use smart matching for domains, URLs, emails, company names, and fields where formatting changes are common.

Use distance matching when you need threshold-based similarity with algorithms such as Levenshtein or Jaro-Winkler.

Use phonetic matching when names sound alike but are spelled differently.

Datablist workflow

Datablist supports Smart matching in the Duplicates Finder. Smart is available on the Free plan after creating an account. The Company Name processor remains a paid feature.

See the processors and their exact behavior in the data matching guide, or follow the complete deduplication workflow.