Dealing with duplicates is an exhausting task with spreadsheet tools. Datablist provides a built-in deduplication feature to find duplicate values in your collections. Duplicates can then be removed or merged (automatically or using a merging assistant).

To run a duplicates analysis in a collection, click on the "Duplicates" button in your collection header.

Duplicates Finder
Duplicates Finder

Select duplicate check method

The first step is to select how to compare items to find duplicate values. Two methods are available:

  • All Properties - Look for items with similar values for all properties. Two items would be considered similar when all of their properties match.
  • Selected properties - Select the properties to be used for similarity check. Two items would be considered similar when they have similar values on all the selected properties.

Most of the time, Selected Properties is the best option. This is perfect to deduplicate contacts based on their email addresses and phone numbers, or companies using their website URLs. Use Selected Properties when you have one or several properties that identify a product, person, company, etc.

Configure Properties

Next, if you click on the Selected Properties method, you have to select which properties will be checked.

For example, in a collection with people items, you would select the Email property.

Example for Selected Properties
Example for Selected Properties

Review and run duplicates check

Then, a review screen is shown with the number of items in your collection and the duplicate check mode selected. Click "Run duplicates check" to continue.

Note: The duplicates analysis is a read-only operation. It will not perform any changes in your collection items until you decide to merge items.

Example for Selected Properties
Example for Selected Properties

Duplicates Listing

After running through all your items for duplicate items, Datablist will list all the duplicates found.

From this list, several actions are available:

The recommended way to deal with duplicates is to merge non-conflicting duplicates first (1 in the screenshot below), and then pursue with Datablist merging assistant for conflicting items (2 in the screenshot below).

Duplicate Item listing
Duplicate Item listing

Automatically merge non-conflicting items

Datablist is able to merge non-conflicting items automatically without losing any information. It works as follows:

  • If all the duplicate items have the same property values, only one item will be kept and the others will be deleted.
  • If the duplicate items are complementary, the item with the most information will be selected as the primary item and its property values will be filled using other item property values. Then all items except the primary item will be deleted.
  • If duplicate items have conflicted property values, items will be skipped for manual merging.

This action is done by clicking on the "Merge non-conflicting duplicates" button.

Manually merge conflicting items

For some duplicate items, conflicting values prevent them to be merged without losing data. For example, imagine two items that store contact information with the following data:

# Contact A
Name: John Doe
Job Title: Marketing Manager
Email: johndoe@company.com
# Contact B
Name: John Doe
Job Title: Business Development
Email: johndoe@company.com

We know they represent the same person because they have similar email addresses. Unfortunately, they have two different job titles, and merging them into a single item means choosing which job title to keep.

This is where to use Datablist Merging Assistant. A "merge" button is available on the left of every duplicates group.

Manual Merging
Manual Merging

It opens the merging assistant for the items. On this modal, you can select which value to keep for each conflicting property. After merging them, all except one (the master item) will be deleted.

Merging Assistant
Merging Assistant

Please check our Datablist Merging Assistant documentation page to learn more.

Edit or delete items

Another way to deal with duplicate values is to directly edit an item's data or delete unnecessary items. You can directly perform those actions with the buttons listed on each item.

Edit and delete buttons
Edit and delete buttons

To learn more

Read our guides for step by step tutorials on how to deal with duplicates for CSV and Excel files: