Email lists are the start of any email campaign and newsletter management. However, a list can get messy from multiple merges or with user spammy behavior.
The benefits for cleaning your email list are:
- Improve deliverability - Every email provider uses your sender score to determine where to route your emails. To avoid being labeled as SPAM and successfully reach your user, the first step is to avoid sending emails to non-existing addresses. A bad sender score is hard to revert, so better spend some time up-front and clean your list from unreachable email addresses.
- Save money - Usually, you pay for each email you send. First, pruning your email list to remove duplicates and invalid addresses will save you money. Then, you need to remove all disposable email addresses that will never be read to keep only real emails.
- Detect and fix typos - After the cleaning process, wrong email addresses will be flagged. Manually, you can find simple typos in names or domains and fix them.
Email list cleaning is an important part of any digital business and must be done regularly. Datablist is a perfect data tool to perform this cleaning process. Using this step by step guide, you will learn:
- How to remove duplicate emails
- How to verify email addresses syntax
- How to check if emails are from disposable providers
- How to ensure email domains exist
Datablist can be used without registration to view and edit CSV files. However, the email verification service that we'll use later require an account.
How does it compare to paid email cleaning services?
If you search for email verification services on Google, you'll find hundreds (thousands?) of them. Almost all of them charge a fee per email address. Datablist includes an email verification service and it is free. It is great and enough for simple email verifications. However, if you need deeper analysis or if you have to perform verifications on hundreds of thousands of emails, please use a paid email cleaning service.
Step 1: Import email addresses
Create a collection
The first step in the email cleaning process is to create a collection on Datablist in which you will pour your email addresses.
In Datablist, click the + to create a new collection. Give it a name (and an icon 😍).
Import your lists of email addresses
Now you have a collection, it's time to import your email lists: whether you have only one email list or several that you want to merge!
Datablist offers two options to import your data:
- With CSV files
- Using copy/pasting from a spreadsheet
Option 1: Import from CSV files
The CSV format is a simple standard to transfer tabular data between software applications. Every newsletter tool and digital marketing solution offer exports of your contacts in CSV files. CSV files are first-class citizens in Datablist.
In this example, we will use a demo contact CSV file with three columns:
To import your CSV file, click the "Import CSV" button and select your file.
Datablist reads CSV files and Excel files. Your first rows will be read to detect the encoding used. If you see weird characters on the detected headers or later when importing, create another collection and try to import your CSV using another encoding.
Datablist will read the columns and shows you a mapping page. If your email addresses are valid, the data type will be Email. It will add some validation to your data edition later on Datablist.
Here is a video of the full process:
Option 2: Import with Copy/Pasting
Datablist is compatible with copy/pasting from any spreadsheet. Just select the cells from your spreadsheet, go to your Datablist collection and use the
Edit -> Paste from your browser or directly the
Ctrl + v keyboard shortcut.
On pasting, Datablist will show you the columns and rows it has detected. To import a column, map it to an existing property or create a new property.
Warning: Only mapped columns will be imported!
Import other contact lists if needed
If you are building an email collection from several sources, just import all your lists into the same collection.
When importing another file, a mapping step will be shown. In this step, you will map your collection properties with your CSV columns. With this mapping, your new data will be added to the existing properties.
Step 2: Find and merge duplicates
It's common to have an email list built over time. The result is several email listings merged into one. Resulting in duplicates! If your list stores contact information like First Name, Last Name, etc. you might have contact information spread over several duplicates rows.
With all your email addresses in one place. The second step is to remove or merge duplicates entries.
Use Datablist "Duplicates Finder" features by clicking on the "Duplicates Finder" button in the clean menu.
In the configuration screen, you need to select how duplicates will be checked:
- All Properties - Look for full items similarity: Two items would be considered similar when all of their properties match.
- Selected properties - Two items would be considered similar when all of their properties match.
In our example, two contacts are duplicates if they share the same email address. So, pick the Selected Properties mode and select the Email property.
Run the duplicates check to see a preview of all duplicates found in the collection.
If duplicates are found, several actions are available:
- Merge or consolidate duplicate items
- Remove duplicate items
- Edit them
Datablist has an automatic algorithm to dedupe your data. Please visit our documentation to learn more about deduplication.
After the automatic merge, finish the cleaning with the manual merging assistant. To merge duplicate contacts, click on the "Merge Items" button on the left of every duplicate contacts group.
Datablist has a merging tool. On the right, a "Primary Item" is shown and on the left the remaining duplicate contacts are called "Secondary Items". Datablist elects the contact with the most data as "Primary item".
When possible, property values from secondary items are auto selected to be merged into the primary items. If several values conflict, you will have to make a decision and select which value to keep.
If the resulting "Primary item" suits you, click the Merge button to confirm the merge process. All the secondary items will be deleted to keep only one combined item.
Once all the duplicates have been processed, go back to the collection.
Check this video for the whole process:
Step 3: Free email list cleaning
Now you have a collection on Datablist with all the email addresses and no duplicates, it's time to clean it!
You must be register to use the email verification service. Sign up (it's free). If you already have an anonymous collection, import it into your workspace.
What does the service check?
Datablist has a built-in free email verification service. This free service does 3 verifications:
- Email syntax analysis
- Disposable providers check
- Domain MX records check
Email syntax analysis
The first check is to ensure the email conforms to the IEFT standard and does a complete syntactical analysis.
This analysis will flagged addresses without the at sign (@), with invalid domains, etc.
Check disposable providers
The second check is to detect temporary emails. The service looks for domains belonging to Disposable Email Address (DEA) providers such as Mailinator, Temp-Mail, YopMail, etc.
The current database lists about 3000 disposable provider domains and is updated regularly using this disposable domains list.
Check domain MX records
A valid email address must have a corresponding domain name with configured MX records. Those MX records specify the mail server accepting the email messages for the domain. Missing MX records indicate an invalid email address.
For every email address domain, the service checks the DNS records and looks for the MX ones. If the domain doesn't exist, the email will be flagged as invalid. If the domain exists and doesn't have a valid MX record, it will also be flagged as invalid.
Perform cleaning on your collection
Performing an email list cleaning on Datablist is simple. Just click on the "Enrich" menu and select the "Email Address Validation" action.
Once you have selected "Email Address Validation", a drawer opens on the right with 3 sections:
Input Properties, and
Select "Check for MX-records in email domain" in the settings to analyze the MX records.
Select the property from your collection that contains the email address. In this example, the collection has an "Email" property that will be matched.
The "Email Address Validation" service returns 2 values:
- Valid Email - A boolean (
false) to indicate whether the email address is valid.
- Error status - A text explaining why the email address is invalid when "Valid" Email" is
It's important to map the output properties to the collection to store the results!
Click the + on the right of each output property to add the result properties to the collection.
When the outputs are mapped with properties to store the results, click "Run action".
Here is a video of the process (this is an old Datablist version, the email verfication service is now accessible from the "Enrich button"):
Once the service is done, analyze all invalid emails to detect easy fix typos.
Step 4: Remove unsubscribed emails
This last step is optional. You might have a dedicated list containing all unsubscribed emails that you want to remove from your main list. If your list with unsubscribed emails has a specific column like:
email | Unsubscribed email@example.com | yes firstname.lastname@example.org | yes email@example.com | yes
It is possible to perform a join operation and add the
Unsubscribed information to your contact list.
To do so, import the second CSV file and perform a join operation into the same collection.
The common property is the
During the import of the CSV files with the unsubscribed email addresses, create a new property to store the "Unsubscribed" column (click on the "+" during the mapping).
Map the "Email" column with the "Email" property if not already done. Then enable "Join on property" on it.
In the next step, you have to configure how Datablist will perform the join operation on the CSV file.
- Import all rows and match when possible - If an email address from the Unsubscribed file doesn't exist in your collection, a new item will be created.
- Import only matching rows - This is the recommended option here. If an email address from the Unsubscribed file doesn't exist in your collection, the email address will be skipped during the import.
The "Merging Mode" is to define how to merge conflicting data. This is useful when you update your collection and already have an "Unsubscribed" property that you want to update.
After the import, the new property "Unsubscribed" will appear. You can filter on the property and remove the contact items.
Extra Step: Automatic merge duplicates during import
To prevent and merge duplicates automatically during the import process. Check the "do not allow duplicate values" option on the Email property. With this property option, Datablist will automatically deduplicate and merge your contacts on data import.
During the import stepper, a merging option will be shown to configure how Datablist must deal with duplicates.
The merging option is important when your listing contains contact information in addition to the email address.
Soft Merge, if a contact exists with the same email, it will not update properties with previous data (contacts already in the collection or the first contact found in the CSV). This is the default setting.
Hard Merge, if data exists with the same email, it will update it.
If you have any feedback on this guide or if you have questions, please contact us.