CRM Guide

The Complete Guide To Cleaning Up Your Customer Database

Published , Updated 8 mn
Profile picture for Yassine Hamou Tahra

Yassine Hamou Tahra


Yassine is one of Salesdorado's co-founders. He's also the founder and CEO of Octolis, a new age Customer Data Platform

The quality of your customer base is deteriorating, you know you need to clean it up, but you haven’t done it yet? Or have you inherited a poorly maintained customer base? You don’t know where to start, how to proceed, what tools to use, what action plan to put in place?

In this comprehensive guide, we will detail how to successfully clean up your customer base.

What are the problems with your customer base?

If you do not clean up your customer base, its quality will necessarily deteriorate. Customer data is a living thing, it has a life cycle and deteriorates over time. You cannot have good performance with your marketing and CRM actions if you work on a poor quality customer base: the quality of your campaigns, the actions of your sales people and your customer service depend on it. Cleaning up your customer database – and let’s be clear: regularly – is therefore a major issue. Customer data cleansing is at the heart of any Data Quality Management strategy.

Read our comprehensive guide to data quality.

What is a poor quality customer base? It is a database that contains the following elements


A duplicate refers to a customer or contact who is found several times (at least twice) in a customer/contact database. Why do duplicates exist? There are several possible reasons:

  • Syntactic – spelling differences, often related to human errors in data entry or lack of standardisation (FirstnameLastname Vs LastnameFirstname Vs Lastname, etc.)
  • The fact that services/sources do not always use the same identifier. For customer service it is the account number, for marketing it is the email, etc. This makes it difficult to reconcile data (in the absence of a unique identifier acting as a reconciliation key).
  • The fact that the same customer may give different information from one collection source to another. For example, using email A to fill in the form and giving email B to customer service. If the email is the reconciliation key, duplicates are created.

In the presence of duplicates, the source contact (the reference data) must be identified and the data merged, taking care not to merge separate contacts incorrectly, which can be a tricky exercise.

Incorrect data

This applies in particular to email. For example: the contact has mistyped his or her email address in the form or has misspelled it to the customer service advisor: we are dealing with incorrect data. Surname, first name, title, date of birth, city, telephone number, etc. All data can be subject to input errors.

Data entry errors are the main reason for the existence of erroneous data. Another reason is related to problems of format incompatibility between the collection source and the database. For example, the transformation of Gérard into Gérard.

Incomplete data

Databases often contain uncompleted fields. Some databases look like Swiss cheese, with a significant proportion (sometimes a majority) of the fields being empty.

Non-standard data

You have to choose between “bd.” and “boulevard”. Between “75” and “Paris”, you have to choose. Between Mr. and Mr., you must choose. A given piece of information (age, title, postal address, etc.) must be presented in the same way for all contacts. In technical language: the data must be correctly formatted. Again, it is the existence of a plurality of collection sources that is the main cause of non-standard data.

See our comparison of RNVP processing software.

Obsolete data

As we said earlier, customer data is a living thing. Some data that was true at time t becomes false at time t. People move, change their phone number, change jobs, etc. Most data is perishable, even if there is a change in the data. Most data is perishable, even if there are some exceptions: first name, title, etc.

Data obsolescence is particularly important in B2B.

How do you go about cleaning up your customer base?

We are now going to give you some advice on how to successfully clean up your data. We will first look at how to define an action plan before presenting you with some tools/solutions.

Define an action plan

You don’t just jump into a customer database cleaning project. Before you even start and carry out the main stages of the action plan, you must clearly identify the objectives of your database and its use cases. These objectives and use cases will serve as reference points when it comes to making decisions. Agree on what you mean by customer and choose a unique key for identifying contacts/customers: email, customer account number, etc. You can use a key built on the combination of several fields, for example email + name. Here are the different steps we recommend you follow to clean up your customer database.

Discover our complete guide to email enrichment (use cases and how it works).

Step #1 – Involve all users of the customer base

Marketing, Sales, Legal, Accounting, Customer Service: involve all the people who use or will use the customer base. This will allow :

  • To raise awareness of the importance of maintaining the quality of the customer base among all those who handle it.
  • To listen to the expectations of those who use the database on a daily basis, to formulate target use cases.
  • Identify the fields they need and, conversely, those that exist in the current database but have no use.

Step #2 – Make a copy of the client base

To ensure that you can go back if you make a wrong decision or have any problems during the cleaning process, make a copy of the client base. This is a safety measure.

Step #3 – Rethink the fields in your customer base

You need to remove data fields that serve no purpose, that are of no use to your marketing team, your sales people, your customer service. There is absolutely no point in maintaining fields that will never be filled in by end users.

Conversely, add fields that are not currently used but which have real added value and will help you to achieve your objectives or to set up new use cases.

Step #4 Delete inactive contacts

Customers or contacts that have been inactive for a long period of time should be considered as lost contacts and therefore removed from the database. You can try to reactivate them if you think they are recoverable contacts/customers, but in any case, a customer base should contain a minimum proportion of inactive customers.

It is sufficient to analyse the date of the last interaction (purchase, telephone contact, web visit, etc.) to identify the inactive.

Step #5 – Deal with erroneous data

Your customer base certainly contains erroneous data. We advise you to correct the data when possible, using data quality tools (we will introduce you to some in a moment). When an erroneous data is impossible to update, delete it.

Step #6 – Use the reconciliation key to identify and remove duplicates

Have you decided to use email as an identification key? Apply this key to your database to identify duplicates, i.e. fields that are filled in several times. Then define a prioritisation rule to define, in case of duplicates, which data to keep. For example, you may decide to prioritise data from source A over data from other sources when there is a data conflict. Once you have defined your prioritisation rule, merge the duplicate records.

Note that some CRMs offer duplicate identification systems and alert users when a duplicate is generated. All the tools we are going to present to you shortly offer deduplication modules.

Step #7 Standardize the fields in the customer database

Postal code, title, telephone number, age: all the fields must be standardised to ensure the uniformity of your database. This means setting up the right parameters in your collection tools/sources (so that these tools/sources upload data in the right form to the customer database) and writing the naming rules for users who enter data manually.

Step #8 Enrich your customer base

We recommend that you enrich your customer base by filling in as many empty fields as possible…so that your customer base no longer looks like a Swiss cheese. How to proceed? There are several possibilities:

  • Interview your customers by creating questionnaire campaigns.
  • Import data from your other databases & tools.
  • Use data enrichment tools / data providers.

Here are two bonus tips for cleaning up your customer base:

  • If you are cleaning up a lead base, remove any odd or obviously non-serious email addresses such as “[email protected]“, “[email protected]“, etc. These are fake leads (typically people who have created an address to download resources for free without giving their real email address). These are fake leads (typically people who have created an address to download resources for free without giving their real email address).
  • Analyse the performance of your marketing campaigns (especially your emailing campaigns) to identify erroneous email addresses as well as individuals who are inactive or no longer wish to receive your communications (opt-out, spamming, etc.).

We are now going to present you 4 tools / solutions / providers to help you in the cleaning of your customer base.

What solutions / service providers can help me clean up my customer base?

We have selected 4 tools for you: Octolis, DQE Software, Winpure and Amabis.


Octolis is a lightweight Customer Data Platform designed to connect your different data sources, prepare your data for business use cases and synchronise the prepared data (segments, scores, aggregates) in your Sales and Marketing tools. Octolis offers advanced features to cleanse, de-duplicate, merge, unify, enrich, score and segment customer data.

One of the great advantages of Octolis lies in its very broad functional coverage. More than a Data Quality tool, Octolis is a new generation Customer Data Platform, independent of your database, over which you retain full control.

Discover .

DQE Software

DQE Software is a software company specialising in data quality. The company, founded in 2008, offers several tools to verify and normalise postal addresses, email addresses, telephone numbers, gender, city, etc. DQE has also developed the “DQE DEDUP” product to deduplicate and deduplicate your customer databases. All software is available in licence mode or in SaaS mode (cloud). DQE is a fast-growing player in the data quality market.


Winpure is one of, if not the most popular data cleaning tool on the market (worldwide). It allows to automate a large part of the cleaning operations: deduplication / deduplication, rectification of erroneous data, data normalisation, email/phone verification, etc.

Winpure can be used to clean up your CRM, Excel spreadsheets or SLQ Server, Access or Dbase databases. It also handles text files (.txt). Winpure also offers an API. It is a relatively affordable solution.

Or providers such as Amabis

Launched in 1996, Amabis is a service provider and software publisher specialising in data quality and CRM/Marketing database management. From postal standardisation and deduplication to email/phone validation, Amabis offers a range of high quality solutions to help you clean up your customer base and, more broadly, your data quality strategy. All Amabis tools are available in license or SaaS mode.

Amabis also offers operational support services. Note that Amabis offers a CRM (AmaCRM, targeting SMEs) with native data quality management.

For more tools, check out our Top 10 email list cleaning tools.

We are coming to the end of this article. We hope you have found it useful. Would you like to receive advice on your data quality approach or the choice of tools/solutions? Do not hesitate to contact us! We will be happy to discuss your issues.

About the author

Profile picture for Yassine Hamou Tahra

Yassine Hamou Tahra

Yassine is one of Salesdorado's co-founders. He's also the founder and CEO of Octolis, a new age Customer Data Platform