Five steps in data cleansing process

Data cleansing is hard to do, hard to maintain, and hard to know where to start. There seem to always be errors, dupes, or format inconsistencies. One of the most challenging aspects of data cleansing has got to be maintaining a clean list of data, whether it’s sourced from multiple vendors or manually entered by your hard-working interns, or a combination of both.

Five steps in data cleansing process

According to, one mistype could create a whole myriad of problems within your database, and can lead to hours upon hours of manual cleansing that could so easily have been avoided. So, what is the solution to these frustrating, time-consuming problems?

A simple, five-step data cleansing process that can help you target the areas where your data is weak and needs more attention. From the first planning stage up to the last step of monitoring your cleansed data, the process will help your team zone in on dupes and other problems within your data.

What’s important to remember about the five-step process, is that it’s a continuous cycle. So you can start small and make incremental changes, repeating the process several times to continue improving data quality.


Firstly, you want to identify the set of data that is critical for making your marketing efforts the best they can possibly be. When looking at data you should focus on high priority data, and start small. The fields you will want to identify will be unique to your business and what information you are specifically looking for, but it may include: job title, role, email address, phone, industry, revenue, among others.

It would be beneficial to create and put into place specific validation rules at this point to standardise and cleanse the existing data as well as automate this process for the future. For example, making sure your postal codes and state codes agree, making sure the addresses are all standardised the same way, and so on. Seek out your IT team members in help with setting these up! They are more help than just deleting a virus!

Analyse to cleanse

After you have an idea of the priority data your company desires, it’s important to go through the data you already have in order to see what is missing, what can be thrown out, and what, if any, are the gaps between them.

You will also need to identify a set of resources to handle and manually cleanse exceptions to your rules. The amount of manual intervention is directly correlated to the amount of acceptable levels of data quality you have. Once you build out a list of rules or standards, it’ll be much easier to actually begin cleansing.

Implement automation

Once you have started clean, you should begin to standardise and cleanse the flow of new data as it enters the system by creating scripts or workflows. These can be run in real-time or in batch (daily, weekly, monthly) depending on how much data you’re working with. These routines can be applied to new data, or to previously keyed-in data.

Append missing data

Step four is important, especially for records that cannot be automatically corrected. Examples of this are emails, phone numbers, industry, company size, among others.

It is important to identify the correct way of getting a hold of the missing data, whether it’s from 3rd party append sites, reaching out to the contacts or just via good old-fashioned Google.


You will want to set up a periodic review so that you can monitor issues before they become a major problem. You should be monitoring your database on a whole as well as in individual units, the contacts, accounts, and so on. You should also be aware of bounce rates, and keep track of bounced emails as well as response rates.

It is important to keep up-to-date with who is working at the company; so if a customer does not reply to any campaign in more than six months, it’s a good idea to dig a little deeper and make sure that that person still holds that position, is still at that company, or quite frankly, depending on how well you’ve maintained the database, hasn’t already kicked the bucket.

The end of this cycle, or step six if you will, is to bring the whole process full circle. Revisit your plans from the first step and re-evaluate. Can your priorities be changed? Do the rules you implemented still fit into your overall business strategy?

Pinpointing these necessary changes will equip you to work through the cycle; make changes that benefit your process and conduct periodic reviews to make sure that your data cleansing is running with smoothness and accuracy.

Follow this cycle and you’ll be well on your way to having the cleanest and thus most effective data


No comments