/ How To Guides

Removing Duplicate Records in your CRM

Your database isn’t perfect. It might be a mess. It might just need a small dose of spring cleaning. The most common issue? Duplicate records.

As a former database adminstrator and freelance data “cleaner”, I know how easy it is to find yourself with a bit of database clutter. Hold tight, because the first time hurts a little, but if you come up with a regular data cleansing plan, it will only get easier from here.

via GIPHY

Take it from my experience, it is better to sit down and face the mess, rather than ignore it. With duplicates in your CRM, you may inadvertently find yourself mailing an invoice to an outdated business address (oops! the right one was on the record we added last week!) or presenting the wrong numbers to your boss (I didn’t realize that we actually won three Deals with this business, the record only showed two…).

So, embrace the fear of embarrassment and let it motivate you to take a look at the integrity of your database.

We’ll use Pipedrive CRM as an example here. This CRM does a great job of catching duplicates during the import process, and it also have a nifty “merge” feature that you can use on individual records that you’ve already spotted as duplicates. These both work great, but for mass amounts of data, I want a quick and easy way to catch those duplicate records that managed to sneak their way into my database before I proceed with my clean-up efforts. First up, identification:

I recommend that if your CRM doesn’t have a built-in data management tool, you get your data out into Excel so that you can more easily manipulate and review it.

Before we get started with the fun stuff, I highly recommend taking some time at this early stage to:

  1. Fill in any gaps in the data that you can use to determine if a record is a duplicate. For example, if there should only be one Organization record associated to a company website, make sure as many website fields are filled in. This way you won’t end up with a bunch of empty fields and no way to effecitvely check for duplicates!
  2. Normalize these data fields. This means your fields (i.e. website) should be formatted the same way. Right now, chances are that some of your websites start with “www.” and others start with “https://”. Checking for duplicates will work best if your data is formatted consistently.

Bonus: at the end of this exercise, your database will be oh-so-much prettier with fewer empty fields and consistently formatted field values. Handy for extracting information like mailing lists, proposals, invoices — the list goes on and on and on…

If you’re ready to get moving with the deduplication, start by exporting the records of choice — all of ‘em! I’m going to use Organization records here…
1-4o4l3iFMmqx1lgiFLKYh2A

Take a look at your Excel file, and take a deep breath…

First, decide which field you want to run the duplicate check on. For me, I want to check for duplicate addresses, since I know I should only have one Organization record per unique address. Sort your column alphabetically so that you will end up seeing the duplicate rows next to one another (trust me, this will make your life easier!)

Now you are going to use Excel’s conditional formatting to automatically highlight the duplicate field values:

  1. Go to Home>Styles>Conditional Formatting (this is true for most versions of Excel. If yours is a little different, don’t fret! Google is your best friend for finding these small variances)
  2. Select Highlight Cells rules, and find “duplicate” in the options
  3. Select the format for which you want to highlight your cells. I’m a classic yellow kind of gal, but you can go crazy here if you like
  4. Select “Ok” and let the magic happen!

1-OkpKR8oyC4ZprDiRggrj8A

Now you will easily be able to identify the records which have duplicate info. Do these steps as many times as you need for as many columns as you need to feel confident that you spotted all the dupes.

As you can guess, the process of reviewing these highlighted records is critical. Sometimes you may find that the records are unique and can be left alone. Most of the time you will find multiples of the same record, each housing some unique strands of information that you wish to keep.

When you run across these records with duplicate info, head on over to our trusty Pipedrive interface, and search for the first occurrence of the duplicate.

Handy tip: you can use the unique record ID from your spreadsheet and insert it into the Pipedrive record URL to quickly locate the exact record.

1-ZwijjV5vjTKrDiVzJCPiAQ

Once you are on the record, use the ellipses to locate the additional record options, and select Merge.

From here, select which record will remain primary — that is, which record you wish to retain key details like created date, owner, etc. You can preview what the merge will look like before proceeding. Once you are ready, select Merge, and Pipedrive will move over all related items (Notes, Tasks, Deals, etc.), update empty fields and add new fields where there may be a new incoming value (like phone number).

1-cr6Qme1xXJHQwDUZXUKlSg

Voila! You have now merged the records. Do this as many times as necessary to merge all of your identified duplicates and rest assured you have retained all of the pertinent fields and related items. No data loss and a clean database! Do this regularly and the process will become quick and painless. Elicit the help of an intern and you just became an expert delegator and database cleaner!

via GIPHY

Removing Duplicate Records in your CRM
Share this

Subscribe to Import2 Blog