A New Way to Clean Up Duplicate Records

Occam artwork

In this post, we are going to talk about a new way to cleanup duplicate records using a modern EMPI, research automation and robots to automate 80% of the work involved.  First, we’re going to explain what a duplicate is, what causes them, and why they need to be cleaned up.  Next, we’ll outline the steps of a tradition duplicate cleanup and then show how a lot of that work can be automated.

What is a Duplicate and Why Do They Need to Be Cleaned Up?

In this case, we’re using the term ‘duplicate’ to refer to cases where there are two or more records for the same person in a SINGLE data system.  Duplicates negatively impact a healthcare organization in a couple of ways:

  • They affect patient safety because a patient’s medical history is split across multiple records.
  • The quality of reporting and analytics is degraded because the same patient may be counted multiple times.

What Causes Duplicates?

There are two reasons that duplicate records get created and both are a result of missing or changed demographic information:

  1. An existing record cannot be located
  2. An existing record is located, but it cannot be confirmed as belonging to the same person

These issues effect manual data entry as well as automated imports.

How Do Traditional Cleanups Work?

There are four steps to cleaning up duplicates: Identification, Verification, Merge Planning, and Merge Execution.  Identification is the process of finding the duplicate pairs.  This step is often reactive, where duplicates are reported to the data integrity team as they are discovered by the business. After identifying the duplicate pair, the next step is to verify whether they do belong to the same patient.  If demographic information on the records doesn’t match closely, additional research will be needed which can take the form of comparing photo IDs, signatures, etc.  Once it has been verified that the records belong to the same person, the next step is merge planning, that is deciding which record to keep.  To make this decision, a data integrity specialist will follow a set of rules, or decision tree, based on their organization’s survivorship policy, that might consider several key factors such as which record contains the most history, which one has open items, or the most current information.  The final step is to execute the merge in the transactional system.  This step can be tedious and often has to be replicated in other downstream systems.  This whole process can take anywhere from 5 to 20 minutes.

How does Occam Make Duplicate Cleanups More Efficient?

Cleaning up duplicates in a data source is no small undertaking.  Cleanup projects often face challenges like time and resource constraints and the data integrity team is often already inundated with work and doesn’t have the extra capacity to devote to the cleanup. As a result, it’s hard for them to keep up with the new duplicates being created let alone make progress on the backlog.  Cleaning up duplicates is also tedious work.  Doing repetitive work for long stretches of time, leads to human error.  Our experience has shown that without a thorough process, you can expect 12 mistakes per every 1000 decisions.

Proactively Identify Duplicates in Real-time

Occam EMPI automates the Identification step and facilitates a proactive approach to finding duplicates rather than waiting for someone to submit a help desk ticket or send an email.  With interfaces from the source system feeding updates to the EMPI, duplicates are detected in real-time.  This means there are no more “hidden” duplicates.

Shrink the Amount of Manual Identity Decisions by 90%

Occam’s Research Automation module makes use of Lexis Nexis’ semi-public data to automate between 80 and 90% of the identity decisions that would need to be made manually in the Verification step of a traditional cleanup.  Lexis Nexis can verify that records belong to the same person even when a large amount of the demographic data is different between them such as in the case of where a woman has gotten married, changed her last name and moved to a new address.


Another way in which Occam can help is by providing the ability to prioritize cleanup tasks.  One way to prioritize is “just-in-time”, meaning you can work the duplicates that have records with recent activity before tasks for patients you may never see again.  This allows the data integrity team to narrow their focus and work the tasks with the most importance.  Other common ways to prioritize is by data source or patient classification such as high-risk. 

Robots (Say what?!)

The merge planning and merge execution steps require the least amount of skill but consume up to 80% of the time a data integrity person spends resolving a duplicate.  The rules for these steps are generally well defined, which makes them good candidates for automation using robots.  Think of the robot as an assistant that will go off and do the specific steps in your decision tree.  This frees up staff resources to focus on the verification phase of the cleanup, which is where their expertise is needed.  In addition, since robots are software, they can run 24/7, churning through a large number of merges in a short amount of time.  To see a duplicate cleanup robot in action checkout the video “A New Way To Clean Up Duplicates”.

What’s Next?

If you’d like to see a demonstration of how Occam EMPI can help with a duplicate cleanup project, contact our sales department.

If you’d like to learn about other ways in which an EMPI can help your particular specialty, click any of the links below.

This site uses cookies. By using our site, you agree to our privacy policy. Read more

I Accept