What is Master Patient Index?

Occam artwork

Lots of the people we talk to for the first time are unclear on exactly what master patient index or enterprise master patient index is, let alone why you would need one.  The goal of this post is to provide a jargon-free introduction to the technology.  It’s a monster topic, so we’ll probably just scratch the surface.  We hope you find it helpful!

Master Patient Index vs. Enterprise Master Patient Index

The terms “Master Patient Index” and “Enterprise Master Patient Index” are used interchangeably but they are different.  Master Patient Index (MPI) technically refers to a single source system and all its patients.  MPI is also used as shorthand for Enterprise Master Patient Index (EMPI), which is a database that brings together, or “links”, patient records from multiple source systems.

What Does a Modern EMPI Do?

There are four objectives that a modern EMPI should support.  They are:

  1. Identifying that records from different source systems belong to the same patient and linking them together even when key information such as social security number is missing or wrong. To link the records together, an EMPI will assign the same enterprise identifier to each record belonging to the same patient in each source system.
  2. Making use of outside semi-public data to reduce or eliminate the manual effort involved in patient matching.
  3. Generating a golden record that contains the best-known information for a person whose information is spread across many different sources.
  4. “De-duping” a transactional system by identifying cases where there’s more than one record for the same patient and providing workflow for resolving them.

How Does an EMPI Work?

An EMPI constantly looks for potential matches across all data sources that are loaded into the EMPI.  The matching is handled by a match algorithm that locates and assesses the quality of matches even when demographic information is missing or wrong.  The algorithm follows the same sort of logic that a well-trained data integrity specialist would when comparing two records to see if they belong to the same person.  The algorithm should account for:

  • common data entry errors like a transposed month and day in a date of birth, misspelled names, and inconsistent addresses.
  • common false positives like twins or junior/senior
  • corrupt records where one person’s information is overlaid with another person’s information.
  • default values such as, all ones or nines for a social security or phone number or ‘baby girl’, or ‘John Doe’ for name.

When looking for potential matches, an EMPI will start by looking for records that have similar values in key demographic fields. Then it narrows those results by looking at the overall quality of the matches and assigns a score and classification to each potential match. 

How Are Matches Classified?

Potential matches are given either a ‘Yes’ or ‘Maybe’ classification.  Records that are given a ‘Yes’ classification, meaning that there is no doubt these records belong to the same person, will be automatically linked together by assigning them the same enterprise identifier.  No human interaction is required.  When two records look like they could be the same person but there are enough differences that it is not safe to auto-link them, they will be given a ‘Maybe’ classification. 

Many EMPIs have the ability to use semi-public information from third party data aggregators like Lexis Nexis to help identify people.  You may hear this functionality referred to as Research Automation or Referential Matching.  The result is that many of the maybe matches will be turned into yes matches, shrinking the amount of manual work required.  Those maybe matches that are left become tasks that will need to be reviewed by a data integrity person.

How Are Tasks Handled?

As mentioned above, not every match can be automated.  These are turned into tasks for a person to research to determine whether the records do belong to the same person.  When working tasks, there are usually three types of decisions the EMPI user can make: “Yes”, “No” and “Insufficient Info”.  A “Yes” decision means that you’ve done the research and can definitively say that the records are for the same patient.   When a “Yes” decision is made, the EMPI will assign both records the same enterprise ID, thereby linking them together.   A “No” decision means that you’ve done the research and can without a doubt say the records belong to two different people. The EMPI will then block those two records from future matching.  An “Insufficient Info” decision means that even after doing the research, you can’t say whether the records belong to the same person.  A modern EMPI can make the decision easier by flagging in some way, common types of false positives.  A false positive is where it looks like the records might be for the same person, but they aren’t.  Examples would be twins with similar first names or father and sons with the same first name.

It is important when working tasks, that the user document the research that was done.  That way, when performing routine auditing of the decisions being made the auditor does not have to repeat the research to know that it was correct.

Task Prioritization

It is normal to have a lot of tasks when an EMPI first goes live, so the ability to prioritize them is important.  The result is that tasks can be completed just-in-time.  It is common to prioritize active patients or those in high-risk groups.  Some organizations may prioritize duplicate tasks (more than one record for the same patient in a single source system) over link tasks.  For more information on how to “de-dupe” a source system, see the article “Using an EMPI to Cleanup Duplicates.”

What Is Auditing and Why Is It Important?

Mistakes happen.  When they do, it is important to have the ability to catch them and coach the person that made the mistake.  Auditing provides the ability to selectively review manual tasks decisions made by the data integrity team and correct any that might be wrong.  A good EMPI should make it easy for an auditor to zero in on questionable decisions. This allows supervisors the chance to verify that their team understands the organization’s data integrity policies and provide coaching in cases where improvement may be needed. 

How to Measure Match Quality?

 ‘Recall’ and ‘Precision’ are the measures most-often used to describe the accuracy of a match algorithm.  Recall is the measure of how well the algorithm finds all potential matches. Once potential matches have been located, the algorithm narrows those results by assessing how well those potential matches stack up against the defined criteria.  The measure of how accurate this assessment is done is known as Precision.   

To use a car analogy, let’s say you want to find all the red 1985 convertible Ford Mustangs with black leather seats in your local junkyard.  You don’t want to look at every car in the junkyard, so you start by looking for all red convertible Mustangs.  If all the cars meeting your criteria were located, your search would get a perfect recall score.  If any cars were missed that meet the search criteria, , the recall score goes down.  

If your search returned only the cars that meet your criteria exactly, it would receive a perfect precision score.  If your search results included a Mustang with white seats or no rag top (AKA false positives), the precision score would suffer.

The tricky part about measuring recall and precision is that you need something to compare your results to.

What’s next?

To learn more about the different ways in which an EMPI can be used to help your particular specialty, click any of the links below.

This site uses cookies. By using our site, you agree to our privacy policy. Read more

I Accept