Enterprise Data Warehouse (EDW) Use Case

Occam artwork

Big data initiatives are bringing patient and member data together across clinical, payer, and even CRM data sources.  The challenge with this is that people do not come with unique identifiers that are consistent across data sources so demographic information is used to match them.  The problem with demographic information is that it changes or is often incomplete.  Because of this, important data doesn’t get matched to a patient record, resulting in that data being invisible.  

Many organizations will go the route of developing their own custom matching logic for their data warehouses, but these custom solutions come with many challenges.  The biggest of which is accuracy.  They miss 70 matches per thousand and link data to the wrong person at a rate of 6 or more records per ten thousand.  Also, they usually lack any workflow for handling ‘Maybe’ matches (matches that can’t be automated).

Getting person matching and mastering right is key to assembling complete data for every patient or member. Occam can identify common records across data sources even when demographic information is missing or wrong and links them together by assigning an enterprise identifier.  Now that previously invisible data is usable, and it is possible to build a complete picture.  In addition, Occam’s Master Data Management (MDM) can resolve conflicts in the data and provide a Golden Record, to your data warehouse, with the best and most current demographic information for the patient across all data sources.

Inputs and Outputs

Occam takes in and maintains a copy of the list of patients or members contained in each source system.  Only data that is useful for identification needs to be loaded.  In healthcare, the core data elements are name, date of birth, home address and phone numbers.  Social security numbers are helpful when available as are email addresses, payor identifiers, and driver’s license numbers.  

Any EMPI links that have already been established can also be imported.  This saves the data integrity team from having to potentially re-work ‘Maybe’ matches that were previously verified and saves Occam’s Research Automation module from making unnecessary calls to LexisNexis (click here for more information).

As person records are saved to or updated in Occam, it continuously looks for matches and assigns an enterprise ID to each record.  If two records match, they are given the same Occam ID, thereby linking them together.  The Occam ID is one of the outputs provided.  One important thing to note is that Occam IDs assigned to a record can change.  This can happen for several reasons:  when two records are merged, the link between two records is broken or a work queue tasks has been resolved.

Another output that can be generated is a Golden Record.  The Golden Record is an amalgamation of the best and most current demographic information for the patient across all data sources and is part of Occam’s MDM functionality.  Gatekeeper is another component of Occam’s MDM functionality and it tracks registration changes on a field-by-field basis for each data source so that when it is time to push a change from one system to another, missing and outdated information will be updated in the target system while current information is protected.

Finally, the last output that Occam produces is work queue tasks for matches that couldn’t be automated.  These are referred to as “maybe” matches and needs a human to do additional research to decide if the records in the maybe match belong to the same person or not.


There are two integration methods for how Occam can be incorporated into your data flow and it really depends on when the matching needs to occur.  These methods are referred to as Post-Integration and Pre-Integration.

In Post-Integration, the matching and enterprise ID assignment occurs AFTER the data has landed in your transactional systems.  The data can be loaded into Occam via flat-file, API or HL7.  It can be done in either real-time or batch but is generally done in batch.  

In Pre-Integration, the matching and enterprise ID assignment occurs BEFORE the data has landed in your transactional systems.  In Pre-Integration, the data is generally loaded in real-time via API or HL7.

Post-Integration is the method typically used with a data warehouse but a combination of both may be appropriate depending on your needs.  

Additional Resources

To learn more:

This site uses cookies. By using our site, you agree to our privacy policy. Read more

I Accept