ListMatchGenie

Choosing a profile

A decision tree for picking the right match profile for your data. When to use Person, Company, Identifier, Contact dedupe, Healthcare provider, or Product — and when to build custom.

Picking the right match profile is the single most impactful decision in setting up a match. The wrong profile can produce a 30% match rate where a 75% rate was available. The right profile often needs zero other tuning.

This page is a decision tree. Start at the top, follow the first "yes".

The decision tree

Do both files share a unique identifier (email, account number, NPI, etc.)?

If yes → Identifier profile. Fastest, most accurate. Often 95%+ precision.

If no → keep reading.

Are you finding duplicates inside a single list?

If yes → Contact dedupe profile. Runs the file against itself.

If no → keep reading.

Are the records people (customers, contacts, leads, patients)?

If yes → Person profile. Optimized for name + address + email + phone.

If no → keep reading.

Are the records organizations (companies, suppliers, accounts)?

If yes → Company profile. Optimized for company name + domain + address.

If no → keep reading.

Are the records medical providers (doctors, clinics, pharmacies)?

If yes → Healthcare provider profile. Handles NPI, specialty, credentials, and provider-specific quirks.

If no → keep reading.

Are the records products or SKUs?

If yes → Product / SKU profile. Token-based name matching, numeric tolerance, SKU prioritization.

If no → Custom profile. Start from the closest built-in and adjust.

Common edge cases

"My file has people, but they're all at the same company."

Use Person profile. The company name becomes just another field that contributes to matching; it doesn't need its own profile.

"My file has households, not individuals."

Use Person profile with address weight raised and first name weight lowered in a custom profile. Household matches care more about address than first name.

"My file has a mix of people and companies."

Hard case. Two options:

  1. Split the file by type and run separate matches. Cleanest.
  2. Use Custom profile with moderate weights on every field; precision suffers but you get one match run.

"My identifier column is not actually unique."

If the ID column has duplicates in either file, using Identifier profile causes problems. Options:

  1. Use Person or Company profile with the ID as a tie-breaker (low weight, but highly reliable).
  2. Dedupe the master first on the ID column.

"Both records have email, but also a lot of name data."

Use Person profile. Email gets high weight by default in Person; you don't need Identifier. Person profile benefits from using all the signal available.

Use Company profile but tune suffix stripping. Legal name matching (for due diligence, compliance) usually wants stricter suffix handling — turn off "normalize Inc./Incorporated" if the suffix is legally meaningful.

The rule of thumb

If you're uncertain between two profiles, the Genie's recommendation on the Configure step is usually right. It's based on what the column profiles of both files contain — it can tell whether you have email, NPI, company name, etc.

Override the recommendation only if you have context the Genie doesn't:

  • You know the data is healthcare providers but the NPI column is named something non-obvious
  • You know the "First Name" column is actually full-name (First Last) in disguise
  • You know two similar-looking IDs are actually from different system generations

When to build custom

Build a custom profile when:

  • Your data has a unique structure not covered by the built-ins (e.g. academic researchers matched by publication IDs)
  • You need per-field threshold overrides (e.g. email must be exact, name can be fuzzy)
  • You're doing this match recurringly and the defaults produce consistent room for improvement
  • You're integrating with a specific downstream system that has strict requirements

Custom profiles are a deep knob; start with one of the built-ins and tune, don't build from scratch unless you have a reason.