Glossary

Every term ListMatchGenie uses, defined in one place. Cross-links to the full explanation for each term.

A

Admin — A team role with permission to invite, remove, and change roles of team members, manage shared resources, and delete any content. Cannot access billing. See Managing your team.

Audit columns — Optional per-field score columns in exports (e.g. _lmg_score_first_name). Enable via the "Include audit columns" toggle on the Export step. See lmg columns.

Auto-match — Shorthand for a row classified as match — the top candidate scored at or above the match threshold and was accepted without manual review.

Blocking — A matching-engine technique that groups candidate rows by a cheap discriminating key (ZIP, phonetic code) to avoid comparing every source row against every master row. See How matching works.

C

Candidate — A potential match between a source row and a master row. Every candidate gets a composite score; the best candidate per source row drives classification.

Classification — The label assigned to a source row based on its best candidate's score: match, review, or unmatched. See Confidence scores.

Cleanse — Stage 1 of the three-stage pipeline. Profiles, standardizes, and deduplicates data before matching. See Three-stage pipeline.

Cleansing report — The structured record of what was detected and fixed during the cleanse stage. See Cleansing report.

Cluster — In contact-dedupe mode, a group of rows the engine believes are duplicates of each other. Identified by _lmg_cluster_id in exports.

Column profile — Per-column metadata: detected type, null rate, distinct values, sample values. Driven by the Genie at upload time.

Composite score — The 0–100 number representing the engine's confidence that two records match. Computed from per-field scores weighted by the profile. See Confidence scores.

Confidence threshold — The score cut-off that separates match from review. Default 70. See Setting the confidence threshold.

Contact dedupe — A match profile that runs a file against itself to find near-duplicates. See Match profiles.

Combined column directive — The explicit mapping option "first + last (combined)" shown on the Configure step when your source has a single full-name column and your master stores first and last in separate columns. Sub-pickers let you name the exact columns to concatenate; the engine builds the combined name on the master side at match time, with no name-search heuristics. Middle name auto-included when source names average more than 2.4 tokens. See Field mapping.

Customer 360 — A unified view of a single customer assembled from every system that touches them: CRM, support tickets, billing, web behavior, marketing campaigns, loyalty program. Entity resolution is the prerequisite — you can't have a 360 view if you don't know which records describe the same customer.

Customer Data Platform (CDP) — A category of marketing software that aggregates customer data from multiple channels and resolves it into a single profile per customer. CDPs use entity resolution under the hood; ListMatchGenie is a focused tool for the resolution step itself, not a full CDP.

D

Data residency — The guarantee that your data is stored and processed in a specific geographic region (US, EU, or UK) and doesn't cross borders. See Data residency and regions.

Dedup report — The report generated alongside cleansing, listing exact, near-exact, and fuzzy duplicate rows detected within each file. See Dedup report.

Deterministic matching — A matching strategy that requires exact agreement on a specified column or combination (email, NPI, account number) before declaring a match. Fast and transparent but brittle to typos, format drift, or missing values. ListMatchGenie's Stage 1 is deterministic; Stage 2 is probabilistic. Contrast with probabilistic matching.

Duplicate setup — An action on the match history row that re-opens the wizard with a past job's files, column mapping, profile type, match mode, confidence, and ZIP radius all pre-filled. Lets you iterate on a match (swap best_match → all_candidates, change a column role, enable radius) without re-uploading or re-mapping from scratch.

DSAR — Data Subject Access Request. A GDPR-mandated right for individuals to request all data you hold on them. ListMatchGenie provides tooling to fulfill DSARs across your uploaded files. See GDPR.

E

Entity resolution — The discipline of figuring out when two records describe the same real-world entity (person, business, product) across different systems, files, or formats. The umbrella term that covers deduplication, identity resolution, and record linkage as special cases. See Entity resolution.

Exact identifier match — The first stage of the match engine: rows that agree on a shared identifier column (email, NPI, account number) are matched directly with a score of 100.

Export — Stage 6 of the match wizard and the downloadable output of a match. Available in CSV, XLSX, PDF, and PPTX. See Exports explained.

F

Field mapping — Explicit configuration that tells the engine which source column corresponds to which master column. Usually auto-detected; manual mapping needed when column names diverge. See Field mapping.

Fuzzy match — A match method where candidate pairs are scored by similarity (allowing for spelling, formatting, and near-miss variation) rather than requiring exact agreement. Per-field scores combine into a composite.

Fuzzy (Category) — A match role for classification text — specialty, department, genre, cuisine, discipline. Agreement boosts the match score; disagreement is a mild penalty rather than a veto because category labels drift across datasets ("Hematology" ↔ "Hematology-oncology", "Cardiology" ↔ "Cardiologist", multi-language variants). Pick it on any column that represents a bounded vocabulary.

G

GDPR — The EU's General Data Protection Regulation. ListMatchGenie acts as a data processor under GDPR. See GDPR.

The Genie — The AI persona that profiles, cleanses, narrates, and reports across every stage of the product. See The Genie.

Golden record — The "best" canonical version of a real-world entity, assembled from multiple input records via survivorship rules (e.g. "newest email wins, longest address wins"). Often the output of an entity resolution + master data management stack. ListMatchGenie produces the matched + scored evidence; you can build golden records from the export by applying your own survivorship rules.

I

Identifier — A column that uniquely identifies a record (email, account number, NPI, SSN). Used by the Identifier match profile for exact, fast matching.

Identity resolution — The marketing-tech version of entity resolution: stitching the same person across email, mobile ad ID, web cookie, CRM, loyalty program, and offline channels into a single profile. Same underlying math as record linkage and entity resolution, different vocabulary. See Entity resolution.

Insights (legacy) — The original free-form AI chat feature, superseded by Reports. See Insights (legacy).

L

_lmg_ columns — Metadata columns added to every export, prefixed with _lmg_. Include match status, score, master row ID, and pass. See lmg columns.

M

Master Data Management (MDM) — Enterprise software that owns the system of record for resolved entities — usually with survivorship rules, governance workflows, and integrations into every downstream system. Six-figure licenses, multi-year implementations, and a dedicated data team are typical. Entity resolution is a capability MDM platforms provide; you don't need MDM to do entity resolution. ListMatchGenie covers the resolution step without the platform overhead.

Master file — Your canonical reference data (CRM export, registry, curated list). Source files are matched against the master. See Master vs source files.

Match — Either a verb (the process of comparing two records) or a noun (a row classified as match, i.e. confidently matched).

Match job — A single match run — one source file vs one master file (or source alone for dedupe). Every job produces results accessible from the Jobs page.

Match profile — A preset bundle of settings (fields, weights, toggles) optimized for a specific entity type. See Match profiles.

Match rate — Percentage of source rows classified as match. A dashboard metric, not a quality grade.

Match-impact badge — A plain-English translation of the cleanse-stage flag counts, shown next to each mapping on the Configure step: "1,800 rows won't match on this field" (blocking), "3,400 flagged — fuzzy still works" (advisory), "flags in this column — no impact (ignored)". Tells you at a glance what your mapping choices will cost before the match runs.

N

Near-duplicate — A row that shares all identity-column values with another row but differs on supplementary columns. Auto-merged by default during cleansing.

Nickname lookup — Matching first names via a canonical-form table (Bill → William, Liz → Elizabeth). See Handling nicknames and abbreviations.

O

One-to-many — The default matching mode, where a master row can be the best match for multiple source rows.

One-to-one — An opt-in matching mode where each master row can only be claimed by one source row, enforced via a globally optimal assignment. See One-to-one vs one-to-many.

Owner — The account-holder role with full billing, delete, and admin capabilities. One per account.

P

Pass — A discrete stage in the match engine's pipeline. Stages run in a fixed order from identifier-based matches through fuzzy comparison to final classification.

PII — Personally Identifiable Information. Names, emails, phone numbers, addresses, etc. ListMatchGenie stores PII only in encrypted regional storage, never in its primary database. See PII and security.

Probabilistic matching — A matching strategy that scores per-field similarity (allowing for typos, format drift, abbreviation, transliteration) and combines those scores into a single confidence value. Captures the 70% of true matches that deterministic matching misses. ListMatchGenie's Stage 2 is probabilistic, built on the same mathematical foundation that academic record-linkage research has refined for decades. Contrast with deterministic matching.

Profile — See Match profile.

R

Record linkage — The academic / healthcare / government statistics term for entity resolution. Same problem, same math, different vocabulary. Decades of academic research — going back to foundational 1960s statistics papers — underpin how modern probabilistic matching engines, ListMatchGenie's included, score the evidence.

Region — See Data residency.

Report — A structured analytical document generated from a completed match job. Includes executive summary, pivots, charts, key findings. See Reports.

Review — A classification for source rows whose top candidate scored between review and match thresholds. The engine is uncertain; you decide.

Review queue — The list of review cases awaiting your decision. Side-by-side comparison UI with approve/reject actions.

S

Score — See Composite score.

Shareable link — A tokenized URL to a report that grants read-only access. Can be password-protected. See Sharing reports.

SKU — Stock Keeping Unit — a product identifier. The Product profile treats SKU as near-definitive for matching.

Source file — The list you're looking things up against the master. Usually new or incoming data. See Master vs source files.

SSE — Server-Sent Events. The one-way-streaming technology used to push live match-progress updates from the server to the browser.

T

Threshold — The score cut-off for classification. Two thresholds exist: match (default 70) and review (default 55).

Transliteration — Converting a non-Latin-script name to Latin script for matching (García → Garcia). Automatic per the data's detected script. See Handling international names.

U

Unmatched — Classification for source rows whose best candidate scored below the review threshold (or had no candidates). Not present in the master.

Concepts — the narrative explanations
FAQ — common questions
Troubleshooting — specific errors and fixes