Match two CSV files
Match two CSV files — fuzzy join without writing code.
You exported two CSVs from two different systems. Now you need to know which rows correspond. The join keys are the same data but the formatting drifted — names spelled differently, emails with aliases, phones formatted six ways. Drag both files in. The Genie does the fuzzy join and exports a single CSV with both sides' columns and a match status per row.

The problem
The two CSVs aren't perfectly aligned. Standard joins miss most matches.
Two CRM exports — same contacts, but the export from System A wrote 'Sarah Patel' and System B wrote 'sarah patel'. Standard CSV merge misses it.
Two ERP exports of the same vendors, one with 'LLC' suffixes and one without. SQL JOIN drops half the rows.
Email addresses with aliases — 'sarah.patel@globex.com' in one file, 'spatel@globex.com' in the other. Same person, different join key.
Doing this in Python (pandas merge + fuzzywuzzy) requires writing code, debugging the fuzzy threshold, handling Unicode edge cases, and re-doing it next quarter.
Doing it in Excel with the Fuzzy Lookup add-in works on Windows only and breaks past ~50K rows.
Doing it in SQL requires loading both files into a database, writing fuzzy-match SQL with custom UDFs (or using Postgres trigram extensions), and a DBA's afternoon.
How the Genie solves it
Two CSVs in. One merged CSV out. Five-minute workflow.
Drag-and-drop CSV-to-CSV match
Source CSV + master CSV. The Genie auto-detects the schema on each, suggests which columns to join on, and lets you tune the match profile (which fields, how heavily weighted) before running.
Fuzzy matching on multiple fields together
Match on name + email + company combined. Calibrated weighted evidence — agreement on three fields beats disagreement on one. Nicknames, aliases, format variants, and company suffix variations all handled.
Review queue with confidence + 'why' per cluster
Every match comes back with a confidence score and an explanation of what aligned and what didn't. Cluster by pattern, bulk-accept high-confidence groups, eyeball the borderline cases.
Output: one merged CSV with both sides' columns
Export the matched file with all columns from both inputs side-by-side, plus match status (matched / review / unmatched), confidence, and master ID per row. Open in Excel, load to your DB, push to BI — same way you'd consume any CSV.
Save the match profile for re-runs
Configure once — match by email + name with company as a tiebreaker, fuzzy thresholds set — save as a profile. Next quarter's pair of files: drag, click, export.
No code, no Python, no DB
Browser workflow. No fuzzywuzzy / rapidfuzz / recordlinkage Python install, no Postgres trigram setup, no Apps Script. Useful when the analyst doesn't want to code or the engineer doesn't want to maintain a one-off match script.
Real example
Matching a CRM export against an event-attendance export
Same workflow whether your two CSVs come from CRMs, marketing tools, ERPs, or custom systems.
Source file
crm_contacts.csv · firstname, lastname, email, phone, company
Master file
event_attendees.csv · first, last, work_email, mobile, organization
Sarah Patel · sarah.patel@globex.com · 415-555-0188 · Globex Inc
S. Patel · spatel@globex.com · 4155550188 · Globex
matchedSurname exact, name initial matches, email alias on shared domain, phone same digits different format, company variant — composite 0.94
Bob Tan · bob.tan@initech.co · 617-555-1234 · Initech
Robert Tan · robert.tan@initech.co · — · Initech Inc
matchedNickname pair, email alias, company variant — confidence 0.91 even with missing phone in master
Mike Johnson · m.johnson@acme.org · 312-555-7700 · ACME Foundation
(no match)
unmatchedNo similar contact in event attendance file — registered for the event would have surfaced
Before and after
What changes when you use ListMatchGenie
Without ListMatchGenie
- Write a Python script with pandas + fuzzywuzzy (or rapidfuzz). Tune the threshold. Handle Unicode, empty strings, missing fields. Debug the false positives. Re-run when next quarter's data arrives.
- Or: load both CSVs into Postgres with the trigram extension; write a JOIN ... ON similarity(...) > 0.7 query; explain what 0.7 means to your stakeholder.
- Or: install Excel's Fuzzy Lookup add-in on a Windows machine; hope your CSV fits under 50K rows.
- In every case, no review queue, no confidence-per-cluster, no 'why this matched' explanation. You get matches and the burden of trusting them.
With ListMatchGenie
- Drag both CSVs in. The Genie profiles each, suggests join columns, and lets you tune the match profile.
- Run the match. Get a clustered review queue with confidence scores and per-cluster explanations.
- Bulk-accept high-confidence patterns; eyeball the borderline ones.
- Export the merged CSV with both sides' columns + match metadata. Total time: 5–15 minutes for a typical mid-size match.
Let the Genie handle the grunt work.
Free tier is real. No card. No forms. Just upload your first list and see the Genie clean and match it in under a minute.

