ListMatchGenie

PII and security

The data boundary between customer files and every other system — especially the AI provider. What's stored where, what's encrypted, and what's guaranteed never to leave your region.

ListMatchGenie handles lists of people, companies, and other entities — often with PII (personally identifiable information) like names, emails, phone numbers, and addresses. Protecting this data is the product's most important responsibility.

This page is a factual description of where every kind of data lives, what touches it, and what's guaranteed never to happen.

The PII-free database invariant

ListMatchGenie's primary database contains zero PII. This is a structural guarantee, not a policy.

PostgreSQL stores:

  • User accounts (email, hashed password, profile name) — this is considered PII, but it's only our direct users, never your customers
  • Subscription state and billing info (handled by Stripe, see below)
  • Match job metadata (timestamps, status, configuration, summary stats)
  • Report structure (stat counts, narrative text, chart configs)
  • Shared-link tokens and configuration

PostgreSQL never stores:

  • Any row from any file you upload
  • Any cell value from any file you upload
  • Any result of any match (matched pair, score, classification)
  • Any review-queue content

Where does your data actually live? Regional S3 buckets. See Data residency and regions for the full architecture.

The AI boundary

The Genie — the AI narrator — generates text from aggregate statistics. It does not receive raw rows, raw cells, or raw identifiers.

What the AI model sees:

  • Column names and types (email: string, phone: identifier, etc.)
  • Aggregate stats per column (null rate, distinct count, top-5 most common values only when the column is low-cardinality categorical like state or industry)
  • Match outcome counts (N matched, N review, N unmatched)
  • Per-pass breakdowns (exact_id: 1200, fuzzy: 800, phonetic: 240)
  • Score distribution bins (how many rows scored 90–100, 80–90, etc.)
  • Pivot table values (counts and percentages per group)

What the AI model never sees:

  • Individual row contents
  • Individual cell values from identity columns (names, emails, phones, addresses)
  • Any specific customer's identity data

This is why the Genie's narrative can safely describe your matches — it's operating on statistics about your data, not your data itself.

Encryption

  • In transit: TLS 1.2+ for every connection — customer traffic, internal service-to-service, and outbound calls to processors.
  • At rest: AES-256 encryption on object storage and database volumes.
  • Backups: encrypted with the same keys as live storage.
  • Secrets management: credentials stored in a managed secrets service, rotated on a regular schedule.

Authentication and access

  • Customer sign-in: email+password (passwords bcrypt-hashed with cost factor 12) or Google OAuth.
  • Session tokens: issued by Auth.js v5, stored in an HTTP-only secure cookie.
  • No long-lived API tokens in the product currently (planned for future, will support key scoping).
  • Admin access to your data: no. Admin accounts exist for billing operations but cannot access your uploaded files or match contents without explicit data access grant via support ticket.

Third-party subprocessors

Every system that touches your data is listed in the Data Processing Agreement. The short list:

  • Cloud infrastructure provider — regional storage, compute for matching workers, managed database, and AI model hosting.
  • Vercel — static hosting and edge functions for the marketing site and app UI. No customer data touches Vercel.
  • Stripe — payments. Card details never reach our systems; Stripe is our SOC-2 Type II payments processor.

No analytics provider has access to customer-data pages. Analytics is limited to marketing pages.

Audit logging

Every significant action is logged:

  • Authentication events (sign-in, sign-out, password change, failed attempts)
  • Match runs (started, completed, failed)
  • Export and share-link creation
  • Team member invite, role change, removal
  • Subscription changes

Logs are retained per tier:

  • Free, Starter: 30 days
  • Pro: 90 days
  • Business: 1 year
  • Enterprise: configurable

Logs are accessible via the admin audit view on Business+; lower tiers can request log extracts via support.

What breaks the boundary

Two things require explicit customer action:

  1. Shared links — you voluntarily share a tokenized URL to a report. Anyone with the URL can see what you shared, subject to the share configuration (password, expiration, export permission). See Sharing reports.
  2. Exports — you download your data to your own device. What happens to it from there is outside our control.

Both are deliberate user actions; neither bypasses the PII-free database or AI boundary at rest.

Incident response

If a security incident affects customer data:

  • We notify affected customers within 72 hours (per GDPR Article 33)
  • We provide an incident report detailing what happened, what was affected, and remediation
  • We cooperate with any required regulatory disclosures

See security.txt for vulnerability reporting.