Find Fuzzy Duplicates - Similar Row Detection Tool

Find near-duplicate rows using similarity matching. Catches typos and variations. Set your own threshold. Free and 100% private.

Back to Tools

Fuzzy Duplicate Finder

Find near-duplicate rows using similarity matching. Perfect for names with typos.

How to Fuzzy Duplicate Finder

  1. 1Upload your CSV file
  2. 2Select the column to compare for duplicates
  3. 3Set similarity threshold (default 80%)
  4. 4Download results with duplicate groups marked

Why Use This Tool?

  • Catches Typos - "Jon Smith" matches "John Smith"
  • Adjustable Threshold - Control how strict matching is
  • Group Detection - See which rows are similar
  • Similarity Score - Shows how close each match is
  • Privacy First - All processing in your browser

Frequently Asked Questions

How does fuzzy matching work?

The tool uses Levenshtein distance to calculate how similar two strings are. A threshold of 80% means strings that are at least 80% identical will be grouped as duplicates.

What threshold should I use?

80% is recommended for names with typos. Use 90%+ for stricter matching. Use 60-70% if you expect significant variations.

What columns are added to the output?

Two columns are added: _DUPLICATE_GROUP (numbers matching rows) and _SIMILARITY (score from 0-1). Only rows with matches are included.