How to Fuzzy Duplicate Finder
- 1Upload your CSV file
- 2Select the column to compare for duplicates
- 3Set similarity threshold (default 80%)
- 4Download results with duplicate groups marked
Find near-duplicate rows using similarity matching. Catches typos and variations. Set your own threshold. Free and 100% private.
Find near-duplicate rows using similarity matching. Perfect for names with typos.
Drag & drop or click to browse
The tool uses Levenshtein distance to calculate how similar two strings are. A threshold of 80% means strings that are at least 80% identical will be grouped as duplicates.
80% is recommended for names with typos. Use 90%+ for stricter matching. Use 60-70% if you expect significant variations.
Two columns are added: _DUPLICATE_GROUP (numbers matching rows) and _SIMILARITY (score from 0-1). Only rows with matches are included.