Question 1

How does fuzzy matching work?

Accepted Answer

The tool uses Levenshtein distance to calculate how similar two strings are. A threshold of 80% means strings that are at least 80% identical will be grouped as duplicates.

Question 2

What threshold should I use?

Accepted Answer

80% is recommended for names with typos. Use 90%+ for stricter matching. Use 60-70% if you expect significant variations.

Question 3

What columns are added to the output?

Accepted Answer

Two columns are added: _DUPLICATE_GROUP (numbers matching rows) and _SIMILARITY (score from 0-1). Only rows with matches are included.

Find Fuzzy Duplicates - Similar Row Detection Tool

Fuzzy Duplicate Finder

Upload your files

How to Fuzzy Duplicate Finder

Why Use This Tool?

Frequently Asked Questions

How does fuzzy matching work?

What threshold should I use?

What columns are added to the output?

Related Tools

Remove Duplicates

Compare Files

Find & Replace