Compare Two CSV Files (Find Differences)
How to compare two CSV files to find added, removed, or changed rows. Methods using Excel, command line, and diff tools.
Compare Two CSV Files (Find Differences)
You have data_v1.csv and data_v2.csv.
You need to know:
- What rows were added?
- What rows were deleted?
- Which specific cells changed?
Staring at them side-by-side is impossible. Here is how to automate the comparison.
Method 1: Online Diff Tool (Easiest)
For quick visual comparison.
- Upload File A (Old).
- Upload File B (New).
- The tool highlights changes: Green (Added), Red (Deleted), Yellow (Changed).
Method 2: Excel "Inquire" or Conditional Formatting
Quick Visual Check:
- Open both files.
- Copy data from File B into a new sheet in File A.
- Select all data in both sheets.
- Home > Conditional Formatting > Highlight Cells Rules > Duplicate Values.
- Switch to "Unique" values.
- Unique values (changes) will be highlighted.
Formula Method (True/False):
If Sheet1 and Sheet2 are sorted identically:
In Sheet3 cell A1: =IF(Sheet1!A1=Sheet2!A1, "Match", "CHANGE")
Drag this formula across all rows/cols.
Pros: Built-in. Cons: Fails if rows are inserted/deleted (misaligns everything).
Method 3: Command Line (diff)
If you are on Mac or Linux, the diff command is built-in.
diff file1.csv file2.csv
Output:
< 1,John,Manager
---
> 1,John,Senior Manager
< means line from file 1. > means line from file 2.
Better Command: vimdiff
vimdiff file1.csv file2.csv
Opens a side-by-side view in the terminal.
Pros: Fast, free. Cons: Hard to read for non-developers; sensitive to row sorting.
Method 4: Python (Pandas)
For detailed analysis (e.g., "Give me a list of all new IDs").
import pandas as pd
df1 = pd.read_csv('old.csv')
df2 = pd.read_csv('new.csv')
# Find rows in df2 that are not in df1 (Added)
new_rows = df2[~df2['id'].isin(df1['id'])]
# Find rows in df1 that are not in df2 (Deleted)
deleted_rows = df1[~df1['id'].isin(df2['id'])]
print("New Rows:", len(new_rows))
print("Deleted Rows:", len(deleted_rows))
Tips for Accurate Comparison
1. Sort Your Data First
If File A is sorted by ID and File B is sorted by Name, every line will look different to a simple diff tool. Always sort both files by a unique key (ID, Email) before comparing.
2. Ignore Headers
Ensure you aren't comparing the header row as data.
3. Watch for Formatting
$100 vs 100.
2024-01-01 vs 1/1/24.
These are "changes" to a computer, even if the value is the same. Clean/standardize data before comparing.
4. Floating Point Numbers
10.000001 vs 10.0.
Computers might see these as different. Use a tool that allows "fuzzy" numeric comparison (tolerance).
Summary
- Visual/Small files: Use Excel Conditional Formatting or an Online Tool.
- Quick Check: Use
diffcommand. - Data Analysis: Use Python/Pandas.
- Crucial Step: Sort both files before starting!
Need to see the difference? HappyCSV's Compare Tool visualizes changes instantly in your browser.
Related Articles
Anonymize CSV Data (GDPR/Testing)
How to mask sensitive data in CSV files. Anonymize names, emails, and phones for testing or GDPR compliance.
Batch Convert Multiple Excel Files to CSV
How to convert 100 Excel files to CSV at once. Use VBA macros, Python scripts, or batch converters to save hours of manual work.
Best Free CSV Viewers for Mac & Windows
Excel isn't the only way to open CSVs. Check out the best free CSV viewers like Tad, Miller, and online tools for large files.
Need to handle CSV files?
HappyCSV is the free, secure way to merge, split, and clean your data — all in your browser.