How to Split Large CSV Files (3 Easy Methods)
Split large CSV files that are too big for Excel. Learn how to divide CSV files by row count or column values. Handle files over 1 million rows.
How to Split Large CSV Files (3 Easy Methods)
Your CSV file is 2 million rows. Excel won't even open it. Your database import times out. Your upload fails because the file is too big.
Time to split that monster into manageable chunks.
Let me show you three ways to split CSV files, when to use each method, and how to avoid common pitfalls.
Why Split CSV Files?
Excel's row limit: 1,048,576 rows maximum. Anything larger just won't open.
File size restrictions: Many systems have upload limits (25MB, 50MB, 100MB). Your 500MB CSV file doesn't fit.
Processing performance: Smaller files process faster. Split into chunks, process in parallel.
Easier management: Would you rather wrangle one 10GB file or ten 1GB files?
Sharing portions: Send someone just the rows they need, not the entire dataset.
Method 1: Split by Number of Rows
This is the most common approach - break your file into chunks of X rows each.
Example:
- Original: 1,000,000 rows
- Split into: 10 files of 100,000 rows each
- Each file gets: Headers + 100,000 data rows
How to Do This
Option A: Use a Split Tool
- Upload your large CSV
- Specify rows per file (e.g., 100,000)
- Download ZIP containing all split files
- Each file includes headers automatically
Option B: Command Line (Linux/Mac)
# Split into files of 100,000 lines each
split -l 100000 largefile.csv chunk_
# Add headers to each chunk (requires a bit more work)
head -1 largefile.csv > headers.txt
for file in chunk_*; do
cat headers.txt $file > with_headers_$file.csv
done
Option C: Python Script
import pandas as pd
chunk_size = 100000
df = pd.read_csv('largefile.csv', chunksize=chunk_size)
for i, chunk in enumerate(df):
chunk.to_csv(f'split_part_{i+1}.csv', index=False)
When to Use Row Splitting
Perfect for:
- Overcoming Excel's row limit
- Working around upload size restrictions
- Parallel processing workflows
- Splitting chronological data by time period
Example: You have a year's worth of transaction logs (12M rows). Split into 12 files of 1M rows each (one per month).
Method 2: Split by Column Value
Instead of arbitrary row counts, split based on actual data in a column.
Example:
- Column: "Country"
- Split result:
USA.csv,Canada.csv,UK.csv,Mexico.csv
Each file contains only rows where Country matches that value.
How to Do This
Using a Tool:
- Upload CSV
- Select column to split by (e.g., "Country")
- Tool creates one file per unique value
- Download ZIP with all files
Using Excel (Small Files Only):
- Apply Auto-Filter to your column
- Filter by first value
- Copy visible rows to new sheet
- Save as CSV
- Repeat for each value
Tedious, but works for occasional use.
Using SQL (If Data in Database):
-- Export each region to separate file
SELECT * FROM table WHERE region = 'North' INTO OUTFILE 'north.csv';
SELECT * FROM table WHERE region = 'South' INTO OUTFILE 'south.csv';
When to Use Column Splitting
Perfect for:
- Distributing data by region/territory
- Separating by category/type
- Creating department-specific files
- Grouping by customer/ account
Example: You have a customer list of 100K rows. Split by "Sales Rep" to create separate files for each sales person to manage their own leads.
Watch Out for High Cardinality
High cardinality = lots of unique values in a column.
If you split by "Customer ID" and have 50,000 customers, you'll get 50,000 files. That's not useful.
Better choices for splitting:
- Country (maybe 20-50 values)
- Region (5-10 values)
- Category (10-30 values)
- Year/Month (12-24 values for yearly data)
Bad choices:
- Email (unique per row)
- Transaction ID (unique per row)
- Timestamp (thousands of unique values)
Method 3: Manual Split in Excel (Last Resort)
If you're desperate and have no other tools:
- Open CSV in Excel (if it fits)
- Select first 100,000 rows
- Cut (Ctrl+X)
- Paste into new workbook
- Save as CSV (part_1.csv)
- Repeat with next 100,000 rows
Why this is terrible:
- Time-consuming
- Error-prone
- Excel might crash with large files
- You might miss rows or duplicate some
When to use this: Never, unless you have literally no other option.
File Naming Best Practices
For Row Splits
Good:
data_part_001.csv
data_part_002.csv
data_part_003.csv
Better:
transactions_2024_part_01_rows_1-100000.csv
transactions_2024_part_02_rows_100001-200000.csv
Why: Numbered parts, clear ranges, sortable.
Bad:
split1.csv
split2.csv
newfile.csv
Why: Not descriptive, hard to track order.
For Column Splits
Good:
customers_USA.csv
customers_Canada.csv
customers_UK.csv
Better:
2024_customers_USA.csv
2024_customers_Canada.csv
Why: Includes context (year), clear meaning.
Rules:
- Use underscores, not spaces
- Include date if relevant
- Be descriptive but concise
- Use consistent format across all files
Don't Forget Headers!
The #1 mistake when splitting: forgetting to include headers in each file.
Bad split:
File 1: John,john@email.com,555-1234
File 2: Sarah,sarah@email.com,555-5678
Good split:
File 1: Name,Email,Phone
John,john@email.com,555-1234
File 2: Name,Email,Phone
Sarah,sarah@email.com,555-5678
Each file needs headers for:
- Excel/Sheets to display column names
- Imports to map columns correctly
- Humans to understand what each column is
Most split tools do this automatically. If you're using command-line or manual methods, double-check.
After Splitting: Verification
Before you archive your original file or send split files to others, verify:
Row count adds up:
Original: 1,000,000 rows
Part 1: 100,000 rows
Part 2: 100,000 rows
...
Part 10: 100,000 rows
Total: 1,000,000 ✓
No data loss:
- Spot-check: Is customer #500 in the right file?
- Check first and last rows of each part
- Verify unique identifiers (IDs) are all present
All files are valid CSVs:
- Open each in text editor - does it look right?
- Try opening in Excel - does it parse correctly?
- Check for weird characters or corruption
Headers present:
- Every file should have the same header row
- Column order should match original
Common Pitfalls
Pitfall 1: Splitting Mid-Row
If your CSV has multi-line fields (text with line breaks inside), splitting at a specific line count might break rows apart.
Example:
Name,Address
"John Smith","123 Main St
Apartment 5"
If you split after line 2, you'll break John's address.
Solution: Use a tool that understands CSV structure, not just line counts.
Pitfall 2: Inconsistent Encoding
Original file is UTF-8. You split it. Some files save as ANSI. Now you have encoding mismatches.
Solution: Verify all split files have the same encoding as the original (usually UTF-8).
Pitfall 3: Losing Track of Order
You split into 50 files. Now you can't remember which one has what data.
Solution:
- Use descriptive filenames
- Keep a manifest file listing what each contains
- Put all split files in one folder with README
Pitfall 4: Not Keeping Original
You split the file, delete the original to save space, then realize you need to re-split differently.
Solution: Always keep the original master file. Archive it if needed, but don't delete.
Use Cases
Use Case 1: Excel Row Limit
Problem: CSV has 2M rows, Excel maxes at 1M.
Solution: Split into 2 files of 1M rows each. Process separately in Excel.
Use Case 2: Upload Size Limit
Problem: Database import tool has 50MB limit, your CSV is 200MB.
Solution: Split into 4 files of 50MB each. Import sequentially.
Use Case 3: Parallel Processing
Problem: Processing 10M rows takes 5 hours.
Solution: Split into 10 files of 1M rows. Process in parallel on multiple machines. Done in 30 minutes.
Use Case 4: Regional Distribution
Problem: You have global customer data. Each region needs their own file.
Solution: Split by "Region" column. Send each regional file to respective team.
Merging Split Files Back Together
Eventually you might need to recombine split files.
Quick method:
- Put all split files in one folder
- Use merge tool
- Download combined file
Command line:
# Combine all split files (headers only from first file)
cat split_part_*.csv > combined.csv
Note: This includes headers from every file. You'll need to remove duplicates manually or use a tool that handles this.
The Bottom Line
Choose your split method based on need:
- Row count: Overcome size limits, parallel processing
- Column value: Distribute by category/region
- Manual: Never (use a tool instead)
Before splitting:
- Backup original file
- Decide on split criteria
- Plan file naming convention
After splitting:
- Verify row counts
- Check for data loss
- Confirm headers in all files
- Test that files are valid CSVs
Large CSV files don't have to be scary. Split them into manageable pieces and suddenly that "impossible" task becomes totally doable.
Split your CSV files easily: HappyCSV's split tools handle both row-based and column-based splitting, preserve headers automatically, and package everything in a ZIP for easy download.
Related Articles
Anonymize CSV Data (GDPR/Testing)
How to mask sensitive data in CSV files. Anonymize names, emails, and phones for testing or GDPR compliance.
Batch Convert Multiple Excel Files to CSV
How to convert 100 Excel files to CSV at once. Use VBA macros, Python scripts, or batch converters to save hours of manual work.
Best Free CSV Viewers for Mac & Windows
Excel isn't the only way to open CSVs. Check out the best free CSV viewers like Tad, Miller, and online tools for large files.
Need to handle CSV files?
HappyCSV is the free, secure way to merge, split, and clean your data — all in your browser.