CSV file has become a generic format to store structured data. The CSV format relies on text files and most applications export data into CSV files.
Spreadsheet tools open CSV files but will crash on big files. Google Sheets shows an error when uploading a big CSV file and the user experience is terrible once you reach a few hundred thousand lines.
Then, how to edit big CSV files when you are not a technical person? Data scientists and developers use databases or programming languages to interact with big data sets. But they are out of reach and complex for day-to-day CSV operations.
In this guide, you will learn:
View vs Edit CSV files
This guide focuses on editing CSV files. If you want to open and visualize (or filter) your CSV file, please use big data solutions (such as Microsoft PowerBi).
A free alternative to big data tools is to use the Terminal on Linux and macOS. Check our guide: How to search in a big CSV file using Linux or macOS Terminal.
Up to 1.5 million lines, edit CSV files online with Datablist
Datablist is an online data editor. It opens CSV files directly in your browser without third-party software. Datablist works by using the database feature provided by modern web browsers (check our documentation to learn more).
In the free plan, Datablist let you load CSV files up to 1 million lines. The limit is set to 1.5 million lines with the Standard Plan. See our pricing page for a detailed comparison between plans.
Step 1: Create a collection and import your CSV file
Datablist CSV editor requires signing up to open big CSV files. Please create an account before continuing.
To import your CSV file into Datablist, create an empty collection and use the "Import CSV/Excel" assistant to load your file.
Select the CSV columns you want to import and confirm the detected data types. Datablist reads the first lines of your file and detects the data type for each column. Data types are important for sorting and filtering your collection later on.
Check the preview data, and click "Import" to launch the import process.
Loading a big CSV file takes time. Expect 5 minutes to load a 1 million lines CSV file. To cancel the loading process before the end, click the "Stop Import" button. It will stop the import process.
Step 2: Edit your big CSV file online
During import, your CSV file is parsed. And the data is stored inside a local database in your browser. The data is persisted on page reload and won't be flush until you say so (or if you reach your hard drive limit).
With Datablist, you can filter your CSV file with one or several conditions and use "full-text" search.
Filter and search response times are correlated with the number of items and the number of available results. The more matching items, the faster the results. If Datablist has to read all the items to find one or zero matching items, the response time will be slower.
Edit your items with a spreadsheet-like interface. Press "Enter" on a cell to edit its content. For datetime values, a calendar is available. And big texts are editable in a large textarea.
An item detail drawer opens when you click on the expand button on the left of your item line.
Merge duplicate items
Datablist comes with some advanced editing features to manage your data. One of them is the duplicate items finder.
The deduplication algorithm compares the values from one or many properties and lists all duplicate items. On the duplicates listing, you can merge all non-conflicting duplicate items automatically. A merging assistant is available to merge the remaining conflicting items. Read our guides to learn how to deduplicate a CSV file or how to merge duplicate rows from an Excel file.
Items merging is available directly in the data listing view. Select the items you want to merge and click on the "Merge Items" button.
All selected items are compared and the item with the most data is elected as the "Primary Item". Then, non-conflicting values from other items are set in the primary item empty properties. When confirming the merging, all but the primary item are deleted. The primary item is kept with the combined values.
Step 3: Export your edited CSV data
Once you have finished your modifications, you can keep your data on Datablist or export it into a CSV or Excel file.
Options are available to configure your export:
- Properties to export - Collection Properties can be hidden from the data listing. When at least one property is hidden, the exporter let you decide if the export must include all or only the visible properties.
- The export format - CSV or Excel.
- Column Delimiter - For CSV format, an option to select between "Comma" or "Semicolon" is available for the column delimiter.
Excel forbids cell text to be larger than 32k characters. And Excel won't be able to import big Excel files (see Excel limits). We recommend using the CSV format for big collections.
For bigger CSV files
Split your CSV files and merge results
Datablist has an import limit of 1.5 million lines for CSV (1 million lines for the free plan). After this limit, your CSV file is truncated. For bigger CSV files, with several million items, split your big CSV file into several small CSV files. Edit them with Datablist, export them as CSV and merge them back into a single big CSV file.
To split a big CSV file, just search on Google "Online CSV splitter". I use the free CSV splitter from ExtendsClass.
To import your smaller CSV files into Datablist, they must have a header row. It's important you configure your CSV splitter to copy the header row into each split file.
Follow the tutorial above to edit your small CSV files.
Then use another online tool to merge back all your CSV files into one. ExtendsClass offers a good tool to merge CSV online.
For Mac and Linux users
The last option for Mac and Linux users is Visidata.
Visidata is a free tool and uses the terminal to show and interact with your CSV file. It can be scary for non-technical users... But if you want to impress yourself, give it a go!
What is Google Sheets row limit?
In march 2022, Google Sheets increased its cell limit to 10 million cells (from a previous limit of 5 million cells). It translates to a row limit by dividing the limit by the number of CSV columns you have.
In theory, Google Sheets could be used to open and edit large CSV files (close to one million rows).
But Google Sheets fetches the whole spreadsheet data from its server into your web browser memory every time you open the webpage. For an accounting use case, it let you create formulas that use data from any cell from your spreadsheet. But for data listing and manipulation, this is inefficient! I don't recommend Google Sheets for data manipulation once you reach a few hundred thousand rows.
What is Airtable row limit?
According to Airtable support page, the limits are:
- Free Plan - 1,200 records per base
- Plus Plan - 5,000 records per base
- Pro Plan - 50,000 records per base
- Enterprise Plan - 100k records per table (250,000 records per base)
In Airtable, a row/line is a record. And a base is composed of several tables that can be linked together (similar to a workbook with several sheets in a spreadsheet).
With such limits, Airtable is not a tool to open big CSV files.
What is Microsoft Excel row limit?
According to Microsoft Support, Excel rows limit is 1,048,576 (as of 2022).
Also, each cell can't contain more than 32,767 characters.
What is Apple Numbers row limit?
Like other spreadsheet tools, Apple Numbers is not suited to deal with big CSV files. Apple Numbers has a 1,000,000 rows limit per sheet. This apply for imported CSV files.
But the UI is unusable with that many rows and editing a cell freezes the application for several seconds.
Do you have sample CSV files to test Datablist?
Sure! Visit our sample CSV Files page to download CSV files with data up to 2 million records.