July 2022

Managing collection up to 1.5 millions items

Last year, I focused on building the foundations for Datablist. Users management, the data table, and the basics for dealing with data. Until January 2022, Datablist could only import CSV files with 10k rows or less. This is the current limit you find on Airtable, Coda, etc.

For 2022, I wanted Datablist to deal with listings of at least 1 million items. This is a comfortable limit to deal with logs, product data sets, users, and prospect lists. Spreadsheet tools break when dealing with a few hundred of thousand items.

In July I finally unlocked import for CSV files up to 1.5 million rows! (1 million for free users).

Going higher is not on the roadmap.

I can't find business use cases with needs for more than 1.5 million items. Bigger CSV files are for data science and are used for analytics, in read-only mode. Read-only analytics on big CSV files is possible with tools like Microsoft PowerBI and Datablist doesn't have any advantage.

Datablist shines on data consolidation, enrichment from external files or API, and cleaning (deduplication, merging).

Stoppable import process

By allowing big CSV files, the import process can take a few minutes. We want the user to have a reactive experience when using Datablist. We added a "stop import" button to cancel the import before the end.

Improve search and filtering for large collections

The time to process searches and filtering on your data in Datablist is proportional to the number of items. If your collection has 1 million items, it takes one thousand times longer to filter your data than with a 1k items collection.

To scale, Datablist filtering engine stops once it has found enough results to fill your list view. And when you scroll, it resumes the search to find more results.

With this behavior, searching and filtering on hundred of thousand of items feel the same as searching a small dataset.

-----

On top of that, any processing search is canceled when the search and filtering parameters change. When typing in the search box, a search is run any time you stop typing for some time. When you resume typing to add a keyword, the previous search request is canceled.

Persistent item drawer

I've added a persistent url for any item. When you open an item in the drawer, the url changes with the item persistent url.

Returning on this url on a new tab or in another browser will load the collection and open your item directly.

Improvements

See how many items are returned on a search or filter listing

The way Datablist processes data during a search (see above) means the engine doesn't know how many results can be returned on any query. It just stops when it has enough results to show. That is the reason I don't show a "Counter" all the time with the number of matching items. The process of counting how many items match a query is an intensive operation.

But this information is important when managing a dataset.

The total number of items matching a query is now available when using the select all feature.

Toggle the master checkbox, and click on "All items selected". Datablist will count all the results matching your query and replace the text with the value.