Changelog

New features, improvements and fixes to Datablist.

August 2022

Run JavaScript code

Data transformation will be a focus for the next months. Splitting or joining properties, find and replace, etc. They are part of your day-to-day data-cleaning tasks.

I wanted to implement a first dev-friendly feature to run javascript code directly on your collection items. You can clean and transform any of your properties' data by writing a JavaScript function. Check our guide to scraping and enriching Facebook Group members to see how it can be used.

Credit system

Datablist goal is to be the perfect mix of a productivity tool for data management and business software to help you grow your company. Data management is not enough to make an impact. Native data enrichment services and third-party APIs integration will be at the core.

In marketing, SaaS APIs offer email validation, business and people enrichment, scoring, etc. Instead of moving your data from one tool to the next. Datablist will consolidate your data so you trigger each service directly from it.

Every service charges a per-use fee and this cost has to be passed to Datablist customers. The first step toward this vision is a new credit system. Every month, customers receive 5000 credits to be used during the month. And top ups are available to buy extra credits. Free users receive 500 credits on sign-up.

With this system, new third-party integrations will be possible. Feel free to reach me if you want a service to be integrated.

Improvements

Export filtered items

When triggering an export, Datablist will check if you have filters. If your collection is filtered, two options will be available: export only the filtered items, or export the complete collection.

Prevent the browser to load the previous URL on horizontal scroll when a drawer is open

Web browsers have a native implementation with horizontal scrolling to navigate your URL history. Scroll left to load your previous URL, and right to move forward.

This behavior is counterintuitive with Single Page Applications such as Datablist. In the data listing, you have to scroll right and left to see all your properties. Scroll too much and your browser moves you to another page.

It happened to me many times. I open an item in the drawer, I scroll horizontally to check some data, my scroll goes too far, and I left my current page. And the drawer disappears with my data unsaved.

I don't like to overwrite native browser features so I haven't disabled this behavior on all Datablist pages. But it is now disabled when you have the drawer open.

This will prevent most of the data loss when creating a new item or when running an action.

Fixes

  • Fix export on collections with more than 500k items
  • Fix export on 1 item collection

July 2022

Managing collection up to 1.5 millions items

Last year, I focused on building the foundations for Datablist. Users management, the data table, and the basics for dealing with data. Until January 2022, Datablist could only import CSV files with 10k rows or less. This is the current limit you find on Airtable, Coda, etc.

For 2022, I wanted Datablist to deal with listings of at least 1 million items. This is a comfortable limit to deal with logs, product data sets, users, and prospect lists. Spreadsheet tools break when dealing with a few hundred of thousand items.

In July I finally unlocked import for CSV files up to 1.5 million rows! (1 million for free users).

Going higher is not on the roadmap.

I can't find business use cases with needs for more than 1.5 million items. Bigger CSV files are for data science and are used for analytics, in read-only mode. Read-only analytics on big CSV files is possible with tools like Microsoft PowerBI and Datablist doesn't have any advantage.

Datablist shines on data consolidation, enrichment from external files or API, and cleaning (deduplication, merging).

Stoppable import process

By allowing big CSV files, the import process can take a few minutes. We want the user to have a reactive experience when using Datablist. We added a "stop import" button to cancel the import before the end.

Improve search and filtering for large collections

The time to process searches and filtering on your data in Datablist is proportional to the number of items. If your collection has 1 million items, it takes one thousand times longer to filter your data than with a 1k items collection.

To scale, Datablist filtering engine stops once it has found enough results to fill your list view. And when you scroll, it resumes the search to find more results.

With this behavior, searching and filtering on hundred of thousand of items feel the same as searching a small dataset.

-----

On top of that, any processing search is canceled when the search and filtering parameters change. When typing in the search box, a search is run any time you stop typing for some time. When you resume typing to add a keyword, the previous search request is canceled.

Persistent item drawer

I've added a persistent url for any item. When you open an item in the drawer, the url changes with the item persistent url.

Returning on this url on a new tab or in another browser will load the collection and open your item directly.

Improvements

See how many items are returned on a search or filter listing

The way Datablist processes data during a search (see above) means the engine doesn't know how many results can be returned on any query. It just stops when it has enough results to show. That is the reason I don't show a "Counter" all the time with the number of matching items. The process of counting how many items match a query is an intensive operation.

But this information is important when managing a dataset.

The total number of items matching a query is now available when using the select all feature.

Toggle the master checkbox, and click on "All items selected". Datablist will count all the results matching your query and replace the text with the value.

10 June 2022

Import files up to 500k items

Another milestone in Datablist handling large collections. After moving the limit from 10k to 50k in April, I've been able to increase it ten-fold in May to 500k. From 10k to 500k in 4 months is a big step forward. When importing a data file (like a CSV file), the data is parsed, and stored in a local database. Datablist uses a database to filter, sort the data, and save edits. In my test, importing a 500k file takes less than 2 minutes. My goal is to import and edit files up to 1 million items.

Join big CSV files

In the process of increasing the number of items in a collection, I've rewrote some features. The algorithm to join several CSV files on a unique key has been improved to handle bigger collections and edge cases.

Joining two CSV files with hundred of thousand of items is now possible.

Collection Filters

Filtering data is finally available on Datablist. Select one or several filter conditions to show a subset of your collection items. Filtering conditions depend on your data types. Number properties can be filtered on numerical operators, DateTime values are filtered related to timestamps. To export your filtered view, select all items and click on "Export selected items".

Auto create properties during first import

Datablist is both used by regular users and users coming from Google who wants to perform a single task (like deduplicating a CSV file). The import process must be as straightforward as possible. On an empty collection, when importing a file, properties are auto-created using column names and detected data types. If the collection already has properties, the mapping process is shown.

Clone collection

Instead of exporting a collection to a CSV file to re-import it in another collection, you can use the shortcut "Clone collection". All the properties and the collection items are duplicated in a new collection.

Select CSV Export separator

When exporting your data to a CSV file, a new option to select the separator character is available. Choices are "comma" (default) and "semicolon".

April 2022

Import files up to 50k items

In April, I released a first step toward managing a higher number of items with Datablist. From 10k items per collection, the limit has been increased to 50k items.

A lot has changed just to multiply the limit by five. Instead of loading all the data in memory (like all spreadsheet tools), Datablist now loads it on demand from a local database. The user interface is still responsive and easy to use. And this stack will scale well to millions of items (at least it does in my head 🤞).

Performance Improvements

Faster file import

Importing files (CSV/Excel) into a collection for registered users has been improved. Previously, items were saved in the cloud during the file import process. For big files with thousands of items, this leads to frustrating seconds/minutes waiting for cloud sync before accessing the data.

Cloud sync is now asynchronous. Access and manage your data instantly after importing a file while the data is being synced to the cloud in the background.

Faster duplicates finder

Duplicates finder algorithm has been improved. It is now faster with big collections.

Also:

  • Duplicates comparison is now case insensitive.
  • DateTime values are now compared
  • Duplicates on empty values are skipped

Action Runner for big collection

Running actions (verify email addresses, find a LinkedIn profile for an email address, etc.) on thousands of items was challenging. The action runner now split the collection into small parts (chunks) and sends them in sequence. A stop button is available to stop the action before processing all the items.

Better errors handling

A lot can happen in a web application. Internet can be lost, servers might have intermittent issues, etc. Shit happens 🤷‍♂️

I continue to improve how Datablist web client deals with any errors. Retries, showing feedback, etc.

February 2022

New features

Datablist Help Center

Learn to use Datablist and discover how to get the most of it with our new Help Center: https://www.datablist.com/docs

New action: LinkedIn Profile Finder

This action takes a name and keyword properties and returns a LinkedIn Profile URL when found. Read our new guide: How to scrape Facebook group members and find their LinkedIn Profile.

Notifications

Long running tasks are being moved to background jobs to improve UI reactivity. For example, when a collection is deleted, the task takes several seconds to complete but it does't prevent the user to navigate on the Datablist App.

Notifications have been implemented on: Collection Delete, Item edit from the drawer, undo and redo operations.

Network lost, API errors while editing items now return visible error notifications.

Improvements

Improve selected items export

Select export format between CSV and Excel files.

Improve history (Undo/Redo)

History actions (Undo and Redo) are new shown directly in the collection header.

Also, it's now possible to undo collection name and icon changes. And, after creating a new item, calling "undo" will delete it.

Keep CSV rows order on import

During import, file rows are split in several chunks and saved using parallel calls. Before, this could lead to a reordering of items order depending of what call was saved first. File import has been improved to keep file rows order.

Create a collection fast with keyboard shortcut

Press "n" to create a new Datablist collection. See Keyboard Shortcuts documentation https://www.datablist.com/docs/keyboard-shortcuts