I can scrape hundreds of case studies in minutes and you can do it too.

In this guide, I'll show you exactly how to scrape case studies efficiently, helping you build a valuable database for sales, marketing, or competitive analysis.

By the end of this tutorial, you'll be able to automatically extract not just the case study links, but also specific information like customer details, industry data, and other key metrics - all organized neatly in a structured format.

This will be a 2 part workflow that breaks down the process into actionable steps:

Note: This guide is for scraping dozens or hundreds of case studies from one website. If you want to scrape one or two case studies from many company websites, read this instead: How to Scrape Case Studies at Scale with AI.

Part 1, Step 1 of Scraping All Case Studies From a Website

Go to Datablist.com and sign up.

Datablist’s home page
Datablist’s home page

Create a collection

Datablist’s starting page
Datablist’s starting page

Click on “See all sources”

Datablist has over 12 sources and increasing
Datablist has over 12 sources and increasing

Choose the “AI Agent - Site Scraper”

Datablist has multiple AI agents to choose from
Datablist has multiple AI agents to choose from

Part 1, Step 2 of Scraping All Case Studies From a Website

In this step we will configure our AI agent to extract all links from the page that stores all case studies.

Start by giving it the link to the page with the case studies.

Datablist’s AI agent can scrape almost any website
Datablist’s AI agent can scrape almost any website

Now write a prompt to extract the links or use our template below.

Prompt configuration to scrape case studies with Datablist
Prompt configuration to scrape case studies with Datablist

Here is my prompt:

Prompt to scrape case studies

I want you to extract all links to the case studies on this page

===

Extract only the links that have this structure "https://www.mazak-customers.com/story/story/......"

===

No Introductions
No Explanations
No Thoughts
Only the links that lead to the case study

Make sure to provide the AI with a sample link structure that you want to target, such as www.mazak-customers.com/story/ or www.salesforce.com/customer-stories/, since sometimes it can get PDF case studies which are not as useful for this use case.

Now check the box to the left of "Enable Pagination" and set a limit for the number of pages the AI agent should be able to visit.

AI agent settings for scraping case studies
AI agent settings for scraping case studies

Then configure your outputs as needed, or copy and paste the values below:

  • Output Name: Case Study Link
  • Output Description: The link found on the page
  • Output Type: URL
Output configuration for Datablist’s AI agent
Output configuration for Datablist’s AI agent

Now, check the box to the left of "Advanced Settings" and enable "Website Scraper Option: Render HTML".

Once you've done this, click on "Continue" to start scraping.

Advanced settings for Datablist’s ai agent
Advanced settings for Datablist’s ai agent

Once the AI agent has finished scraping the case studies, your collection should look like this.

The results display the case study link in the column we named "Case Study Link" and the source page in the column "Page Scraped".

The case study links we scraped with Datablist’s AI agent
The case study links we scraped with Datablist’s AI agent

Now that we have scraped all the case study links from the first page, let's scrape the case study contents from each case study page.

Part 2 of Scraping All Case Studies from a Website — Extracting Information

This part of the workflow is a bit more sophisticated but will save you a lot of time compared to doing it manually — just follow the instructions I am going to give you and you'll be on the safe side!

Here are the steps this workflow consists of:

  1. Visiting one or two pages to scan and analyze the structure of pages
  2. Creating tags for each piece of information you want to have
  3. Writing a prompt to provide the AI with clear instructions and examples
  4. Configuring the outputs you want to get
  5. Running the AI agent to scrape the case study content

Part 2, Step 1 of Scraping All Case Studies from a Website

First, you need to visit one or two of the pages that you just scraped, define which pieces of information you want to have, and look for any patterns in the structure of the case studies.

Second, create a tag for each piece of information you want to have, give the AI examples, and tell it where it can find the information since the AI will provide you with much better outputs that way.

Sometimes you can hover over text to see if the link has specifications you can use to better define your output formats. In my case, for example, "VERSATECH" would be a machine series.

That’s one the case study pages I am about to scrape
That’s one the case study pages I am about to scrape

💡 Quick Tip

Providing examples will enhance your outputs up to 3x more than without them

Part 2, Step 2 of Scraping All Case Studies From a Website

In this step, we will configure the AI agent to scrape the information from the case study page — let's go!

First, open your collection with the links to the case study pages again.

Since the "Scraped Page" column is not needed for this workflow, we'll hide it and then click on "Enrich".

Datablist collection with case study links
Datablist collection with case study links

Now go to “AI” and select the “AI Agent”.

Datablist’s AI enrichments selection
Datablist’s AI enrichments selection

Now copy the prompt template below and modify it according to the information you need from the case study page

Prompt configuration for scraping case studies with Datablist
Prompt configuration for scraping case studies with Datablist
Prompt to extract information from a case study page

Context: I need some of information that are related to the case study on the web page

===

What I want you to do: Visit the page I am going to give you and extract requested the data points. I'll tell you more about the information shortly

===

The data points you have to look for (with examples):
[Information Tag 1] e.g., [Example 1, Example 2, Example 3]
[Information Tag 2] e.g., [Example 1, Example 2, Example 3]
[Information Tag 3] e.g., [Example 1, Example 2, Example 3]

===

You can access the case study with this link: /Your column

Here is this template prompt with example data:

Context: I need some of information that are related to the case study on the web page

===

What I want you to do: Visit the page I am going to give you and extract requested the data points. I'll tell you more about the information shortly.

===

The data points you have to look for (with examples):

Machine Information:

- Machine Series e.g., VERSATECH, Dual Turn, CV5-500
- Machine Name e.g., VERSATECH V-140N/280, OPTIPLEX 4020 DDL, INTEGREX j-200

Customer’s Information:

- Customer's Industry e.g., Manufacturing, Aerospace, Construction
- Customer's Location e.g., Germany, France, Baltics
- Customer’s Name e.g.,

===

You can access the case study with this link: /Case Study Link

💡 Quick Fact About the AI Agent

The AI agent is incredibly good in following instructions but if you don’t provide it with clear examples the AI agent wont provide you with good results.

After configuring your prompt using our template you have to configure the outputs, here’s how:

For each piece of information you want to extract:

  • Use the information tag name as your "Output Name"
  • Add a clear description in the "Output Description" field or include examples
  • Choose the appropriate "Output Type" for the data you want to have
  • Click "More" to add additional outputs and do the same there
Output configuration for scraping case studies with Datablist
Output configuration for scraping case studies with Datablist

After you've configured all your outputs, click on "Continue to outputs configuration"

Last step before the columns configuration
Last step before the columns configuration

Now click on all the plus (+) icons to add a new column for each output, and click on "Instant Run"

Datablist columns configuration for scraping case studies
Datablist columns configuration for scraping case studies

These are the results of the scraped case studies

Datablist collection with scraped case studies
Datablist collection with scraped case studies

Frequently Asked Questions About Scraping Case Studies

How Do I Scrape Case Studies From a Website Legally?

Website scraping is legal when you scrape publicly available data and respect copyright restrictions.

What Tools Do I Need to Scrape Case Studies From Websites?

You can use web scraping tools like Datablist for no-code solutions.

How Long Does It Take to Scrape Case Studies From a Website?

With tools like Datablist, you can scrape hundreds of case studies within minutes to hours. The setup time for automation is typically 15-30 minutes once you understand the website's structure.

Can I Scrape Case Studies From Any Website?

Not all websites allow scraping. Some websites use anti-scraping measures or explicitly forbid it in their terms of service.

What Information Can I Extract From Case Studies?

You can extract various data points including company names, industries, challenges, solutions, results, testimonials, dates, and metrics. The key is identifying consistent patterns in how the case studies are structured on the website to ensure accurate data extraction.