Use Datablist's Smart Scraper to extract website text, email addresses, phone numbers, LinkedIn profiles, LinkedIn company pages, and Instagram profile links from a list of webpages.
Upload a CSV or Excel file with URLs, run the enrichment, and get clean columns back in your Datablist collection. No code, no browser extension, and no manual copy-paste.
The main strength is the price. A simple scrape costs 0.10 credits per URL. Since 1,000 credits cost around $1, scraping 10,000 simple webpages costs around $1 in credits. If a website blocks a normal request, you can enable proxy fallback. Proxy scraping costs 0.50 credits per URL, and Datablist only uses it when needed.
What the Smart Scraper extracts
The Smart Scraper reads each webpage and returns structured fields you can use for lead generation, research, recruiting, or AI workflows.
It can extract:
- Website text from the page body
- Email addresses from links and page HTML
- Phone numbers
- LinkedIn profile links
- LinkedIn company page links
- Instagram profile links
- The scraping status for each URL
When you enable the website text output, Datablist removes low-value layout text such as headers, menus, and footers. The result is easier to use in ChatGPT, classification enrichments, or manual review.
Why use this bulk webpage scraper?
Most scraping tools are built for custom projects. This enrichment is built for spreadsheet work. If you already have a CSV with company websites, personal websites, product pages, directories, or landing pages, you can enrich the list directly in Datablist.
Key strengths:
- Low cost at scale: 0.10 credits per URL without proxy, about $1 for 10,000 simple URLs.
- Several outputs in one run: text, emails, phone numbers, LinkedIn links, and Instagram profiles.
- Bulk workflow: process hundreds or thousands of URLs from a CSV or Excel file.
- Optional proxy fallback: retry protected websites with a proxy only when normal scraping fails.
- About and contact page discovery: follow common "About us" and "Contact" links to find more useful data.
- AI-ready text: extract page text and send it to ChatGPT for classification, summarization, or lead scoring.
Step-by-step guide
Step 1: Load your CSV or Excel file on Datablist
Create a free account and import your data file. Datablist is a CSV editor built for large lists. You can open CSV files, Excel files, and lead lists with thousands of rows.
Create a new collection and import the file with the URLs you want to scrape.
Step 2: Select the "Smart Scraper" enrichment
Click on the "Enrich" button and search for "Smart Scraper".
Step 3: Configure options and enable proxy if needed
Choose what you want to extract: website text, emails, phone numbers, LinkedIn links, and Instagram profiles.
You can also ask the scraper to follow "About us" and "Contact" links. When enabled, Datablist scans the first page for links pointing to those pages. It uses a list of common About Us paths and link text analysis to find the right page.
Some websites block scraping or rate-limit requests. You can use a proxy automatically when the Smart Scraper receives an error. This is useful for ecommerce websites, directories, and websites protected by anti-bot rules.
Pricing note: A normal scrape costs 0.10 credits per URL. A proxy scrape costs 0.50 credits per URL. When proxy fallback is enabled, you only pay the proxy price for URLs that need it.
In the advanced settings, you can add ignore terms to exclude unwanted emails, LinkedIn links, or Instagram profiles.
Step 4: Select the column with the webpage URLs
Move to the "Input Property" section and select the column that contains your website or webpage URLs.
Step 5: Define where to store extracted texts, links, and email addresses
Create new properties or map the outputs to existing columns. You can store each result in its own column: emails, phone numbers, LinkedIn profiles, LinkedIn company pages, Instagram profiles, website text, and scraping status.
When Datablist finds several phone numbers, links, or emails, it returns them as a comma-separated list.
How to use the extracted texts with ChatGPT?
The Smart Scraper can return useful text from the scraped page and from followed pages such as "About us" or "Contact".
Datablist removes repeated layout text and keeps content that gives context about the company, product, service, or person behind the page.
You can use this text as input for ChatGPT prompts. For example, classify websites as B2B or B2C, summarize a company description, detect the industry, score leads, or write a short sales note.
Here is an example that segments websites between B2B and B2C companies.
Datablist writes the classification back to your collection:
Use Cases
Lead generation
Start with a list of company websites and extract contact emails, phone numbers, LinkedIn company pages, LinkedIn profiles, and Instagram profiles. Sales teams can use the results to enrich accounts before outreach.
Recruiting and talent sourcing
Scrape personal websites, agency pages, portfolio pages, or company team pages. Extract LinkedIn profile links and emails, then review the results in Datablist.
Company research
Extract website text from a list of domains, then use ChatGPT to classify each company by industry, audience, location, or business model.
Social profile discovery
Find Instagram profiles and LinkedIn pages from a list of websites. This is useful for influencer research, ecommerce prospecting, local business lists, and brand monitoring.
FAQ
How much does it cost to scrape webpages in bulk?
A normal scrape costs 0.10 credits per URL. 1,000 credits cost around $1, so 10,000 simple URL scrapes cost around $1 in credits. Proxy scraping costs 0.50 credits per URL and is useful when a website blocks normal scraping.
Can it scrape an About page or Contact page?
Yes. Enable the options to follow About and Contact links. Datablist will inspect the first page, find likely About or Contact URLs, scrape those pages, and extract data from them too.
Can I extract text and use it with ChatGPT?
Yes. Enable the website text output, then use the extracted text in a ChatGPT enrichment. This works well for company classification, lead scoring, summaries, and data cleaning.
Does it work with a CSV file?
Yes. Import your CSV or Excel file into Datablist, select the URL column, and run the Smart Scraper enrichment in bulk.
Can I use this for cold email personalization?
Yes. Extract website text from each company website, then send that text to a ChatGPT or classification enrichment. This helps create industry tags, company summaries, lead scores, or first-line personalization at scale.




