An AI scraping prompt is an instruction that tells an AI scraper what to extract from a web page.

Instead of selecting HTML elements, you describe the fields you want:

Extract the company name, address, phone number, website URL, and business category from this directory listing.

The AI reads the page and returns the requested fields.

What a good AI scraping prompt includes

A good prompt defines:

  • The page type
  • The fields to extract
  • The output format
  • What to do when data is missing
  • What not to guess
  • Examples when the page is ambiguous

For example:

Extract data from this product page.

Return:
- Product name
- Current price
- Original price
- Rating
- Number of reviews
- Availability

If a value is not visible on the page, leave it empty. Do not infer prices or ratings.

Missing data rules

Missing data rules are important in scraping prompts.

Without them, an AI model may guess. In scraping, guessed values are worse than empty values because they look real.

Use instructions such as:

  • Leave the field empty if not visible
  • Return Not found if the page does not mention it
  • Do not infer from the company name
  • Do not use external knowledge
  • Use only the current page

📌 Empty is better than invented

For scraping workflows, a blank field is easy to filter. A guessed value can pollute the dataset.

AI scraping prompt examples

Directory listing:

Extract the business name, category, full address, phone number, website, and opening hours from this listing.

Case study page:

Extract the customer company, industry, product used, problem solved, and measurable result from this case study.

Job post:

Extract the job title, company, location, salary, remote policy, required skills, and seniority level.

Review page:

Extract the reviewer name, rating, review text, review date, and mentioned product issue.

AI scraping prompts in Datablist

Datablist includes a Website AI Scraper source and an AI Agent enrichment for no-code AI scraping.

Use AI prompts when pages vary or when the extraction depends on meaning. Use selector-based scraping when the page layout is stable.

For examples, read AI web scraping, scrape a directory, and AI data extraction.