Quick Start

Introduction

trueparse is a modern web scraping API that extracts clean, structured data from any website. Unlike traditional scrapers that break when websites change, trueparse uses AI-powered parsing to reliably extract content in multiple formats: HTML, Markdown, images, and custom JSON schemas.

Scrape

Scrape data from any website and parse into HTML, Markdown, and images effortlessly.

Crawl

Crawl a website and extract all links, images, and other assets.

Extract data

Extract custom and guaranteed JSON data using natural language prompts or schemas.

Schedules

Schedule regular scrapes and crawls and receive data updates via webhooks.

Capabilities

Dynamic Content Rendering: Fully managed virtual browsers that render dynamic JavaScript content
Universal Parsing: Effortlessly parse websites, PDFs, images and more into clean structured data
AI-Powered Data Extraction: Extract perfect JSON data using natural language or custom schemas. Ditch manual HTML selectors
Advanced Stealth Mode: Stay undetected with world-class anti-bot software and automatic proxy management
Parallel Web Crawling: Crawl entire websites and scrape multiple pages in parallel for maximum performance
Schedules & Webhooks: Automate your scraping tasks with schedules and deliver data directly via webhooks

Getting Started

1. Get Your API Key

First, you'll need an API key to authenticate your requests:

Sign up for a trueparse account
Navigate to the API Keys page in your dashboard
Click "Create New Key" and give it a descriptive name
Copy your API key (keep it secure - and don't share it with anyone!)

2. Test in the Playground

Before writing code, try the API playground to:

Test different websites and see the extracted data
Experiment with output formats (HTML, Markdown, images)
Generate code snippets for your preferred language
Test data extraction with prompts or custom schemas

Your First API Call

Here's how to make your first request to extract content from a webpage. Replace <API_KEY> with your actual API key from the dashboard.

curl -X POST https://api.trueparse.com/v0/scrape \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "outputs": ["markdown", "html"]
  }'

import requests
import json

url = "https://api.trueparse.com/v0/scrape"
headers = {
    "Authorization": "Bearer <API_KEY>",
    "Content-Type": "application/json"
}
data = {
    "url": "https://example.com",
    "outputs": ["markdown", "html"]
}

response = requests.post(url, headers=headers, json=data)
result = response.json()
print(result)

const url = "https://api.trueparse.com/v0/scrape";
const data = {
    url: "https://example.com",
    outputs: ["markdown", "html"]
};

fetch(url, {
    method: "POST",
    headers: {
        "Authorization": "Bearer <API_KEY>",
        "Content-Type": "application/json"
    },
    body: JSON.stringify(data)
})
.then(response => response.json())
.then(result => console.log(result));

Next Steps

Ready to dive deeper? Check out these guides:

Scraping: Scraping and extracting data
Crawling - Crawl entire websites
Scheduling - Set up automated, recurring scrapes