trueparse

Quick Start

Getting Started with trueparse

Introduction

trueparse is a modern web scraping API that extracts clean, structured data from any website. Unlike traditional scrapers that break when websites change, trueparse uses AI-powered parsing to reliably extract content in multiple formats: HTML, Markdown, images, and custom JSON schemas.

Scrape

Scrape data from any website and parse into HTML, Markdown, and images effortlessly.

Crawl

Crawl a website and extract all links, images, and other assets.

Extract data

Extract custom and guaranteed JSON data using natural language prompts or schemas.

Schedules

Schedule regular scrapes and crawls and receive data updates via webhooks.

Capabilities

  • Dynamic Content Rendering: Fully managed virtual browsers that render dynamic JavaScript content
  • Universal Parsing: Effortlessly parse websites, PDFs, images and more into clean structured data
  • AI-Powered Data Extraction: Extract perfect JSON data using natural language or custom schemas. Ditch manual HTML selectors
  • Advanced Stealth Mode: Stay undetected with world-class anti-bot software and automatic proxy management
  • Parallel Web Crawling: Crawl entire websites and scrape multiple pages in parallel for maximum performance
  • Schedules & Webhooks: Automate your scraping tasks with schedules and deliver data directly via webhooks

Getting Started

1. Get Your API Key

First, you'll need an API key to authenticate your requests:

  1. Sign up for a trueparse account
  2. Navigate to the API Keys page in your dashboard
  3. Click "Create New Key" and give it a descriptive name
  4. Copy your API key (keep it secure - and don't share it with anyone!)

2. Test in the Playground

Before writing code, try the API playground to:

  • Test different websites and see the extracted data
  • Experiment with output formats (HTML, Markdown, images)
  • Generate code snippets for your preferred language
  • Test data extraction with prompts or custom schemas

Your First API Call

Here's how to make your first request to extract content from a webpage. Replace <API_KEY> with your actual API key from the dashboard.

curl -X POST https://api.trueparse.com/v0/scrape \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "outputs": ["markdown", "html"]
  }'
import requests
import json

url = "https://api.trueparse.com/v0/scrape"
headers = {
    "Authorization": "Bearer <API_KEY>",
    "Content-Type": "application/json"
}
data = {
    "url": "https://example.com",
    "outputs": ["markdown", "html"]
}

response = requests.post(url, headers=headers, json=data)
result = response.json()
print(result)
const url = "https://api.trueparse.com/v0/scrape";
const data = {
    url: "https://example.com",
    outputs: ["markdown", "html"]
};

fetch(url, {
    method: "POST",
    headers: {
        "Authorization": "Bearer <API_KEY>",
        "Content-Type": "application/json"
    },
    body: JSON.stringify(data)
})
.then(response => response.json())
.then(result => console.log(result));

Next Steps

Ready to dive deeper? Check out these guides: