Skip to main content
Extract structured data from any website automatically with ZenRows’ Autoparse feature. Instead of parsing HTML manually, Autoparse uses advanced algorithms to identify and extract key data points in clean JSON format.

How Autoparse works

Autoparse analyzes the HTML structure and content of web pages to identify meaningful data patterns. The algorithms recognize common website elements such as titles, prices, descriptions, images, dates, and other structured information using semantic markup, CSS classes, and content positioning. This process extracts content such as:
  • Product details (title, price, description, images, ratings)
  • Article content (headline, author, publication date, body text)
  • Job listings (title, company, location, salary, requirements)
  • Contact information (names, addresses, phone numbers, emails)
  • Event details (title, date, location, description)
The feature automatically adapts to different website structures and layouts, making it particularly useful for scraping multiple sites with varying designs or for quickly prototyping data extraction workflows.
Remember that Autoparse is an automatic feature designed for general-purpose extraction. It may not capture all fields on every website. Always test requests to verify that all required data is present before implementing in production.

Basic usage

Enable Autoparse by adding the autoparse=true parameter to your ZenRows request:
# pip install requests
import requests

url = 'https://www.scrapingcourse.com/ecommerce/'
apikey = 'YOUR_ZENROWS_API_KEY'
params = {
    'url': url,
    'apikey': apikey,
    'autoparse': 'true',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
This example extracts structured data from the Scraping Course eCommerce page. Instead of receiving raw HTML, you get organized JSON data containing key information automatically identified by Autoparse.

When to use Autoparse

Content extraction needs:
  • E-commerce scraping - Product catalogs, pricing data, reviews, and specifications
  • News and media - Article content, headlines, author information, and publication dates
  • Job board aggregation - Job listings, company details, requirements, and salary information
  • Real estate data - Property listings, prices, descriptions, and location details
  • Event information - Event details, dates, venues, and other event information
Development scenarios:
  • Rapid prototyping - Quick data extraction without writing custom parsers
  • Multi-site scraping - Extracting similar data from different website layouts
  • Unknown site structures - When you need to explore what data is available
  • Proof of concept projects - Testing data availability before building custom solutions
For dynamic content that loads via JavaScript, combine Autoparse with JavaScript Rendering:
# pip install requests
import requests

url = 'https://www.scrapingcourse.com/ecommerce/'
apikey = 'YOUR_ZENROWS_API_KEY'
params = {
    'url': url,
    'apikey': apikey,
    'autoparse': 'true',
    'js_render': 'true',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)
This combination ensures that dynamically loaded content is available for parsing while still providing structured JSON output. For more information about JavaScript Rendering, see the JavaScript Rendering documentation.

Comparing extraction methods

MethodBest forProsCons
AutoparseQuick extraction, multiple sites, prototypingNo coding required, works across sites, JSON outputLess control, may miss specific fields
CSS ExtractorSpecific data, single site, custom requirementsFull control, precise targeting, efficientRequires HTML knowledge, site-specific
Custom ParsingComplex logic, data transformationMaximum flexibility, custom processingTime-intensive, maintenance overhead

Troubleshooting

Common issues and solutions

IssueCauseSolution
Missing expected dataContent not in standard formatContact support for analysis or switch to custom parsing
Empty or incomplete extractionJavaScript-loaded contentAdd js_render=true and wait parameters
Page blocked or captchaSite protection systemsCombine js_render=true + premium_proxy=true
Unexpected data structureSite uses non-standard markupTest with manual CSS Extractor instead of Autoparse

Improving extraction accuracy

When Autoparse doesn’t capture all the data you need:
1

Test if the content loads dynamically

Python
# For empty or missing content, enable JavaScript rendering
params = {
    'autoparse': 'true',
    'js_render': 'true',
    'wait': '3000',  # Wait for content to load
}
2

Bypass protection if the site is blocked

Python
# For blocked pages or captchas
params = {
    'autoparse': 'true',
    'js_render': 'true',
    'premium_proxy': 'true',
}
3

Wait for specific elements to appear

Python
# Wait for specific elements to appear
params = {
    'autoparse': 'true',
    'js_render': 'true',
    'wait_for': '.product-price',
}
4

Contact support or switch to manual parsing

If Autoparse consistently misses the specific fields you need, contact ZenRows support for analysis, or consider switching to the manual CSS Extractor for precise control.
Remember that Autoparse is an automatic feature designed for general-purpose extraction. It may not capture all fields on every website. Always run test requests to verify that all required data is present before implementing in production.

Pricing

The autoparse=true parameter is included at no additional cost with all ZenRows requests - you only pay extra for JavaScript Render and Premium Proxy when used.
You can monitor your ZenRows usage in multiple ways to stay informed about your account activity and prevent unexpected overages.Dashboard monitoring: View real-time usage statistics, remaining requests, success rates, and request history on your Analytics Page. You can also set up usage alerts in your notification settings to receive notifications when you approach your limits.Programmatic monitoring: For automated monitoring in your applications, call the /v1/subscriptions/self/details endpoint with your API key in the X-API-Key header. This returns real-time usage data that you can integrate into your monitoring systems. Learn more about the usage endpoint.Response header monitoring: Track your concurrency usage through response headers included with each request:
  • Concurrency-Limit: Your maximum concurrent requests
  • Concurrency-Remaining: Available concurrent request slots
  • X-Request-Cost: Cost of the current request

Frequently Asked Questions (FAQ)

Autoparse works best with structured content sites like e-commerce stores, news websites, job boards, real estate listings, and social media platforms. Sites with clear content hierarchy and semantic markup provide the most accurate results.
Yes, Autoparse works with all ZenRows features, exept other output features like JSON Response or Markdown Response.
If Autoparse misses specific data points feel free to contact ZenRows support for analysis or consider switching to manual CSS Extractor for precise control.
Autoparse processes whatever HTML is available. For JavaScript-heavy sites, combine it with js_render=true to ensure dynamic content is loaded before parsing. This combination provides comprehensive extraction for modern web applications.