Learn how to extract data from any website using ZenRows’ Universal Scraper API. This guide walks you through creating your first scraping request that can handle sites at any scale.
ZenRows’ Universal Scraper API is designed to simplify web scraping. Whether you’re dealing with static content or dynamic JavaScript-heavy sites, you can get started in minutes with any programming language that supports HTTP requests.
Before diving in, ensure you have the proper development environment and required HTTP client libraries for your preferred programming language. ZenRows works with any language that can make HTTP requests.
Python 3 is recommended, preferably the latest version. Consider using an IDE like PyCharm or Visual Studio Code with the Python extension.
If you need help setting up your environment, check out our detailed Python web scraping setup guide.
Sign up for a free ZenRows account and get your API key from the Builder dashboard. You’ll need this key to authenticate your requests.
Start with a simple request to understand how ZenRows works. We’ll use the HTTPBin.io/get endpoint to demonstrate how ZenRows processes requests and returns data.
Replace YOUR_ZENROWS_API_KEY
with your actual API key and run the script:
The script will print the contents of the website, for HTTPBin.io/get
it’s something similar to this:
Perfect! You’ve just made your first web scraping request with ZenRows.
Modern websites often use JavaScript to load content dynamically and employ sophisticated anti-bot protection. ZenRows provides powerful features to handle these challenges automatically.
Use the Request Builder in your ZenRows dashboard to easily configure and test different parameters. Enter the target URL (for this demonstration, the Anti-bot Challenge page) in the URL to Scrape field to get started.
Premium Proxies provide access to over 55 million residential IP addresses from 190+ countries with 99.9% uptime, ensuring the ability to bypass sophisticated anti-bot protection.
JavaScript Rendering uses a real browser to execute JavaScript and capture the fully rendered page. This is essential for modern web applications, single-page applications (SPAs), and sites that load content dynamically.
For the most protected sites, enable both JavaScript Rendering and Premium Proxies. This provides the highest success rate for challenging targets.
This code sends a GET request to the ZenRows API endpoint with your target URL and authentication. The js_render
parameter enables JavaScript processing, while premium_proxy
routes your request through residential IP addresses.
Execute your script to test the scraping functionality and verify that your setup works correctly.
Run the script, and ZenRows will handle the heavy lifting by rendering the page’s JavaScript and routing your request through premium residential proxies. The response will contain the entire HTML content of the page:
Congratulations! You now have a ZenRows integration that can scrape websites at any scale while bypassing anti-bot protection. You’re ready to tackle more advanced scenarios and customize the API to fit your scraping needs.
Request failures can happen for various reasons. While some issues can be resolved by adjusting ZenRows parameters, others are beyond your control, such as the target server being temporarily down.
Below are some quick troubleshooting steps you can take:
Check the Error Code and Error Message
When faced with an error, it’s essential first to check the error code and message for indications of the error. The most common error codes are:
401 Unauthorized
Your API key is missing, incorrect, or improperly formatted. Double-check that you are sending the correct API key in your request headers.
429 Too Many Requests
You have exceeded your concurrency limit. Wait for ongoing requests to finish before sending new ones, or consider upgrading your plan for higher limits.
413 Content Too Large
The response size exceeds your plan’s limit. Use CSS selectors to extract only the needed data, reducing the response size.
422 Unprocessable Entity
Your request contains invalid parameter values, or anti-bot protection is blocking access. Review the API documentation to ensure all parameters are correct and supported.
Check if the Site is Publicly Accessible
Some websites may require a session, so verifying if the site can be accessed without logging in is a good idea. Open the target page in an incognito browser to check this.
You must handle session management in your requests if login credentials are required. You can learn how to scrape a website that requires login in our guide: Web scraping with login in Python.
Verify the Site is Accessible in Your Country
Sometimes, the target site may be region-restricted and only accessible to specific locations. ZenRows automatically selects the best proxy, but if the site is only available in concrete regions, specify a geolocation using proxy_country
.
Here’s how to choose a proxy in the US:
If the target site requires access from a specific region, adding the proxy_country
parameter will help.
Add Pauses to Your Request
You can also enhance your request by adding options like wait
or wait_for
to ensure the page fully loads before extracting data, improving accuracy.
Retry the Request
Network issues or temporary failures can cause your request to fail. Implementing retry logic can solve this by automatically repeating the request.
Get Help From ZenRows Experts
Our support team can assist you if the issue persists despite following these tips. Use the Builder page or contact us via email to get personalized help from ZenRows experts.
For more solutions and detailed troubleshooting steps, see our Troubleshooting Guides.
You now have a solid foundation for web scraping with ZenRows. Here are some recommended next steps to take your scraping to the next level:
How can I bypass CloudFlare and other protections?
To successfully bypass CloudFlare or similar security mechanisms, you’ll need to enable both js_render
and premium_proxy
in your requests. These features simulate a full browser environment and use high-quality residential proxies to avoid detection.
You can also enhance your request by adding options like wait
or wait_for
to ensure the page fully loads before extracting data, improving accuracy.
How can I ensure my requests don't fail?
You can configure retry logic to handle failed HTTP requests. Learn more in our guide on retrying requests.
How do I extract specific content from a page?
You can use the css_extractor
parameter to directly extract content from a page using CSS selectors. Find out more in our tutorial on data parsing.
Can I integrate ZenRows with Python's Requests and BeautifulSoup?
Yes! You can use ZenRows alongside Python Requests and BeautifulSoup for HTML parsing. Learn how in our guide on Python Requests and BeautifulSoup integration.
Can I integrate ZenRows with Node.js and Cheerio?
Yes! You can integrate ZenRows with Node.js and Cheerio for efficient HTML parsing and web scraping. Check out our guide to learn how to combine these tools: Node.js and Cheerio integration.
How can I simulate user interactions on the target page?
Use the js_render
and js_instructions
features to simulate actions such as clicking buttons or filling out forms. Discover more about interacting with web pages in our JavaScript instructions guide.
How can I scrape faster using ZenRows?
You can scrape multiple URLs simultaneously by making concurrent API calls. Check out our guide on using concurrency to boost your scraping speed.