Want to scrape the web effortlessly using Python? With ZenRows, you can handle complex scraping tasks, bypass anti-bot measures, and extract data from almost any website. This tutorial will guide you step-by-step in setting up a ZenRows scraper with Python.

How to Use ZenRows with Python

Before we dive in, ensure you have Python 3 installed. If you’re new to Python or web scraping, using an IDE such as PyCharm or Visual Studio Code with the Python extension is recommended for a smoother experience.

We’ll create a Python script named scraper.py inside a /scraper directory. If you need help setting up your environment, check out our Python web scraping guide for detailed instructions on preparing everything.

Install Python’s requests Library

To interact with the ZenRows API, you can use Python’s requests library. It’s a widely used Python library for making HTTP requests, and it simplifies sending requests and handling responses, making it a great tool for integrating with web services like ZenRows.

This approach allows you to manage your API requests directly, allowing greater control over the web scraping process.

To install the requests library, use the following command in your terminal:

pip install requests

This will install the library to send HTTP requests from your Python scripts.

Make Your First Request

In this step, you will send your first request to ZenRows using the requests library to scrape content from a simple URL. We will use HTTPBin.io/get endpoint to see how ZenRows processes the request and returns the data.

Here’s an example:

scraper.py
# pip install requests
import requests

url = 'https://httpbin.io/get'
apikey = 'YOUR_ZENROWS_API_KEY'
params = {
    'url': url,
    'apikey': apikey,
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)

Replace YOUR_ZENROWS_API_KEY with your actual API key and run the script:

python scraper.py

It’ll print something similar to this:

{
    "args": {},
    "headers": {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36",
        // additional headers omitted for brevity...
    },
    "origin": "38.154.5.224:6693",
    "url": "http://httpbin.io/get"
}

The response includes useful information, like the origin, which shows the IP address from which the request was made. ZenRows automatically rotates your IP address and changes the User-Agent for each request, helping to maintain anonymity and avoid blocks.

Perfect, you just learned how to make scraping requests with Python!

Scrape More Complex Web Pages

While scraping simple sites like HTTPBin is straightforward, many websites, especially those with dynamic content or strict anti-scraping measures, require additional features. ZenRows allows you to bypass these defenses by enabling JavaScript Rendering and using Premium Proxies.

For example, if you try to scrape a page like G2’s Asana reviews without any extra configurations, you’ll encounter an error:

{
    "code":"REQS002",
    "detail":"The requested URL domain needs JavaScript rendering and/or Premium Proxies due to its high-level security defenses. Please retry by adding 'js_render' and/or 'premium_proxy' parameters to your request.",
    "instance":"/v1",
    "status":400,
    "title":"Www.g2.com requires javascript rendering and premium proxies enabled (REQS002)",
    "type":"https://docs.zenrows.com/api-error-codes#REQS002"
}

This error happens because G2 employs advanced security measures that block basic scraping attempts.

Here’s how you can modify the request to enable both:

scraper.py
# pip install requests
import requests

url = 'https://www.g2.com/products/asana/reviews'
apikey = 'YOUR_ZENROWS_API_KEY'
params = {
    'url': url,
    'apikey': apikey,
	'js_render': 'true',
	'premium_proxy': 'true',
}
response = requests.get('https://api.zenrows.com/v1/', params=params)
print(response.text)

Run the script, and this time, ZenRows will handle the heavy lifting by rendering the page’s JavaScript and routing your request through premium residential proxies. The response will contain the entire HTML content of the page:

<!DOCTYPE html>
<head>
    <meta charset="utf-8">
    <title>Asana Reviews 2024: Details, Pricing, & Features | G2</title>
    <!-- omitted for brevity -->
</head>
<body>
    <!-- page content -->
</body>
</html>

This demonstrates how you can scrape more advanced websites that rely on JavaScript or have stringent anti-bot mechanisms in place.

Troubleshooting

Request failures can happen for various reasons. While some issues can be resolved by adjusting ZenRows parameters, others are beyond your control, such as the target server being temporarily down.

Below are some quick troubleshooting steps you can take:

1

Retry the Request

Network issues or temporary failures can cause your request to fail. Implementing retry logic can solve this by automatically repeating the request. Learn how to add retries in our Python requests retry guide.

Example of retry logic using requests:

scraper.py
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

# Define the retry strategy
retry_strategy = Retry(
    total=4,  # Maximum number of retries
    status_forcelist=[429, 500, 502, 503, 504],  # HTTP status codes to retry on
)
# Create an HTTP adapter with the retry strategy and mount it to session
adapter = HTTPAdapter(max_retries=retry_strategy)

# Create a new session object
session = requests.Session()
session.mount('http://', adapter)
session.mount('https://', adapter)

# Make a request using the session object
response = session.get('https://scrapingcourse.com/ecommerce/')

if response.status_code == 200:
    print(f'SUCCESS: {response.text}')
else:
    print("FAILED")
2

Verify the Site is Accessible in Your Country

Sometimes, the target site might be region-restricted and only available to some proxies. ZenRows automatically selects the best proxy, but if the site is available only in specific regions, specify a geolocation using proxy_country.

Here’s how to choose a proxy in the US:

params = {
    'premium_proxy': 'true',                 
    'proxy_country': 'us' # <- choose a premium proxy in the US
    # other configs...
}
response = client.get(url, params=params)

If the target site requires access from a specific region, adding the proxy_country parameter will help.

Check out more about it on our Geolocation Documentation Page.
3

Check if the Site is Publicly Accessible

Some websites may require a session, so verifying if the site can be accessed without logging in is a good idea. Open the target page in an incognito browser to check this.

If login credentials are required, you’ll need to handle session management in your requests. You can learn how to scrape a website that requires login in our guide: Web scraping with login in Python.

4

Get Help From ZenRows Experts

If the issue persists despite following these tips, our support team is available to assist you. Use the Builder page or contact us via email to get personalized help from ZenRows experts.

Frequently Asked Questions (FAQ)