Looking to build a ZenRows Python scraper? You’re in the right place!

In this step-by-step tutorial, you’ll see how to use ZenRows’ web scraping API with Python to scrape any site on the internet. You’ll learn how to:

  1. Get your API key.
  2. Install ZenRows’ Python SDK.
  3. Make your first request.
  4. Get the HTML content from any site.

Let’s dive in!

How to Use ZenRows with Python

Before going through the steps below, make sure you have Python 3 installed on your machine. A Python IDE such as PyCharm or Visual Studio Code with the Python extension will be useful as well.

For this tutorial, we’ll assume you have a file named scraper.py contained in folder /scraper. You can check out our Python scraping guide for a step-by-step tutorial on how to set up the environment.

Step 1: Sign up for Your API Key

Create a ZenRows account to get your free API key and up to 1,000 URLs. You’ll also gain access to a 24/7 support chat from experienced developers.

If you already have an account, use your credentials to log in.

After logging in, you’ll get to the Request Builder page, where you can find your API key:

ZenRows Request Builder Page

Step 2: Install ZenRows’ Python SDK

ZenRows’ Python SDK is the best way to get started. As a library, it exposes all the ZenRows features through easy-to-adopt utility functions.

That being said, the tool also offers two other connection modes:

  • API: Endpoints you can call in the code to get site data.
  • Proxy: Servers you can configure your HTTP requests to go through for anonymity reasons.

Launch the command below in the /scraper project folder to install the ZenRows’ Python SDK:

pip install zenrows

Great, the ZenRows pip package has just been added to your project’s dependencies.

Step 3: Make Your First Request

In scraper.py, import ZenRowsClient from zenrows to make your first request. Use the /get endpoint from HTTPBin.io as the target site to test that the script works as expected:

scraper.py
from zenrows import ZenRowsClient

client = ZenRowsClient("<YOUR_ZENROWS_API_KEY>")
url = "https://httpbin.io/get"

response = client.get(url)

print(response.text)

Replace <YOUR_ZENROWS_API_KEY> with the API key retrieved in step 1.

Now, run your script:

python scraper.py

It’ll print something similar to this:

{
    "args": {},
    "headers": {
        // omitted for brevity...
        "User-Agent": [
            "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
        ]
    },
    "origin": "38.154.5.224:6693",
    "url": "http://httpbin.io/get"
}

Note that origin contains the IP your request comes from. ZenRows automatically rotates the IP and User-Agent for you. So, you’ll get different values every time you run the script.

Perfect, you just learned how to configure ZenRowsClient correctly and make your first scraping request!

Step 4: Scrape Any Web Page

ZenRows SDK’s default configuration allows you to overcome common blocks. However, that isn’t enough for tougher sites with advanced anti-bot measures in place, like on G2.

Execute the scraper from step 3 using https://www.g2.com/products/asana/reviews as a target URL, and it’ll fail with the following error:

{
    "code":"REQS002",
    "detail":"The requested URL domain needs JavaScript rendering and/or Premium Proxies due to its high-level security defenses. Please retry by adding 'js_render' and/or 'premium_proxy' parameters to your request.",
    "instance":"/v1",
    "status":400,
    "title":"Www.g2.com requires javascript rendering and premium proxies enabled (REQS002)",
    "type":"https://www.zenrows.com/documentation/api-error-codes#REQS002"
}

To deal with more protected sites, you need to add scraping superpowers: JavaScript rendering and premium proxies. Do that by adding their parameters to your code:

scraper.py
from zenrows import ZenRowsClient

client = ZenRowsClient("<YOUR_ZENROWS_API_KEY>")
url = "https://www.g2.com/products/asana/reviews"
params = {
    # enable JavaScript rendering in a headless browser
    "js_render": "true",
    # enable a premium residential proxy
    "premium_proxy": "true"
}

response = client.get(url, params=params)

print(response.text)

Run the script, and it’ll print the page’s HTML:

<!DOCTYPE html>
<head>
    <meta charset="utf-8" />
    <link href="https://www.g2.com/assets/favicon-fdacc4208a68e8ae57a80bf869d155829f2400fa7dd128b9c9e60f07795c4915.ico" rel="shortcut icon" type="image/x-icon" />
    <title>Asana Reviews 2023: Details, Pricing, &amp; Features | G2</title>
    <!-- omitted for brevity ... -->

The request now returns the HTML content associated with the target page as desired. Congrats, mission complete! You’re now able to bypass anti-scraping systems and scrape data from any website.

Look at our documentation for more information on what optional parameters ZenRowsClient accepts. For example, wait_for is a useful one.

It only remains to pass it to an HTML parser to extract only the specific data you’re interested, as introduced in the What’s Next section.

Troubleshooting

The request can fail for several reasons. Sometimes, it is just a matter of finding the right ZenRows parameters. Yet, in some cases, the failure is due to something you don’t have control over, such as the target servers being temporarily down.

When dealing with request failures, there are a few quick solutions you should try:

A. Try the request again: Implement retry logic to make the ZenRows Python script repeat requests if an error occurs. Find out more in our Python requests retry guide.

B. Verify the site is available in your country: The premium proxy picked by ZenRows may be in a different country than yours. If the target site is only accessible from certain countries or regions, that’s a problem. Avoid the issue by specifying a country for your proxy with proxy_country:

params = {
    "premium_proxy": "true",                 
    "proxy_country": "us" # <- choose a premium proxy in the US
    # other configs...
}
response = client.get(url, params=params)

C. Check if the site is publicly accessible: Open the destination page in incognito mode of your browser to verify that it doesn’t require a session. Read more in our tutorial on how to scrape a website that requires a login with Python.

If none of those quick tips work, use the chat on the Request Builder page page to ask our experts for technical support:

ZenRows Request Builder Page

Pricing

When signing up for ZenRows, you can scrape up to 1,000 URLs for free, and they’re used only if your request delivers data. That’s because ZenRows charges for successful requests only.

ZenRows offers several plans starting as low as $49 per month. This entry-level plan provides you with up to 250,000 URLs. With basic requests, you can scrape 250,000 URLs successfully!

To maximize success, enable our recommended parameters js_render and premium_proxy. With this setup, the requests will cost 25 times, which means 10,000 scraped URLs.

Check out our pricing page for detailed information.

What’s Next

When building a ZenRows Python scraper, you may prefer to test your requests before using them in your code. That’s what the Request Builder is all about.

Log into ZenRows to reach the Request Builder page:

ZenRows Request Builder Page

Here, you can:

  1. Paste your target URL.
  2. Enable the different parameters offered by ZenRows (e.g., block specific resources on the page or specify particular JS instructions).
  3. Choose a programming language and specific connection mode. For general use, select “cURL” to get the full API endpoint you can call with any HTTP client.
  4. Click “Try It” to test the request in the browser.
  5. Copy the auto-generated code and paste it into your script.

Keep learning by exploring our resources to dig into ZenRows functionality and possibilities: