> ## Documentation Index
> Fetch the complete documentation index at: https://docs.zenrows.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Scrape Property Listings

> Scrape Zillow property listings with ZenRows to extract URLs from search results while bypassing anti-bot protection and rendering JavaScript.

Extract property URLs and data from Zillow listing pages using ZenRows' Universal Scraper API. This tutorial covers setting up your scraper to handle Zillow's anti-bot protection, extracting property links from search results, and processing dynamic content.

## What you'll learn

* Set up scraping requests with anti-bot bypass and JavaScript rendering
* Extract property URLs from Zillow listing pages using CSS selectors
* Handle dynamic content loading and page scrolling
* Configure custom JavaScript instructions for reliable data extraction

## Why Scrape Real Estate Data

Real estate professionals need timely, comprehensive market data to make informed decisions. Web scraping enables automated data collection that provides several advantages:

### Market intelligence

* Monitor new listings as they appear on property websites
* Track pricing trends across different neighborhoods and property types
* Identify properties that match specific investment criteria

### Investment analysis

* Access comprehensive property data for market research
* Compare property specifications across multiple listings
* Analyze market conditions in target areas

### Lead generation

* Identify potential investment opportunities from listing data
* Build databases of properties that meet client requirements
* Monitor competitor listings and market activity

## Step 1: Test With a Basic Scraping Setup

Start by creating a scraping function that handles Zillow's anti-bot measures. The site requires [JavaScript Rendering](/universal-scraper-api/features/js-rendering) and [Premium Proxies](/universal-scraper-api/features/premium-proxy) for reliable access.

```python Python theme={null}
# pip install requests
import requests

def scraper(url):
    apikey = "YOUR_ZENROWS_API_KEY"

    params = {
        "url": url,
        "apikey": apikey,
        "js_render": "true",
        "premium_proxy": "true",
        "proxy_country": "us",
    }
    response = requests.get("https://api.zenrows.com/v1/", params=params)
    return response.text
```

This function returns the website's HTML content. The `js_render` parameter enables JavaScript processing, while `premium_proxy` provides residential IP addresses to avoid blocking. Setting `proxy_country` to "us" positions requests from IP addresses based in the US.

<Tip>
  The `proxy_country` parameter is optional. If not specified, ZenRows will use a random IP address worldwide. See more about geolocation [here](/universal-scraper-api/features/proxy-country).
</Tip>

## Step 2: Handle Dynamic Content

Parameters, such as `wait`, `wait_for`, and `js_instructions`, allow customizing requests to handle dynamic rendering.

Modify the request parameter with a 2-second generic delay (`wait`) to allow elements to load.

```python Python theme={null}
#...
params = {
    "url": url,
    "apikey": apikey,
    "js_render": "true",
    "premium_proxy": "true",
    "proxy_country": "us",
    "wait": "2000",
}
#...
```

Add a scrolling logic using the `js_instructions` parameter. This helps capture property listings that extend beyond the viewport.

Also define the `js_instructions` separately as a stringified parameter. The instructions include an extra `wait` command. This adds a specific delay for elements to load after the scrolling action has completed.

```python Python theme={null}
# ...
import json


#... scraper function

# custom JS instructions
listing_js_instructions = json.dumps(
    [
        {"evaluate": "window.scrollTo(0, document.body.scrollHeight);"},
        {"wait": 2000},
    ]
)
```

Use the same scraper function to scrape the listings and individual property pages. To make the function more customizable for each scenario, update it to accept an optional `js_instructions` parameter.

Here's the updated code:

```python Python theme={null}
# pip install requests
import requests
import json

def scraper(
    url,
    js_instructions=None,
):
    apikey = "YOUR_ZENROWS_API_KEY"
    params = {
        "url": url,
        "apikey": apikey,
        "js_render": "true",
        "premium_proxy": "true",
        "proxy_country": "us",
        "js_instructions": js_instructions,
        "wait": "2000",
    }
    response = requests.get("https://api.zenrows.com/v1/", params=params)

    return response.text

# custom JS instructions
listing_js_instructions = json.dumps(
    [
        {"evaluate": "window.scrollTo(0, document.body.scrollHeight);"},
        {"wait": 2000},
    ]
)
```

## Step 3: Extract Property URLs from Listing Pages

Now you'll extract individual property page links from the Zillow listing. Use ZenRows' `css_extractor` feature to automatically pull these URLs from the page.

Here's a stringified format of the `css_extractor`:

```python Python theme={null}
# listing page CSS extractor
listing_css_extractor = json.dumps(
    { "Links": "a[data-testid*=carousel][href]@href" }
)
```

<Warning>CSS selectors can change when websites update their code. To maintain a reliable scraper, monitor your selectors regularly and update them as needed. Learn more about CSS selectors [here](/universal-scraper-api/features/css-extractor).</Warning>

Add an optional `css_extractor` parameter to the scraper function and specify it in the ZenRows `params`. Since we're extracting specific content, update the scraper function to return the data as JSON rather than plain text. Execute the function with the `listing_url`, `listing_css_extractor`, and `listing_js_instructions` as parameters.

Here's the updated code:

```python Python expandable theme={null}
# pip install requests
import requests
import json

def scraper(
    url,
    js_instructions=None,
    css_extractor=None,
):
    apikey = "YOUR_ZENROWS_API_KEY"
    params = {
        "url": url,
        "apikey": apikey,
        "js_render": "true",
        "premium_proxy": "true",
        "proxy_country": "us",
        "js_instructions": js_instructions,
        "wait": "2000",
        "css_extractor": css_extractor,
    }
    response = requests.get("https://api.zenrows.com/v1/", params=params)

    return response.json()

# custom JS instructions
listing_js_instructions = json.dumps(
    [
        {"evaluate": "window.scrollTo(0, document.body.scrollHeight);"},
        {"wait": 2000},
    ]
)

# listing page CSS extractor
listing_css_extractor = json.dumps(
    { "Links": "a[data-testid*=carousel][href]@href" }
)

listing_url = "https://www.zillow.com/districts/8494/california-area-school-district/"

property_urls = scraper(
    listing_url,
    js_instructions=listing_js_instructions,
    css_extractor=listing_css_extractor,
)["Links"]

# use a set to avoid duplicate URLs
property_urls = set(property_urls)
print(property_urls)
```

The above returns the URLs of each property on the listing page. See a sample response below:

```json JSON Response theme={null}
{
    "https://www.zillow.com/homedetails/130-3rd-St-California-PA-15419/49745481_zpid/",
    "https://www.zillow.com/homedetails/882-Highpoint-Dr-Coal-Center-PA-15423/2087529083_zpid/",
    # ...,
    "https://www.zillow.com/homedetails/508-5th-St-California-PA-15419/49745256_zpid/",
    "https://www.zillow.com/homedetails/721-Spring-St-Roscoe-PA-15477/49794732_zpid/",
}
```

You now have a complete foundation for scraping Zillow property listings. Your scraper handles anti-bot protection, processes dynamic content, and extracts property URLs efficiently from any Zillow listing page.

## Next Steps

Once you have the property URLs, you can use them to extract detailed property data from individual listing pages.

Learn how to do this in our [Extract Property Data](/zenrows-academy/extract-property-data) tutorial.

## Data management best practices

<Steps>
  <Step title="Structure for your use case">
    Design your data structure to match your specific business needs, rather than using a generic approach. For example, if you need to analyze price trends, structure price history as date/price pairs, not raw HTML. Map scraped fields directly to your business entities (e.g., property, agent, transaction) and use consistent, clear field names that work for your team and future use.
  </Step>

  <Step title="Validate before storage">
    Check for missing or malformed fields, unexpected data types, duplicate entries, or values outside the expected range. Use validation scripts or schema checks (e.g., using Python’s pydantic or JSON Schema). Automate this process to maintain consistency throughout your data extraction workflow.
  </Step>

  <Step title="Preserve raw data">
    Store both cleaned and raw data separately. Raw data serves as a backup for debugging and reprocessing when requirements change.
  </Step>

  <Step title="Determine the storage format">
    Choose the storage format that best fits your data and use case. Common options include:

    * **JSON**: Best for nested, hierarchical, or complex data structures. Human-readable and widely supported across platforms and protocols.
    * **CSV**: Ideal for flat, tabular data. Easy to use in spreadsheets and many analytics tools.
    * **Databases (e.g., MongoDB, PostgreSQL)**: Suitable for large datasets that require frequent updates and querying.
    * **Vector databases**: Designed for storing vectorized data, such as embeddings for LLM (Large Language Model) consumption.

    Select the format that aligns with your workflow and future data needs.
  </Step>
</Steps>

## Troubleshooting

### Missing data

**Solution 1**: Employ adequate delay strategies to allow dynamic content to load. These include generic waits, waiting for specific elements, or pausing after scrolling or navigation.

**Solution 2**: If using `css_extractor`, check and ensure you've used the correct CSS selectors. Test each selector using the ZenRows Request Playground before integrating it into your codebase. Create an DOM monitoring strategy to spot site structural changes. Isolate selectors from your codebase for easy debugging and troubleshooting.

### CAPTCHA/anti-bot challenges

**Solution 1**: Ensure you've applied anti-bot bypass parameters like `js_render` and `premium_proxy`.

**Solution 2**: If you continue to be blocked by in-page CAPTCHAs or those attached to form fields, easily integrate a CAPTCHA-solving service like 2Captcha from our solver integration options. Check our 2Captcha integration guide for more information.

**Solution 3**: Use fallbacks or alternative pathways to avoid abrupt scraping failures.

### Rate limiting/geo-blocking

**Solution 1**: Ensure you use the `premium_proxy` parameter to automatically switch IPs.

**Solution 2**: Use request retry mechanisms, such as exponential backoff delay and retries between failed requests.
