Who Is this for?
This guide is designed for developers who want to build a price monitoring system from the ground up using web scraping. No existing price monitoring infrastructure required.What you’ll learn
- Extract product price data from an Amazon product page using ZenRows.
- Clean the extracted raw data and structure it for storage and price monitoring.
- Store price history with timestamps.
- Schedule automated scraping at regular intervals.
- Set up price change notifications.
- Optimize performance and manage costs.
Step 1: Set Up Data Extraction
Extract specific product data related to pricing, availability, and social cues using ZenRowscss_extractor. Stringify the css_extractor for compatibility:
Python
css_extractor.
The js_render parameter handles the site’s dynamic rendering, while premium_proxy routes the request through premium proxies to avoid blocking. proxy_country is set to us to send requests from US-based IP addresses:
Python
Python
JSON response
Step 2: Clean and Structure the Data
The returned data is unsuitable for price monitoring in its current state since it contains undesired strings. Since some fields are returned as a list, define aconsistency_handler to handle them as a list before cleaning:
Python
Python
reviewCount field by extracting the count integer from the list of strings:
Python
average_rating as a float from the raw data:
Python
demandHistory field:
Python
availability field by extracting the In Stock string from the strings:
Python
datetime and timestamp to get the timestamp of the scraping operation to track historical data:
Python
parse_cleaned_data to the raw_data to get a cleaned version with a recorded timestamp:
Python
JSON
Step 3: Store the Price History Data
Since the scraping will be scheduled, start the storage procedure by setting uplogging to track the extraction process in real-time:
Python
product_history.json file. Index the stored data by product name:
Python
alert_triggered=False) and check for price changes by comparing the previously scraped price to the current one. Set alert_triggered to True if a price change is detected. Then, log the price difference. Append new price records with existing ones in the JSON file. If a product doesn’t exist in the JSON file, create a new record for it:
Python
Python
update_and_save_price_history function:
Python
Step 4: Create a Scraping Job
Create ajob function to execute the extraction, data cleaning, and storage logics. This function logs the scraping timestamps and price change notifications:
Python
Step 5: Schedule the Scraping Job
Schedule the extraction process using Python’sschedule module. The code below runs the scraping job every 30 seconds and checks for jobs every second:
Python
Step 6: Put Everything Together
Here’s the complete code:Python
price_history JSON file:
JSON
Next Steps: Enhance Your System
Add Email Notifications
Set up email alerts when prices change:Python
Monitor Multiple Products
Track multiple products by modifying the monitoring function:Python
Database Integration
Replace JSON storage with a database for better performance:Python
Troubleshooting
Missing Data
- Enable
js_renderfor dynamic content - Increase
waittime for slow-loading pages or consider switching towait_for - Verify if the CSS selectors are correct
Rate Limiting
- Use
premium_proxyfor residential IPs - Add delays between requests
- Monitor your API usage
Selector Changes
- Test selectors regularly
- Use more stable selectors when possible
- Implement fallback selectors
Best Practices
- Monitor selector stability: Check if your CSS selectors still work monthly
- Handle errors gracefully: Always include try-catch blocks for network requests
- Log everything: Comprehensive logging helps debug issues
- Start small: Begin with one product before scaling to multiple products
- Respect rate limits: Don’t overwhelm target websites with too many requests