Automate data extraction in your Node-RED workflows using ZenRows’ web scraping capabilities. This guide shows you how to integrate ZenRows with Node-RED and build a continuous data retrieval process.

What Is Node-RED?

Node-RED is a low-code programming tool that connects different services and APIs using a visual, drag-and-drop interface. It includes built-in workflow nodes for automating repetitive tasks, including action triggers, scheduling, data storage, and more.

Use Cases

Price comparison: Use Node-RED’s schedule node to run a periodic price scraping operation on competitors’ sites using ZenRow. Connect your flow to an LLM node to analyze price disparities. Property analysis: Use ZenRows to scrape property listing pages across popular real-estate sites and use Node-RED workflow to automate continuous data extraction, cleaning, storage, and analysis. Web page monitoring: Periodically monitor a web page by combining Node-RED’s scheduler with ZenRows’ scraping capabilities. Sentiment analysis: Use a ZenRows scraping node to collect data from various platforms, including social media, review sites, Google review pages, and more. Pass the scraped data to an LLM node for sentiment extraction. Lead generation: Use Node-RED’s automation features to automate quality lead scraping using ZenRows’ unique scraping capabilities.

Integration Steps

In this guide, we’ll create a scheduled Node-RED workflow that extracts product data from an Amazon product page using ZenRows’ css_extractor.

Step 1: Install and Launch Node-RED

  1. Install the node-red package globally:
    npm install -g --unsafe-perm node-red
    
  2. Launch the Node-RED server:
    node-red
    
    This command starts Node-RED on http://localhost:1880.
  3. Visit the localhost URL via your browser to load Node-RED’s platform as shown node-red-interface

Step 2: Get the request URL from ZenRows

  1. Open the ZenRows Universal Scraper API Builder.
  2. Paste the target URL in the link box, activate JS Rendering and Premium Proxies.
  3. Under Output, click Specific data and select Parsers.
  4. Enter the CSS selector array in the Parsers field. We’ll use the following CSS extractors in this guide:
    JSON
    {
        "name":"span#productTitle",
        "price":"span.a-price.aok-align-center.reinventPricePriceToPayMargin.priceToPay",
        "ratings":"#acrPopover",
        "description":"ul.a-unordered-list.a-vertical.a-spacing-mini",
        "review summary":"#product-summary p span",
        "review count":"#acrCustomerReviewLink"
    } 
    
  5. Select cURL as your language. Then, copy the URL generated after the cURL command.

Step 3: Create your scraping workflow

  1. Drag the Inject Node into the canvas. node-red-timestamp-module-view
  2. Double-click the timestamp node and enter a name in the Name field to rename it (e.g., “Scheduler”).
  3. Click the Repeat dropdown and select a schedule. We’ll use a 10-second Interval in this guide.
  4. Click Done. node-red-timestamp-node-edit
  5. From the search bar at the top left, search for http request. Then, drag the http request node into the canvas.
  6. Link both nodes by dragging a line to connect them at either end. node-red-http-request-node
  7. Double-click the http request node.
  8. Paste the ZenRows connection URL you copied previously in the URL field.
  9. Change the Return field to a Parsed JSON object to parse the returned string as a JSON object.
  10. Enter a name for your HTTP node in the Name field (e.g., Scraper).
  11. Click Done. node-red-http-request-node-setup

Step 4: Add storage and output nodes

  1. Search for write file in the search bar and drag the write file node into the canvas.
  2. Link the write file node with the Scraper node. node-red-write-file-node
  3. Double-click the write file node to configure it.
  4. In the Filename field, enter the JSON file path.
  5. Under Action, select a suitable action. We’ll choose the overwrite file option in this guide.
  6. Enter a name in the Name field to rename the node (e.g., JSON Writer).
  7. Click Done. node-red-file-writer-setup
  8. Drag the debug node into the canvas and connect it to the file writer.
  9. Double-click the debug node and rename it as desired (e.g., Output Tracker).
  10. Leave the Output as msg and payload.
  11. Click Done. node-red-output-tracker-setup

Step 5: Deploy and run the flow

Click Deploy at the top right to deploy the flow. Once deployed, the schedule is now triggered. Click the bug icon at the top right. This displays the JSON outputs in the right sidebar for each run. node-red-complete-flow-deploy The workflow also creates a JSON file in the specified directory and updates it at the chosen interval. Here’s a sample JSON response:
JSON
{
    "name": "Gamrombo LED Wireless Controller for PS5, Compatible with PS5 Pro/Slim/PC, Dual Vibration, Marco/Turbo Function, 3.5mm Audio Jack, 6-Axis Motion Contro Gamepad with Speaker",
    "price": "$59.99",
    "ratings": [
        "4.4",
        "4.4"
    ],
    "reviewCount": [
        "455 ratings",
        "455 ratings"
    ],
    "reviewSummary": [
        "Customers find the PS5 controller smooth and responsive, with buttons that provide crisp feedback and moderate rebound strength. Moreover, the controller is comfortable during long gaming sessions and features wear-resistant materials. Additionally, they appreciate its compatibility with both PS5 and PC, its vibrant lighting design, and consider it good value for money.",
        "",
        "AI Generated from the text of customer reviews",
        ""
    ]
}
Congratulations! 🎉 You’ve now integrated ZenRows with Node-RED.

Troubleshooting

Invalid URL/flow fails to run

  • Solution 1: Ensure you explicitly copied the URL generated by cURL from the builder.
  • Solution 2: Check the URL format for extra character strings, such as cURL appended to the URL (e.g., the URL must be in the form of http://… and not cURL "http").

File write access unavailable or denied

  • Solution 1: Ensure you set the proper write permission inside the storage location.
  • Solution 2: While specifying the file path in the write file node, you should include the file’s absolute path, including the expected file name. For instance, D:/Node-RED-result is wrong. The correct file path format should be D:/Node-RED-result/result.json.

Node-RED server jammed or localhost address unavailable

  • Solution: Stop the running server and restart it. Then, re-open the server address on your browser.

node-red is not recognized as an internal or external command

  • Solution: Ensure you’ve installed the Node-RED module globally, not in a localized working directory with package.json.

Frequently Asked Questions (FAQ)