Once you’ve extracted data using ZenRows, you might want to store it in CSV format. For simplicity, we’ll focus on a single URL and save the data to one file. In real-world scenarios, you might need to handle multiple URLs and aggregate the results.

To start, we’ll explore how to export data to CSV using both Python and JavaScript.

From JSON using Python

If you’ve obtained JSON output from ZenRows with the autoparse feature enabled, you can use Python to convert this data into a CSV file.

Autoparsing can work for many websites but some are not included on this feature

The Pandas library will help us flatten nested JSON attributes and save the data as a CSV file.

Here’s a sample Python script:

scraper.py
# pip install requests pandas
import requests
import json
import pandas as pd

url = "https://www.zillow.com/san-francisco-ca/"
apikey = "YOUR_ZENROWS_API_KEY"
params = {"autoparse": True, "url": url, "apikey": apikey}
response = requests.get("https://api.zenrows.com/v1/", params=params)

content = json.loads(response.text)

data = pd.json_normalize(content)
data.to_csv("result.csv", index=False)

You can also adjust the json_normalize function to control how many nested levels to flatten and rename fields. For instance, to flatten only one inner level and remove latLong from latitude and longitude fields:

data = pd.json_normalize(content, max_level=1).rename(
	columns=lambda x: x.replace("latLong.", ""))

From HTML using Python

When dealing with HTML output without the autoparse feature, you can use BeautifulSoup to parse the HTML and extract data. We’ll use the example of an eCommerce site from Scraping Course. Create a dictionary for each product with essential details, then use Pandas to convert this list of dictionaries into a DataFrame and save it as a CSV file.

Here’s how to do it:

scraper.py
# pip install requests beautifulsoup4 pandas
import requests
from bs4 import BeautifulSoup
import pandas as pd

url = "https://www.scrapingcourse.com/ecommerce/"
apikey = "YOUR_ZENROWS_API_KEY"
params = {"url": url, "apikey": apikey}
response = requests.get("https://api.zenrows.com/v1/", params=params)
soup = BeautifulSoup(response.content, "html.parser")

content = [{
    "product_name": product.select_one(".product-name").text.strip(),
    "price": product.select_one(".price").text.strip(),
    "rating": product.select_one(".rating").text.strip() if product.select_one(".rating") else "N/A",
    "link": product.select_one(".product-link").get("href"),
} for product in soup.select(".product")]

data = pd.DataFrame(content)
data.to_csv("result.csv", index=False)

From JSON using JavaScript

For JavaScript and Node.js, you can use the json2csv library to handle the JSON to CSV conversion.

After getting the data, we will parse it with a flatten transformer. As the name implies, it will flatten the nested structures inside the JSON. Then, save the file using writeFileSync.

Here’s an example using the ZenRows Scraper API with Node.js:

scraper.js
// npm install zenrows json2csv
const fs = require("fs");
const {
	Parser,
	transforms: { flatten },
} = require("json2csv");
const { ZenRows } = require("zenrows");

(async () => {
	const client = new ZenRows("YOUR_ZENROWS_API_KEY");
	const url = "https://www.scrapingcourse.com/ecommerce/";

	const { data } = await client.get(url, { autoparse: "true" });

	const parser = new Parser({ transforms: [flatten()] });
	const csv = parser.parse(data);

	fs.writeFileSync("result.csv", csv);
})();

From HTML using JavaScript

For extracting data from HTML without autoparse you can use the cheerio library to parse the HTML and extract relevant information. We’ll use the Scraping Course eCommerce example for this task:

As with the Python example, we will use AutoScout24 to extract data from HTML without the autoparse feature. For that, we will get the plain result and load it into cheerio. It will allow us to query elements as we would in the browser or with jQuery. We will return an object with essential data for each car entry in the list. Parse that list into CSV using json2csv, and no flatten is needed this time. And lastly, store the result. These last two steps are similar to the autoparse case.

scraper.js
// npm install zenrows json2csv cheerio
const fs = require("fs");
const cheerio = require("cheerio");
const { Parser } = require("json2csv");
const { ZenRows } = require("zenrows");

(async () => {
    const client = new ZenRows("YOUR_ZENROWS_API_KEY");
    const url = "https://www.scrapingcourse.com/ecommerce/";

    const { data } = await client.get(url);
    const $ = cheerio.load(data);

    const content = $(".product").map((_, product) => ({
        product_name: $(product).find(".product-name").text().trim(),
        price: $(product).find(".price").text().trim(),
        rating: $(product).find(".rating").text().trim() || "N/A",
        link: $(product).find(".product-link").attr("href"),
    }))
    .toArray();

    const parser = new Parser();
    const csv = parser.parse(content);

    fs.writeFileSync("result.csv", csv);
})();

If you encounter any issues or need further assistance with your scraper setup, please contact us, and we’ll be happy to help!