Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.zenrows.com/llms.txt

Use this file to discover all available pages before exploring further.

The response_type=plaintext parameter tells ZenRows to strip all HTML tags and formatting from the scraped page and return only the raw text content. You get the words on the page, nothing else. This is useful when you don’t need structure or markup. Just the text itself. It’s a good fit for keyword analysis, sentiment analysis, full-text search indexing, or any pipeline where clean, unformatted content is easier to work with than HTML or Markdown.
response_type=plaintext cannot be combined with the outputs parameter. Use one or the other depending on whether you need targeted data extraction or a plain text version of the full page.

How it works

ZenRows fetches the target page, renders it if needed, and strips all HTML tags from the resulting content. The output preserves line breaks between block-level elements but removes all formatting: no headings, no bold, no links, no list syntax. For example, given the following HTML:
HTML
<h1>Product Title</h1>
<p>This is a great product that does many things.</p>
<ul>
    <li>Feature 1</li>
    <li>Feature 2</li>
    <li>Feature 3</li>
</ul>
The plain text response returns:
Plain text
Product Title

This is a great product that does many things.

Feature 1
Feature 2
Feature 3

Basic usage

Add response_type=plaintext to your request parameters:
import requests

url = "https://www.scrapingcourse.com/ecommerce/"
apikey = "YOUR_ZENROWS_API_KEY"

params = {
    "url": url,
    "apikey": apikey,
    "response_type": "plaintext",
}

response = requests.get("https://api.zenrows.com/v1/", params=params)
print(response.text)

When to use plain text response

Plain text is the right choice when structure doesn’t matter and you just need the words. It works well for:
  • NLP and text analysis: run sentiment analysis, keyword extraction, or topic modeling directly on the output without pre-processing HTML.
  • Full-text search indexing: feed clean text into search engines like Elasticsearch or Typesense without stripping tags in your pipeline.
  • LLM pipelines with minimal context: when you want to reduce token usage further than Markdown allows and don’t need headings or list structure preserved.
  • Content comparison and diffing: compare page text across time or across URLs without formatting noise affecting the diff.
  • Readability and summarization tools: pass clean prose to summarization models or readability scoring systems.
If you need the page structure preserved (headings, lists, links), use response_type=markdown instead. If you only need specific data types like emails or tables, use the outputs parameter.

Best practices

Enable js_render for dynamic pages:
If the page loads content via JavaScript, combine response_type=plaintext with js_render=true. Without it, the output will only reflect the initial HTML and may be missing key content.
Use wait or wait_for when content loads after a delay:
For pages where content appears after a delay or user interaction, add wait (milliseconds) or wait_for (CSS selector) to ensure the full content is present before the text is extracted.
Expect navigation and boilerplate in the output:
Plain text response strips tags but converts the full page, including headers, footers, navigation menus, and sidebars. If you need only the main body content, post-process the output to remove boilerplate, or evaluate whether css_extractor with a targeted selector fits your use case better.
Find more details on our CSS Extractor Documentation.
Credits are charged on successful responses:
A request using response_type=plaintext is charged when the API returns a 200 status code, regardless of whether the output contains the content you expected. Test on a small set of URLs before running at scale.

Troubleshooting

The page likely renders its content via JavaScript. Add js_render=true to your request to ensure ZenRows renders the page fully before extracting the text. If content still appears incomplete, combine with wait_for targeting a CSS selector that’s only present once the main content has loaded.
These two parameters are mutually exclusive. response_type=plaintext converts the entire page to plain text, while outputs performs targeted data extraction. You can only use one per request. Choose outputs when you need specific data types, and response_type=plaintext when you need clean, unformatted text from the full page.
Plain text conversion preserves line breaks between block-level elements, which can result in multiple blank lines between sections depending on how the page is structured. Post-process the output with a simple whitespace normalization step in your code to clean it up before further processing.

Pricing

The response_type=plaintext parameter is included at no additional cost with all ZenRows requests. You only pay extra for JavaScript Render and Premium Proxy when used.
You can monitor your ZenRows usage in multiple ways to stay informed about your account activity and prevent unexpected overages.Dashboard monitoring: View real-time usage statistics, remaining requests, success rates, and request history on your Analytics Page. You can also set up usage alerts in your notification settings to receive notifications when you approach your limits.Programmatic monitoring: For automated monitoring in your applications, call the /v1/subscriptions/self/details endpoint with your API key in the X-API-Key header. This returns real-time usage data that you can integrate into your monitoring systems. Learn more about the usage endpoint.Response header monitoring: Track your concurrency usage through response headers included with each request:
  • Concurrency-Limit: Your maximum concurrent requests
  • Concurrency-Remaining: Available concurrent request slots
  • X-Request-Cost: Cost of the current request

FAQ (Frequently Asked Questions)

Yes, you can combine response_type=plaintext with js_render=true to render JavaScript before the text extraction. This is recommended for any page that loads content dynamically.
Both strip HTML tags, but response_type=markdown preserves the semantic structure of the page by converting headings, lists, links, and tables into Markdown syntax. response_type=plaintext removes all formatting entirely and returns only the raw text. Use Markdown when structure matters, and plain text when you just need the words.
No, the response_type=plaintext parameter does not add extra credit cost on its own. Credits are calculated based on the other parameters you use, such as js_render or premium_proxy, exactly as they would be for a standard request.
No, response_type=plaintext is a parameter of the Universal Scraper API. It is not available in Scraping Browser (CDP/Playwright) sessions. In a Scraping Browser session, you control the page directly and can extract text content using standard browser APIs like document.body.innerText.
Yes. Plain text is the most token-efficient format for LLM input since it contains no markup at all. It works well when the structure of the page is irrelevant to your task. If your LLM prompt benefits from knowing which text was a heading or a list item, use response_type=markdown instead.