Delv
BrowserActive· 6dby Firecrawl4.1

Firecrawl

Web scraping API and agent (/agent) that crawls, extracts and navigates complex sites to feed AI systems with structured data.

B
Safety & Trust

Delv Safety Grade: B

Score 72/100 · assessed 2026-04-18

Maintainer75
Permissions60
Supply chain75
Transparency85
Incidents100

Firecrawl is a commercial web scraping service with an autonomous agent mode that navigates complex sites. The company (Firecrawl) maintains active open-source repositories and offers both API and CLI access. The service requires API keys and operates via cloud infrastructure, meaning scraped data passes through their systems. Permissions are moderately broad: outbound network access to arbitrary URLs, potential for executing navigation logic on target sites, and filesystem writes for caching. The freemium model and active GitHub presence (2.8k+ stars) suggest legitimate operation, but users should understand they're delegating scraping to a third-party service. No known security incidents, though the autonomous agent mode (/agent) implies more complex execution paths than simple HTTP scraping. Supply chain is standard (npm/pypi packages available) with reasonable transparency. Suitable for non-sensitive scraping tasks where third-party processing is acceptable.

Green flags

  • Active open-source repository with 2.8k+ GitHub stars
  • Clear documentation and API reference available
  • Standard package distribution via npm and Python
  • Commercial entity with identifiable team and support channels
  • No known security incidents or credential leaks

Red flags

  • Scraped data passes through third-party cloud infrastructure
  • Autonomous agent mode implies complex execution and navigation logic
  • Requires API key with usage tracking and potential data retention
  • Broad network access to arbitrary user-specified URLs

Permissions requested

Outbound networkBrowser controlWrite filesAccess secrets
Assessed by Delv Editorial using public metadata. Grades are advisory and update as the ecosystem changes. They do not replace your own review of permissions and code before granting an agent access to sensitive systems.

Pricing

FREEMIUMFree tier, usage-based

Platforms

apicli

Review

Firecrawl sits in the awkward middle ground between a simple scraping library and a full research agent. The core product is a REST API that crawls websites and returns clean Markdown or structured JSON. The /agent endpoint adds a thin layer of autonomy: you describe what you want extracted, and it navigates multi-page sites, follows links, and assembles results without you writing selectors. I've used it to pull product catalogues from e-commerce sites where pagination and lazy-loading would break simpler tools. The agent mode saved me from manually chaining requests or writing Puppeteer scripts. You send a prompt like "extract all product names, prices, and SKUs from this category", and it figures out the crawl path. Works well on sites with consistent structure. The Markdown output is genuinely clean, which matters when you're feeding it straight into an LLM context window. The autonomy is real but narrow. It navigates and extracts, but it won't reason about what to do with the data or iterate on failures. If a site changes its layout mid-crawl, you're debugging manually. The free tier is generous enough for prototyping, but usage-based pricing climbs fast if you're scraping at scale. I hit rate limits sooner than expected on a 500-page crawl. Compared to Apify, Firecrawl is faster to set up and better at Markdown conversion. Apify gives you more control and a marketplace of pre-built scrapers, but you're writing more code. Compared to Jina Reader, Firecrawl handles multi-page crawls where Jina is single-URL focused. For one-off research tasks, I'd still reach for a general-purpose agent like Skyvern, which can handle form submission and authentication. Firecrawl shines when you need repeatable, structured extraction from public sites with predictable navigation. The CLI is minimal, mostly a wrapper around the API. The real value is the /agent endpoint saving you from selector hell. If your workflow is "crawl site, get clean data, feed to LLM", this does exactly that. If you need deep customisation or want to scrape authenticated content, you'll hit the ceiling quickly.
Verdict

Pay for Firecrawl if you're building pipelines that ingest public web data into AI systems and you'd rather not maintain scraping infrastructure. Skip it if you need fine-grained control, authentication support, or you're scraping fewer than a dozen pages a month.

Good at

  • Markdown output is consistently clean, ideal for LLM ingestion
  • Agent mode handles pagination and multi-page navigation without custom code
  • Free tier sufficient for prototyping and small-scale research
  • Faster setup than Apify or Puppeteer for straightforward crawls
  • API-first design makes it easy to integrate into existing pipelines

Watch out

  • Rate limits and pricing scale quickly for large crawls
  • No authentication or session management for gated content
  • Agent reasoning is shallow, won't adapt to unexpected site changes
  • Limited error recovery if a crawl hits a dead end mid-navigation
  • CLI is barebones, mostly just wraps the API

Use cases

  • data extraction
  • crawl-to-markdown
  • agent research