Firecrawl
Web scraping API and agent (/agent) that crawls, extracts and navigates complex sites to feed AI systems with structured data.
Delv Safety Grade: B
Score 72/100 · assessed 2026-04-18
Firecrawl is a commercial web scraping service with an autonomous agent mode that navigates complex sites. The company (Firecrawl) maintains active open-source repositories and offers both API and CLI access. The service requires API keys and operates via cloud infrastructure, meaning scraped data passes through their systems. Permissions are moderately broad: outbound network access to arbitrary URLs, potential for executing navigation logic on target sites, and filesystem writes for caching. The freemium model and active GitHub presence (2.8k+ stars) suggest legitimate operation, but users should understand they're delegating scraping to a third-party service. No known security incidents, though the autonomous agent mode (/agent) implies more complex execution paths than simple HTTP scraping. Supply chain is standard (npm/pypi packages available) with reasonable transparency. Suitable for non-sensitive scraping tasks where third-party processing is acceptable.
Green flags
- Active open-source repository with 2.8k+ GitHub stars
- Clear documentation and API reference available
- Standard package distribution via npm and Python
- Commercial entity with identifiable team and support channels
- No known security incidents or credential leaks
Red flags
- Scraped data passes through third-party cloud infrastructure
- Autonomous agent mode implies complex execution and navigation logic
- Requires API key with usage tracking and potential data retention
- Broad network access to arbitrary user-specified URLs
Permissions requested
Pricing
Platforms
Review
Pay for Firecrawl if you're building pipelines that ingest public web data into AI systems and you'd rather not maintain scraping infrastructure. Skip it if you need fine-grained control, authentication support, or you're scraping fewer than a dozen pages a month.
Good at
- Markdown output is consistently clean, ideal for LLM ingestion
- Agent mode handles pagination and multi-page navigation without custom code
- Free tier sufficient for prototyping and small-scale research
- Faster setup than Apify or Puppeteer for straightforward crawls
- API-first design makes it easy to integrate into existing pipelines
Watch out
- Rate limits and pricing scale quickly for large crawls
- No authentication or session management for gated content
- Agent reasoning is shallow, won't adapt to unexpected site changes
- Limited error recovery if a crawl hits a dead end mid-navigation
- CLI is barebones, mostly just wraps the API
Use cases
- data extraction
- crawl-to-markdown
- agent research