Delv
Researchby Elicit4.3

Elicit

Research agent for academic literature. Decomposes a question, finds papers, extracts methods + findings into a comparison table.

B
Safety & Trust

Delv Safety Grade: B

Score 72/100 · assessed 2026-04-18

Maintainer75
Permissions65
Supply chain60
Transparency55
Incidents100

Elicit is a commercial research agent from a venture-backed company (Ought, now Elicit) focused on AI-assisted literature review. The maintainer is a legitimate mid-size organisation with academic roots and transparent funding. However, it operates as a closed-source web service with no public repository, making supply-chain verification impossible. The agent autonomously queries academic databases, extracts structured data from papers, and synthesises findings without human review of each step. Permissions are moderately scoped: it reads from external academic APIs and likely uses external LLMs for extraction, but does not touch your filesystem or execute code locally. Transparency is limited by the closed-source model and lack of public incident history. No known security incidents, but the autonomous nature and opaque processing pipeline mean you are trusting Elicit's infrastructure and model behaviour without independent audit.

Green flags

  • Legitimate organisation (Ought/Elicit) with known academic AI research roots
  • Scoped to academic literature: no filesystem, shell, or payment access
  • Web-only deployment reduces local supply-chain risk
  • Active product with regular feature updates and user community
  • No known security incidents or credential leaks to date

Red flags

  • Closed source with no public repository or code audit trail
  • Autonomous extraction and synthesis without step-by-step human review
  • Opaque LLM usage: unclear which models process your research queries
  • Freemium model may create incentive to upsell or limit free-tier scrutiny
  • No public changelog or incident disclosure mechanism visible

Permissions requested

Outbound networkExternal LLM callDB read
Assessed by Delv Editorial using public metadata. Grades are advisory and update as the ecosystem changes. They do not replace your own review of permissions and code before granting an agent access to sensitive systems.

Pricing

FREEMIUM

Platforms

web

Review

Elicit treats literature review like a structured workflow rather than a search engine. You give it a research question, it breaks that question into sub-queries, then hunts for papers and extracts specific claims into a comparison table. The autonomy here is real: instead of reading twenty abstracts yourself, you get a grid showing how different studies measured the same outcome, what their sample sizes were, whether they found an effect. I used it to prep a grant section on sleep interventions for adolescents. Elicit pulled eight RCTs I'd missed, extracted their primary endpoints, and flagged two that used wrist actigraphy instead of self-report. That specificity saved me three hours of skimming PDFs. The extraction quality varies. For well-structured papers with clear methods sections, it nails the details. For older scans or papers with unusual formatting, you get blanks or vague summaries. It also leans heavily on abstracts and conclusions, so if a paper buries a key limitation in the discussion, Elicit might miss it. The comparison table is the killer feature: seeing five studies side-by-side, with extracted sample sizes and effect directions, makes patterns obvious. But you still need to read the original papers for anything you plan to cite. Elicit is a triage tool, not a replacement for critical reading. Compared to Consensus, which focuses on yes/no synthesis across many papers, Elicit is better when you need granular method-level detail. Consensus tells you the weight of evidence; Elicit tells you how each study got there. For systematic reviews or grant writing, that distinction matters. The free tier gives you a handful of queries per month, enough to evaluate it. The paid tier is worth it if you write literature reviews regularly, less so if you only need this twice a year. Failure mode: it struggles with interdisciplinary questions where terminology shifts between fields. Ask about 'cognitive load' in education versus HCI and you'll get papers talking past each other in the same table. Also, no citation export to Zotero or Mendeley yet, which is baffling for a research tool.
Verdict

Pay for Elicit if you write grant proposals, systematic reviews, or need to compare study methods at scale. Skip it if you mostly need broad consensus answers or only review literature occasionally. The free tier is generous enough to test whether the extraction quality meets your standards.

Good at

  • Extracts specific methods and findings into comparison tables, not just summaries
  • Breaks complex questions into sub-queries autonomously, surfaces papers you'd miss
  • Side-by-side study comparison makes methodological patterns obvious
  • Free tier gives enough queries to evaluate properly
  • Faster triage than reading twenty abstracts manually

Watch out

  • Extraction quality drops for older papers or unusual formatting
  • Leans on abstracts, can miss limitations buried in discussion sections
  • Struggles with interdisciplinary questions where terminology shifts
  • No citation export to Zotero or Mendeley
  • Still requires reading original papers for anything you'll cite

Use cases

  • Lit reviews for grant applications
  • Side-by-side comparisons of studies
  • Finding the strongest counter-evidence
  • Academic writing prep