OpenResearcher: An Open-Source Agent That Beats GPT-4.1 at Deep Research, Trained Entirely Offline

Kabui, Charles

TIGER-Lab at the University of Waterloo released OpenResearcher, a fully open pipeline for training deep research agents, the kind that can search, read, and reason across dozens of web pages to answer hard questions. The model is a 30B-parameter mixture-of-experts architecture with only 3B active parameters. What makes it unusual: the entire training loop runs offline. The team bootstrapped a 15M-document corpus from FineWeb, then used three browser primitives (search, open, find) to synthesize 97K training trajectories without ever calling a live web API. Some trajectories stretch to 100+ tool calls. After supervised fine-tuning, the model scores 54.8% on BrowseComp-Plus, ranking #1 among open-source models and beating GPT-4.1, Claude Opus 4, Gemini 2.5 Pro, DeepSeek-R1, and Alibaba’s Tongyi DeepResearch.

Training deep research agents has been gated by expensive, rate-limited web APIs. Every search query and page fetch costs money, and the results are hard to reproduce. OpenResearcher removes that dependency entirely: the offline corpus and trajectory synthesis pipeline can scale to arbitrary dataset sizes at near-zero marginal cost. Any research lab can now train a competitive deep research agent on standard hardware. NVIDIA has already adopted the approach for its Nemotron model family.

A while ago, OpenSeeker showed that 11,700 high-quality samples could match proprietary search agents. Now OpenResearcher proves you don’t even need live web access to collect those samples. The bottleneck for training research agents has shifted from data access to trajectory design.

Sources:

Disclaimer: For information only. Accuracy or completeness not guaranteed. Illegal use prohibited. Not professional advice or solicitation. Read more: /terms-of-service

Reuse

GNU GENERAL PUBLIC LICENSE v3.0(View License)

Citation

BibTeX citation:

@misc{kabui2026,
  author = {{Kabui, Charles}},
  title = {OpenResearcher: {An} {Open-Source} {Agent} {That} {Beats}
    {GPT-4.1} at {Deep} {Research,} {Trained} {Entirely} {Offline}},
  date = {2026-03-29},
  url = {https://toknow.ai/posts/openresearcher-offline-deep-research-agent-tiger-lab/},
  langid = {en-GB}
}

For attribution, please cite this work as:

Kabui, Charles. 2026. “OpenResearcher: An Open-Source Agent That Beats GPT-4.1 at Deep Research, Trained Entirely Offline.” https://toknow.ai/posts/openresearcher-offline-deep-research-agent-tiger-lab/.

Other Formats

Reuse

Citation