Loading…

MiroThinker-1.7 mini activates 3B parameters, matches GPT-5 on reasoning benchmarks | UncensoredHub

ReleasesNSFW

MiroThinker-1.7 mini activates 3B parameters, matches GPT-5 on reasoning benchmarks

MiroMind released open-weight deep research agents built on Qwen3 MoE — the mini variant uses only 3B active parameters from a 30B total pool and outperforms GPT-5 on multi-step reasoning tasks.

May 16, 2026

MiroThinker-1.7 mini activates 3B parameters, matches GPT-5 on reasoning benchmarks

MiroMind released MiroThinker-1.7-deepresearch and MiroThinker-1.7-mini-deepresearch on May 17, open-weight research agents built on Qwen3 MoE with weights available on HuggingFace. The mini model runs 30B total parameters with only 3B active at inference, designed to fit mid-range consumer GPUs while handling multi-step research tasks. On the GAIA benchmark, the full 1.7 model scores 82.7 and the mini hits 80.3 — both ahead of GPT-5's 76.4. On BrowseComp, the full model reaches 74.0 versus GPT-5's 54.9.

The architecture uses a sliding-window context manager (K=5 with episode restarts) rather than full-context retention, trading long-context coherence for memory efficiency. Real-world throughput on consumer hardware depends on quantization, context length, and whether the episode-restart overhead impacts tokens-per-second. The team is actively seeking local inference reports and feedback on the context-management trade-offs from practitioners running long-context agents.

Benchmark comparison

The arxiv preprint (2603.15726) shows MiroThinker-1.7 trailing Qwen3.5-397B on BrowseComp (74.0 vs 78.6) and HLE-Text (42.9 vs 48.3) but leading DeepSeek-V3.2 on both Chinese and English BrowseComp variants. The mini model's 80.3 GAIA score is 3.9 points higher than GPT-5, suggesting the MoE sparsity preserves multi-step reasoning performance. On xbench-DS and SEAL-0, the mini trails the full variant by roughly 5–10 points (57.2 vs 62.0 and 48.2 vs 53.0 respectively).

Model	BrowseComp	BrowseComp-ZH	HLE-Text	GAIA	xbench-DS	SEAL-0

Benchmark comparison

More in Releases