Claude agents now self-improve during idle time with Anthropic's dreaming feature
Anthropic launched a research preview of dreaming, a capability that lets Claude agents practice and improve autonomously between user sessions.
Anthropic rolled out a research preview of dreaming, a feature that allows Claude agents to self-train during idle periods between user sessions. The capability, announced this week on the company's official blog, activates automatically when an agent isn't actively handling tasks and enables the model to rehearse scenarios, refine workflows, and strengthen its reasoning without human intervention.
The dreaming mode is part of Anthropic's broader push to make managed agents more autonomous and capable over time. Instead of relying solely on in-session feedback, agents can now use downtime to simulate edge cases, test alternative approaches, and consolidate learning from prior interactions.
What stands out
- 01Idle-time learning. Dreaming activates only when the agent is not serving user requests, turning compute downtime into a self-improvement window. Anthropic positions this as a way to reduce the manual tuning burden on developers.
- 02Research preview status. The feature is not yet in general availability. Early access users can opt in via the Claude managed-agents dashboard, but Anthropic has not disclosed rollout timelines for broader tiers.
- 03No explicit cost model yet. The blog post does not specify whether dreaming consumes billable compute tokens or runs as a background service included in existing agent pricing. That ambiguity will matter to teams running high-volume agent fleets.
- 04Self-supervised rehearsal. Dreaming is described as a self-supervised process—agents generate synthetic scenarios and evaluate their own responses, rather than waiting for new labeled data. This mirrors techniques from reinforcement learning and model distillation but is applied at the agent level.
