Hugging Face ml-intern lets Qwen 35B orchestrate model training on a laptop
Hugging Face's ml-intern framework now supports local models via llama.cpp and Ollama, letting open-weight LLMs orchestrate supervised fine-tuning jobs end-to-end without cloud API costs or token limits.
An AI researcher can now run continuously on a laptop, orchestrating model training jobs without hitting token caps or cloud API bills. Hugging Face released ml-intern, a lightweight agent framework that integrates with the open-source stack—transformers, datasets, trl—and the Hub's compute infrastructure. Originally built for Claude Opus, the harness now supports local models via llama.cpp and Ollama, letting any GGUF-quantized model act as the orchestrator.
