Loading…

ORBIT preserves language reasoning during generative retrieval fine-tuning | UncensoredHub

ReleasesResearch

ORBIT preserves language reasoning during generative retrieval fine-tuning

A new weight-averaging technique tracks parameter drift during generative retrieval fine-tuning and merges checkpoints when distance exceeds a threshold, preserving general reasoning while beating continual learning baselines.

May 12, 2026

ORBIT preserves language reasoning during generative retrieval fine-tuning

Large language models fine-tuned for generative retrieval — a task where the model directly generates document identifiers instead of scoring candidates — suffer rapid degradation of their foundational language reasoning. Researchers have now proposed ORBIT, a parameter-space regularization method that monitors how far fine-tuned weights drift from the original pre-trained checkpoint and applies periodic weight-averaging merges to keep that drift in check.

The core observation, detailed in a preprint posted to arXiv on May 13, is that catastrophic forgetting during retrieval fine-tuning correlates directly with the Euclidean distance between the tuned and original model parameters. By capping that distance through dynamic merges with the origin checkpoint, ORBIT keeps the model anchored to its pre-trained knowledge while still acquiring retrieval-specific skills.

The mechanism

During training, ORBIT calculates the L2 norm between current weights and the saved pre-trained checkpoint at each step. When the distance exceeds a predefined threshold — a hyperparameter tuned per task — the method performs a weighted average of the two checkpoints, blending the fine-tuned state back toward the origin. This process repeats throughout training, creating a dynamic tether that prevents runaway specialization.

On both retrieval metrics and held-out language benchmarks, ORBIT outperformed elastic weight consolidation, learning without forgetting, and other continual learning baselines. The authors — Neha Verma, Nikhil Mehta, Shao-Chuan Wang, Naijing Zhang, Alicia Tsai, and Li Wei — frame the work as a step toward fine-tuning regimes that preserve general intelligence while acquiring task-specific skills. The full preprint is available at arXiv:2605.12419.

The mechanism

More in Releases