DoRA matches LoRA accuracy while IA³ cuts training memory by 40 percent
HuggingFace's new PEFT benchmark compares LoRA against six parameter-efficient fine-tuning alternatives across language and vision tasks, revealing trade-offs in accuracy, memory, and speed.

HuggingFace released a comprehensive benchmark this week comparing LoRA—the dominant parameter-efficient fine-tuning method—against six alternatives: IA³, Prompt Tuning, Prefix Tuning, P-Tuning, AdaLoRA, and DoRA. The evaluation ran across language modeling tasks on BERT and GPT-2, plus image classification on Vision Transformer, measuring final accuracy, training memory footprint, and inference speed.
LoRA has become the default choice for fine-tuning large models on consumer hardware because it freezes the base weights and trains only low-rank adapter matrices. The benchmark tests whether that dominance holds across different model sizes and task types.
What stands out
- 01DoRA matched or beat LoRA on accuracy across all three model families tested, with gains of 0.3–1.2 percentage points on downstream tasks. DoRA decomposes weight updates into magnitude and direction components, adding minimal overhead.
- 02IA³ used 40% less memory than LoRA during training on GPT-2 fine-tuning runs, because it scales activations with learned vectors rather than injecting full adapter layers. Accuracy lagged LoRA by 0.5–0.8 points.
- 03Prompt Tuning and Prefix Tuning underperformed on small models (BERT-base, GPT-2-small) but closed the gap on larger checkpoints, suggesting they need scale to compete. On ViT-large, Prefix Tuning reached within 0.2 points of LoRA.
- 04AdaLoRA's adaptive rank allocation delivered no consistent accuracy advantage over fixed-rank LoRA in these tests, despite higher training cost. The benchmark used default hyperparameters; hand-tuning might change the outcome.
- 05




