Distilled task instructions cut B2B call classification token use by 99%
A new approach to in-context learning replaces verbose few-shot examples with compact classification criteria, improving macro-averaged AUC by up to 7% on real-world sales conversations while slashing token budgets.

Researchers have demonstrated that replacing traditional few-shot examples with distilled task instructions can slash token consumption by 99% while improving classification accuracy on complex business-to-business conversations.
The paper, published on HuggingFace by Guy Rotman, Adi Kopilov, Danit Berger Zalmanson, and Omri Allouche, introduces the Call Playbook dataset—five classification tasks drawn from actual B2B sales calls. Traditional in-context learning (ICL) struggles with these multi-party conversations because concatenating multiple examples balloons context length and dilutes signal. The team's solution extracts structured classification criteria and precise task descriptions from verbose examples, then feeds those compact representations to the model instead.
On the Call Playbook tasks, the distillation method improved macro-averaged AUC by up to 7 percentage points over standard ICL. Advanced token-compression baselines—techniques that prune or summarize context—degraded by more than 9 F1 points as context grew, while the distilled-instruction approach held steady. The 99% token reduction means practitioners can fit far more tasks into a single prompt or run inference at a fraction of the cost.
The framework also surfaces the classification logic in human-readable form, letting domain experts refine criteria without retraining. That transparency matters in regulated industries where black-box decisions carry compliance risk.

