Model stitching measures representation compatibility across frozen networks
A 2015 technique for splicing frozen neural networks through a thin stitching layer has evolved into a practical tool for measuring representation compatibility across architectures.

Model stitching—introduced in 2015 as a way to test whether two networks learn equivalent features—has matured into a lightweight method for measuring knowledge transfer between frozen models. The core idea: freeze two trained networks A and B, extract the lower layers of B (the "front model"), add a single trainable stitching layer, and connect that to the upper layers of A (the "top model"). Only the stitching layer learns; the rest stays frozen. For convolutional nets, that layer is typically a 1×1 convolution with batch normalization; for transformers, a token-wise linear projection.
Two 2021 NeurIPS papers—Similarity and Matching of Neural Network Representations and Revisiting Model Stitching to Compare Neural Representations—formalized the approach and defined the stitching penalty: the difference between the stitched model's error and the baseline error of network A. A penalty near zero signals that the representations are compatible. Researchers train the stitching layer using one of four strategies: hard label matching (minimize error against ground truth), soft label matching (minimize distance to the end model's predictions), direct matching (minimize activation distance at the stitch point), or functional latent alignment (a 2026 method that imitates the end model's internal layer-by-layer behavior).
The technique has moved beyond its original AlexNet experiments on image classification. It now serves as a diagnostic for comparing representations across model families, architectures, and training regimes—offering a way to measure how much two networks "speak the same language" without retraining either one from scratch.




