SPADE diffusion model tackles offline design optimization without new experiments
A new arXiv preprint introduces SPADE, a diffusion-based surrogate model that uses support-proximity regularization and calibrated moment matching to optimize designs from static datasets without running new experiments.
SPADE (Support-Proximity Augmented Diffusion Estimation) is a forward surrogate modeling framework addressing offline black-box optimization—the challenge of finding high-scoring designs using only a fixed dataset, with no ability to test new candidates. The method models the forward likelihood p(y|x) as a diffusion process, then adds two components tailored for optimization: a Calibrated Diffusion Estimation module that enforces global consistency in statistical moments and pairwise rankings, and a Support-Proximity Regularization mechanism that uses k-nearest-neighbor density estimation to keep proposed designs close to the data manifold. The authors prove their regularization is first-order equivalent to maximizing a Bayesian posterior with a valid design prior, bridging the gap between generative modeling and constrained optimization.
Existing methods split into inverse approaches—which map scores back to designs but struggle with ill-posed inversions—and forward approaches that model p(y|x) directly but often lack the distributional expressivity to quantify uncertainty. SPADE sidesteps both pitfalls by treating the forward model as a conditional generative process, inheriting diffusion models' ability to capture complex distributions while adding explicit constraints to prevent out-of-distribution extrapolation. The preprint reports state-of-the-art results across Design-Bench tasks and an LLM data mixture optimization benchmark, posted to arXiv on May 13, 2026.
