Sber and Yandex veterans teach systematic LLM testing on May 28
School of Higher Mathematics webinar walks developers through structured evaluation: from raw logs and user feedback to automated regression checks and measurable improvement cycles for AI features in production.

School of Higher Mathematics is hosting a webinar on May 28 at 19:30 Moscow time focused on moving LLM product teams from ad-hoc testing to systematic quality evaluation. The session targets developers, ML engineers, product managers, and team leads shipping AI features to production who currently assess model improvements by feel rather than measurement.
Andrey Kiselev, Head of Product at an AI company and former Revolut and Yandex engineer, and Fedor Azarov, head of data research at Sber CIB, lead the session. The format is a live demo with a reusable framework attendees can apply to commercial or side projects.
What stands out
- 01Raw log collection and feedback loops. Capture interaction data and convert subjective user feedback into measurable signals.
- 02Metric design for LLM outputs. Identify metrics that predict user satisfaction rather than just correlating with token count or prompt length.
- 03Automated regression suites. Flag when a new prompt or model version breaks existing scenarios without manual review.
- 04Structured before-and-after testing. Use A/B or holdout methods to confirm whether a change genuinely improved the feature or redistributed errors.
- 05End-to-end improvement cycle. Link logs → metrics → automation → deployment into a repeatable process.

