NVIDIA Nemotron 3.5 Content Safety: 4B multimodal classifier for 18 languages
NVIDIA released Nemotron 3.5 Content Safety, a 4B-parameter multimodal safety classifier supporting text and images across 18 languages, available under Apache 2.0 on HuggingFace.
NVIDIA released Nemotron 3.5 Content Safety on June 4, a 4-billion-parameter multimodal safety classifier that evaluates text and images for policy violations across 18 languages. Available under Apache 2.0 on HuggingFace, the model runs as a guardrail layer for enterprise AI deployments that need multilingual content moderation without sending data to external APIs.
The classifier covers 13 harm categories including violence, sexual content, hate speech, self-harm, harassment, and illegal activity. It accepts text-only, image-only, or combined text-image inputs and returns a binary safe/unsafe label plus per-category scores. NVIDIA trained the model on a proprietary dataset mixing human annotations with synthetic examples generated by larger models, then applied direct preference optimization to align outputs with enterprise safety policies.
Performance across languages
NVIDIA reports 91.2 percent accuracy on ToxicChat, 94.1 percent on OpenAI Moderation, and 89.7 percent on a held-out multilingual test set. The company tested across English, Spanish, German, French, Italian, Portuguese, Dutch, Russian, Polish, Ukrainian, Romanian, Czech, Turkish, Arabic, Hindi, Chinese, Japanese, and Korean. Performance varies by language — English and Western European languages score highest, while accuracy drops 3–5 points for lower-resource languages like Ukrainian and Turkish.
The model ships as a single checkpoint on HuggingFace with inference code for batch and streaming use cases. NVIDIA positions it as a drop-in replacement for closed moderation APIs in regulated industries where data residency rules prohibit cloud calls. The Apache 2.0 license permits commercial use and fine-tuning, letting enterprises retrain the classifier on internal policy datasets or domain-specific harms.





