OpenAI adds Lockdown Mode to ChatGPT Enterprise, limiting data exposure in prompt injection attacks
ChatGPT's new Lockdown Mode aims to reduce the risk of sensitive data exposure during prompt injection attacks, though vulnerabilities remain.

OpenAI rolled out Lockdown Mode this week, a new security feature designed to limit data exposure when ChatGPT falls victim to prompt injection attacks. The feature doesn't eliminate injection vulnerabilities — attackers can still manipulate the model's behavior by embedding malicious instructions in user-supplied content — but it constrains what information the model can leak in the process.
Prompt injection has emerged as one of the most persistent security challenges for large language models. An attacker embeds hidden instructions inside a document, email, or webpage, which the model then follows as if they were legitimate user commands. In enterprise deployments, where ChatGPT might process internal documents or customer data, a successful injection could expose confidential information to unauthorized parties.
How Lockdown Mode works
Lockdown Mode operates by restricting the model's ability to echo or summarize certain classes of data when it detects suspicious prompt patterns. OpenAI has not disclosed the full technical implementation, but the feature appears to layer additional output filtering on top of existing content policy checks. When active, the mode limits the model's responses to high-confidence, policy-compliant outputs, even if that means refusing requests that would normally succeed.
The feature is available now in ChatGPT Enterprise and Team plans. OpenAI positions it as a stopgap — a way to buy time while the company works on deeper architectural defenses against injection attacks. Security researchers have noted that Lockdown Mode's effectiveness will depend heavily on how well it can distinguish between legitimate edge-case queries and actual injection attempts, a notoriously difficult classification problem.



