Prompt Injection Defense

Defend against prompt injection attacks: detection techniques, input sanitization, and multi-layer defense strategies.

Last updated 2026-06-12

SD-12

Prompt Injection Defense — 5-Layer Model

Contracts, support tickets, PDFs — all are adversarial text. Agents are uniquely vulnerable because they act on instructions.

A prompt injection is text that a model mistakes for instructions. In a chatbot that is an embarrassment; in an agent that can fire tool calls it is a remote-control channel — which is why the OWASP Top 10 for LLM Applications ranks it LLM01, first on the list. This section keeps the 5-layer defense model and adds what most write-ups skip: what the attacks look like when they arrive, and an honest accounting of what the defense buys you.

WHERE YOU ARE

The policy side — data classification, PII, tenant isolation — was covered in Security & Privacy. This is the adversarial side: four payload shapes from real traffic, the 5-layer model mapped against them, and where the model still loses. SD-13 · Guardrails & Safety picks up the runtime enforcement layer.

Prompt Injection Defense

Prompt Injection Defense — 5-Layer Model

More in System Design

System Design 101

AI System Design Vocabulary

Your First Agentic System

The Paradigm Shift