Block prompt injection at the gateway, not the post-mortem.
Detection after the fact is a blog post. ShadowIQ blocks, redacts, or escalates suspect prompts inline — and signs every decision for the record.
Summary
Prompt injection defense in ShadowIQ uses a multi-classifier ensemble (rule-based, small classifier, and LLM-judge quorum) tuned on 2,400+ labeled adversarial samples, enforcing inline at the AI gateway with p99 latency under 75ms and cryptographically signed decisions.
The before / after, in one picture.
You've heard this one before.
- Classic detections miss indirect prompt injection via RAG content.
- Guardrails that add 300ms latency nobody will ship.
- False positives that block safe traffic and exhaust ops.
- No record of what the attacker actually tried.
Three moves.
- 1Multi-classifier ensemble.
Rule match + small classifier + LLM-judge quorum. 2,400+ labeled adversarial samples, updated weekly.
- 2Sub-75 ms enforcement.
WASM-compiled, parallel-evaluated policies. Latency budget preserved even under burst.
- 3Full attacker timeline.
Every suspected injection attempt is signed, chained, and replayable — for your red-team, your IR team, and your auditor.
Numbers, not adjectives.
Asked, answered, sourced.
Yes. RAG content flows through the same classifier pipeline with context-aware rules. Suspicious retrieved content is flagged or stripped before reaching the model.
Weekly, with major-sample releases on new attack families (e.g., ASCII smuggling, prompt leak via tool descriptions). Customers can subscribe to the RSS of additions.
Yes. Upload a dataset; it becomes a versioned eval you can schedule and share. Your samples stay in your tenant — never train our models.
Keep going.
Your 30-minute demo. A signed audit trail by the end of it.
We'll wire ShadowIQ into one live workload, block a prompt injection in real time, and hand you a cryptographic receipt — before the meeting ends.