Use case · Prompt injection defense

Block prompt injection at the gateway, not the post-mortem.

Q: Do you handle indirect prompt injection via RAG?

Yes. RAG content flows through the same classifier pipeline with context-aware rules. Suspicious retrieved content is flagged or stripped before reaching the model.

Q: How often is the sample set updated?

Weekly, with major-sample releases on new attack families (e.g., ASCII smuggling, prompt leak via tool descriptions). Customers can subscribe to the RSS of additions.

Q: Can I bring my own red-team prompts?

Yes. Upload a dataset; it becomes a versioned eval you can schedule and share. Your samples stay in your tenant — never train our models.

Detection after the fact is a blog post. ShadowIQ blocks, redacts, or escalates suspect prompts inline — and signs every decision for the record.

See prompt injection defense in action Platform overview

What this is

Prompt injection defense in ShadowIQ uses a multi-classifier ensemble (rule-based, small classifier, and LLM-judge quorum) tuned on 2,400+ labeled adversarial samples, enforcing inline at the AI gateway with p99 latency under 75ms and cryptographically signed decisions.

How it fits · explainer

The before / after, in one picture.

Where it hurts

You've heard this one before.

Classic detections miss indirect prompt injection via RAG content.
Guardrails that add 300ms latency nobody will ship.
False positives that block safe traffic and exhaust ops.
No record of what the attacker actually tried.

What we do about it

Three moves.

1
Multi-classifier ensemble.
Rule match + small classifier + LLM-judge quorum. 2,400+ labeled adversarial samples, updated weekly.
2
Sub-75 ms enforcement.
WASM-compiled, parallel-evaluated policies. Latency budget preserved even under burst.
3
Full attacker timeline.
Every suspected injection attempt is signed, chained, and replayable — for your red-team, your IR team, and your auditor.

Outcomes

Numbers, not adjectives.

94%

injection detection recall

< 1%

false-positive rate

74 ms

p99 end-to-end

Frequently asked

Asked, answered, sourced.

Yes. RAG content flows through the same classifier pipeline with context-aware rules. Suspicious retrieved content is flagged or stripped before reaching the model.

Weekly, with major-sample releases on new attack families (e.g., ASCII smuggling, prompt leak via tool descriptions). Customers can subscribe to the RSS of additions.

Yes. Upload a dataset; it becomes a versioned eval you can schedule and share. Your samples stay in your tenant — never train our models.