Getting Started with the Claude Code Security Plugin: Check 25 High-Risk Vulnerability Types While You Code, and How the Three-Layer Review Works
AI can now write code much faster, but “fast” also means unsafe patterns can be copied into a codebase faster too: hardcoded secrets, SQL injection, unsafe deserialization, and more. If the model is not careful, these issues can slip in. Anthropic has released a free security-guidance plugin for Claude Code: it reviews and fixes vulnerabilities in real time inside the same coding session. This article explains how to install it, what each layer of its three-layer review handles, what types of issues it can detect, and where its limits are.

1. Install It with One Command
Run this inside a Claude Code session:
/plugin install security-guidance@claude-plugins-official
/plugin install security-guidance@claude-plugins-official
During installation, Claude Code will ask you to choose a scope. Choose user scope. The plugin will be written to your user settings, so every new local session on this machine will load it automatically. You can also browse and install it from the marketplace with /plugins. After installation, run /reload-plugins.
Prerequisites:
- Claude Code version 2.1.144 or later
- Python 3.8+
- On first run, the plugin creates a virtual environment under
~/.claude/security/and installs the Claude Agent SDK, so it needs pip and network access
Once installed, you do not need to call it manually. There is no command to remember. The plugin runs automatically.
2. The Three-Layer Review Mechanism
The plugin works at three checkpoints, with increasing depth:
| Layer | Trigger | Technique | Cost | What it catches |
|---|---|---|---|---|
| 1. Per-edit pattern warning | Every Edit/Write | Regex matching for ~25 risky patterns | Zero: no AI inference | Known dangerous constructs |
| 2. End-of-turn diff review | When Claude finishes a turn | Fast LLM call, defaulting to Opus 4.7 | One model call | Logic-level vulnerabilities |
| 3. Agentic commit review | On git commit | SDK-driven review that reads related files and traces data flow | Multiple tool calls | Cross-file vulnerabilities |
Layer 1: Per-Edit Regex (~25 Classes, Zero Cost)
Every time Claude uses Edit/Write, the plugin gives immediate warnings by regex-matching around 25 known dangerous patterns. No AI inference is needed, so there is no extra usage cost. Covered risky constructs include:
eval(), new Function() # Arbitrary code execution
os.system(), child_process.exec() # Command injection
pickle.load (untrusted data) # Unsafe deserialization
yaml.load (instead of safe_load)
torch.load(weights_only=False)
eval(), new Function() # Arbitrary code execution
os.system(), child_process.exec() # Command injection
pickle.load (untrusted data) # Unsafe deserialization
yaml.load (instead of safe_load)
torch.load(weights_only=False)
Frontend-side DOM injection vectors:
element.innerHTML = userInput // XSS
dangerouslySetInnerHTML={{__html}} // React XSS
element.innerHTML = userInput // XSS
dangerouslySetInnerHTML={{__html}} // React XSS
It also catches hardcoded secrets, API keys, and similar risks.
Layer 2: End-of-Turn LLM Diff Review
When Claude finishes a turn, the plugin sends the diff to a fast LLM call, defaulting to Opus 4.7. High-risk findings are fed back to Claude so it can fix them before you even see the response.
The key detail: this reviewer starts from a fresh context. It has no “attachment” to the original implementation plan, so it can catch logic-level vulnerabilities that string matching cannot: authorization bypass, insecure direct object references (IDOR), server-side request forgery (SSRF), and weak cryptography.
Layer 3: Agentic Commit Review
When a git commit happens, an SDK-driven reviewer uses Read/Grep/Glob to inspect related files and trace cross-file data flow. This catches multi-file vulnerabilities that simple pattern matching can miss, such as IDOR, auth bypass, and cross-file SSRF.
Overall, the plugin covers common web vulnerability classes: injection, XSS, SSRF, hardcoded secrets, IDOR, authorization bypass, unsafe deserialization, path traversal, and more.
3. Important Limits
The plugin sends violations to Claude as findings for Claude to fix, but:
- ❌ It does not block writes, and it does not guarantee every violation will be caught
- ✅ It is a best-effort assistant, not a guarantee
Do not treat it as a replacement for human code review, SAST/DAST, dependency scanning, or penetration testing. Treat findings as recommendations, not final approval.
4. Customize Rules for Your Team
You can extend the built-in rules, but not disable them, using two repository-level files:
.claude/claude-security-guidance.md # Describe your threat model and review checklist in natural language
# Loaded as additional context during model-backed reviews
.claude/security-patterns.yaml # Add regex/subString rules to the per-edit check
# Runs as deterministic string matching
.claude/claude-security-guidance.md # Describe your threat model and review checklist in natural language
# Loaded as additional context during model-backed reviews
.claude/security-patterns.yaml # Add regex/subString rules to the per-edit check
# Runs as deterministic string matching
Teams can declare the plugin in .claude/settings.json to require all members to enable it. Administrators can also push it organization-wide through managed settings.
5. Pair It with the Claude API for Batch Pre-Commit Security Reviews
The plugin focuses on “the moment code is being written.” If you also want an independent security review of the full PR diff during CI, you can build a lightweight check with the Claude API. claudeapi.com is compatible with the Anthropic SDK, so you only need to replace base_url:
from anthropic import Anthropic
client = Anthropic(
api_key="sk-...", # Get this from the claudeapi.com console
base_url="https://gw.claudeapi.com",
)
def review_diff(diff_text: str) -> str:
resp = client.messages.create(
model="claude-opus-4-7", # Use the strongest reasoning model for security review
max_tokens=2000,
system=(
"You are a senior security reviewer. Review the following diff and report only high-risk issues: "
"injection, XSS, SSRF, hardcoded secrets, IDOR, authorization bypass, unsafe deserialization, path traversal. "
"For each item, provide: file:line, vulnerability type, and recommended fix. If there are no high-risk issues, return 'PASS'."
),
messages=[{"role": "user", "content": diff_text}],
)
return resp.content[0].text
from anthropic import Anthropic
client = Anthropic(
api_key="sk-...", # Get this from the claudeapi.com console
base_url="https://gw.claudeapi.com",
)
def review_diff(diff_text: str) -> str:
resp = client.messages.create(
model="claude-opus-4-7", # Use the strongest reasoning model for security review
max_tokens=2000,
system=(
"You are a senior security reviewer. Review the following diff and report only high-risk issues: "
"injection, XSS, SSRF, hardcoded secrets, IDOR, authorization bypass, unsafe deserialization, path traversal. "
"For each item, provide: file:line, vulnerability type, and recommended fix. If there are no high-risk issues, return 'PASS'."
),
messages=[{"role": "user", "content": diff_text}],
)
return resp.content[0].text
Connect it to CI, and you get a two-stage defense: real-time checks while coding, plus an independent review before merge.
6. Summary
- Install it with one command:
/plugin install security-guidance@claude-plugins-official. It runs automatically with no manual command needed. - Three-layer review: per-edit regex checks (~25 classes, zero cost) + end-of-turn LLM diff review (Opus 4.7, catches logic flaws) + agentic cross-file review at commit time.
- It is a best-effort assistant. It does not block writes and does not replace human review or professional security tools.
- For an extra CI-stage review, build your own check with the Claude API and pair it with the IDE plugin for a two-stage defense.
For model pricing and integration docs, see claudeapi.com. The console is available at console.claudeapi.com.



