Security

MCP Gateways for Coding Agents: Security and Code Review Controls

Mar 22, 2026

Model Context Protocol, or MCP, is quickly becoming the default way to connect coding agents to repositories, ticketing systems, CI, browsers, secrets, and internal APIs. That is good news for interoperability. It is also a new review problem. Once an agent can reach real tools, a pull request is no longer just a code diff. It is a record of what the agent was allowed to touch, what it actually touched, and what evidence proves the run stayed inside the intended boundary.

Key Takeaways

MCP standardizes tool access for coding agents, but standardized access still needs policy.
Every new MCP server should be reviewed like a privileged dependency, not a harmless plugin.
The safest architecture routes agents through a gateway that enforces allowlists, scopes, and approvals.
Per-tool authorization matters more than coarse server-level trust when agents can write, deploy, or access secrets.
High-signal AI code review now needs capability diffs, action traces, and merge evidence for tool use.

TL;DR

MCP makes it easier to wire coding agents into real systems. That convenience can quietly expand blast radius unless teams review tool access as rigorously as they review code. Treat every MCP server as a privileged integration, enforce least-privilege authorization through a gateway, and require pull request evidence that shows tools exposed, actions executed, approvals triggered, and runtime outcomes observed.

Why this topic is breaking out right now

Several engineering feeds in early 2026 are pointing at the same change in operating model: coding agents are no longer interesting only because they write code. They are interesting because they can reach tools, execute workflows, and interact with production-adjacent systems.

On March 3, 2026,
Pragmatic Engineer's AI Tooling for Software Engineers in 2026
reported that 55% of respondents regularly use AI agents, and code review and code validation are among the most common agent use cases.
On February 14, 2026,
ByteByteGo's MCP vs RAG vs AI Agents
made the stack separation explicit: MCP is the tool access layer, not the retrieval layer and not the agent loop itself.
On January 27, 2026,
Latent Space's roundup on the MCP Apps open spec
showed the interface moving from raw JSON tool calls toward richer in-chat app surfaces. That means more agent power, not less.
On March 6, 2026,
Simon Willison's guide to agentic manual testing
made the practical point: once agents can execute software, proof and validation become a core part of the workflow.
On February 10, 2026, Simon pushed the idea further in
Showboat and Rodney
, which focused on producing demo artifacts that make agent work visible and reviewable.
Meanwhile, the official MCP documentation now includes
enterprise-managed authorization
, which is exactly the kind of control plane enterprise teams need once agents start using internal tools at scale.

Put those signals together and the article opportunity is clear: engineering teams now need an MCP review playbook. Tool access is becoming part of code review.

MCP changes what code review has to cover

Traditional pull request review asks whether the proposed code is correct, maintainable, and safe enough to merge. MCP adds a second dimension: what external capabilities the agent was able to use while producing that code. A clean diff can still hide an unsafe workflow if the agent had broad tool access, modified external state, or relied on unreviewed server behavior during execution.

1. Code diff

Files changed, tests updated, dependencies added, and merge risk inside the repository.

2. Capability diff

Which MCP servers and tools were available, added, removed, or newly allowed for this run.

3. Authorization diff

Scopes, token lifetime, approval requirements, and whether access was granted at the server level or the tool level.

4. Runtime evidence

Tool calls executed, side effects observed, and artifacts proving that the changed behavior actually worked.

This is the same shift we described in our guides to

evidence-first AI code review

and

session provenance

. MCP simply makes the hidden part bigger. If a reviewer can only see the diff, they are no longer seeing the whole change.

Treat every MCP server like a privileged dependency

Teams often install MCP servers with the same mindset they use for editor extensions: this looks useful, so wire it in and keep moving. That framing is too casual. An MCP server can expose file system access, repo writes, browser automation, issue tracker actions, package publishing, secrets lookup, or deployment triggers. In other words, it can expand the agent blast radius faster than any code generation feature ever could.

Questions to ask before approving a new MCP server

What systems can this server read, write, or trigger?
Which tools are safe by default, and which require approval every time?
Does authorization happen per server or per tool?
Can the server surface secrets, tokens, or customer data into agent context?
What logs, transcripts, or traces exist if something goes wrong?
How do you revoke access quickly without breaking the whole workflow?

If that checklist feels familiar, it should. It is the same least-privilege reasoning we recommended in our post on

coding agent API keys

. MCP does not remove that problem. It packages it into a more portable interface.

The security risk is not theoretical either. If you want a concrete example of why tool integration deserves scrutiny, our analysis of the

Cursor MCP vulnerability

shows how quickly prompt injection and configuration trust can become execution risk.

Build a gateway, not a direct mesh

The most robust operating model is not agent to every server directly. It is agent to gateway to approved servers. A gateway creates one place to manage discovery, policy, approval, rate limits, auditing, and revocation. That turns MCP from a loose collection of integrations into a reviewable platform surface.

Pattern	Benefit	Main weakness
Agent to server directly	Fast setup and low coordination overhead	Policy drifts across clients and revocation becomes messy
Agent to local allowlist proxy	Basic control over discovery and blocking	Weak enterprise visibility and limited audit depth
Agent to enterprise gateway to servers	Central policy, approvals, logs, and scope management	Needs platform work and clear ownership

This is also why internal developer platforms are suddenly interested in MCP. Once agents become normal, the gateway becomes the enforcement layer between human intent and external tool execution. That is a natural extension of the principles in our

agent-first CLI design

playbook.

Use per-tool authorization and risk tiers

Coarse trust is not enough. Saying a server is approved tells you very little if one tool is harmless read-only search and another can rotate credentials or deploy to production. Whenever the platform supports it, authorization should happen per tool or per capability bundle, not only per server.

Tool class	Examples	Default policy
Low risk	Repo search, docs lookup, read-only issue queries	Allow with logging
Medium risk	Branch write, pull request comments, test reruns	Allow with session provenance
High risk	Dependency updates, package publishing, browser actions with side effects	Require explicit approval
Critical risk	Secrets access, IAM changes, production deploys, data deletion	Deny by default or dual approval

Good review systems then map those risk tiers into merge policy. A documentation edit backed by read-only tools can move fast. A schema migration that relied on database tools and deployment helpers needs stronger evidence and likely another human. This is the same routing logic we recommended in our guide to

coding agent guardrails and review gates

The pull request evidence pack for MCP activity

Reviewers should not have to reconstruct tool behavior from raw chat logs. The pull request needs a compact evidence pack that explains what capabilities were exposed and what actually happened during execution.

Minimum MCP evidence pack

Server inventory: names, versions, owners, and whether each server is centrally managed
Tool inventory: the exact tools exposed to the session and which were newly enabled
Authorization summary: scope, token lifetime, approval policy, and who approved what
Action trace: executed tool calls, timestamps, targets, and side effects
Validation artifacts: tests, screenshots, demo traces, or browser recordings for changed behavior
Policy result: pass, fail, or waived, with a reason reviewers can evaluate

Notice how similar this is to our guidance on

AI rewrite review artifacts

. The difference is that MCP adds a capability layer. Large AI changes are hard to review because reviewers cannot hold the whole system in their head. Evidence packs solve that by summarizing both code intent and tool behavior.

What to measure once the gateway is live

If you do not measure the control plane, you will not know whether the policy is reducing risk or just adding ceremony. The right metrics are not vanity counts of how many tools were exposed. They are outcome metrics tied to reviewer confidence and incident prevention.

Unauthorized tool attempts

Shows where agents or prompts keep pushing against risky boundaries.

Approval latency

Shows whether policy routing is practical or quietly blocking delivery.

Evidence completeness

Shows how often pull requests arrive with enough tool context to review fast.

Resolved findings per risk tier

Shows whether the review loop is actually catching and fixing meaningful issues.

For broader workflow health, combine these with the operational metrics in our posts on

code review queue health

and

verification layer resolution rate

. The objective is not more policy. The objective is safer merges with less reviewer guesswork.

A 30-day rollout plan

Most teams do not need a six-month platform program to get started. The practical sequence is visibility first, then policy, then blocking enforcement for the most dangerous actions.

Rollout sequence

Week 1: inventory current MCP servers, tools, owners, and trust assumptions
Week 2: route all agent sessions through a gateway in observe-only mode
Week 3: require approvals for high-risk tools and attach evidence packs to pull requests
Week 4: block direct access to critical tools and review every capability diff before merge

If your team is already dealing with parallel agent output, combine this rollout with the branch and ownership rules in our

parallel coding agents guide

. Tool governance and branch governance reinforce each other.

How Propel fits this workflow

Propel is designed for the part of the workflow where raw agent output becomes merge decisions. That means surfacing risk, organizing evidence, and helping reviewers focus on the few signals that actually change the outcome. In an MCP-heavy environment, that includes more than code comments. It includes capability changes, tool traces, validation artifacts, and policy status in one review surface.

The broader lesson is simple: if tool access expands, review scope must expand with it. A good MCP rollout is not just an integration project. It is a code review design problem.

FAQ

Is MCP itself unsafe?

No. MCP is a protocol, not a policy. The risk comes from how much authority you attach to servers and tools, how access is granted, and how much evidence you require before merge.

Do we always need an enterprise gateway?

Small teams can start with a narrow allowlist and local controls. As soon as multiple clients, shared servers, or high-risk tools enter the picture, a gateway becomes the cleanest way to keep policy consistent.

What should we enforce first?

Start with server inventory, per-tool risk classification, and action logging. Those three changes create the visibility you need before you turn on blocking approvals.

Closing perspective

MCP is making coding agents more useful because it makes the world around the repository more reachable. That same reach is why engineering leaders should care. The winning pattern is not to slow agents down. It is to make their tool access legible, scoped, and reviewable. Teams that do that will get the upside of agentic software delivery without handing every prompt a hidden production control plane.

Comparison

LM Arena Coding Leaderboard: Insights for Developers

A current May 2026 snapshot of the LM Arena Code Arena leaderboard, what changed, and how engineering teams should turn rankings into safer model routing.

May 27, 2026

Best Practices

AI-Resistant Technical Evaluations: How to Review Engineers in the Coding-Agent Era

Technical interviews and take-homes need to change now that coding agents can beat legacy exercises. Use this playbook to evaluate steering, verification, and judgment instead of pretending AI is absent.

May 26, 2026