Why Agentic Pentesting Can’t Fix the False Positive Problem

Agentic pentesting promises smarter orchestration of tools, but it does not magically eliminate false positives. At its core, an agent still leans on the same scanners, payload generators, and detection heuristics that produced noisy results in the first place. If the underlying tools misclassify behavior or lack application context, the agent simply becomes a faster, more automated way to generate and route those misclassifications. In other words, you risk “scaling the noise” as much as scaling the signal.

Another limitation is that most agentic systems still struggle with business context and intent, which is where many false positives are born. A finding that looks critical in HTTP traces might be benign in the real-world workflow because of compensating controls, domain‑specific logic, or risk acceptance decisions that only humans understand. Agents can replay exploits and correlate signals, but they cannot reliably answer questions like “Is this test user data or real PII?” or “Would exploiting this actually harm the business?” Without that judgment, they often cannot confidently close the loop on whether something is truly a vulnerability or just an academic issue.

Finally, agentic pentesting introduces its own new sources of error that can masquerade as false positives. Misconfigured prompts, overly broad goals, or aggressive automation can lead agents to test unsupported flows, mis-handle authentication, or misinterpret application responses. These mistakes can create “findings” that look real on paper but collapse under minimal human scrutiny. So while agentic approaches can help prioritize, group, and sometimes auto‑retest issues, they do not remove the need for human validation; they merely change where you spend your validation effort—from sifting through raw scanner output to scrutinizing AI‑curated results.

SSRF in Azure MCP Server Tools

In Microsoft's March 2026 Patch Tuesday release on March 10, an urgent high-severity vulnerability, CVE-2026-26118, emerged in Azure Model Context Protocol (MCP) Server Tools. This server-side request forgery (SSRF) flaw, scored at CVSS 8.8, allows low-privileged attackers to manipulate user-supplied inputs and force the server into making unauthorized outbound requests to attacker-controlled endpoints. MCP, designed to standardize AI model integrations with external data sources, unexpectedly became a vector for privilege escalation in AI-driven Azure environments, highlighting the growing risks in agentic AI architectures.

At its core, exploitation involves crafting malicious payloads that trick the MCP server—running versions prior to 2.0.0-beta.17—into leaking its managed identity token. Attackers can then impersonate the server's identity to access sensitive Azure resources like storage accounts, virtual machines, or databases, all without needing admin rights or user interaction. Public proof-of-concept exploits, such as those on GitHub, amplify the threat, enabling rapid weaponization in targeted attacks against organizations leveraging MCP for AI workflows. This vulnerability underscores a classic SSRF pattern (CWE-918) but tailored to cloud-native AI tools, where broad service principals often grant excessive permissions.

Organizations should prioritize patching via Microsoft's Security Update Guide, audit MCP deployments for over-privileged identities, and implement outbound request filtering to contain risks. As AI security evolves, this incident signals the need for runtime protections in MCP-based systems, including token rotation and anomaly detection for AI agent traffic. Application security teams, especially those testing AI integrations, can use tools like Burp Suite to validate fixes against SSRF payloads. Staying vigilant ensures AI innovation doesn't outpace defense in the cloud.

Reference - https://www.tenable.com/cve/CVE-2026-26118

Supply Chains and AI: Decoding OWASP Top 10 2026 Changes

OWASP’s 2026 Top 10 reflects how quickly modern application threats are evolving, especially with AI-heavy and highly distributed architectures. The list continues to emphasize long-standing problems like Broken Access Control and Cryptographic Failures, but the new edition elevates security misconfigurations and software supply chain issues as first-class risks. This shift acknowledges that complex CI/CD pipelines, third‑party services, and AI-powered components have dramatically expanded the attack surface beyond just your own code.

A key change in 2026 is the explicit spotlight on software supply chain failures and the mishandling of exceptional conditions. These categories capture real‑world issues such as compromised libraries, poisoned models, insecure infrastructure-as-code templates, and fragile error handling that leads to data leakage or privilege escalation. Rather than treating these as edge cases, OWASP now frames them as systemic risks that can undermine even well‑written business logic. For teams shipping fast, this is a wake‑up call that “secure by default” must include dependencies, pipelines, and runtime behavior—not just input validation and authentication.

The importance of the 2026 Top 10 lies in how it guides priorities for engineering, security architecture, and governance. It gives product and security leaders a shared vocabulary to justify investments in SBOMs, dependency scanning, secure AI integration patterns, and runtime protection. For practitioners, it acts as a practical roadmap: threat modeling features around these categories, aligning test cases and code reviews with them, and measuring progress over time. In a world where AI agents, APIs, and microservices are deeply interwoven, using the updated OWASP Top 10 as a baseline can be the difference between a resilient platform and one supply‑chain incident away from a major breach.