Microsoft Open-Sources RAMPART and Clarity for AI Agent Security Testing

Microsoft has released two open-source tools — RAMPART and Clarity — that bring security testing directly into the AI agent development process. The announcement came on May 20, 2026, on the Microsoft Security Blog, and the tools were made available to developers immediately. RAMPART automates red-team testing in CI pipelines, while Clarity verifies design assumptions before a single line of code is written.

TL;DR: Microsoft published RAMPART (Risk Assessment and Measurement Platform for Agentic Red Teaming) and Clarity as open-source projects. RAMPART is a pytest-native framework for repeatable agent security testing in CI, built on top of the earlier PyRIT tool. Clarity is a tool for verifying product assumptions before implementation. Both tools were created to shift security testing from a final-stage review into developers’ daily workflow. They were released on May 20, 2026.

What is RAMPART and how does it work in practice?

RAMPART, or Risk Assessment and Measurement Platform for Agentic Red Teaming, is a testing framework natively integrated with pytest, designed for writing and running security tests for AI agents. It was built so that developers can encode adversarial and benign scenarios as repeatable tests that run in CI. The tool builds on Microsoft’s earlier project, PyRIT, which was covered in the context of exploiting key AI agent benchmarks.

The key advantage of RAMPART is its ability to turn red-team findings and AI incidents into permanent regression coverage. When testers discover a vulnerability, a developer writes a test that checks for that vector on every subsequent build. What’s more, the framework can automatically generate attack variants from a single base vector.

During a presentation for The Register, Microsoft engineer Kumar shared concrete numbers: RAMPART generated nearly 100 different variants from a single attack vector and then ran close to 300 tests on a single asset. This approach goes far beyond what manual testing can achieve.

Microsoft’s approach is worth examining because it combines automation with repeatability. The framework requires no special infrastructure — it runs in the standard Python/pytest ecosystem, lowering the barrier to entry for development teams.

What is Clarity and why was it created?

Clarity is a tool described as a “structured sounding board” — helping teams verify that they are building the right thing before writing any code. It was designed as a response to a specific problem: many AI agent security vulnerabilities stem from flawed design assumptions, not from implementation issues.

Above all, Clarity forces product and engineering teams to discuss agent behavior in a structured format. Instead of building an agent and testing it after the fact, the team walks through defined questions and scenarios at the design stage. This approach is described in detail on the Microsoft Security Blog.

The tool does not generate code or run tests — its purpose is to catch flawed assumptions early. For example, a team can define how an agent should respond to manipulation attempts, what its permission boundaries are, and how it handles sensitive data before anyone starts writing business logic.

While Clarity may seem simpler than RAMPART, its value lies in preventing problems at the source. The cost of fixing a design flaw grows exponentially over time — which is why verifying assumptions before coding makes economic sense.

How do RAMPART and Clarity work together in the development workflow?

Microsoft designed both tools as complementary stages of a single workflow. Clarity operates at the pre-code stage — before implementation. RAMPART steps in after code is written, in CI pipelines. Together, they cover the full AI agent development lifecycle, from concept to production deployment.

According to DevOps.com, Microsoft’s goal is to shift security testing from the final review stage directly into the daily workflow. Traditionally, security entered the picture at the end — now tests are part of the build pipeline, just like unit or integration tests.

The workflow looks like this: first, the team goes through a Clarity exercise, defining agent behaviors, security boundaries, and edge-case scenarios. Then they implement the agent, and RAMPART runs adversarial tests in CI on every commit. If a security incident occurs in production, a new regression test is added.

The following table breaks down the roles of both tools:

Feature	Clarity	RAMPART
Workflow stage	Pre-code (design)	Post-code (CI/CD)
Activity type	Assumption verification	Adversarial test automation
Format	Structured discussion	pytest-native framework
Output	Assumptions document	Regression coverage in CI
Integration	Analytical tool	Native pytest integration

This approach makes sense because it combines prevention with automation.

What AI agent security problems do these tools solve?

AI agents differ from traditional applications in that their behavior is non-deterministic — the same input can generate different responses. Furthermore, agents have access to external tools: databases, APIs, file systems. The combination of these characteristics creates an attack surface that is difficult to test with conventional methods.

That is why Microsoft created tools covering attack vectors specific to agents. RAMPART enables testing of adversarial scenarios, such as prompt manipulation attempts, privilege escalation, and data exfiltration. The framework automatically generates attack variants, which would be time-consuming with manual testing.

Clarity, in turn, addresses the problem at an earlier stage — bad agent design. If an agent has overly broad permissions, lacks proper guardrails, or has undefined behavioral boundaries, no post-code testing will fix that. As described in the context of RCE vulnerabilities in AI agent frameworks, many vulnerabilities stem from fundamental architectural mistakes.

The main categories of problems addressed by both tools:

Prompt injection — RAMPART tests attack variants, Clarity forces teams to define agent responses
Privilege escalation — RAMPART checks whether an agent exceeds its granted permissions
Data exfiltration — tests verify whether the agent discloses sensitive information
Unintended actions — Clarity helps define what an agent should NOT do
Supply chain attacks — similar to those described in the npm Axios compromise
Missing guardrails — Clarity forces discussion about security boundaries
Inconsistent behaviors — RAMPART runs tests multiple times, catching non-determinism
Lack of regression coverage — RAMPART turns incidents into permanent tests

I recommend both tools to teams building AI agents, as they cover different stages of the development cycle and together provide a more complete security picture.

Why did Microsoft release these tools as open source?

The decision to publish RAMPART and Clarity under an open-source license fits into Microsoft’s broader AI security strategy. The company has been releasing security tools to the community for years — similar to the Open Source security at Astral discussed earlier on this blog, where the community co-creates security standards.

According to CyberScoop, Microsoft treats these tools as aids for incident responders and developers facing a growing number of attacks on agentic systems. Releasing the code as open source allows the community to audit, modify, and extend functionality.

Moreover, open source means transparency — security teams can thoroughly examine how the tools work under the hood, adapt them to their needs, and report bugs. This is particularly important for security tools, where trust in closed-source code can be problematic.

Microsoft is also building an ecosystem around AI agent security in this way. RAMPART is based on the earlier PyRIT project, so open source enables integration with existing tools and workflows. A similar approach was described in Open Source AI in March 2026, where code openness was becoming an industry standard.

It’s worth checking both tools’ repositories, as the documentation includes usage examples and integration instructions for existing CI pipelines. Publishing as open source is also a signal to the market — AI agent security is a priority, not an afterthought tacked onto the end of the process.

What are the technical requirements for running RAMPART?

RAMPART was designed as a framework natively integrated with pytest, which means it requires a standard Python environment and the pytest library. According to the Microsoft Security Blog, the tool builds on the earlier PyRIT project, so familiarity with that ecosystem eases adoption. The framework runs in standard CI pipelines without special infrastructure.

The key requirement is a Python environment with pytest installed. RAMPART uses pytest’s native mechanisms, so agent security tests run just like regular unit tests. This lowers the barrier to entry for teams already using pytest in their projects.

According to Let’s Data Science, RAMPART allows writing and running security tests as repeatable scenarios in CI. The framework automatically generates attack variants from a single base vector — during a presentation for The Register, engineer Kumar demonstrated generating nearly 100 variants from a single vector.

Furthermore, CI integration requires only adding a pytest call to the pipeline. There’s no need to install additional agents or servers — the framework operates as part of the standard application build process.

What does a typical test scenario look like in RAMPART?

A typical test scenario in RAMPART starts with defining an attack vector as a pytest test. According to The Hacker News, the framework allows encoding adversarial and benign scenarios as repeatable tests that run in CI. The developer defines the expected agent behavior and verifies that the agent does not fall for manipulation.

For example, a team can write a test checking whether an agent refuses to execute instructions outside its permissions. RAMPART automatically generates variants of this manipulation attempt — different phrasings, contexts, and languages. As Kumar told The Register, nearly 100 variants were generated from a single vector, and then close to 300 tests were run on a single asset.

When a test detects a vulnerability, the developer adds it to the regression coverage. On every subsequent build, the same scenario runs again, preventing the fixed bug from reappearing. This approach is analogous to unit testing, but addresses security behaviors.

Define an attack vector as a pytest test
Automatically generate variants from the base vector
Run tests multiple times on the same asset
Add production incidents as new regression tests
Integrate with CI pipelines on every commit
Verify adversarial and benign behaviors simultaneously
Report results in standard pytest format
Build permanent security regression coverage

How does Clarity support teams at the design stage?

Clarity functions as a structured sounding board that helps teams verify product assumptions before implementation. According to the Microsoft Security Blog, the tool forces product and engineering teams to discuss agent behavior through defined questions and scenarios. Its purpose is to catch flawed assumptions before anyone starts writing code.

Importantly, Clarity does not generate code or run technical tests. Instead, it guides the team through a set of questions about the agent’s security boundaries, responses to manipulation, sensitive data handling, and permissions. As described on Cryptonomist, Microsoft wants to move security checks closer to the daily workflow of building agentic software, rather than leaving them solely for the final review stage.

The tool is particularly valuable for teams building their first AI agents. It forces discussion of edge-case scenarios that are easy to overlook during design. For example: what does the agent do when a user requests access to data outside their role? How does it respond to contradictory instructions? Clarity helps structure these questions.

While the tool may seem simple, its value lies in preventing problems at the source. The cost of fixing a design flaw grows exponentially over time — verifying assumptions before coding makes economic sense.

Which organizations can benefit most from these tools?

RAMPART and Clarity are primarily aimed at teams building AI agents with access to external tools and data. According to CyberScoop, Microsoft created them as aids for incident responders and developers facing a growing number of attacks on agentic systems. Organizations deploying agents in production environments will see the greatest benefits.

Teams with pytest experience will be able to adopt RAMPART faster, since the framework is natively integrated with that ecosystem. Meanwhile, organizations in the early stages of building agents will benefit from Clarity, which helps avoid the fundamental architectural mistakes described in the context of RCE vulnerabilities in AI agent frameworks.

According to DevOps.com, Microsoft’s goal is to shift security testing from the final review stage into developers’ daily workflow. Organizations with mature CI/CD processes will be able to integrate RAMPART into existing build pipelines.

Teams building AI agents with access to APIs and databases
Organizations with mature CI/CD pipelines using pytest
Incident responders analyzing attacks on agentic systems
Companies in the early stages of designing agent architecture
Red teams testing the security of production agents

How does RAMPART build on the PyRIT foundation?

RAMPART was built on Microsoft’s earlier project, PyRIT, which was covered in the context of exploiting key AI agent benchmarks. According to Let’s Data Science, The Hacker News confirms that RAMPART builds on PyRIT, extending its capabilities with pytest integration and CI automation.

PyRIT was a tool for red-teaming generative AI, but it lacked native CI pipeline integration. RAMPART extends this foundation by adding the ability to encode scenarios as repeatable pytest tests. Developers can now turn red-team findings into permanent regression coverage that runs on every build.

The relationship between both tools is complementary. PyRIT provides the methodology and adversarial testing techniques, while RAMPART adds a layer of automation and developer workflow integration. A similar approach was described in the context of Open Source security at Astral, where the community co-creates security standards iteratively.

Frequently Asked Questions

Does RAMPART replace manual red-team testing?

No, RAMPART automates repeatable scenarios but does not eliminate the need for creative red-teaming. As Kumar demonstrated for The Register, nearly 100 variants were generated from a single vector — a scale impossible to achieve manually. The tool complements red-teamers’ work by turning their findings into regression tests.

Does Clarity require a local installation?

Clarity is an analytical tool for verifying assumptions, not a testing framework. According to the Microsoft Security Blog, it works as a structured discussion before coding. It does not require CI infrastructure or a Python environment — it guides the team through questions about agent behavior.

How quickly can RAMPART be adopted in an existing project?

Adoption requires adding pytest dependencies and writing test scenarios. Since RAMPART is natively integrated with pytest, teams familiar with that framework can get started quickly. According to WinBuzzer, the tool was designed for CI integration from the ground up, which minimizes configuration overhead.

Do the tools support agents built outside the Microsoft ecosystem?

Yes, both tools are open source and independent of any cloud platform. According to CyberScoop, Microsoft released them as aids for incident responders and developers in general, without restrictions to Azure. RAMPART requires Python and pytest, while Clarity is a technologically independent analytical tool.

Summary

RAMPART and Clarity are two complementary tools covering different stages of the AI agent lifecycle. Key takeaways:

Clarity verifies design assumptions before a single line of code is written, preventing architectural mistakes
RAMPART automates security testing in CI, generating nearly 100 variants from a single attack vector
Both tools are used together — Clarity at the design stage, RAMPART after implementation
Open source means transparency, auditability, and the ability to adapt to your own needs
The framework builds on the PyRIT foundation, adding pytest integration and CI automation

Check out the RAMPART and Clarity repositories on GitHub and read the full article on the Microsoft Security Blog. If your team is building AI agents with access to external tools, these tools can help shift security from a final-stage review directly into your daily workflow.