Implementing GenAI Red Teaming - The OWASP Way

Feb 20, 2025

Red Teaming is a time-tested approach to testing and bolstering cyber system security. That said, it often needs to evolve as technology does. The latest example of this is the exponential growth of GenAI and LLMs. We’ve seen red teaming emphasized in sources such as the EU’s AI Act and the NIST AI Risk Management Framework (RMF).

However, given the nascent and emerging technology, organizations may wonder where and how to get started with red teaming for GenAI. The OWASP publication “GenAI Red Teaming Guide: A Practical Approach to Evaluating AI Vulnerabilities” is so timely.

This article will look at key recommendations and takeaways from the guide.

Interested in sponsoring an issue of Resilient Cyber?

This includes reaching over 30,000 subscribers, ranging from Developers, Engineers, Architects, CISO’s/Security Leaders and Business Executives

Reach out below!

-> Contact Us <-

What is it?

It helps to first define red teaming, especially in the context of GenAI. OWASP defines it as a “structured approach to identify vulnerabilities and mitigate risks across AI systems” that combines traditional adversarial testing with AI-specific methodologies and risks. This includes aspects of GenAI systems such as models, deployment pipelines, and the various interactions within the broader system context.

OWASP emphasizes the roles of tools, technical methodologies, and cross-functional collaboration, including threat modeling, scenarios, and automation, all underpinned by human expertise.

Some key risks include prompt injection, bias and toxicity, data leakage, data poisoning, and supply chain risks, several of which you likely will recognize from the OWASP LLM Top 10 Risks.

To effectively implement any sort of red teaming engagement, some key steps are required, such as:

Defining objectives and scope
Assembling the team
Threat Modeling
Addressing the entire application stack
Debriefing, Post-Engagement Analysis, and Continuous Improvement

In short, GenAI Red Teaming complements traditional red teaming by focusing on AI-driven systems' nuanced and complex aspects. This includes accounting for new testing dimensions such as AI-specific threat modeling, model reconnaissance, prompt injection, guardrail bypass, and more.

Scope

As discussed above, GenAI Red Teaming builds on traditional red teaming by covering unique aspects of GenAI, such as models, model output, and the output and responses from models. GenAI Red Teams should examine how models can be manipulated to produce misleading and false outputs or “jailbroken,” allowing them to operate in ways that weren’t intended. Teams should also determine if data leakage can occur, all of which are key risks consumers of GenAI should be concerned with. OWASP recommends that testing consider both the adversarial perspective and that of the impacted user.

Leveraging NIST’s AI RMF GenAI Profile, OWASP’s guide recommends structuring AI Red Teaming to consider the lifecycle phases (e.g., design, development, etc.), the scope of risks such as model, infrastructure, and ecosystem, and the source of the risks.

Risks

As we have discussed, GenAI presents some unique risks, including model manipulation and poisoning, bias, and hallucinations, among many others, as depicted in the image above. For these reasons, OWASP recommends a comprehensive approach that has four key aspects:

Model evaluation
Implementation testing
System evaluation
Runtime analysis

These risks are looked at from three perspectives as well: security (operator), safety (users), and trust (users). OWASP categorizes these risks into three key areas:

Security, privacy, and robustness risk
Toxicity, harmful context, and interaction risk
Bias, content integrity, and misinformation risk

Agentic AI, in particular, has received tremendous attention from the industry, with leading investment firms such as Sequoia calling 2025 “the year of Agentic AI.” OWASP specifically points out multi-agent risks such as multi-step attack chains across agents, exploitation of tool integrations, and permission bypass through agent interactions.

OWASP recently produced v1 of their “Agentic AI—Threats and Mitigations” publication, including a multi-agent system threat model summary. Below are the numerous potential interaction points and activities that represent attack vectors, and you can read their full details in the agentic AI publication.

Threat Modeling for GenAI/LLM Systems

OWASP recommends threat modeling as a key activity for GenAI Red Teaming and cites MITRE ATLAS as a great resource to reference. Threat modeling is done to systematically analyze the system's attack surface and identify potential risks and attack vectors.

I have covered MITRE ATLAS previously, including:

Key considerations include the model's architecture, data flows, and how the system interacts with the broader environment, external systems, data, and sociotechnical aspects such as users and behavior. OWASP, however, points out that AI and ML present unique challenges because models may behave unpredictably because they are non-deterministic and probabilistic.

GenAI Red Teaming Strategy

Each organization's GenAI red teaming strategy may look different. OWASP explains that the strategy must be aligned with the organization's objectives, which may include unique aspects such as responsible AI goals and technical considerations.

GenAI red teaming strategies should consider various aspects as laid out in the above image, such as risk-based scoping, engaging cross-functional teams, setting clear objectives, and producing both informative and actionable reporting.

Blueprint for GenAI Red Teaming

Once a strategy is in place, organizations can create a blueprint for conducting GenAI red teaming. This blueprint provides a structured approach and the exercise's specific steps, techniques, and objectives.

OWASP recommends evaluating GenAI systems in phases, including models, implementation, systems, and runtime, as seen below.

Each of these phases has key considerations, such as the model's provenance and data pipelines, testing guardrails that are in place for implementation, examining the deployed systems for exploitable components, and targeting runtime business processes for potential failures or vulnerabilities in how multiple AI components interact at runtime in production.

This phased approach allows for efficient risk identification, implementing a multi-layered defense, optimizing resources, and pursuing continuous improvement. Tools should also be used for model evaluation to support speed of evaluation, efficient risk detection, consistency, and comprehensive analysis. The complete OWASP GenAI Red Teaming guide provides a detailed checklist for each blueprint phase, which can be referenced.

Essential Techniques

While there are many possible techniques for GenAI Red Teaming, it can feel overwhelming to determine what to include or where to begin. OWASP does, however provide what they deem to be “essential” techniques.

These include examples such as:

Adversarial Prompt Engineering
Dataset Generation Manipulation
Tracking Multi-Turn Attacks
Security Boundary Testing
Agentic Tooling/Plugin Analysis
Organizational Detection & Response Capabilities

This is just a subset of the essential techniques, and the list they provide represents a combination of technical considerations and operational organizational activities.

Maturing

As with traditional Red Teaming, GenAI Red Teaming is an evolving and iterative process in which teams and organizations can and should mature their approach both in tooling and in practice.

Due to AI's complex nature and its ability to integrate with several areas of the organization, users, data, and more, OWASP stresses the need to collaborate with multiple stakeholder groups across the organization, conduct regular synchronization meetings, have clearly defined processes for sharing findings, and integrate existing organizational risk frameworks and controls.

The team conducting GenAI Red Teaming should also evolve to add additional expertise as needed to ensure relevant skills evolve alongside the rapidly changing nature of the GenAI technology landscape.

Best Practices

The OWASP GenAI Red Teaming Guide closes out by listing some key best practices organizations should consider more broadly. These include examples such as establishing GennAI policies, standards, and procedures and establishing clear objectives for each red teaming session. It is also essential for organizations to have clearly defined and meaningful success criteria to maintain detailed documentation of test procedures, findings, and mitigations and to curate a knowledge base for future GenAI Red Teaming activities.

Resilient Cyber