David Brumley, Chief AI and Science Officer at Bugcrowd, and the former CEO of Mayhem Security, now acquired as part of Bugcrowd’s expansion into AI-driven application security. Brumley is an offensive security expert, and academic researcher with over 20 years of experience.
A Professor at Carnegie Mellon University, Brumley teaches the next generation of cybersecurity talent.
In this conversation, we explore the pros and cons of AI Agents in cybersecurity, the importance of enhanced guardrails and having human-in-the-loop to monitor agent actions.
Brumley explains that AI is only as smart as its training. Creativity comes from us. And why proof-of-concept exploits remain essential for validating real issues.
Vishwa: Mayhem Security was built on autonomous systems that detect and patch vulnerabilities in real time. How does that technology evolve inside Bugcrowd’s human-in-the-loop environment?
David: AI is smart -- but it's also only able to be smart about things it's been trained on. Humans bring that spark of creativity to challenge known problems in new ways, and adapt to unknown problems no one anticipates.
Mayhem (autonomy) and Bugcrowd (human creativity) give us the power to bring these two things together in a truly unique platform.
Vishwa: What practical lessons from the DARPA Cyber Grand Challenge still guide your work on AI models that find and fix flaws autonomously?
David: Don't break stuff. It's boring, but true.
We won the DARPA Cyber Grand Challenge (CGC) because we built into Mayhem the capability to check its own work, while other competitors would field broken patches, or churn code just to make a change.
We've taken that lesson to heart, and made the ability to test a decision as important as the ability to make a decision.
I believe if you just add AI without the ability to test, you run the danger of running as fast as you can off the deep end.
Vishwa: Agents in cybersecurity are relatively new. How comfortable are you using AI agents in cybersecurity?
David: AI agents are tech that take actions on behalf of humans that can be huge productivity boosters, while massively decreasing time to remediation. But we also know AI agents hallucinate, and will confidentially take incorrect or harmful action.
That's why it's important to have built-in guardrails, and a human-on-the-loop making sure agents actions are safe.
Vishwa: What are the advantages of using them? Where are they most beneficial for security teams?
David: The big advantage is easy to overlook -- it's natural language interaction. We've always been able to orchestrate security at the speed and scale of AI agents in theory.
The problem is the amount of time it takes to write that orchestration and get it deployed. AI agents solve that problem precisely because they don't need to be programmed -- they just need to be directed and monitored.
Vishwa: How does combining crowdsourced hacker ingenuity with AI reasoning systems change the signal-to-noise ratio in vulnerability discovery?
David: We've had automated vulnerability analysis for years, like SAST and SBOM, but their core problem has been noise. The problem is these legacy technologies never prove problems with a proof of concept (POC) exploit. When you have a POC, you know an issue is real, but without one, you're just guessing.
Bugcrowd and Mayhem both believe in the value of a POC, but how they get there is different. Humans have leaps of intuition that can skip steps, while Mayhem mathematically reasons. The combination of the two is where the magic happens. Let humans make leaps, while AI starts reasoning down paths to see what's next, and provide options.
I'd go a step further and say you need ingenuity *and* accountability. Ultimately people are responsible for outcomes, so we're putting in the right guardrails to make sure humans scope autonomous actions and are asked before performing critical actions.
Vishwa: Where do you see the biggest engineering challenges in merging Bugcrowd’s crowdsourced platform telemetry with Mayhem’s reinforcement-learning engines?
David: We don't really talk about future work, but…I can tell you that we'll have something quite special to show you in the near future.
Vishwa: As AI systems grow more capable of generating and testing code, how do you think researchers should balance innovation with the risk of unintentionally creating exploitable outputs?
David: AIs will get smarter, but we shouldn't lose sight that ultimately people are responsible. We believe responsible AI is about building in guardrails, and ultimately making sure a human signs off on the task.
Vishwa: As AI testing expands to APIs, dependencies, and SBOM analysis, what data-integration or model-validation hurdles must be solved to keep accuracy near 100 percent?
David: I think "app-intent" vulnerabilities are the hardest. These come when the vulnerability programmatically looks fine, and only are obvious when you understand the human intent of an app. For example, imagine an app where you could buy a gift, and then it has a "share as gift" URL you send to the recipient. If that link is posted on social media, it allows anyone to redeem the gift. That's the sort of problem that's really because you need to understand the app intent semantics.
Vishwa: As offensive and defensive AI continue to converge, what benchmarks or review layers help ensure machine actions stay aligned with security goals?
David: For businesses, the biggest piece of advice I can give is don't start with the tech, start with your business purpose. Then, when you look at AI, ask how it helps with that purpose. In the broader community, we need to keep focusing on how AI proves vulnerabilities, not just reports vulnerabilities.
Part of what we're helping LLM models do is set up training to do that -- validate findings are real and not just hallucinations.
Vishwa: When it comes to AI security in 2026, what will people be talking about and what should organizations be preparing for?
David: What we used to think of as a nation state capability 10 years ago is going to become much more accessible as criminals adapt AI. That means businesses need to double down on owning security.
For example, today we see too many businesses only trying to retroactively patch known vulnerabilities, and spending zero time looking for unknown zero days. That will need to change to survive.