When Detection Fails Quietly, What Are Teams Really Chasing?
Question: In offensive work and malware analysis, what are the things you’ve learned to look for over time, and which of those do machines still struggle to pick up?
Eric Hulse, Director of Security Research, Command Zero
The first few years in offensive security and malware analysis, you hunt indicators.
- IOCs,
- signatures,
- behavioral rules.
- You match against known-bad.
What changes with experience is that you stop asking "does this match something I've seen before?" and start asking "does this make sense?" That shift is enormous. And it's almost entirely absent from current detection tooling.
The Offensive Reality: Discipline Over Dominance
Skilled attackers aren't trying to maximize access. They're trying to minimize exposure while achieving the objective. Those are very different goals, and almost all detection logic is built to catch the first one.
When you're operating with real discipline, you don't climb the privilege ladder for the sake of it. If a compromised EA account has broad read permissions on a file server due to a misconfiguration, and the data you need is sitting right there, you don't escalate further.
Domain admin might trip a wire. The EA account reading files it's technically permitted to read doesn't trip anything, because nothing about the action is technically wrong.
The attack surface isn't the most privileged path. It's the path of least resistance that still accomplishes the goal.
Every unnecessary action is a detection opportunity you're handing to the defender. Good operators don't do that. And detection logic built around the assumption that attackers reach for the highest available privilege can simply be opted out of.
Where the Machines Fall Short
The central failure of current detection models is context collapse. A model trained on global telemetry scores a PowerShell execution with base64 encoding as medium-high risk. That same execution in an environment where IT runs base64-encoded scripts in every backup job is noise.
The model knows what normal looks like across its training dataset. It doesn't know what normal looks like for this org. Alert fatigue isn't a tuning problem. It's an architecture problem.
Machines also cannot infer intentionality.
- An insider slowly staging data across thirty days in folders that look like project work,
- a contractor whose access patterns drift quietly toward sensitive systems over a month,
- a threat actor operating entirely within permitted access because a misconfiguration handed them what they needed
- None of these generates alerts because none of them is technically wrong.
The detection gap for disciplined actors isn't a visibility problem. It's a judgment problem, and judgment doesn't come from telemetry alone.
What Is Shifting and Where It's Heading
Two structural changes have widened the gap in the last two years.
- First, the identity surface has outrun the detection stack. Modern attacks live in token theft, OAuth abuse, and conditional access bypass. Most EDR and SIEM weren't built to reason about this.
- Second, LLMs are accelerating attacker capability faster than they're improving defender capability. Reconnaissance, phishing personalization, and code generation are all getting cheaper and faster on the offensive side. Detection synthesis is improving, but it's trailing.
- The defenders who close this gap first won't do it by buying better alerting.
- They'll do it by building better investigative judgment into their workflows, human and automated.
Near-term, graph-based detection that reasons from attacker objectives backward through the kill chain is the most promising architectural direction. It won't replace analyst judgment. It will finally give analysts something worth judging.
What Should Change Right Now
Stop measuring detection by alert count. Know what percentage of actual compromise events generated a signal you acted on in time.
Almost no organization tracks this.
Run red team exercises that model disciplined attackers. Most engagements are scoped to generate detectable activity by design, which means they systematically test for the wrong attacker profile.
If your detections only catch operators who make noise, you're only protected against the least sophisticated ones.
Audit permission sprawl before the attacker finds it.
- Start with service accounts,
- EA-level roles, and
- shared mailboxes.
- These carry ambient over-permission that accumulated quietly for years and requires no exploitation to abuse.
Organizations are bracing for AI-scale attacks while still lacking a reliable inventory of their own assets, services, and exposure. The foundational work hasn't been done.
If you don't have visibility into what you're running, what's misconfigured, and where your gaps are, AI-powered attacks don't create a new problem. They just move through the existing ones faster, and you still won't catch them.
The Honest Close
The machines are good at pattern matching at scale and will get better. What they cannot do is reason about what an attacker is deliberately not doing, or whether what they're seeing is consistent with a human making careful decisions inside the environment.
Detection logic built around escalation and noise will keep catching the operators who don't know what they're doing. The ones who do are already through.
The gap isn't compute. It's context, continuity, and craft.




