The Detection Rebuild, Part 2: Automating Detection Engineering Without Breaking the SOC

Coming off the heels of Part 1, where we focused on fixing the signal problem, Part 2 is all about scale. Because once you’ve cleaned up your alerts and improved your detection quality, the next question is: how do you keep it that way without burning your team out?

This post is a practical look at how to automate your detection engineering pipeline—safely, reliably, and without flooding your environment with untested rules or brittle logic. We’re not here to add DevOps jargon for the sake of it. We’re here to build sustainable detection systems that evolve without collapsing under their own complexity.

Let’s go.

What Detection Engineering Automation Shouldn’t Be

First, let’s set some boundaries. Automation doesn’t mean:

Generating 500 rules from every new threat report
Enabling every Sigma rule in the repo “just in case”
Bulk-deploying untested detection logic into production

Automation without validation is just accelerating failure.

The Goal: Detection-as-Code, Built Like Software

To scale detection engineering, you need to treat rules like code. That means:

Version control: Every rule lives in Git.
Peer review: No rule gets deployed without review.
CI/CD pipeline: Automated tests run before anything hits production.
Rollback: If something breaks, you can revert instantly.

This isn’t just clean—it’s survival.

Building a Detection CI/CD Pipeline

1. Rule Linting & Syntax Validation

Run static analysis on detection rules to catch formatting or logic errors. Example tools:

sigmac for Sigma rules
Custom YAML/JSON schema validators

Example:

In one org, broken YAML indentation silently caused several detection rules to fail during a weekly sync—but no one noticed because there was no validation step. After adding a pre-commit linter, rule failures dropped to zero, and missed alerts from key rules came back online.

2. Test Against Historical Data

Don’t ship rules blind. Run them in dry-run mode against last 30 days of data:

Count how many matches would have occurred
Compare against known alerts
Identify noise before it hits production

Example:

A detection for “rare login location” matched 1,200 events over 30 days. A quick review showed 90% were due to employees traveling for conferences. After tuning with known event calendars and tagging approved destinations, the final rule averaged just 3 alerts a week—and caught a real stolen token incident the next quarter.

3. Simulate Alerts with Controlled Inputs

For behavioral detections:

Replay simulated attack data into your test environment
Validate that your rule fires under known conditions

Example:

After deploying a lateral movement rule, one team replayed Atomic Red Team modules through a sandbox. Half the expected alerts didn’t trigger. Root cause? Their rule was too tightly scoped to specific process names. With broader parent-child logic and command-line keyword matching, they got full coverage without adding noise.

4. Tag & Route by Environment

Tag rules for:

Production
Audit-only / test mode
High-risk / watchlist

Route alerts accordingly. Not all rules need to page the on-call engineer.

Example:

An overzealous detection for suspicious registry changes triggered hundreds of tickets in prod before the team realized it was related to an internal update script. Now, any new rule starts in “audit-only” mode for a week, with tags for environment, priority, and ownership. Tickets are only created after review.

Automating the Boring, Not the Broken

Focus your automation on the high-friction, low-risk parts of the workflow:

Rule deployment: Auto-publish validated rules to your SIEM
Alert suppression: Auto-suppress known false positives based on analyst feedback
Metadata enrichment: Auto-tag alerts with threat intel, asset context, or prior alert history

Avoid:

Blindly generating rules from threat feeds
Auto-enabling rules without signal testing

Example:

One company added auto-enrichment for all file hash alerts using VirusTotal API. Within a week, analysts could triage commodity malware in seconds instead of minutes. No extra alerts were added—just more context per event.

Detection Rule Lifecycle: A Healthy Flow

To keep your detection stack healthy and scalable, every rule should follow a lifecycle:

Draft → created and documented in Git
Reviewed → peer-reviewed with threat context and test plan
Test Mode → dry-run alerts captured and evaluated
Production → alerting enabled, metrics tracked
Monitored → rule performance reviewed regularly
Tuned/Retired → based on feedback and activity

This helps avoid stale rules, redundant logic, and alert fatigue from the inside out.

Metrics That Matter

You can’t improve what you don’t measure. Track:

Alerts per rule, per week
% of alerts leading to investigation
% of alerts closed as false positives
Time-to-triage per rule category
Rules that haven’t fired in X days

Example:

Use a dashboard to track “high-volume, low-action” rules. Anything with >100 alerts/week and <1% triage got flagged.

Common Pitfalls to Avoid

No testing before deployment: Don’t learn the hard way.
Too much abstraction: Building a “platform” before proving the workflow.
Unclear rule ownership: Nobody wants to fix what nobody owns.
CI/CD without rollback: If your deploy fails, can you recover instantly?
Duplication across data sources: Alert storms from the same event via EDR, SIEM, and cloud logs.

Trust in automation is fragile. One false firestorm can undo months of good work.

Tooling Recommendations (Real & Practical)

Rule Management:
Sigma for standardized rule formats
DetectionRules (Elastic’s repo)
Git + pull requests for audits
CI/CD:
GitHub Actions / GitLab Pipelines
Pre-commit hooks for schema validation
Slack/Jira integrations for peer review flow
Testing/Replay:
Custom Python scripts with Elastic queries
Atomic Red Team or custom replay tools
Test harnesses using ECS data snapshots
Feedback:
Analyst tagging in SIEM
Jira or ServiceNow auto-linking
Slack bots for in-line feedback capture

Final Thought

You can’t scale detection engineering by working harder. You scale it by working smarter—building pipelines that catch issues before they cause pain, and automating workflows with safety rails.

Think of automation like detection plumbing: invisible when it works, catastrophic when it leaks.

Build for trust. Build for clarity. Build for scale.

That’s the rebuild.

Annoyed Engineer

The Detection Rebuild, Part 2: Automating Detection Engineering Without Breaking the SOC

What Detection Engineering Automation Shouldn’t Be

The Goal: Detection-as-Code, Built Like Software

Building a Detection CI/CD Pipeline

1. Rule Linting & Syntax Validation

2. Test Against Historical Data

3. Simulate Alerts with Controlled Inputs

4. Tag & Route by Environment

Automating the Boring, Not the Broken

Detection Rule Lifecycle: A Healthy Flow

Metrics That Matter

Common Pitfalls to Avoid

Tooling Recommendations (Real & Practical)

Final Thought

Discover more from Annoyed Engineer

Leave a Reply Cancel reply

Annoyed Engineer

The Detection Rebuild, Part 2: Automating Detection Engineering Without Breaking the SOC

What Detection Engineering Automation Shouldn’t Be

The Goal: Detection-as-Code, Built Like Software

Building a Detection CI/CD Pipeline

1. Rule Linting & Syntax Validation

2. Test Against Historical Data

3. Simulate Alerts with Controlled Inputs

4. Tag & Route by Environment

Automating the Boring, Not the Broken

Detection Rule Lifecycle: A Healthy Flow

Metrics That Matter

Common Pitfalls to Avoid

Tooling Recommendations (Real & Practical)

Final Thought

Share this:

Discover more from Annoyed Engineer

Leave a Reply Cancel reply