---
title: "10x Faster Change Management. Compliance Included."
date: 2026-05-12
tag: infrastructure
authors:
  - ross
  - nick
url: https://eng.miragesecurity.ai/posts/10x-change-management/
---

In the last year we increased the throughput of code changes by 146%, adjusted for headcount. You don't break that kind of sound barrier without a few things breaking. We had to revamp our architecture, development process, release tooling, and more.

After doing all of that, one problem remained: human review.

As part of SOC 2 CC8.1, we need an auditable change management process. Changes must be authorized, tested, approved, and deployed in a controlled way. In practice, that means human review, or at least controls that prove no single person can unilaterally ship risky changes to production.

> **SOC 2 CC8.1**
>
> "The entity **authorizes**, designs, develops or acquires, configures, documents, **tests**, **approves**, and **implements** changes to infrastructure, data, software, and procedures to meet its objectives."

Our default pre-AI policy was simple: an engineer signs off on every pull request. As our throughput nearly doubled, that became a real bottleneck. We were shipping faster than ever, but the approval queue meant PRs sat waiting for a human to context-switch, review, and click the green button.

So we asked ourselves: is there a better way? One that maintains compliance, that we actually feel good about, and that doesn't slow us down?

After working through it with our auditors at Richey May, we landed on a policy and technical plan:

1. Define which parts of our codebase are high-risk and which are low-risk.
2. Low-risk changes get reviewed by an AI code reviewer and can merge when there are no remaining blockers.
3. High-risk changes still require a human reviewer who is not the PR author.
4. Every AI-approved merge gets retroactively reviewed by a human on a weekly basis, covering the full population (not a sample). This acts as a catch-all in case there are gaps in automation that we then use as feedback to iterate on our tooling.
5. All of this is codified in CI/CD scripts so the process is auditable and not prone to human error.

Below is a redacted version of our change management policy, the classification logic, and the GitHub Actions workflows that enforce it.

## How Risk Classification Works

The core idea: not all code changes carry the same risk. A copy change to a dashboard label is different from modifying authentication middleware or network infrastructure.

We use `dorny/paths-filter` action to classify every PR at open time and as the PR evolves. If the changed files touch any of the paths on our high-risk list, the entire PR is classified as high-risk and requires human approval. Everything else is low-risk.

Here's a condensed version of our classification criteria:

```yaml
high-risk:
  - '**/*.tf'                                    # Infrastructure
  - '.github/workflows/**'                       # CI/CD pipelines
  - '.github/CODEOWNERS'                         # Access control
  - '**/Dockerfile'                              # Container definitions
  - '**/env.ts'                                  # Environment config
  - 'packages/tokens/**'                         # Auth tokens
  - 'packages/contract/src/**'                   # API contracts
  - 'packages/shared/drizzle/**'                 # Database migrations
  - 'services/api/src/http/middleware/**'        # Request middleware
  - 'services/api/src/modules/Organizations.ts'  # Tenant boundaries
  - 'services/api/src/modules/CustomerEncryption/**'
  - 'services/identity/**'                       # Identity service
  - '**/package.json'                            # Dependencies
  - 'pnpm-lock.yaml'                             # Supply chain
```

The classification criteria script and workflow are classified as high-risk. Changing what counts as high-risk requires human approval.

## The SDLC Gate

Once classification happens, a CI workflow called `sdlc-gate` enforces the approval policy before merge.

The logic is straightforward:

**For high-risk PRs:**

- Require an approving review from a human who is not the PR author, which is enforced via branch protection
- Label the PR `human-review`
- Remove any `ai-approved` label from previous runs

**For low-risk PRs:**

- Require at least one approving review (human or AI reviewer)
- If the AI reviewer approved, label the PR `ai-approved`

The labels matter. They create the audit trail that feeds into the weekly retrospective.

## The AI Code Review

We use [Claude Code Action](https://github.com/anthropics/claude-code-action) running Claude Opus as our AI reviewer. When it runs on a PR, it reads the diff, leaves inline comments on anything it flags, and posts a summary comment with a recommendation: ✅ Approve or 🚫 Request Changes.

A separate step then reads that summary and submits a formal GitHub review on behalf of a dedicated reviewer bot. This is important for SOC 2: the approval needs to be a proper GitHub review event, not just a comment. The bot's review is what the SDLC gate checks.

If the AI's recommendation is ambiguous, "Conditionally Approve," or anything other than a clear ✅ Approve, the system defaults to `REQUEST_CHANGES`. We'd rather require human review of a clean PR than let a questionable one through.

## The Weekly Retrospective

This is the control that validates everything else.

Every Friday, a scheduled GitHub Action queries all merged PRs with the `ai-approved` label from the past week and generates a GitHub issue listing every single one. A human reviewer then goes through the list, checks whether any of those changes should have been classified as high-risk, notes any exceptions, and signs off by closing the issue.

The review window anchors to the previous retrospective issue's creation timestamp, so there are no gaps between review periods. If no prior issue exists (e.g. on its first run), it falls back to a 7-day lookback.

Each generated issue includes a link to the exact commit of the script that produced it. If someone asks "how was this report generated?", the answer is pinned to a specific version of the code, not "we ran a script."

The retrospective issue looks something like this:

```markdown
## SDLC Weekly Retrospective

**Period:** 2026-04-28T17:30:00.000Z to 2026-05-05T17:30:00.000Z
**Total AI-approved merges:** 12

### Generation Method
- **Query:** All merged PRs with label `ai-approved`
- **Window source:** anchored to previous retrospective #10
- **Population:** Full — every PR matching the criteria is included (not a sample)

### AI-Approved Merges

| PR    | Title                  | Author | Merged At            |
|-------|------------------------|--------|----------------------|
| #3401 | Fix dashboard tooltip  | @dev1  | 2026-05-01T10:00:00Z |
| #3405 | Update onboarding copy | @dev2  | 2026-05-02T14:30:00Z |
| ...   | ...                    | ...    | ...                  |

### Reviewer Sign-off
1. Verify the query results represent the full population
2. Review each PR for changes that should have been classified as high-risk
3. Note any exceptions or concerns
4. Add a sign-off comment and close this issue
```

## The GitHub Actions

Three workflows tie it all together:

**`claude-pr-review.yml`** runs Claude Code Action on every PR, then submits a formal review via the reviewer bot.

**`sdlc-gate.yml`** runs on every PR event (open, sync, review submitted). It classifies the changed files, then enforces the approval policy based on the classification.

**`sdlc-weekly-review.yml`** runs on a cron schedule every Friday at 10:30 AM PT. It generates the retrospective issue.

All three workflows, the classification criteria, the enforcement scripts, and the CODEOWNERS file that restricts who can modify them are protected by code ownership rules requiring CTO approval.

```
# CODEOWNERS (redacted)
.github/workflows/sdlc-gate.yml            @cto
.github/workflows/sdlc-weekly-review.yml   @cto
.github/workflows/claude-pr-review.yml     @cto
.github/CODEOWNERS                         @cto
packages/soc2/                             @cto
```

To make these workflows legible to humans and not just AI, we've abstracted out the logic into our own internal `@mirage/soc2` package. Any changes to this or the workflows require human review and validation.

## The Change Management Policy

Here's the relevant section from our updated change management policy that authorizes this system:

**Change Risk Classification**

All code and infrastructure changes are classified as either high-risk or low-risk prior to merge.

**High-risk changes** are changes that reasonably impact security controls, processing integrity, availability, or confidentiality. This includes changes that introduce or materially alter authentication, authorization decisions, data access rules, tenant boundary enforcement, encryption, infrastructure, CI/CD pipelines, or supply chain dependencies.

High-risk changes require review and approval by an authorized human reviewer who is not the change author prior to deployment to production.

**Low-risk changes** are changes that do not reasonably impact security controls, processing integrity, availability, or confidentiality. Low-risk changes may be reviewed and approved by an automated AI code review system, with the retrospective review functioning as the control that validates this determination. The following conditions must be met:

1. The classification criteria are formally documented, version-controlled, and subject to change management (any modification to the classification criteria is itself classified as high-risk).
2. Access to modify the classification criteria and enforcement script is restricted to authorized personnel via code ownership rules.
3. A documented weekly retrospective review is conducted by an authorized human reviewer covering all AI-approved merges from the prior period (full population, not a sample).

## Can We Trust the Process?

AI reviewing code sounds scary until you admit the obvious: humans are not perfect either.

Having a human review lots of low-risk changes like copy, moving around widgets, tests, and small refactors leads to review fatigue. It makes humans more likely to miss the stuff that matters. We have amazing tools now to handle those reviews. There is also growing evidence that AI models are as good if not better than most engineers at finding security vulnerabilities.

For low-risk changes, AI review is a reasonable gate. For anything that impacts our security posture and supply chain -- infra, auth, tenant boundaries, encryption, CI/CD, dependencies, critical workflows -- we still require a human.

The process is not to blindly trust AI but to use it where it makes the most sense and keep humans where they matter most. And who knows, maybe there is a day where humans are not even allowed to write code without an AI review.