Skip to content
Casey Labs

A secure delivery platform needs one place where the intended design is written down. Without that, nobody can tell whether the platform is enforcing policy or merely hoping policy exists.

If some rules live in Terraform, other rules live in the GitLab UI, runner settings live in Kubernetes manifests, and exceptions live in chat history, the organization has no reliable control model. People may still be working hard, but the system is too scattered to review, test, or improve safely.

The first design rule for this platform is simple: the repository is authoritative.

Source of truth means the repository contains the desired state of the platform. It does not mean every API response or every runtime log is committed to Git. It means the important decisions are represented in reviewable files before they affect production workflows.

Those decisions include:

  • how GitLab groups and projects should be configured
  • which branches are protected
  • which pipeline stages are required
  • which security policies apply to which projects
  • which Terraform workspaces own which infrastructure
  • which runner tiers exist
  • which policies block risky changes
  • which SLOs and runbooks define platform operation

This is the difference between a platform and a collection of administrator habits. A platform can be reviewed by someone who was not present when the original decision was made.

Many teams first learn DevOps through automation: write a script, run a pipeline, deploy faster. That is useful, but automation can also hide risk. A script that changes GitLab settings directly may be faster than a manual UI change, but it is still risky if nobody can review what it intends to change.

Infrastructure-as-code and policy-as-code solve that problem by making intent visible. A merge request can show the exact Terraform change, policy update, runner setting, or CI component revision. Reviewers can ask whether the change is correct before it becomes real.

That visibility is especially important for security. Most delivery incidents are not caused by one dramatic failure. They come from drift: a protected branch changed by hand, an unreviewed project variable, a runner registered in the wrong tier, or an exception that nobody remembers approving.

The reference repository keeps each platform concern in a predictable place:

docs/ architecture, governance, operating model, runbooks
terraform/ GitLab, HCP Terraform, and EKS runner infrastructure
policies/ GitLab security policies and OPA/Rego controls
ci/ reusable paved-road CI components
k8s/ runner Helm values and network policy baselines
sre/ SLO and dashboard seeds
examples/ onboarding catalog examples

The exact directory names are less important than the boundary they create. A future maintainer should be able to answer basic questions quickly:

  • Where is GitLab project state modeled?
  • Where are mandatory pipeline controls defined?
  • Where are runner trust tiers configured?
  • Where are infrastructure policies tested?
  • Where are operating expectations documented?

If the answer is “ask the person who set it up,” the platform is not mature enough yet.

The basic workflow is deliberately staged:

  1. Change the desired state in the repository.
  2. Run local validation.
  3. Review the diff.
  4. Generate Terraform or policy plans in the relevant control plane.
  5. Apply only after approval.

This is slower than clicking directly in a UI, but it is safer and more repeatable. It also creates useful history. When a setting changes, the organization can see who proposed it, who reviewed it, which tests ran, and why the change was made.

The same pattern applies to security policy. A policy change should be reviewed like application code. If a new rule blocks deployments, teams should be able to inspect the rule, see when it was introduced, and understand the exception path.

A source-of-truth repository should not become a dumping ground. Some information does not belong in Git:

  • secrets and tokens
  • private keys
  • raw production data
  • one-time credentials
  • unfiltered incident evidence
  • generated files that can be reproduced

The repository should contain references, configuration, policy, and documentation. Sensitive values should live in systems designed for secrets and access control, such as HCP Terraform variables, cloud secret managers, or identity-backed deployment systems.

Treating the repository as authoritative supports several security properties:

  • Changes are inspectable before they affect production workflows.
  • Platform policies can be tested like code.
  • Drift has a baseline to compare against.
  • Access can be narrowed because fewer humans need direct administrative changes.
  • Exceptions can be modeled explicitly instead of being hidden in UI state.

The same pattern also improves operations. During an incident, responders can answer basic questions quickly: which policies should be active, which runner tiers exist, which Terraform workspaces own which layer, and what SLOs define platform health.

Once the repository is the design record, the next question is architecture. Which system owns which kind of decision? GitLab, HCP Terraform, EKS, and policy engines all have a role, but they should not all own the same control.

The architecture starts there: define the control planes, then show how delivery work flows through them.