6. Runner Isolation on EKS
CI runners execute code the platform does not fully trust. That includes feature branch code, build scripts, third-party dependencies, package manager hooks, generated files, and test fixtures. Runner design is an isolation problem as much as a capacity problem.
The reference platform runs GitLab Runner on AWS EKS with separate runner tiers. Terraform creates namespaces, applies restrictive labels, installs the GitLab Runner Helm chart, and attaches network policies for each tier.
Runners in plain language
Section titled “Runners in plain language”A GitLab runner is the worker that executes CI jobs. GitLab schedules the job, but the runner performs the work. If a pipeline says “build this container image” or “run these tests,” a runner is where those commands run.
That means a runner is exposed to whatever the job does. A job can run application code, install dependencies, execute package manager scripts, open network connections, and handle artifacts. Some jobs also need credentials. That combination makes runner design a security decision.
The design needs to answer:
- Which jobs can run on which runners?
- Which runners can access secrets?
- Which runners can reach the internet?
- Which runners can deploy?
- Which runners are allowed to run privileged workloads?
- How do we monitor runner health and queue time?
The safest default is unprivileged, ephemeral pods with restricted egress. A job should start clean, run with the least privilege it needs, publish its output, and disappear.
Why CI isolation belongs in the platform
Section titled “Why CI isolation belongs in the platform”If every repository chooses its own runner model, the organization cannot reason about which jobs can reach which secrets or networks. If all jobs share the same runner pool, a low-trust feature branch can end up too close to high-trust release automation.
The platform response is to make trust tiers explicit. Most jobs should run in constrained, non-privileged environments. Protected release jobs should run on runners with narrower registration, stronger network controls, and clearer monitoring. Exceptional privileged builds should be rare, documented, and separated from normal CI capacity.
Namespaces and pod security
Section titled “Namespaces and pod security”The EKS runner stack labels each runner namespace with restricted pod security settings:
locals { namespace_labels = { "pod-security.kubernetes.io/enforce" = "restricted" "pod-security.kubernetes.io/audit" = "restricted" "pod-security.kubernetes.io/warn" = "restricted" }}
resource "kubernetes_namespace_v1" "runner" { for_each = var.runner_tiers
metadata { name = each.value.namespace labels = merge(local.namespace_labels, each.value.labels) }}That does not make CI safe by itself, but it establishes the default: runner pods should not start from a privileged posture.
Helm releases by tier
Section titled “Helm releases by tier”Each runner tier becomes its own Helm release:
resource "helm_release" "gitlab_runner" { for_each = var.runner_tiers
name = "gitlab-runner-${each.key}" repository = "https://charts.gitlab.io" chart = "gitlab-runner" version = each.value.chart_version namespace = kubernetes_namespace_v1.runner[each.key].metadata[0].name
atomic = true cleanup_on_fail = true wait = true values = [file("${path.module}/${each.value.values_file}")]}Separate releases make rollout and rollback cleaner. A sandbox runner change should not be coupled to a protected release runner change.
Standard runner baseline
Section titled “Standard runner baseline”The standard tier uses the Kubernetes executor, a pinned base image, locked runner registration, non-privileged execution, node selectors, and pod security context:
runners: name: standard tags: standard protected: false locked: true requestConcurrency: 25 config: | [[runners]] executor = "kubernetes" [runners.kubernetes] namespace = "gitlab-runner-standard" image = "alpine:3.22.1" privileged = false service_account = "gitlab-runner-standard" [runners.kubernetes.node_selector] "platform.gitlab.com/runner-tier" = "standard"Protected, privileged, deployment, and sandbox tiers can then vary from that baseline. Trust is explicit. Jobs do not accidentally inherit production deploy access because all runners share a registration.
Cloud access
Section titled “Cloud access”Deployment jobs should avoid static cloud keys. GitLab OIDC ID tokens let a job assume AWS IAM roles without storing long-lived AWS credentials in project variables.
Use separate roles for separate environments:
- feature and sandbox jobs receive no deployment role
- non-production deployment jobs assume non-production roles
- production deployment jobs assume production roles only from protected refs
- production environments are protected in GitLab so only approved users or groups can deploy
That split keeps cloud access tied to both GitLab policy and AWS IAM policy. A compromised feature branch should not be able to reach a production role just because the runner can reach AWS.
Network policy
Section titled “Network policy”The default network posture is deny first:
apiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: default-denyspec: podSelector: {} policyTypes: - Ingress - EgressThe baseline allow policy opens DNS and TCP 443. Production environments should usually make that more specific through egress gateways, proxies, or approved registry endpoints. The principle is the same: CI jobs should not have arbitrary network reach just because they run inside the platform cluster.
Operational considerations
Section titled “Operational considerations”Runner isolation is also an SRE concern. Queue latency, autoscaling behavior, failed pod scheduling, registry access errors, and runner token rotation all affect developer experience. Monitor runner tiers separately because a saturated standard tier and a failing deployment tier require different responses.
The security model and operations model meet in one place: a runner tier is a product surface. It needs clear trust boundaries, capacity management, dashboards, and a runbook.
The final page connects delivery evidence with ongoing platform operations.