CI/CD: Plan, Apply, and Drift Detection

This chapter wires Terraform into GitHub Actions so every PR gets a plan, every merge gets an apply, and drift gets caught before it bites.

Why Automate Terraform

Terraform on a laptop is fine for learning. In production, you want CI to apply, for two reasons:

  1. Reviewable. Every change is a PR with a visible plan. Nothing slips in.
  2. Consistent. CI uses the same Terraform version, the same provider, the same AWS credentials path every time. No "works on my machine".

Locally-applied Terraform is also a security issue: every engineer needs broad credentials. CI can use short-lived, scoped credentials via OIDC.

The Standard Workflow

On every pull request: terraform plan. Comment the plan on the PR.

On merge to main: terraform apply. If prod, require manual approval.

On schedule: terraform plan with no changes, to detect drift.

That's the whole shape. GitHub Actions below, same idea works for GitLab, Bitbucket Pipelines, Azure DevOps.

Authenticating to AWS

The wrong way: store long-lived AWS keys in GitHub secrets.

The right way: OIDC. GitHub Actions can request short-lived credentials from AWS via federated identity. No secrets stored.

One-Time AWS Setup

Create an IAM OIDC provider for GitHub:

resource "aws_iam_openid_connect_provider" "github" {
  url             = "https://token.actions.githubusercontent.com"
  client_id_list  = ["sts.amazonaws.com"]
  thumbprint_list = ["6938fd4d98bab03faadb97b34396831e3780aea1"]
}

Create a role the workflow can assume:

resource "aws_iam_role" "github_actions" {
  name = "github-actions-terraform"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Principal = {
        Federated = aws_iam_openid_connect_provider.github.arn
      }
      Action = "sts:AssumeRoleWithWebIdentity"
      Condition = {
        StringEquals = {
          "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
        }
        StringLike = {
          "token.actions.githubusercontent.com:sub" = "repo:your-org/your-repo:*"
        }
      }
    }]
  })
}

Attach policies for what Terraform needs. A scoped policy is better than AdministratorAccess.

Now your workflow can assume github-actions-terraform without secrets.

GitHub Actions: Plan on PR

.github/workflows/terraform-plan.yml:

name: Terraform Plan

on:
  pull_request:
    paths:
      - "terraform/**"
      - ".github/workflows/terraform-plan.yml"

permissions:
  id-token: write        # for OIDC
  contents: read
  pull-requests: write   # to comment on the PR

jobs:
  plan:
    runs-on: ubuntu-latest

    defaults:
      run:
        working-directory: terraform/environments/dev

    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.6.0

      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/github-actions-terraform
          aws-region: us-east-1

      - run: terraform fmt -check
      - run: terraform init
      - run: terraform validate

      - name: Plan
        id: plan
        run: terraform plan -no-color -out=tfplan
        continue-on-error: true

      - name: Comment Plan
        uses: actions/github-script@v7
        with:
          script: |
            const output = `#### Terraform Plan \`${{ steps.plan.outcome }}\`
            <details><summary>Show Plan</summary>

            \`\`\`\n${{ steps.plan.outputs.stdout }}\n\`\`\`

            </details>`;
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: output
            })

      - name: Fail on plan error
        if: steps.plan.outcome == 'failure'
        run: exit 1

Now every PR gets a plan comment. Reviewers read the plan before approving.

GitHub Actions: Apply on Merge

.github/workflows/terraform-apply.yml:

name: Terraform Apply

on:
  push:
    branches: [main]
    paths:
      - "terraform/**"

permissions:
  id-token: write
  contents: read

jobs:
  apply-dev:
    runs-on: ubuntu-latest
    environment: dev
    defaults:
      run:
        working-directory: terraform/environments/dev
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/github-actions-terraform
          aws-region: us-east-1
      - run: terraform init
      - run: terraform apply -auto-approve

  apply-prod:
    runs-on: ubuntu-latest
    environment: prod       # GitHub environment with required reviewers
    needs: apply-dev
    defaults:
      run:
        working-directory: terraform/environments/prod
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/github-actions-terraform-prod
          aws-region: us-east-1
      - run: terraform init
      - run: terraform apply -auto-approve

Two jobs: dev applies automatically; prod requires approval (via GitHub environment protection rules). The needs: apply-dev means prod won't run if dev failed.

GitHub environment protection rules are configured in repo settings: required reviewers, branch restrictions, wait timers.

Drift Detection

Running terraform plan on a schedule tells you if the real world has drifted from your config.

.github/workflows/terraform-drift.yml:

name: Terraform Drift Detection

on:
  schedule:
    - cron: "0 8 * * *"    # daily at 08:00 UTC
  workflow_dispatch:

permissions:
  id-token: write
  contents: read
  issues: write

jobs:
  drift:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        env: [dev, staging, prod]
    defaults:
      run:
        working-directory: terraform/environments/${{ matrix.env }}
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/github-actions-terraform
          aws-region: us-east-1
      - run: terraform init
      - name: Plan
        id: plan
        run: terraform plan -detailed-exitcode -no-color
        continue-on-error: true
      - name: Alert on drift
        if: steps.plan.outputs.exitcode == '2'
        uses: actions/github-script@v7
        with:
          script: |
            github.rest.issues.create({
              owner: context.repo.owner,
              repo: context.repo.repo,
              title: `Terraform drift in ${{ matrix.env }}`,
              body: 'terraform plan detected unexpected changes. Investigate.'
            })

-detailed-exitcode returns 0 for no changes, 1 for error, 2 for changes-present. The job opens a GitHub issue if drift is detected.

You can post to Slack, PagerDuty, or your incident system instead.

Policy Gates

Block PRs that violate policy before apply runs.

tfsec / Checkov

Security scanners. Run on PR, fail if critical issues.

      - name: tfsec
        uses: aquasecurity/tfsec-action@v1
        with:
          soft_fail: false
      - name: Checkov
        uses: bridgecrewio/checkov-action@master
        with:
          directory: terraform/
          quiet: true

OPA / Conftest

Custom policy with Rego. Example: "no public S3 buckets".

      - name: Conftest
        run: |
          terraform show -json tfplan > plan.json
          conftest test --policy policies/ plan.json

Chapter 10 covers these properly.

PR Comments with terraform-docs

Auto-generate module docs in the PR itself:

      - name: terraform-docs
        uses: terraform-docs/gh-actions@v1.0.0
        with:
          working-dir: modules/vpc
          output-file: README.md
          git-push: true

Keeps module README files up to date automatically.

Running Locally Alongside CI

Even with CI, engineers still run terraform plan locally when iterating. Rules:

  • Never terraform apply locally on production. It breaks the CI audit trail.
  • Dev apply locally is fine for speed during development.
  • State locks prevent collisions between local and CI applies.

Some teams disable local apply on prod state via IAM (only the CI role can write to the prod state bucket).

The Test Workflow

Combined pipeline: every PR runs format, validate, lint, security scan, plan. If any step fails, PR is red.

jobs:
  test:
    steps:
      - run: terraform fmt -check -recursive
      - run: terraform init -backend=false   # no state access needed for validate
      - run: terraform validate
      - uses: terraform-linters/setup-tflint@v4
      - run: tflint --recursive
      - uses: aquasecurity/tfsec-action@v1
      - run: terraform plan -no-color

Fast, thorough, mostly offline. Apply runs only after merge.

Common Pitfalls

Long-lived access keys in CI. Use OIDC. If your CI supports it, there's no excuse.

auto-approve without review. Auto-approve is fine, as long as a reviewer approved the plan via PR. The reviewer is the approval; Terraform's built-in prompt is redundant in CI.

No drift detection. Drift compounds. Schedule a plan.

Plan output eaten by CI logs. Post the plan as a PR comment. Make reviewers see it.

Shared role for all environments. One leaked token and every env is compromised. Separate IAM role per env.

Running apply from feature branches. Only main (or a release branch) triggers apply.

Next Steps

Continue to 10-testing-and-validation.md to catch bugs before they become outages.