Testing and Validation: Catching Bugs Before They Cost Money
This chapter covers the validation layers around Terraform: fmt, validate, tflint, Terratest, and policy-as-code.
The Layers
You don't pick one tool; you stack them. Each catches a different class of problem.
terraform fmt Formatting (style)
terraform validate Syntax and internal consistency
tflint Lint rules, provider-specific best practices
tfsec / Checkov Security misconfigurations
terraform test Plan/apply-based tests (native, 1.6+)
Terratest Real-integration tests in Go
OPA / Conftest Custom policy over the plan JSON
Sentinel HashiCorp's policy-as-code (HCP Terraform)
Each layer is fast and cheap compared to the next. Put the cheap ones in pre-commit hooks; the expensive ones in CI.
terraform fmt
Formats your code to Terraform's canonical style.
terraform fmt # format files in the current directory
terraform fmt -recursive # include subdirectories
terraform fmt -check # exit 1 if any file would be changed (for CI)
terraform fmt -diff # show the changes it would make
In CI, always:
terraform fmt -check -recursive
Fails fast if anyone committed unformatted code. Zero excuses.
terraform validate
Checks syntax and internal consistency. Doesn't hit any cloud APIs.
terraform init -backend=false
terraform validate
Fails on: typos in resource types, missing required arguments, references to things that don't exist, type mismatches.
Won't catch: runtime errors, missing IAM permissions, values that depend on the real cloud.
Always run in CI. It's fast and catches a surprising amount.
tflint
A Terraform linter. Catches things validate doesn't:
- Deprecated syntax.
- Unused variables.
- AWS-specific rules (invalid instance types for a region, missing required tags, etc.).
- Naming conventions.
Install:
brew install tflint
# or the binary from github.com/terraform-linters/tflint
Configure with a .tflint.hcl:
plugin "terraform" {
enabled = true
preset = "recommended"
}
plugin "aws" {
enabled = true
version = "0.30.0"
source = "github.com/terraform-linters/tflint-ruleset-aws"
}
rule "terraform_naming_convention" {
enabled = true
}
rule "terraform_unused_declarations" {
enabled = true
}
Run:
tflint --init # once, to download plugins
tflint --recursive
The AWS plugin catches "you used an instance type that doesn't exist in this region", which the provider would also catch at plan time, but much slower.
Security Scanners: tfsec and Checkov
Static analysis that knows common security mistakes.
tfsec
brew install tfsec
tfsec .
Example findings:
Check: aws-s3-enable-versioning
Severity: MEDIUM
Location: main.tf:15
S3 bucket does not have versioning enabled.
Check: aws-ec2-no-public-egress-sgr
Severity: MEDIUM
Location: sg.tf:23
Security group rule allows egress to 0.0.0.0/0.
Checkov
pip install checkov
checkov -d .
Similar scope, different rule set. Some teams run both; many pick one.
Use these for:
- Public resources (S3 buckets, security groups) that should be private.
- Missing encryption at rest.
- IAM policies with
*actions. - Unrestricted ingress rules.
Not a substitute for code review, but a useful safety net.
terraform test (Native)
Since Terraform 1.6, terraform test is built in. You write test cases in HCL.
tests/bucket.tftest.hcl:
variables {
environment = "test"
}
run "valid_config" {
command = plan
assert {
condition = length(aws_s3_bucket.notes.bucket) > 0
error_message = "bucket name must not be empty"
}
}
run "production_uses_large_instance" {
command = plan
variables {
environment = "prod"
}
assert {
condition = aws_instance.web.instance_type == "m5.large"
error_message = "prod must use m5.large"
}
}
Run:
terraform test
Each run block is a test. command = plan runs a plan (no cloud changes). command = apply actually applies (use against a test environment; clean up after).
Great for module testing: you can verify outputs and resource configuration without deploying.
Terratest
Go-based integration testing. Deploys real infrastructure, asserts behavior, destroys it.
Example (test/vpc_test.go):
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestVPCModule(t *testing.T) {
t.Parallel()
opts := &terraform.Options{
TerraformDir: "../modules/vpc",
Vars: map[string]interface{}{
"cidr_block": "10.99.0.0/16",
"name": "test-vpc",
},
}
defer terraform.Destroy(t, opts)
terraform.InitAndApply(t, opts)
vpcID := terraform.Output(t, opts, "vpc_id")
assert.NotEmpty(t, vpcID)
}
Run:
cd test
go test -v -timeout 30m
Terratest is more work to set up than terraform test, but:
- Tests against real cloud resources (catches provider bugs, IAM issues, eventual consistency).
- Lets you assert HTTP-level behavior (does the deployed ALB actually respond?).
- Full Go, so you can script anything.
Use Terratest for modules that go through real API calls (networking, ALBs, Lambda deployments).
Policy as Code
Policy says "plans must satisfy these rules". Stronger than linting because it runs on the actual plan output, not the source code.
OPA / Conftest
OPA (Open Policy Agent) with Conftest wrapper:
# policies/s3.rego
package main
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_s3_bucket_public_access_block"
resource.change.after.block_public_acls == false
msg := sprintf("S3 bucket %s must block public ACLs", [resource.address])
}
Run:
terraform show -json tfplan.binary > plan.json
conftest test --policy policies/ plan.json
Any deny rule that matches fails the check.
Sentinel
HashiCorp's policy-as-code language, available in HCP Terraform.
import "tfplan/v2" as tfplan
s3_buckets = filter tfplan.resource_changes as _, rc {
rc.type is "aws_s3_bucket_public_access_block"
}
main = rule {
all s3_buckets as _, bucket {
bucket.change.after.block_public_acls is true
}
}
Attached to a workspace; every plan is checked. Violations block apply.
OPA vs Sentinel
OPA: open source, language (Rego) used beyond Terraform. Works with any CI.
Sentinel: HashiCorp-only, HCP Terraform / Terraform Enterprise. More integrated with the HCP UI.
Pick based on your Terraform delivery path.
A Realistic Pipeline
What to run where:
Pre-commit hook terraform fmt, terraform validate, tflint (fast)
PR CI (always) fmt check, validate, tflint, tfsec, plan, conftest on plan
PR CI (optional) terraform test (native)
Nightly Terratest against a sandbox account
Apply to prod Sentinel (if using HCP) or conftest gate
Each layer runs when its feedback is valuable without blocking you.
Pre-Commit Hook
Install pre-commit (pre-commit.com) and add .pre-commit-config.yaml:
repos:
- repo: https://github.com/antonbabenko/pre-commit-terraform
rev: v1.86.0
hooks:
- id: terraform_fmt
- id: terraform_validate
- id: terraform_tflint
- id: terraform_tfsec
- id: terraform_docs
pre-commit install wires these as git hooks. Every commit runs them.
Common Pitfalls
Skipping terraform fmt. Unformatted diffs make real changes hard to see. Free tool; run it.
Depending only on terraform validate. It catches syntax, nothing else. Layer more.
Running Terratest on every PR. Slow and expensive. Reserve for scheduled runs or release branches.
Policy that's too strict. If 80% of PRs fail policy, engineers route around it. Make policy actionable: say what to fix, not just what's wrong.
No cleanup in Terratest. A test that leaks resources is worse than no test. defer terraform.Destroy always.
Testing only the happy path. Testing that terraform plan succeeds for valid input is half a test. Also test that invalid input fails with the right error.
Next Steps
Continue to 11-ecosystem.md for the tools built on top of Terraform.