Best Practices: Habits and Anti-Patterns
This chapter collects the directory layouts, naming conventions, production patterns, and anti-patterns that separate a healthy Terraform codebase from a growing liability.
Directory Structure
A layout that scales:
terraform/
├── modules/ # shared modules, versioned
│ ├── vpc/
│ ├── rds/
│ └── app-service/
├── environments/ # one directory per env
│ ├── dev/
│ │ ├── main.tf
│ │ ├── backend.tf
│ │ ├── providers.tf
│ │ ├── versions.tf
│ │ └── terraform.tfvars
│ ├── staging/
│ └── prod/
├── bootstrap/ # state bucket, lock table (local state)
└── policies/ # OPA / Sentinel / custom policies
Rules:
- Each environment is its own Terraform config. Separate state.
- Modules are shared. Either in
modules/(same repo) or published to a registry. - One directory, one state file. One
terraform applynever spans directories. - Bootstrap is separate. It owns the state bucket; it can't use that bucket as its own backend.
Naming Conventions
Pick a convention, enforce with tflint. Examples:
# Good
resource "aws_s3_bucket" "app_logs" { ... }
resource "aws_iam_role" "lambda_executor" { ... }
variable "instance_type" { ... }
output "bucket_name" { ... }
# Bad
resource "aws_s3_bucket" "MyBucket" { ... }
resource "aws_iam_role" "LambdaRole1" { ... }
variable "InstanceType" { ... }
Use snake_case for resource names, variables, and outputs. That matches the provider's argument naming and is Terraform-idiomatic.
For resources of the same type, don't repeat the type in the name:
# Redundant
resource "aws_s3_bucket" "logs_bucket" { ... }
# Better
resource "aws_s3_bucket" "logs" { ... }
Tagging Strategy
Tag every resource. At minimum:
locals {
common_tags = {
Environment = var.environment
ManagedBy = "terraform"
Project = "notes"
Owner = "platform-team"
CostCenter = "engineering"
}
}
Many providers support default tags at the provider level (AWS does):
provider "aws" {
region = "us-east-1"
default_tags {
tags = local.common_tags
}
}
Every resource managed by this provider picks up the tags automatically. Reduces repetition.
Tags are the primary tool for cost allocation, audit, and cleanup automation. Invest in them.
Provider Version Pinning
Always pin:
terraform {
required_version = ">= 1.6"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0" # pessimistic: 5.x, not 6.0
}
random = {
source = "hashicorp/random"
version = "~> 3.0"
}
}
}
~> 5.0 means 5.0 or later, but less than 6.0. Catches most breaking changes at major-version boundaries.
Check in .terraform.lock.hcl. It pins the exact version per platform; committing it means CI gets the same version every team member has.
Destruction Protection
For databases, KMS keys, and other precious resources:
resource "aws_db_instance" "main" {
# ... lots of config ...
lifecycle {
prevent_destroy = true
}
deletion_protection = true # AWS-side
}
Two layers:
lifecycle.prevent_destroy: Terraform will refuse to plan a destroy. Fails at plan time.- AWS
deletion_protection = true: the AWS API refuses to delete. Belt and suspenders.
When you genuinely want to delete (migration, retirement), remove prevent_destroy, plan, verify, apply. Deliberate, not accidental.
Secret Management
State contains sensitive data. Tfvars can contain sensitive data. Neither should be in git.
Rules:
- Never commit
.tfvarswith secrets. Gitignore files matching*.secret.tfvarsif you use the convention. - Pull secrets at runtime. Use
aws_secretsmanager_secret_versionoraws_ssm_parameterdata sources; pass IDs and ARNs to resources, not plaintext. - Use external secret managers for application secrets. Terraform provisions the infrastructure; secrets live in Secrets Manager or Vault.
- Mark sensitive variables and outputs. Prevents logging accidents in
terraform apply.
data "aws_secretsmanager_secret_version" "db_password" {
secret_id = "prod/db/password"
}
resource "aws_db_instance" "main" {
password = data.aws_secretsmanager_secret_version.db_password.secret_string
}
The password is still in state. Encrypt state at rest (Chapter 7) and restrict access.
Importing Existing Resources
Inheriting a system that was click-ops built? Don't recreate; import.
import {
to = aws_vpc.main
id = "vpc-0abc123"
}
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
# ... match the existing config
}
terraform plan -generate-config-out=generated.tf writes the matching config for you (then refine).
Import one resource at a time. Verify each plan shows no changes before importing the next.
The .gitignore Checklist
Every Terraform repo's .gitignore:
# Local state
*.tfstate
*.tfstate.*
# Plan output
*.tfplan
*.tfplan.*
# Secrets
*.secret.tfvars
.env
.envrc
# Crash logs
crash.log
crash.*.log
# Terraform directory
.terraform/
.terraform.tfstate.lock.info
# IDE / OS
.DS_Store
.idea/
.vscode/
Recovery Recipes
Commands you'll reach for eventually.
"Someone deleted a resource in the console"
terraform plan
# You see: "resource will be created" on something that should exist already
Option A: import it back.
import {
to = aws_s3_bucket.notes
id = "existing-bucket-name"
}
Option B: let Terraform recreate it (if it's truly gone and can be rebuilt).
"State is corrupted"
# With S3 backend + versioning
aws s3api list-object-versions --bucket my-tfstate --prefix projects/notes/terraform.tfstate
# Find the last-good version, restore:
aws s3api get-object --bucket my-tfstate --key projects/notes/terraform.tfstate \
--version-id <last-good-version> ./terraform.tfstate
terraform state push ./terraform.tfstate
"The wrong plan was applied to prod"
# Revert the git commit that changed the config:
git revert <commit>
# Then apply the reverted config:
terraform apply
If the damage is state-level (drift from the revert), terraform plan will show what Terraform thinks needs to change. Review carefully.
"I need to rename a resource"
terraform state mv aws_s3_bucket.old aws_s3_bucket.new
Then edit the config to match. plan should show no changes.
Anti-Patterns
Catch these in code review.
Monolithic State
One state for the entire company. Every apply locks everything. One bad plan breaks everything. Split by project or env.
State in Git
Ever. Don't.
Unversioned Modules
source = "git::...?ref=main" breaks the moment the module author changes something. Pin to a tag or commit.
Huge Modules
A 2000-line "platform" module with 80 inputs. Break it up into composable pieces.
Hidden Cross-Env Coupling
Project B's terraform apply depends on Project A's state via terraform_remote_state, and you didn't document it. Now nobody can change A safely. Use explicit interfaces (SSM, Secrets Manager).
No CI
Engineers apply from laptops. Audit trail: nonexistent. IAM: overly broad. Move to CI.
terraform apply --auto-approve Locally
Rarely what you want. Read the plan. The prompt is a seatbelt.
Secrets in tfvars
Committed to git. Eventually leaks. Use a secrets manager.
"Temporary" Imports Never Cleaned Up
import blocks stay in config forever. After import, terraform plan should show no changes; remove the import block in the next PR.
Mixed Count and for_each
A resource that uses count elsewhere and for_each somewhere else. Both work; the inconsistency doesn't. Pick one per kind of use.
No Module Tests
Modules other teams depend on have no tests. Every breaking change is found in production.
The One-Page Checklist
For a team starting fresh, do these in order:
- Pick a directory layout (modules/ and environments/).
- Bootstrap remote state (S3 plus DynamoDB with versioning).
- Set up CI (plan on PR, apply on merge).
- Use OIDC for AWS auth.
- Pin provider versions; commit the lock file.
- Tag every resource with common tags.
- Protect production resources (prevent_destroy, deletion_protection).
- Scan with tfsec and Checkov.
- Schedule drift detection.
- Document the repo structure in a README.
Do these and you're ahead of most teams.
Where to Go From Here
You have the commands, the patterns, the pipeline, and the list of bad habits to avoid. The next level is depth:
- Terraform docs (developer.hashicorp.com/terraform): the canonical reference.
- terraform-aws-modules on the public registry: read the source of production-grade modules.
- Anton Babenko's terraform-best-practices: the practical book-site.
- AWS Well-Architected Framework: what "good infrastructure" actually means, above Terraform.
- The OpenTofu discussions: community governance in action.
Build something. Break it. Fix it. Migrate it from workspace-per-env to directory-per-env when you realize why. That's where Terraform stops being a config language and becomes a tool you trust with production.