Resources and Data Sources: Where the Infrastructure Lives

This chapter covers the resource lifecycle, data sources, dependencies, and the count and for_each meta-arguments you'll use every day.

The Resource Block

A resource declares a thing to be managed.

resource "aws_instance" "web" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"

  tags = {
    Name = "web-server"
  }
}
  • First label: resource type (provided by the provider, e.g. aws_instance).
  • Second label: local name (you pick).
  • Body: arguments (some required, some optional, documented by the provider).

Find documentation at registry.terraform.io for every provider. The AWS provider's docs page for aws_instance lists every argument and attribute.

The Resource Lifecycle

For each managed resource, Terraform runs through four stages, in order.

Create

When a resource is in config but not in state, Terraform creates it.

Read (Refresh)

Before every plan, Terraform asks the provider: "what's the current state of everything?" It updates its state file with the answer. This is refresh. You rarely invoke it directly; plan does it automatically.

Update

When an argument changes:

  • In-place update: some attributes can change without replacing the resource (tags, for example).
  • Replace: some attributes force a new resource (the AMI of an EC2 instance, the CIDR of a VPC). Terraform destroys the old and creates the new.

The plan tells you which: ~ means in-place, -/+ means replace.

Delete

When a resource is in state but not in config, Terraform deletes it.

Data Sources

A data block reads an existing resource. Read-only. Output goes into Terraform's evaluation; nothing is created or modified.

data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]    # Canonical

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }
}

resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"
}

Data sources are essential. Use them to:

  • Look up the latest AMI instead of hardcoding.
  • Read your own AWS account ID and region.
  • Find a VPC by tag.
  • Read a secret from Secrets Manager.

Useful common ones:

data "aws_caller_identity" "current" {}      # account ID
data "aws_region" "current" {}               # current region
data "aws_availability_zones" "available" {} # AZs in current region
data "aws_ami" "ubuntu" { ... }              # AMI lookup
data "aws_vpc" "default" { default = true }  # default VPC

Reference: data.aws_caller_identity.current.account_id.

Dependencies

Terraform builds a dependency graph from your config and applies resources in order. Dependencies come in two flavors.

Implicit Dependencies

Whenever resource A references resource B, A depends on B.

resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
}

resource "aws_subnet" "web" {
  vpc_id     = aws_vpc.main.id       # reference → implicit dep
  cidr_block = "10.0.1.0/24"
}

Terraform creates the VPC before the subnet because of the reference.

Implicit dependencies are the default. Use references everywhere; the dependency graph builds itself.

Explicit Dependencies

Sometimes A must run after B, but A doesn't reference B. Use depends_on:

resource "aws_iam_role_policy_attachment" "lambda_logs" {
  role       = aws_iam_role.lambda.name
  policy_arn = aws_iam_policy.logs.arn
}

resource "aws_lambda_function" "api" {
  # ... doesn't reference the policy attachment
  depends_on = [aws_iam_role_policy_attachment.lambda_logs]
}

Here the Lambda function doesn't reference the policy attachment, but it won't work until the attachment exists. depends_on tells Terraform.

Use explicit dependencies sparingly. If you find yourself reaching for depends_on, check first whether a reference would work.

count: Identical Copies

Create N copies of a resource:

resource "aws_instance" "web" {
  count = 3

  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"

  tags = {
    Name = "web-${count.index}"
  }
}

count.index is 0, 1, 2. Inside the block, you reference the current index.

References outside the block need the index:

output "first_web_id" {
  value = aws_instance.web[0].id
}

output "all_web_ids" {
  value = aws_instance.web[*].id     # splat: all of them
}

Count is fine for identical, numbered resources. Where it hurts: if you remove the first one, indices shift, and Terraform destroys and recreates everything after the removed index. For per-item distinct resources, use for_each instead.

for_each: Per-Key Uniqueness

Create one resource per key:

resource "aws_s3_bucket" "per_env" {
  for_each = toset(["dev", "staging", "prod"])

  bucket = "notes-${each.key}"
}
  • each.key: the key (the string).
  • each.value: the value (same as key when the input is a set of strings).

With a map:

resource "aws_instance" "by_role" {
  for_each = {
    web    = "t3.medium"
    worker = "m5.large"
    db     = "r5.xlarge"
  }

  ami           = data.aws_ami.ubuntu.id
  instance_type = each.value

  tags = {
    Role = each.key
  }
}

References use the key:

aws_instance.by_role["web"].id
aws_instance.by_role["worker"].id

Adding or removing a key affects only that one resource; neighbors aren't disturbed.

count vs for_each

count       Identical copies; order matters; removal is disruptive.
for_each    Per-key; order doesn't matter; removal is surgical.

Rule of thumb: if the items are distinct (different names, different configs), use for_each. If they're truly identical copies of the same thing, count is fine.

Some older modules accept count = var.enabled ? 1 : 0 to conditionally create a resource. This works but is awkward. Modern idiom: lift the conditional to the caller.

Dynamic Blocks

When a resource has nested blocks that depend on input data, use dynamic:

resource "aws_security_group" "web" {
  name = "web"

  dynamic "ingress" {
    for_each = var.ingress_rules

    content {
      from_port   = ingress.value.from_port
      to_port     = ingress.value.to_port
      protocol    = ingress.value.protocol
      cidr_blocks = ingress.value.cidr_blocks
    }
  }
}

var.ingress_rules might look like:

ingress_rules = [
  { from_port = 443, to_port = 443, protocol = "tcp", cidr_blocks = ["0.0.0.0/0"] },
  { from_port = 80,  to_port = 80,  protocol = "tcp", cidr_blocks = ["0.0.0.0/0"] },
]

Use dynamic blocks sparingly. They're powerful and harder to read. Static blocks, when possible, are always clearer.

The lifecycle Meta-Argument

resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"

  lifecycle {
    create_before_destroy = true
    prevent_destroy       = false
    ignore_changes        = [tags["LastUpdated"]]
    replace_triggered_by  = [aws_launch_template.web.latest_version]
  }
}

What each does:

  • create_before_destroy: when replacing, create the new resource before destroying the old. Useful for zero-downtime changes.
  • prevent_destroy: error if Terraform tries to destroy this resource. Use on databases and anything precious. Remove the line when you genuinely want to destroy.
  • ignore_changes: don't replace or update on changes to these attributes. Common for attributes the provider mutates outside Terraform (e.g. tags set by auto-tagging policies).
  • replace_triggered_by: replace this resource when another resource changes. Useful for dependencies the API doesn't model cleanly.

A Real Example: Load-Balanced Web Servers

Bringing it together:

data "aws_availability_zones" "available" {}

data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }
}

variable "instance_count" {
  type    = number
  default = 3
}

resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
}

resource "aws_subnet" "public" {
  count = length(data.aws_availability_zones.available.names)

  vpc_id            = aws_vpc.main.id
  cidr_block        = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
  availability_zone = data.aws_availability_zones.available.names[count.index]

  tags = { Name = "public-${count.index}" }
}

resource "aws_instance" "web" {
  count = var.instance_count

  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"
  subnet_id     = aws_subnet.public[count.index % length(aws_subnet.public)].id

  lifecycle {
    create_before_destroy = true
  }

  tags = { Name = "web-${count.index}" }
}

VPC, subnets (one per AZ), and instances spread across the subnets. Change instance_count and Terraform scales up or down.

Common Pitfalls

Using count when for_each fits. Removing an item shifts every later index, causing unnecessary churn. Prefer for_each for distinct items.

Hardcoded AMIs. AMIs vary by region and go out of date. Use a data source.

Circular dependencies. A references B references A. Terraform can't build the graph. Restructure or introduce a data source.

Ignoring prevent_destroy on databases. Delete flag plus one bad terraform apply equals data loss. Protect stateful resources.

Dynamic blocks for static configs. Two known ingress rules don't need a dynamic block. Write them out.

Forgetting ignore_changes on autoscaling-managed resources. If an ASG updates instance tags or counts outside Terraform, every plan will show a change. Ignore the right attributes.

Next Steps

Continue to 05-state.md to understand what Terraform is tracking behind your back.