Chapter 02 — Infrastructure as Code (IaC) (Part 1)
Chapter 02 — Infrastructure as Code (IaC) (Part 1) is being produced. Check back soon.
Sign in to view source code.
A reproducible lab result plus quiz verification and incident-safe operating evidence.
Members see the full interactive explainer with checkpoint questions and downloadable labs. The first two chapters are free previews — try those to get a feel for the format before you subscribe.
Chapter 02 — Infrastructure as Code (IaC) (Part 1) is being produced. Check back soon.
Chapter 02 — Infrastructure as Code (IaC) (Part 2) is being produced. Check back soon.
By the end of this chapter, you will be able to:
guard-terraform-plan.shStart with the video for the concept overview, then work through each lesson section.
Infrastructure as Code (IaC) is not just about automation speed; it’s about repeatability and safety. In this chapter, we build our local foundation using Terraform and Kind, focusing on a reviewed execution model.
Manual cluster creation leads to inconsistency and “snowflake” clusters. Without a shared state and strict locking, team collaboration becomes a source of race conditions and unintended resource destruction.
We use Kind (Kubernetes in Docker) to simulate a production cluster locally. This allows us to practice our deployment workflows in a safe, disposable environment before moving to the cloud.
Our sre/ repo organizes infrastructure into isolated modules. The kind_cluster module codifies our multi-node topology and Flux bootstrap.
Kind cluster layout
terraform {
required_providers {
kind = {
source = "tehcyx/kind"
version = "0.9.0"
}
helm = {
source = "hashicorp/helm"
version = "~> 2.12"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.25"
}
null = {
source = "hashicorp/null"
version = "~> 3.2"
}
}
}
provider "kind" {}
provider "helm" {
kubernetes {
host = kind_cluster.sre.endpoint
client_certificate = kind_cluster.sre.client_certificate
client_key = kind_cluster.sre.client_key
cluster_ca_certificate = kind_cluster.sre.cluster_ca_certificate
}
}
provider "kubernetes" {
host = kind_cluster.sre.endpoint
client_certificate = kind_cluster.sre.client_certificate
client_key = kind_cluster.sre.client_key
cluster_ca_certificate = kind_cluster.sre.cluster_ca_certificate
}
locals {
kubeconfig_path = pathexpand("${path.module}/kubeconfig.yaml")
flux_pull_secret_yaml = var.flux_git_token != "" ? " pullSecret: \"flux-system\"\n" : ""
backup_s3_secret_enabled = nonsensitive(var.r2_access_key_id != "" && var.r2_secret_access_key != "")
}
resource "kind_cluster" "sre" {
name = "sre-control-plane"
wait_for_ready = true
kubeconfig_path = local.kubeconfig_path
kind_config {
api_version = "kind.x-k8s.io/v1alpha4"
kind = "Cluster"
networking {
api_server_address = "127.0.0.1"
api_server_port = 6443
kube_proxy_mode = "iptables"
}
containerd_config_patches = [
<<-EOT
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:5001"]
endpoint = ["http://kind-registry:5000"]
EOT
]
node {
role = "control-plane"
kubeadm_config_patches = [
<<-EOT
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "ingress-ready=true"
authorization-mode: "Webhook"
EOT
]
extra_port_mappings {
container_port = 30080
host_port = 8080
listen_address = "127.0.0.1"
protocol = "TCP"
}
extra_port_mappings {
container_port = 30443
host_port = 8443
listen_address = "127.0.0.1"
protocol = "TCP"
}
}
node {
role = "worker"
}
}
}
resource "null_resource" "merge_kubeconfig" {
depends_on = [kind_cluster.sre]
provisioner "local-exec" {
when = create
command = "${path.module}/scripts/merge-kubeconfig.sh \"${local.kubeconfig_path}\""
interpreter = ["/bin/bash", "-c"]
}
}
resource "time_sleep" "wait_for_cluster" {
depends_on = [null_resource.merge_kubeconfig]
create_duration = "30s"
}
output "kubeconfig" {
description = "Path to the generated kubeconfig for the kind cluster"
value = local.kubeconfig_path
}
output "kubeconfig_load_instructions" {
description = "How to use the generated kubeconfig"
value = <<-EOT
export KUBECONFIG="${local.kubeconfig_path}"
kubectl get nodes
# Optional: merge into your default kubeconfig
${path.module}/scripts/merge-kubeconfig.sh "${local.kubeconfig_path}"
kubectl config use-context sre-control-plane
EOT
}
resource "kubernetes_namespace" "traefik" {
metadata { name = "traefik" }
depends_on = [time_sleep.wait_for_cluster]
}
resource "helm_release" "traefik" {
name = "traefik"
repository = "https://traefik.github.io/charts"
chart = "traefik"
namespace = "traefik"
version = "34.5.0"
depends_on = [kubernetes_namespace.traefik]
set {
name = "service.type"
value = "NodePort"
}
set {
name = "ports.web.nodePort"
value = "30080"
}
set {
name = "ports.websecure.nodePort"
value = "30443"
}
set {
name = "providers.kubernetesIngress.enabled"
value = "true"
}
set {
name = "providers.kubernetesCRD.enabled"
value = "true"
}
}
resource "helm_release" "metrics_server" {
name = "metrics-server"
repository = "https://kubernetes-sigs.github.io/metrics-server/"
chart = "metrics-server"
namespace = "kube-system"
version = "3.12.2"
depends_on = [time_sleep.wait_for_cluster]
set {
name = "args[0]"
value = "--kubelet-insecure-tls"
}
}
resource "null_resource" "flux_operator_install" {
depends_on = [time_sleep.wait_for_cluster]
triggers = {
kubeconfig_path = local.kubeconfig_path
repo_url = var.flux_git_repository_url
repo_branch = var.flux_git_repository_branch
repo_path = var.flux_kustomization_path
provider = "github"
}
provisioner "local-exec" {
when = create
interpreter = ["/bin/bash", "-c"]
command = "kubectl --kubeconfig=\"${local.kubeconfig_path}\" apply -f https://github.com/controlplaneio-fluxcd/flux-operator/releases/latest/download/install.yaml"
}
}
resource "null_resource" "flux_instance" {
depends_on = [
null_resource.flux_operator_install,
kubernetes_secret.flux_git_auth
]
triggers = {
kubeconfig_path = local.kubeconfig_path
}
provisioner "local-exec" {
when = create
command = <<-EOC
cat <<EOF | kubectl --kubeconfig="${local.kubeconfig_path}" apply -f -
apiVersion: fluxcd.controlplane.io/v1
kind: FluxInstance
metadata:
name: flux
namespace: flux-system
spec:
distribution:
version: "${var.flux_version}"
registry: ghcr.io/fluxcd
components:
- source-controller
- kustomize-controller
- helm-controller
- notification-controller
- image-reflector-controller
- image-automation-controller
cluster:
type: kubernetes
sync:
kind: GitRepository
url: "${var.flux_git_repository_url}"
ref: "refs/heads/${var.flux_git_repository_branch}"
provider: generic
path: "${var.flux_kustomization_path}"
${local.flux_pull_secret_yaml}
EOF
EOC
interpreter = ["/bin/bash", "-c"]
}
provisioner "local-exec" {
when = destroy
on_failure = continue
command = "kubectl --kubeconfig=\"${self.triggers.kubeconfig_path}\" delete fluxinstance flux -n flux-system --ignore-not-found=true --wait=false"
interpreter = ["/bin/bash", "-c"]
}
}
resource "null_resource" "flux_pre_destroy" {
depends_on = [
kind_cluster.sre,
kubernetes_namespace.traefik,
kubernetes_namespace.bootstrap,
null_resource.flux_instance,
]
triggers = {
kubeconfig_path = local.kubeconfig_path
namespaces = "develop,staging,production,observability,traefik"
}
provisioner "local-exec" {
when = destroy
on_failure = continue
command = "\"${path.module}/../scripts/flux-pre-destroy.sh\" \"${self.triggers.kubeconfig_path}\" \"${self.triggers.namespaces}\""
interpreter = ["/bin/bash", "-c"]
}
}
metadata {
name = "flux-system"
namespace = "flux-system"
}
data = {
username = "git"
password = var.flux_git_token
}
type = "Opaque"
}
data = {
cloudflare_proxied = "disabled"
cluster_name = "sre-control-plane"
image_registry = var.image_registry
git_owner = var.git_owner
}
depends_on = [null_resource.flux_operator_install]
}
type = "Opaque"
data = {
uptrace_dsn = var.uptrace_dsn
}
depends_on = [null_resource.flux_operator_install]
}
metadata {
name = each.key
}
depends_on = [time_sleep.wait_for_cluster]
lifecycle {
ignore_changes = [
metadata[0].labels,
]
}
}
metadata {
name = "ghcr-credentials-docker"
namespace = each.key
}
type = "kubernetes.io/dockerconfigjson"
data = {
".dockerconfigjson" = jsonencode({
auths = {
"ghcr.io" = {
username = var.ghcr_username
password = var.ghcr_token
auth = base64encode("${var.ghcr_username}:${var.ghcr_token}")
}
}
})
}
}
metadata {
name = "sops-age"
namespace = "flux-system"
}
type = "Opaque"
data = {
"age.agekey" = var.sops_age_key
}
}
metadata {
name = "cnpg-backup-s3"
namespace = each.key
}
type = "Opaque"
data = merge(
{
ACCESS_KEY_ID = var.r2_access_key_id
ACCESS_SECRET_KEY = var.r2_secret_access_key
BUCKET = var.r2_bucket
},
var.r2_endpoint != "" ? { ENDPOINT = var.r2_endpoint } : {},
var.r2_region != "" ? { REGION = var.r2_region } : {},
)
depends_on = [kubernetes_namespace.bootstrap]
}
output "flux_operator_installed" {
description = "Indicates that Flux Operator has been installed"
value = null_resource.flux_operator_install.id != ""
}
output "flux_instance_created" {
description = "Indicates that FluxInstance has been created"
value = "flux"
depends_on = [null_resource.flux_instance]
}
We enforce a “sanity check” layer before code leaves the workstation. Hooks for formatting, validation, and security scanning block broken changes before they reach the repository.
IaC hook baseline
default_install_hook_types:
- pre-commit
- pre-push
- pre-merge-commit
- prepare-commit-msg
repos:
- repo: local
hooks:
- id: master-branch-check
name: Protected branch guard
entry: scripts/pre-commit-master-check.sh
language: script
always_run: true
pass_filenames: false
stages: [pre-commit, pre-push, pre-merge-commit]
args:
- --protected=master
- --protected=main
- id: prevent-amend-after-push
name: Prevent amending pushed commits
entry: scripts/prevent-amend-after-push.sh
language: script
always_run: true
pass_filenames: false
stages: [prepare-commit-msg]
- repo: local
hooks:
- id: flux-kustomize-validate
name: Flux kustomize validate
entry: scripts/flux-kustomize-validate.sh
language: script
files: ^flux/.*\.ya?ml$
pass_filenames: true
require_serial: true
stages: [pre-commit]
- id: terraform-fmt
name: Terraform format check
entry: terraform fmt -recursive -diff -check
language: system
files: \.tf$
pass_filenames: false
stages: [pre-commit]
- id: terraform-validate
name: Terraform validate
entry: scripts/terraform-validate.sh
language: script
files: \.(tf|tfvars)$
pass_filenames: false
require_serial: true
stages: [pre-commit]
- id: terraform-security
name: Terraform security scan
entry: scripts/terraform-security.sh
language: script
files: \.(tf|tfvars)$
pass_filenames: false
require_serial: true
stages: [pre-commit]
- repo: local
hooks:
- id: no-secrets
name: Block sensitive files
entry: scripts/block-secrets.sh
language: script
files: (kubeconfig|\.key$|\.pem$|credentials|\.env$)
stages: [pre-commit]
- repo: https://github.com/koalaman/shellcheck-precommit
rev: v0.10.0
hooks:
- id: shellcheck
files: \.sh$
args: [--severity=warning]
stages: [pre-commit]
- repo: https://github.com/adrienverge/yamllint
rev: v1.35.1
hooks:
- id: yamllint
files: \.ya?ml$
args: [-d, relaxed]
stages: [pre-commit]
Initialize your local environment using the guarded workflow:
cd infra/terraform/kind_cluster
terraform init
terraform plan -out=tfplan
# Review the plan output, then apply
terraform apply tfplan
kubectl get nodes --context kind-sre-control-plane
Labs, quizzes, and runbooks — available to course members.
Result: Partial drift plus unexpected replacement in unrelated resources. Recovery takes longer because no one can prove which plan produced the final state. Observed Symptoms What the team sees first: One apply job …
Safe investigation sequence: Identify every job: List every plan and apply job that touched the same environment. Compare artifacts: Compare the reviewed plan artifact with a fresh plan against current state. Confirm …
terraform fmt -recursive -diff -check: Ensures consistent formatting. scripts/terraform-validate.sh: Catches configuration errors. scripts/terraform-security.sh: Flags security misconfigurations using Checkov. Safe …
Done When You have completed this chapter when: You can explain and demonstrate the plan -> review -> apply workflow. You have successfully provisioned a 3-node Kind cluster using Terraform. You can identify …