Chapter 07: Resource Management & QoS

Chapter 07: Resource Management & QoS

Why This Chapter Exists

Unbounded workloads create noisy-neighbor incidents and unpredictable recovery. This chapter enforces resource discipline:

  • requests/limits per container
  • namespace quotas
  • predictable QoS behavior under pressure

Guardrails

  • Every workload must define CPU/memory requests and limits.
  • Namespaces must enforce LimitRange and ResourceQuota.
  • OOM and throttling analysis must happen before scaling decisions.

Repo Mapping

  • App resources:
    • flux/apps/backend/base/deployment.yaml
    • flux/apps/frontend/base/deployment.yaml
  • Namespace quotas/limits:
    • flux/infrastructure/resource-management/develop/
    • flux/infrastructure/resource-management/staging/
    • flux/infrastructure/resource-management/production/
  • Flux wiring:
    • flux/bootstrap/flux-system/infrastructure.yaml
    • flux/bootstrap/flux-system/apps.yaml

Current Implementation (This Repo)

  • Backend and frontend define CPU/memory/ephemeral-storage requests+limits.
  • develop, staging, production have LimitRange and ResourceQuota via Flux.
  • Apps depend on resource-management Kustomizations before reconcile.

Lab Files

  • lab.md
  • quiz.md

Done When

  • learner can explain Burstable vs Guaranteed vs BestEffort with real manifests
  • learner can verify quota/limitrange enforcement in cluster
  • learner can diagnose OOM/resource pressure from pod events and metrics

Lab: Requests, Limits, QoS, and OOM Analysis

Lab: Requests, Limits, QoS, and OOM Analysis

Goal

Validate resource guardrails in develop:

  • verify requests/limits are present
  • verify namespace quota and default limits
  • trigger controlled memory pressure and analyze behavior

Prerequisites

  • Flux healthy
  • develop namespace workloads running
kubectl -n flux-system get kustomizations
kubectl -n develop get deploy backend frontend

Step 1: Verify Namespace Controls

kubectl -n develop get limitrange
kubectl -n develop describe limitrange default-container-limits
kubectl -n develop get resourcequota
kubectl -n develop describe resourcequota compute-quota

Expected:

Quiz: Chapter 07 (Resource Management & QoS)

Quiz: Chapter 07 (Resource Management & QoS)

Questions

  1. Why are requests/limits mandatory for production workloads?

  2. What QoS class do pods usually get when requests and limits are both set but not equal?

  3. What Kubernetes object enforces namespace-wide total resource caps?

  4. What Kubernetes object provides default/min/max resource values per container?

  5. Which statement is correct?

  • A) BestEffort pods are safest for critical APIs.
  • B) ResourceQuota helps prevent one namespace from exhausting cluster capacity.
  • C) OOMKilled means the container exceeded CPU limit.
  1. Why include ephemeral-storage requests/limits?