Chapter 08: Resource Management & QoS
Learning Objectives
By the end of this chapter, you will be able to:
- Set requests, limits, and quotas per namespace for CPU and memory
- Predict QoS class assignment under resource pressure
- Distinguish OOMKilled behavior from node-pressure eviction
- Justify scaling decisions with resource utilization evidence
Start with the video for the concept overview, then work through each lesson section.
In Kubernetes, applications share the same physical hardware. Without resource discipline, one “noisy neighbor” can crash every other service on the node. In this chapter, we implement strict resource guardrails to ensure predictable and stable cluster behavior.
1. The Problem: The “Noisy Neighbor” Incident
A service starts consuming memory aggressively during a traffic peak. Because it has no limits, it starves neighboring Pods of memory, forcing Kubernetes to “evict” healthy Pods to save the host node. This causes a cascading failure across the entire node, affecting unrelated production workloads.
2. The Concept: Requests, Limits, and QoS
We use three tiers of Quality of Service (QoS) to tell Kubernetes which Pods are the most important:
- Requests: What the Pod is guaranteed to get for scheduling.
- Limits: The absolute maximum the Pod is allowed to use for safety.
- QoS Classes: Guaranteed (Requests == Limits), Burstable (Requests < Limits), and BestEffort (no requests/limits).
3. The Code: Resource Blocks
Our sre/ repo enforces explicit resource definitions for every container. The resources block in our deployment manifest is our contract with the Kubernetes scheduler.
Backend resource block
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
labels:
app: backend
app.kubernetes.io/name: backend
app.kubernetes.io/component: api
spec:
replicas: 1
revisionHistoryLimit: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0
maxSurge: 1
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
app.kubernetes.io/name: backend
app.kubernetes.io/component: api
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
imagePullSecrets:
- name: ghcr-credentials-docker
securityContext:
runAsNonRoot: true
runAsUser: 10001
runAsGroup: 10001
fsGroup: 10001
seccompProfile:
type: RuntimeDefault
containers:
- name: backend
image: backend:latest
imagePullPolicy: IfNotPresent
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 10001
runAsGroup: 10001
capabilities:
drop:
- ALL
ports:
- containerPort: 8080
name: http
protocol: TCP
env:
- name: PORT
value: "8080"
- name: NAMESPACE
value: "${NAMESPACE}"
- name: ENVIRONMENT
value: "${ENVIRONMENT}"
- name: LOG_LEVEL
value: "${LOG_LEVEL}"
- name: SERVICE_NAME
value: "backend"
- name: SERVICE_VERSION
value: "v1.0.0"
- name: DEPLOYMENT_ENVIRONMENT
value: "${ENVIRONMENT}"
- name: OTEL_RESOURCE_ATTRIBUTES
value: "k8s.cluster.name=${cluster_name}"
- name: UPTRACE_DSN
valueFrom:
secretKeyRef:
name: backend-secrets
key: uptrace-dsn
- name: OTEL_EXPORTER_OTLP_HEADERS
valueFrom:
secretKeyRef:
name: backend-secrets
key: uptrace-headers
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: backend-secrets
key: jwt-secret
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: app-postgres-app
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: app-postgres-app
key: password
- name: POSTGRES_HOST
value: app-postgres-rw
- name: POSTGRES_DB
value: app
livenessProbe:
httpGet:
path: /livez
port: http
initialDelaySeconds: 15
periodSeconds: 20
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /readyz
port: http
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
startupProbe:
httpGet:
path: /healthz
port: http
initialDelaySeconds: 0
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 30
resources:
requests:
cpu: 10m
memory: 32Mi
ephemeral-storage: 64Mi
limits:
cpu: 100m
memory: 128Mi
ephemeral-storage: 128Mi
volumeMounts:
- name: tmp
mountPath: /tmp
- name: cache
mountPath: /home/app/.cache
volumes:
- name: tmp
emptyDir: {}
- name: cache
emptyDir:
sizeLimit: 10Mi
4. The Guardrail: Namespace Quotas
To prevent a single environment from consuming the entire cluster’s resources, we use ResourceQuotas. This provides an additional layer of protection by rejecting any deployment that exceeds the namespace’s total memory or CPU budget.
Develop namespace quota
Show the develop resource baseline
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
namespace: develop
spec:
hard:
requests.cpu: "1"
requests.memory: 1Gi
limits.cpu: "2"
limits.memory: 2Gi
pods: "15"
5. Verification: Did I Get It?
Verify your workload’s QoS class and its actual resource consumption:
# Check the assigned QoS class
kubectl get pod -n develop <pod-name> -o jsonpath='{.status.qosClass}'
# Check actual CPU/memory usage
kubectl top pod -n develop <pod-name>
Expected Output: You should see Burstable or Guaranteed and resource usage that remains within the defined limits.