Chapter 07: Security Context & Pod Hardening
Learning Objectives
By the end of this chapter, you will be able to:
- Configure non-root, read-only filesystem, dropped capabilities, and seccomp baseline
- Debug permission failures without escalating to privileged mode
- Compare a golden security baseline against an insecure manifest diff
Start with the video for the concept overview, then work through each lesson section.
Container defaults are not production-safe. If an attacker gains shell access to a Pod, the Security Context is what prevents them from taking over the entire node. In this chapter, we implement baseline hardening to enforce the principle of least privilege.
1. The Problem: The “Privileged Escape”
A container is compromised through a web vulnerability. Without a hardened security context, an attacker running as root can easily escape the container to the host node, gaining access to other Pods, secrets, and the cluster API.
2. The Concept: Least Privilege at Runtime
We strip away every capability the container doesn’t absolutely need.
- Non-Root: No process should ever run as UID 0.
- ReadOnly Filesystem: Prevent attackers from downloading and running malware inside the container.
- Privilege Escalation: Block any attempt by a process to gain more permissions than it started with.
3. The Code: The Hardened Deployment
Our sre/ repo defines a strict security baseline for every application. The securityContext block is our contract for secure runtime execution.
Hardened backend deployment
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
labels:
app: backend
app.kubernetes.io/name: backend
app.kubernetes.io/component: api
spec:
replicas: 1
revisionHistoryLimit: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0
maxSurge: 1
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
app.kubernetes.io/name: backend
app.kubernetes.io/component: api
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
imagePullSecrets:
- name: ghcr-credentials-docker
securityContext:
runAsNonRoot: true
runAsUser: 10001
runAsGroup: 10001
fsGroup: 10001
seccompProfile:
type: RuntimeDefault
containers:
- name: backend
image: backend:latest
imagePullPolicy: IfNotPresent
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 10001
runAsGroup: 10001
capabilities:
drop:
- ALL
ports:
- containerPort: 8080
name: http
protocol: TCP
env:
- name: PORT
value: "8080"
- name: NAMESPACE
value: "${NAMESPACE}"
- name: ENVIRONMENT
value: "${ENVIRONMENT}"
- name: LOG_LEVEL
value: "${LOG_LEVEL}"
- name: SERVICE_NAME
value: "backend"
- name: SERVICE_VERSION
value: "v1.0.0"
- name: DEPLOYMENT_ENVIRONMENT
value: "${ENVIRONMENT}"
- name: OTEL_RESOURCE_ATTRIBUTES
value: "k8s.cluster.name=${cluster_name}"
- name: UPTRACE_DSN
valueFrom:
secretKeyRef:
name: backend-secrets
key: uptrace-dsn
- name: OTEL_EXPORTER_OTLP_HEADERS
valueFrom:
secretKeyRef:
name: backend-secrets
key: uptrace-headers
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: backend-secrets
key: jwt-secret
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: app-postgres-app
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: app-postgres-app
key: password
- name: POSTGRES_HOST
value: app-postgres-rw
- name: POSTGRES_DB
value: app
livenessProbe:
httpGet:
path: /livez
port: http
initialDelaySeconds: 15
periodSeconds: 20
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /readyz
port: http
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
startupProbe:
httpGet:
path: /healthz
port: http
initialDelaySeconds: 0
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 30
resources:
requests:
cpu: 10m
memory: 32Mi
ephemeral-storage: 64Mi
limits:
cpu: 100m
memory: 128Mi
ephemeral-storage: 128Mi
volumeMounts:
- name: tmp
mountPath: /tmp
- name: cache
mountPath: /home/app/.cache
volumes:
- name: tmp
emptyDir: {}
- name: cache
emptyDir:
sizeLimit: 10Mi
4. The Guardrail: ReadOnly with Exceptions
The most effective protection is readOnlyRootFilesystem: true. However, apps often need to write temporary files or logs. Instead of making the whole filesystem writable, we mount small, dedicated emptyDir volumes for specific paths.
Namespace security baseline
Show the namespace security baseline
---
apiVersion: v1
kind: Namespace
metadata:
name: auth
labels:
name: auth
managed-by: flux
---
apiVersion: v1
kind: Namespace
metadata:
name: cert-manager
labels:
name: cert-manager
managed-by: flux
---
apiVersion: v1
kind: Namespace
metadata:
name: develop
labels:
name: develop
environment: development
managed-by: flux
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: latest
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: latest
---
apiVersion: v1
kind: Namespace
metadata:
name: staging
labels:
name: staging
environment: staging
managed-by: flux
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: latest
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: latest
---
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
name: production
environment: production
managed-by: flux
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: latest
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: latest
5. Verification: Did I Get It?
Verify that your Pod is truly hardened:
# Exec into the backend and check your User ID
kubectl -n develop exec -it deploy/backend -- id
# Try to create a file in a forbidden location
kubectl -n develop exec -it deploy/backend -- touch /bin/malware
Expected Output: touch: /bin/malware: Read-only file system. This proves the hardening is active.