Kubernetes Best Practices¶
Overview¶
This guide covers best practices for running Kubernetes in production, including security, scalability, reliability, and operational excellence.
Prerequisites¶
- Basic understanding of Kubernetes concepts
- Knowledge of container orchestration
- Familiarity with DevOps practices
- Understanding of cloud-native principles
Learning Objectives¶
- Understand Kubernetes best practices
- Learn security hardening
- Master scalability patterns
- Implement reliability measures
- Configure operational excellence
Table of Contents¶
- Security Best Practices
- Scalability Best Practices
- Reliability Best Practices
- Operational Best Practices
- Resource Management
Security Best Practices¶
Pod Security Context¶
apiVersion: v1
kind: Pod
metadata:
name: secure-pod
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
containers:
- name: secure-container
image: nginx
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
Network Policies¶
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
RBAC Configuration¶
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: minimal-access
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: minimal-access-binding
subjects:
- kind: ServiceAccount
name: app-service-account
roleRef:
kind: Role
name: minimal-access
apiGroup: rbac.authorization.k8s.io
Scalability Best Practices¶
Horizontal Pod Autoscaling¶
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
Resource Requests and Limits¶
apiVersion: v1
kind: Pod
metadata:
name: resource-pod
spec:
containers:
- name: app
image: nginx
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
Reliability Best Practices¶
Liveness and Readiness Probes¶
apiVersion: v1
kind: Pod
metadata:
name: probe-pod
spec:
containers:
- name: app
image: nginx
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3
periodSeconds: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
Pod Disruption Budget¶
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: app-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: web
Operational Best Practices¶
Resource Quotas¶
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
spec:
hard:
requests.cpu: "4"
requests.memory: 4Gi
limits.cpu: "8"
limits.memory: 8Gi
Limit Ranges¶
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
spec:
limits:
- default:
memory: 256Mi
cpu: 500m
defaultRequest:
memory: 128Mi
cpu: 250m
type: Container
Resource Management¶
Namespace Configuration¶
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
name: production
environment: prod
Resource Labels and Annotations¶
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
labels:
app: web
environment: production
team: frontend
annotations:
description: "Web application deployment"
contact: "team@example.com"
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
Implementation Examples¶
Complete Production Setup¶
apiVersion: apps/v1
kind: Deployment
metadata:
name: production-app
labels:
app: production
environment: prod
spec:
replicas: 3
selector:
matchLabels:
app: production
template:
metadata:
labels:
app: production
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
containers:
- name: app
image: nginx:1.21
ports:
- containerPort: 8080
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: production-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: production
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: production-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: production-app
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
Monitoring and Logging Setup¶
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: app-monitor
spec:
selector:
matchLabels:
app: production
endpoints:
- port: metrics
---
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
data:
fluent.conf: |
<source>
@type tail
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
tag kubernetes.*
read_from_head true
<parse>
@type json
</parse>
</source>
Best Practices Checklist¶
Security¶
- Enable RBAC
- Use Pod Security Contexts
- Implement Network Policies
- Secure Secrets Management
- Regular Security Updates
- Image Scanning
- Audit Logging
Scalability¶
- Use HPA
- Set Resource Requests/Limits
- Implement Load Balancing
- Use Node Autoscaling
- Optimize Performance
- Cache Effectively
- Use CDN
Reliability¶
- Use Health Checks
- Implement PodDisruptionBudgets
- Set Up Monitoring
- Configure Logging
- Backup Critical Data
- Disaster Recovery Plan
- High Availability Setup
Operations¶
- Use Resource Quotas
- Implement Limit Ranges
- Label Resources
- Document Everything
- Automate Deployments
- Monitor Costs
- Regular Maintenance
Resources for Further Learning¶
Practice Exercises¶
- Implement security best practices
- Configure scalability features
- Set up reliability measures
- Establish operational procedures
- Create production checklist