Kubernetes Scaling¶
Overview¶
This guide covers scaling strategies in Kubernetes, including horizontal and vertical scaling, autoscaling configurations, and best practices for managing application scale.
Prerequisites¶
- Basic understanding of Kubernetes concepts
- Knowledge of pod and deployment management
- Familiarity with resource metrics
- Understanding of load balancing
Learning Objectives¶
- Understand scaling concepts
- Learn horizontal pod scaling
- Master vertical scaling
- Implement autoscaling
- Configure cluster scaling
Table of Contents¶
- Horizontal Pod Scaling
- Vertical Pod Scaling
- Cluster Autoscaling
- Custom Metrics Scaling
- Best Practices
Horizontal Pod Scaling¶
HorizontalPodAutoscaler¶
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
Multiple Metrics¶
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa-multi
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Vertical Pod Scaling¶
VerticalPodAutoscaler¶
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: app-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: app-deployment
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: '*'
minAllowed:
cpu: 100m
memory: 50Mi
maxAllowed:
cpu: 1
memory: 500Mi
Manual Resource Scaling¶
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-deployment
spec:
template:
spec:
containers:
- name: app
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
Cluster Autoscaling¶
Cluster Autoscaler Configuration¶
apiVersion: "autoscaling.k8s.io/v1"
kind: ClusterAutoscaler
metadata:
name: cluster-autoscaler
spec:
scaleDown:
enabled: true
delayAfterAdd: 10m
delayAfterDelete: 10s
delayAfterFailure: 3m
scaleUp:
enabled: true
delayAfterAdd: 10s
delayAfterDelete: 10s
delayAfterFailure: 3m
minSize: 1
maxSize: 10
targetNodeUtilization: 50
Node Group Configuration¶
apiVersion: "autoscaling.k8s.io/v1"
kind: NodeGroup
metadata:
name: worker-nodes
spec:
minSize: 1
maxSize: 5
machineType: "t2.medium"
labels:
role: worker
taints: []
Custom Metrics Scaling¶
Custom Metrics HPA¶
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa-custom
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: requests_per_second
target:
type: AverageValue
averageValue: 1k
Prometheus Adapter Configuration¶
apiVersion: custom.metrics.k8s.io/v1beta1
kind: MetricsAdapter
metadata:
name: custom-metrics-adapter
spec:
rules:
- seriesQuery: 'http_requests_total'
resources:
overrides:
kubernetes_namespace:
resource: namespace
kubernetes_pod_name:
resource: pod
name:
matches: "^(.*)_total"
as: "${1}_per_second"
metricsQuery: 'rate(<<.Series>>{<<.LabelMatchers>>}[2m])'
Best Practices¶
Resource Requests and Limits¶
apiVersion: v1
kind: Pod
metadata:
name: resource-pod
spec:
containers:
- name: app
image: app:1.0
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
Pod Disruption Budget¶
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: app-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: myapp
Common Pitfalls¶
- Incorrect metric configuration
- Poor resource planning
- Missing pod disruption budgets
- Inadequate monitoring
- Improper scaling thresholds
- Resource contention
Implementation Examples¶
Complete Scaling Configuration¶
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: web
image: nginx
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "200m"
memory: "256Mi"
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: web-app-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: web
Advanced Scaling Strategy¶
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: web-app-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: web-app
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: '*'
minAllowed:
cpu: 50m
memory: 64Mi
maxAllowed:
cpu: 500m
memory: 512Mi
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: web-app-pdb
spec:
minAvailable: "50%"
selector:
matchLabels:
app: web
Resources for Further Learning¶
- Kubernetes Horizontal Pod Autoscaling
- Vertical Pod Autoscaling
- Cluster Autoscaler
- Custom Metrics API
Practice Exercises¶
- Implement horizontal pod autoscaling
- Configure vertical pod autoscaling
- Set up cluster autoscaling
- Implement custom metrics scaling
- Configure pod disruption budgets