Understanding Kubernetes: Part 31 Vertical Pod Autoscaler (VPA)
Last updated
Last updated
📢 If you’ve been following our Kubernetes series 2025, welcome back! For new readers, check out Horizontal Pod Autoscaler (HPA)
A Vertical Pod Autoscaler (VPA) automatically adjusts the CPU and memory resource requests/limits of a pod based on real-time usage. Unlike Horizontal Pod Autoscaler (HPA), which scales the number of pods, VPA scales pod resources while keeping the replica count unchanged.
Monitors resource usage of running pods.
Recommends or applies changes to CPU and memory requests/limits.
Evicts and recreates pods with updated resource requests if necessary.
VPA requires the Metrics Server or external monitoring systems like Prometheus to collect usage data.
1. Automatic Resource Optimization
Ensures pods always have the right CPU and memory allocation, avoiding under or over-provisioning.
2. Preventing Out-of-Memory (OOM) Kills
Dynamically increases memory limits to prevent pod crashes due to memory starvation.
3. Reducing Wasted Resources
Lowers resource requests for over-provisioned pods, reducing unnecessary cloud costs.
4. Works Alongside HPA
HPA handles replica scaling, while VPA ensures individual pods have optimal resource allocations.
VPA Recommender — Analyzes historical and real-time resource usage and suggests CPU/memory adjustments.
VPA Updater — Applies changes by evicting and recreating pods when needed.
VPA Admission Controller — Adjusts resource requests when a new pod is created.
VPA operates in three modes:
Mode Description Off Only collects recommendations without applying changes. Auto Automatically updates pod resources and restarts the pod if needed. Initial Sets resource requests for new pods but does not modify running ones.
A Vertical Pod Autoscaler (VPA) is created using the following YAML configuration:
targetRef
– Specifies the deployment (webapp
) for VPA.
updatePolicy.updateMode
– Set to Auto
, meaning VPA will automatically adjust resources.
resourcePolicy.containerPolicies
– Defines min/max CPU and memory limits for all containers ("*"
).
To apply the VPA configuration:
To check VPA recommendations:
To describe recommendations:
To delete a VPA:
The Vertical Pod Autoscaler (VPA) is essential for optimizing resource requests and preventing performance issues due to over/under-provisioning. It works well alongside HPA, ensuring efficient CPU and memory usage while keeping pods stable.
As a Senior DevOps Engineer, I leveraged VPA for efficient resource management:
Preventing Memory Issues: Configured VPA to automatically adjust memory requests for high-load services, preventing OOM kills.
Optimizing CPU Usage: Used VPA recommendations to adjust CPU requests for services, reducing wasted resources.
Cost Reduction: Applied VPA to lower over-allocated resource requests, leading to significant cloud cost savings.
Take your Kubernetes journey to the next level with the Master Kubernetes: Zero to Hero course! 🌟 Whether you’re a beginner or aiming to sharpen your skills, this hands-on course covers:
✅ Kubernetes Basics — Grasp essential concepts like nodes, pods, and services. ✅ Advanced Scaling — Learn HPA, VPA, and resource optimization. ✅ Monitoring Tools — Master Prometheus, Grafana, and AlertManager. ✅ Real-World Scenarios — Build production-ready Kubernetes setups.
Don’t miss your chance to become a Kubernetes expert! 💻✨
🔥 Start Learning Now: [Join the Master Kubernetes Course + FREE Access to Terraform Course]()
🚀 Stay ahead in DevOps and SRE! 🔔 and never miss a beat on Kubernetes and more. 🌟
🔥 Start Learning Now: