Understanding Kubernetes: Part 31 Vertical Pod Autoscaler (VPA)

📢 If you’ve been following our Kubernetes series 2025, welcome back! For new readers, check out Part 30 Horizontal Pod Autoscaler (HPA)

What is a Vertical Pod Autoscaler (VPA)?

A Vertical Pod Autoscaler (VPA) automatically adjusts the CPU and memory resource requests/limits of a pod based on real-time usage. Unlike Horizontal Pod Autoscaler (HPA), which scales the number of pods, VPA scales pod resources while keeping the replica count unchanged.

How VPA Works

Monitors resource usage of running pods.
Recommends or applies changes to CPU and memory requests/limits.
Evicts and recreates pods with updated resource requests if necessary.

VPA requires the Metrics Server or external monitoring systems like Prometheus to collect usage data.

Use Cases

1. Automatic Resource Optimization

Ensures pods always have the right CPU and memory allocation, avoiding under or over-provisioning.

2. Preventing Out-of-Memory (OOM) Kills

Dynamically increases memory limits to prevent pod crashes due to memory starvation.

3. Reducing Wasted Resources

Lowers resource requests for over-provisioned pods, reducing unnecessary cloud costs.

4. Works Alongside HPA

HPA handles replica scaling, while VPA ensures individual pods have optimal resource allocations.

VPA Components

VPA Recommender — Analyzes historical and real-time resource usage and suggests CPU/memory adjustments.
VPA Updater — Applies changes by evicting and recreating pods when needed.
VPA Admission Controller — Adjusts resource requests when a new pod is created.

VPA Modes

VPA operates in three modes:

Mode Description Off Only collects recommendations without applying changes. Auto Automatically updates pod resources and restarts the pod if needed. Initial Sets resource requests for new pods but does not modify running ones.

VPA Syntax

A Vertical Pod Autoscaler (VPA) is created using the following YAML configuration:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: webapp-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: webapp
  updatePolicy:
    updateMode: "Auto"  # Options: "Off", "Auto", "Initial"
  resourcePolicy:
    containerPolicies:
      - containerName: "*"
        minAllowed:
          cpu: "100m"
          memory: "128Mi"
        maxAllowed:
          cpu: "2"
          memory: "4Gi"
        controlledResources: ["cpu", "memory"]

Explanation:

targetRef – Specifies the deployment (webapp) for VPA.
updatePolicy.updateMode – Set to Auto, meaning VPA will automatically adjust resources.
resourcePolicy.containerPolicies – Defines min/max CPU and memory limits for all containers ("*").

Applying VPA

To apply the VPA configuration:

kubectl apply -f vpa.yaml

To check VPA recommendations:

kubectl get vpa

To describe recommendations:

kubectl describe vpa webapp-vpa

Removing VPA

To delete a VPA:

kubectl delete vpa webapp-vpa

Conclusion

The Vertical Pod Autoscaler (VPA) is essential for optimizing resource requests and preventing performance issues due to over/under-provisioning. It works well alongside HPA, ensuring efficient CPU and memory usage while keeping pods stable.

In My Previous Role

As a Senior DevOps Engineer, I leveraged VPA for efficient resource management:

Preventing Memory Issues: Configured VPA to automatically adjust memory requests for high-load services, preventing OOM kills.
Optimizing CPU Usage: Used VPA recommendations to adjust CPU requests for services, reducing wasted resources.
Cost Reduction: Applied VPA to lower over-allocated resource requests, leading to significant cloud cost savings.

🚀 Ready to Master Kubernetes?

Take your Kubernetes journey to the next level with the Master Kubernetes: Zero to Hero course! 🌟 Whether you’re a beginner or aiming to sharpen your skills, this hands-on course covers:

✅ Kubernetes Basics — Grasp essential concepts like nodes, pods, and services. ✅ Advanced Scaling — Learn HPA, VPA, and resource optimization. ✅ Monitoring Tools — Master Prometheus, Grafana, and AlertManager. ✅ Real-World Scenarios — Build production-ready Kubernetes setups.

🔥 Flash Sale: Buy Kubernetes Course, Get Terraform FREE! Limited Time Offer!

🔥 Start Learning Now: [Join the Master Kubernetes Course + FREE Access to Terraform Course](https://cloudops0.gumroad.com/l/k8s)

Don’t miss your chance to become a Kubernetes expert! 💻✨

🚀 Stay ahead in DevOps and SRE! 🔔 Subscribe now and never miss a beat on Kubernetes and more. 🌟

🚀 Master Terraform: Infrastructure as Code

🔥 Start Learning Now: Join the Master Terraform Course

PreviousUnderstanding Kubernetes: Part 30 Horizontal Pod Autoscaler (HPA)NextUnderstanding Kubernetes: Part 33 Startup Probe

Last updated 8 months ago