Best Practices and Limitations

This document describes best practices for the Workload Autoscaler and the limitations of the InPlace Update Mode.

InPlace Update Mode Limitations

Cluster version requirement

Your Kubernetes cluster version must be 1.33 or higher.

Memory Limits: Decrease vs Increase

🔻 Decreasing Memory Limit

Not supported in InPlace. InPlace resizing does not allow lowering the Memory limit.
Fallback behavior: When a new recommendation would reduce an existing Pod’s Memory limit, the Workload Autoscaler automatically falls back to ReCreate mode and recreates the Pod.

🔺 Increasing Memory Limit

May require container restarts. Some workloads (e.g., Java applications) cannot dynamically adapt to Memory limit changes.
Best practice: Configure the container’s ResizePolicy so that the Memory resource is set to RestartContainer. In this case, attempts to increase the Memory limit will automatically restart the container to apply the new limit.

Notes

By default, the Workload Autoscaler sets the ResizePolicy for all resources of all containers to NotRequired.
If you have manually configured a container’s ResizePolicy for any resource, the Workload Autoscaler will not overwrite it. For details, see the Kubernetes documentation example .

Pod QoS class must not change

A Pod’s QoS class is determined at creation time (one of Guaranteed, Burstable, or BestEffort). InPlace updates must not cause a change in QoS class:

BestEffort Pods (no CPU/Memory requests or limits at startup): You cannot add any CPU/Memory requests or limits, because adding requests would convert the Pod to Burstable, which is not allowed in InPlace updates. Therefore, BestEffort Pods cannot use in-place vertical scaling. If you need scaling, specify requests at creation time so the Pod is at least Burstable.
Guaranteed Pods (for every container, CPU and Memory requests equal limits): After InPlace adjustments, each container must still satisfy requests == limits. To increase or decrease CPU/Memory, you must update both request and limit to the same value. For example, going from 2 CPU to 3 CPU requires setting both request and limit to 3. You cannot change only one of them, or the Pod will no longer be Guaranteed.
Burstable Pods (have requests, but not all equal to limits, or some containers may have no requests): You may adjust CPU/Memory, but must not turn the Pod into Guaranteed. It is forbidden to make both CPU and Memory requests equal to their limits across all containers after the update; otherwise the Pod would become Guaranteed. You also must not clear all requests and turn the Pod into BestEffort. In short, the Pod must keep its original QoS class unchanged.

If an InPlace operation violates any of the above QoS rules, the Workload Autoscaler falls back to ReCreate mode and explicitly recreates (re-schedules) the target Pod.

Note: Such fallback events are expected to occur only when a Workload is first configured with a AutoscalingPolicy or when certain related configurations of the AutoscalingPolicy are modified. They should not occur during normal operation.

`PodResizePending` during scale-up

When scaling up a Pod, you may see PodResizePending condition of the pod if the node hosting the Pod does not have enough remaining resources to satisfy the new requests.

In this scenario, the Workload Autoscaler will fall back to ReCreate mode to recreate/re-schedule the Pod.

Coexisting with HPA

Using the Workload Autoscaler together with HPA (Horizontal Pod Autoscaler) can produce unexpected behavior. If you need both, configure them to manage different resources—for example, let HPA scale by CPU usage, while the Workload Autoscaler adjusts only Memory.

Best Practices

Specify resource requests for all containers

Whenever possible, set resource requests for every container in all workloads. These do not need to be perfectly precise.

Prefer not to set limits

Avoid specifying limits whenever feasible. Instead, set requests to place Pods in the Burstable QoS class.

Set a restart policy for workloads that cannot adapt Memory InPlace

For workloads like Java that cannot adjust to Memory Limit changes dynamically, manually configure the container’s ResizePolicy so that when the InPlace update modifies the Memory Limit, the container will restart to apply the new limit (set Memory ResizePolicy to RestartContainer).