Use PodMutation During a NodePool Merge
This guide explains how to use PodMutation to move workloads from multiple existing NodePools to one new NodePool during a merge.
The goal is straightforward:
- Keep the workload manifests unchanged at first
- Define the old scheduling rules in
workloadSelector - Remove those old rules in
mutationRemove - Add the new merged NodePool rules in
mutationAdd - Recreate the Pods so the webhook applies the new scheduling constraints
In practice, PodMutation acts as a transition layer for the merge. Instead of editing every workload immediately, you let the admission webhook rewrite Pod scheduling rules when replacement Pods are created.
How the merge works
In this example:
- the first old NodePool is selected by
example.com/node-pool=pool-a - the second old NodePool is selected by
example.com/node-role=pool-b - both old NodePools use the taint key
dedicated, but with different values - the merged NodePool is selected by
example.com/node-pool=pool-merged
The merge is implemented with two PodMutation objects:
- one matches workloads that still point to
pool-a - one matches workloads that still point to
pool-b
Each mutation removes the old nodeSelector and toleration, then adds the new merged NodePool selector and toleration.
Important behavior
This workflow depends on the current webhook behavior in the agent:
- The mutating webhook applies PodMutation only on Pod
CREATE. - Existing Pods are not rewritten in place.
- DaemonSet-owned Pods are skipped by the webhook.
- Inside one
PodMutation,mutationRemoveis applied beforemutationAdd. - The webhook refreshes its PodMutation cache every 30 seconds, so wait briefly after creating or updating a
PodMutation.
Prerequisites
- The
cloudpilot-webhookdeployment is healthy. - The
PodMutationCRD is installed. - The old and new NodePools already exist.
- The merged NodePool is schedulable before you restart workloads.
For this example, the cluster must have nodes that use these labels and taints:
example.com/node-pool=pool-awithdedicated=pool-a:NoScheduleexample.com/node-role=pool-bwithdedicated=pool-b:NoScheduleexample.com/node-pool=pool-mergedwithdedicated=pool-merged:NoSchedule
Procedure
The command examples below assume you save the two YAML snippets from this page as:
merge-nodepools-workloads.yamlmerge-nodepools-podmutation.yaml
- Apply workloads that still target the old NodePools.
- Confirm that the current Pods are using the old selectors and tolerations.
- Create the
PodMutationrules for the merge. - Wait for the webhook cache to refresh.
- Restart the workloads so that new Pods pass through admission.
- Confirm that the replacement Pods now target the merged NodePool.
# 1. Apply the example workloads
kubectl apply -f merge-nodepools-workloads.yaml
# 2. Verify the current scheduling rules
kubectl get pods -n merge-demo -o wide
kubectl get pod -n merge-demo -l app=app-pool-a -o yaml | grep -A12 "nodeSelector:"
kubectl get pod -n merge-demo -l app=app-pool-b -o yaml | grep -A12 "nodeSelector:"
# 3. Apply the PodMutation rules for the merge
kubectl apply -f merge-nodepools-podmutation.yaml
# 4. Wait for the webhook cache refresh
sleep 35
# 5. Recreate Pods so the webhook can mutate them
kubectl rollout restart deployment/app-pool-a -n merge-demo
kubectl rollout restart deployment/app-pool-b -n merge-demo
kubectl rollout status deployment/app-pool-a -n merge-demo --timeout=120s
kubectl rollout status deployment/app-pool-b -n merge-demo --timeout=120s
# 6. Verify that the workloads now target the merged NodePool
kubectl get pods -n merge-demo -o wide
kubectl get pod -n merge-demo -l app=app-pool-a -o yaml | grep -A12 "nodeSelector:"
kubectl get pod -n merge-demo -l app=app-pool-b -o yaml | grep -A12 "nodeSelector:"Expected result
Before the merge:
app-pool-aPods targetexample.com/node-pool=pool-aapp-pool-bPods targetexample.com/node-role=pool-b
After the restart:
- both workloads create new Pods with
example.com/node-pool=pool-merged - both workloads tolerate
dedicated=pool-merged:NoSchedule - unrelated selectors and tolerations remain unchanged
Example workloads
apiVersion: v1
kind: Namespace
metadata:
name: merge-demo
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-pool-a
namespace: merge-demo
labels:
app: app-pool-a
spec:
replicas: 2
selector:
matchLabels:
app: app-pool-a
template:
metadata:
labels:
app: app-pool-a
spec:
nodeSelector:
example.com/node-pool: pool-a
tolerations:
- key: dedicated
value: pool-a
effect: NoSchedule
containers:
- name: nginx
image: nginx:1.25-alpine
resources:
requests:
cpu: 100m
memory: 64Mi
limits:
cpu: 200m
memory: 128Mi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-pool-b
namespace: merge-demo
labels:
app: app-pool-b
spec:
replicas: 2
selector:
matchLabels:
app: app-pool-b
template:
metadata:
labels:
app: app-pool-b
spec:
nodeSelector:
example.com/node-role: pool-b
tolerations:
- key: dedicated
value: pool-b
effect: NoSchedule
containers:
- name: nginx
image: nginx:1.25-alpine
resources:
requests:
cpu: 100m
memory: 64Mi
limits:
cpu: 200m
memory: 128MiExample PodMutation rules
apiVersion: agent.cloudpilot.ai/v1alpha1
kind: PodMutation
metadata:
name: merge-pool-a
spec:
enable: true
priority: 100
workloadSelector:
nodeSelectorMatch:
example.com/node-pool: pool-a
tolerationMatch:
- key: dedicated
value: pool-a
effect: NoSchedule
mutationRemove:
nodeSelectorKeys:
- example.com/node-pool
tolerationKeys:
- dedicated
mutationAdd:
nodeSelector:
example.com/node-pool: pool-merged
tolerations:
- key: dedicated
value: pool-merged
effect: NoSchedule
---
apiVersion: agent.cloudpilot.ai/v1alpha1
kind: PodMutation
metadata:
name: merge-pool-b
spec:
enable: true
priority: 100
workloadSelector:
nodeSelectorMatch:
example.com/node-role: pool-b
tolerationMatch:
- key: dedicated
value: pool-b
effect: NoSchedule
mutationRemove:
nodeSelectorKeys:
- example.com/node-role
tolerationKeys:
- dedicated
mutationAdd:
nodeSelector:
example.com/node-pool: pool-merged
tolerations:
- key: dedicated
value: pool-merged
effect: NoScheduleNotes
- If you apply the
PodMutationand restart the workload immediately, the new Pods may still use the old rules because the webhook cache has not refreshed yet. - If you do not recreate the Pods, the merge does not take effect because existing Pods are not mutated in place.
- If the old NodePools use different label keys, create one
PodMutationper source rule set, as shown above.