Permissions Required to Use CloudPilot AI
CloudPilot AI uses a two-phase installation to minimize security risk. Phase 1 requires read-only access to Kubernetes cluster resources. Phase 2 (optional) enables cluster optimization and requires additional Kubernetes RBAC and cloud provider permissions.
CloudPilot AI follows the principle of least privilege. You only need to grant CloudPilot AI the minimal permissions required to perform its operations — no more. Here is the overall architecture:
Note:: All credentials are mounted in your local cluster components and will not be stored or synced to our servers or the public internet.
Phase 1: Read-Only Permissions
During the first agent installation phase, CloudPilot AI deploys an agent that collects metadata about your Kubernetes cluster. This requires read-only access to core Kubernetes APIs across namespaces.
A ClusterRole and Role are required with the following permissions:
Agent RBAC Rule Matrix (Phase 1)
| Scope | API Group | Resources | Verbs | Purpose |
|---|---|---|---|---|
| Cluster | karpenter.k8s.aws | ec2nodeclasses | get, list, watch | Read EC2NodeClass metadata used by Karpenter-integrated environments. |
| Cluster | storage.k8s.io | csinodes | get, list, watch | Discover CSI/node storage topology. |
| Cluster | "" (core) | namespaces, nodes, pods, persistentvolumeclaims, persistentvolumes | get, list, watch | Collect cluster/workload and storage inventory. |
| Cluster | apps | deployments, daemonsets, statefulsets, replicasets | get, list, watch | Read workload controllers for optimization analysis. |
| Cluster | policy | poddisruptionbudgets | get, list, watch | Evaluate disruption constraints during planning. |
| Cluster | metrics.k8s.io | pods, nodes | get, list, watch | Read runtime resource metrics. |
| Cluster | apiextensions.k8s.io | customresourcedefinitions | get, list, watch | Discover available CRDs. |
| Cluster | evpa.cloudpilot.ai | autoscalingpolicyconfigurations, autoscalingpolicyconfigurations/status | get, list, watch | Read autoscaling configuration CRDs. |
| Cluster | evpa.cloudpilot.ai | autoscalingpolicies, autoscalingpolicies/status, recommendationpolicies | create, update, list, watch, patch, delete | Manage autoscaling policy and recommendation resources. |
| Namespace | coordination.k8s.io | leases | get, list, create, update, patch, watch | Leader election and coordination for the agent. |
This allows discovery and reporting of node configurations and workload configurations to calculate the optimization plan.
Phase 2: Additional Permissions for Cluster Optimization
If you proceed to Phase 2, CloudPilot AI will deploy components that actively optimize your cluster. This includes managing Kubernetes workloads and interacting with your cloud provider’s infrastructure APIs (e.g., EC2, Auto Scaling Groups).
Kubernetes RBAC Permissions
To manage Kubernetes resources, CloudPilot AI requires the following RBAC permissions:
AWS Optimizer RBAC Rule Matrix (Phase 2)
| Scope | API Group | Resources | Verbs | Purpose |
|---|---|---|---|---|
| Namespace | coordination.k8s.io | leases | get, watch, create, patch, update | Leader election for optimizer components. |
| Namespace | "" (core) | configmaps, secrets | get, list, watch | Read runtime config and TLS material. |
| Namespace | "" (core) | secrets (resourceNames: cloudpilot-aws-optimizer-cert, cloudpilot-webhook) | update | Rotate/update webhook and optimizer cert secrets. |
| Cluster | karpenter.k8s.aws | ec2nodeclasses | get, list, watch, create, delete, patch, update | Read and manage EC2NodeClass definitions. |
| Cluster | karpenter.k8s.aws | ec2nodeclasses/status | patch, update | Update EC2NodeClass status when needed. |
| Cluster | karpenter.sh | nodepools, nodepools/status | get, list, watch, create, delete, patch, update | Manage Karpenter NodePool lifecycle and tuning. |
| Cluster | karpenter.sh | nodeclaims, nodeclaims/status | get, list, watch, create, delete, patch, update | Manage NodeClaim provisioning and cleanup. |
| Cluster | admissionregistration.k8s.io | mutatingwebhookconfigurations, validatingwebhookconfigurations | get, update, list, watch | Maintain webhook registration and configuration. |
| Cluster | certificates.k8s.io | * | get, create, update, patch, delete, approve | Full CSR lifecycle for component certificates. |
| Cluster | agent.cloudpilot.ai | podmutations | get, create, update, patch, delete, list, watch | Manage pod mutation CRDs used by optimization flows. |
| Cluster | "" (core) | pods, pods/log, nodes, persistentvolumes, persistentvolumeclaims, replicationcontrollers, namespaces | get, list, watch, update, patch | Read/adjust workload and node-related state. |
| Cluster | "" (core) | nodes | patch, delete, update | Apply node-level optimization operations. |
| Cluster | "" (core) | pods/eviction, pods, events | create (pods/eviction, events), delete (pods), patch (events) | Evict/delete pods and publish events during actions. |
| Cluster | storage.k8s.io | storageclasses, csinodes, volumeattachments | get, list, watch | Storage capability and attachment awareness. |
| Cluster | apps | daemonsets, deployments, replicasets, statefulsets | get, list, watch, update, patch | Workload controller read/write for optimization. |
| Cluster | apiextensions.k8s.io | customresourcedefinitions | get, list, watch, update | CRD discovery and updates required by controller logic. |
| Cluster | apiextensions.k8s.io | customresourcedefinitions/status (resourceNames: ec2nodeclasses.karpenter.k8s.aws, nodepools.karpenter.sh, nodeclaims.karpenter.sh) | patch | Patch status of key Karpenter CRDs. |
| Cluster | policy | poddisruptionbudgets | get, list, watch | Respect disruption policies during operations. |
These permissions are needed to ensure system stability while performing operations such as scaling, provisioning, and termination.
Cloud Provider Permissions
AWS IAM Permissions
In Phase 2, CloudPilot AI requires an IAM role with the following permissions:
AWS IAM Permission Matrix
| Category | Actions | Resource Scope | Condition/Note |
|---|---|---|---|
| ASG read | autoscaling:DescribeAutoScalingGroups, autoscaling:DescribeAutoScalingInstances, autoscaling:DescribeLaunchConfigurations, autoscaling:DescribeScalingActivities, autoscaling:DescribeTags | * | Discover ASG topology and state. |
| ASG write | autoscaling:SetDesiredCapacity, autoscaling:TerminateInstanceInAutoScalingGroup, autoscaling:UpdateAutoScalingGroup | * | Scale and rebalance existing ASGs. |
| EC2/Karpenter runtime | ssm:GetParameter, ec2:Describe* (images/instances/types/subnets/security groups/launch templates/regions/spot), ec2:RunInstances, ec2:CreateFleet, ec2:CreateLaunchTemplate, ec2:DeleteLaunchTemplate, ec2:CreateTags, pricing:GetProducts, savingsplans:DescribeSavingsPlans | * | Node provisioning and pricing-aware decisions. |
| Conditional EC2 terminate | ec2:TerminateInstances | * | Limited by condition ec2:ResourceTag/karpenter.sh/nodepool = *. |
| Pass node IAM role | iam:PassRole | arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/CloudPilotNodeRole-${CLUSTER_NAME} | Allow node launch with designated role. |
| Cluster metadata read | eks:DescribeCluster, eks:DescribeNodegroup | Cluster/Nodegroup ARN | Discover endpoint and nodegroup metadata. |
| Instance profile create | iam:CreateInstanceProfile | * | Requires cluster/region request tags and karpenter.k8s.aws/ec2nodeclass tag pattern. |
| Instance profile tag | iam:TagInstanceProfile | * | Requires matching request/resource tags for cluster/region and nodeclass. |
| Instance profile manage | iam:AddRoleToInstanceProfile, iam:RemoveRoleFromInstanceProfile, iam:DeleteInstanceProfile | * | Requires matching resource tags for cluster/region and nodeclass. |
| Instance profile read | iam:GetInstanceProfile | * | Read back created instance profiles. |
These permissions include access to EC2 and Auto Scaling Group APIs, used to manage compute capacity dynamically.
AlibabaCloud RAM Permissions
For AlibabaCloud, the required RAM policy is as follows:
AlibabaCloud RAM Permission Matrix
| Category | Actions | Resource Scope | Condition/Note |
|---|---|---|---|
| Core discovery/provisioning | vpc:DescribeVSwitches, ecs:CreateAutoProvisioningGroup, ecs:DescribeSecurityGroups, ecs:DescribeAvailableResource, ecs:DescribeInstances, ecs:DescribeImages, cs:DescribeClusterDetail, cs:DescribeKubernetesVersionMetadata, cs:DescribeClusterAttachScripts, cs:DescribeClusterNodePools, ess:DescribeScalingGroups, ecs:AddTags, ram:CreateServiceLinkedRole | * | Discover cluster/infrastructure state and provision capacity. |
| Conditional ECS terminate | ecs:DeleteInstance | * | Only when acs:ResourceTag/cloudpilot.ai/managed = true. |
| Conditional ESS detach | ess:DetachInstances | * | Only when acs:ResourceTag/ack.aliyun.com = ${INTERNAL_CLUSTER_ID}. |
These permissions allow CloudPilot AI to interact with ECS and ScalingGroup APIs for node lifecycle management.
All cloud provider permissions align closely with those used by Karpenter, with the addition of access to scaling group APIs (AWS/AutoScalingGroup, AlibabaCloud/ScalingGroup) to enable existing node optimization.
Workload Autoscaler Additional Permissions
When installing Workload Autoscaler, CloudPilot AI requires the following Kubernetes RBAC permissions.
These permissions allow the component to:
- Read and patch target workloads (
Deployment/StatefulSet/ReplicaSet) and related Pods - Evict and resize Pods when applying optimization recommendations
- Manage Workload Autoscaler CRDs and status resources
- Create/approve CSR resources and update webhook configurations for TLS rotation
- Discover scrape endpoints for metrics collection
- Perform leader election and manage namespaced secrets/config/events
RBAC Rule Matrix
The following table lists the permissions required by Workload Autoscaler. (Presented by capability only; no installation manifest metadata such as apiVersion, kind, or Helm template placeholders.)
| Scope | API Group | Resources | Verbs | Purpose |
|---|---|---|---|---|
| Cluster | "" (core) | pods, pods/eviction, pods/resize, pods/status | create, update, get, list, watch, patch, delete | Pod-level actions for recommendation execution (including eviction and in-place resize). |
| Cluster | apps | deployments, statefulsets, replicasets | get, update, list, watch, patch | Read/update/patch workload objects affected by autoscaling decisions. |
| Cluster | evpa.cloudpilot.ai | autoscalingpolicyconfigurations, autoscalingpolicyconfigurations/status | create, update, list, watch, patch, delete | Manage Workload Autoscaler configuration CRDs. |
| Cluster | evpa.cloudpilot.ai | autoscalingpolicies, autoscalingpolicies/status, recommendationpolicies | update, list, watch, patch | Consume and update policy/recommendation resources. |
| Cluster | certificates.k8s.io | certificatesigningrequests, certificatesigningrequests/approval | get, create, update | CSR workflow for webhook TLS certificates. |
| Cluster | certificates.k8s.io | signers (resourceNames: kubernetes.io/kubelet-serving, beta.eks.amazonaws.com/app-serving) | approve | Approve signer-specific certificate requests. |
| Cluster | admissionregistration.k8s.io | mutatingwebhookconfigurations, validatingwebhookconfigurations | get, update, patch, list, watch | Maintain webhook configuration state. |
| Cluster | discovery.k8s.io | endpointslices | get, list, watch | Service discovery for metrics scraping. |
| Cluster | "" (core) | nodes, nodes/proxy, nodes/metrics, services, endpoints | get, list, watch | Node/service endpoint discovery and metrics access. |
| Cluster | networking.k8s.io | ingresses | get, list, watch | Discover ingress endpoints for metrics-related integrations. |
| Cluster | non-resource URL | /metrics | get | Access metrics endpoint. |
| Namespace | coordination.k8s.io | leases | get, create, update, watch | Leader election. |
| Namespace | "" (core) | configmaps (resourceName: kube-root-ca.crt) | get, update | Read/update cluster CA config for certificate workflows. |
| Namespace | "" (core) | secrets (resourceName: workload-autoscaler-webhook) | get, update, patch | Manage webhook TLS secret content. |
| Namespace | "" (core) | secrets | list, watch | Observe secret changes needed by runtime components. |
| Namespace | "" (core) | events | create, patch | Emit operational events. |
| Namespace (OpenShift) | security.openshift.io | securitycontextconstraints (resourceName: privileged) | use | Required for Node Agent execution on OpenShift. |
Required Binding
- Bind the Workload Autoscaler agent service account (
cloudpilot-agent) to the agent cluster-scope permission set in the deployment namespace.
Commitment to Privacy and Security
CloudPilot AI adheres strictly to data protection best practices. We access only the data necessary for operation and optimization, and we never request or retain unnecessary privileges.
By design, we reduce attack surfaces through scoped permissions and continuous review of access requirements. This ensures secure, compliant, and efficient operation within your environment.
For further assistance, feel free to reach out to us through our Slack channel .
