Skip to Content
GuideSecurityPermissions Required to Use CloudPilot AI

Permissions Required to Use CloudPilot AI

CloudPilot AI uses a two-phase installation to minimize security risk. Phase 1 requires read-only access to Kubernetes cluster resources. Phase 2 (optional) enables cluster optimization and requires additional Kubernetes RBAC and cloud provider permissions.

CloudPilot AI follows the principle of least privilege. You only need to grant CloudPilot AI the minimal permissions required to perform its operations — no more. Here is the overall architecture:

cloudpilot-ai-role

Note:: All credentials are mounted in your local cluster components and will not be stored or synced to our servers or the public internet.

Phase 1: Read-Only Permissions

During the first agent installation phase, CloudPilot AI deploys an agent that collects metadata about your Kubernetes cluster. This requires read-only access to core Kubernetes APIs across namespaces.

A ClusterRole and Role are required with the following permissions:

Agent RBAC Rule Matrix (Phase 1)

ScopeAPI GroupResourcesVerbsPurpose
Clusterkarpenter.k8s.awsec2nodeclassesget, list, watchRead EC2NodeClass metadata used by Karpenter-integrated environments.
Clusterstorage.k8s.iocsinodesget, list, watchDiscover CSI/node storage topology.
Cluster"" (core)namespaces, nodes, pods, persistentvolumeclaims, persistentvolumesget, list, watchCollect cluster/workload and storage inventory.
Clusterappsdeployments, daemonsets, statefulsets, replicasetsget, list, watchRead workload controllers for optimization analysis.
Clusterpolicypoddisruptionbudgetsget, list, watchEvaluate disruption constraints during planning.
Clustermetrics.k8s.iopods, nodesget, list, watchRead runtime resource metrics.
Clusterapiextensions.k8s.iocustomresourcedefinitionsget, list, watchDiscover available CRDs.
Clusterevpa.cloudpilot.aiautoscalingpolicyconfigurations, autoscalingpolicyconfigurations/statusget, list, watchRead autoscaling configuration CRDs.
Clusterevpa.cloudpilot.aiautoscalingpolicies, autoscalingpolicies/status, recommendationpoliciescreate, update, list, watch, patch, deleteManage autoscaling policy and recommendation resources.
Namespacecoordination.k8s.ioleasesget, list, create, update, patch, watchLeader election and coordination for the agent.

This allows discovery and reporting of node configurations and workload configurations to calculate the optimization plan.

Phase 2: Additional Permissions for Cluster Optimization

If you proceed to Phase 2, CloudPilot AI will deploy components that actively optimize your cluster. This includes managing Kubernetes workloads and interacting with your cloud provider’s infrastructure APIs (e.g., EC2, Auto Scaling Groups).

Kubernetes RBAC Permissions

To manage Kubernetes resources, CloudPilot AI requires the following RBAC permissions:

AWS Optimizer RBAC Rule Matrix (Phase 2)

ScopeAPI GroupResourcesVerbsPurpose
Namespacecoordination.k8s.ioleasesget, watch, create, patch, updateLeader election for optimizer components.
Namespace"" (core)configmaps, secretsget, list, watchRead runtime config and TLS material.
Namespace"" (core)secrets (resourceNames: cloudpilot-aws-optimizer-cert, cloudpilot-webhook)updateRotate/update webhook and optimizer cert secrets.
Clusterkarpenter.k8s.awsec2nodeclassesget, list, watch, create, delete, patch, updateRead and manage EC2NodeClass definitions.
Clusterkarpenter.k8s.awsec2nodeclasses/statuspatch, updateUpdate EC2NodeClass status when needed.
Clusterkarpenter.shnodepools, nodepools/statusget, list, watch, create, delete, patch, updateManage Karpenter NodePool lifecycle and tuning.
Clusterkarpenter.shnodeclaims, nodeclaims/statusget, list, watch, create, delete, patch, updateManage NodeClaim provisioning and cleanup.
Clusteradmissionregistration.k8s.iomutatingwebhookconfigurations, validatingwebhookconfigurationsget, update, list, watchMaintain webhook registration and configuration.
Clustercertificates.k8s.io*get, create, update, patch, delete, approveFull CSR lifecycle for component certificates.
Clusteragent.cloudpilot.aipodmutationsget, create, update, patch, delete, list, watchManage pod mutation CRDs used by optimization flows.
Cluster"" (core)pods, pods/log, nodes, persistentvolumes, persistentvolumeclaims, replicationcontrollers, namespacesget, list, watch, update, patchRead/adjust workload and node-related state.
Cluster"" (core)nodespatch, delete, updateApply node-level optimization operations.
Cluster"" (core)pods/eviction, pods, eventscreate (pods/eviction, events), delete (pods), patch (events)Evict/delete pods and publish events during actions.
Clusterstorage.k8s.iostorageclasses, csinodes, volumeattachmentsget, list, watchStorage capability and attachment awareness.
Clusterappsdaemonsets, deployments, replicasets, statefulsetsget, list, watch, update, patchWorkload controller read/write for optimization.
Clusterapiextensions.k8s.iocustomresourcedefinitionsget, list, watch, updateCRD discovery and updates required by controller logic.
Clusterapiextensions.k8s.iocustomresourcedefinitions/status (resourceNames: ec2nodeclasses.karpenter.k8s.aws, nodepools.karpenter.sh, nodeclaims.karpenter.sh)patchPatch status of key Karpenter CRDs.
Clusterpolicypoddisruptionbudgetsget, list, watchRespect disruption policies during operations.

These permissions are needed to ensure system stability while performing operations such as scaling, provisioning, and termination.

Cloud Provider Permissions

AWS IAM Permissions

In Phase 2, CloudPilot AI requires an IAM role with the following permissions:

AWS IAM Permission Matrix

CategoryActionsResource ScopeCondition/Note
ASG readautoscaling:DescribeAutoScalingGroups, autoscaling:DescribeAutoScalingInstances, autoscaling:DescribeLaunchConfigurations, autoscaling:DescribeScalingActivities, autoscaling:DescribeTags*Discover ASG topology and state.
ASG writeautoscaling:SetDesiredCapacity, autoscaling:TerminateInstanceInAutoScalingGroup, autoscaling:UpdateAutoScalingGroup*Scale and rebalance existing ASGs.
EC2/Karpenter runtimessm:GetParameter, ec2:Describe* (images/instances/types/subnets/security groups/launch templates/regions/spot), ec2:RunInstances, ec2:CreateFleet, ec2:CreateLaunchTemplate, ec2:DeleteLaunchTemplate, ec2:CreateTags, pricing:GetProducts, savingsplans:DescribeSavingsPlans*Node provisioning and pricing-aware decisions.
Conditional EC2 terminateec2:TerminateInstances*Limited by condition ec2:ResourceTag/karpenter.sh/nodepool = *.
Pass node IAM roleiam:PassRolearn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/CloudPilotNodeRole-${CLUSTER_NAME}Allow node launch with designated role.
Cluster metadata readeks:DescribeCluster, eks:DescribeNodegroupCluster/Nodegroup ARNDiscover endpoint and nodegroup metadata.
Instance profile createiam:CreateInstanceProfile*Requires cluster/region request tags and karpenter.k8s.aws/ec2nodeclass tag pattern.
Instance profile tagiam:TagInstanceProfile*Requires matching request/resource tags for cluster/region and nodeclass.
Instance profile manageiam:AddRoleToInstanceProfile, iam:RemoveRoleFromInstanceProfile, iam:DeleteInstanceProfile*Requires matching resource tags for cluster/region and nodeclass.
Instance profile readiam:GetInstanceProfile*Read back created instance profiles.

These permissions include access to EC2 and Auto Scaling Group APIs, used to manage compute capacity dynamically.

AlibabaCloud RAM Permissions

For AlibabaCloud, the required RAM policy is as follows:

AlibabaCloud RAM Permission Matrix

CategoryActionsResource ScopeCondition/Note
Core discovery/provisioningvpc:DescribeVSwitches, ecs:CreateAutoProvisioningGroup, ecs:DescribeSecurityGroups, ecs:DescribeAvailableResource, ecs:DescribeInstances, ecs:DescribeImages, cs:DescribeClusterDetail, cs:DescribeKubernetesVersionMetadata, cs:DescribeClusterAttachScripts, cs:DescribeClusterNodePools, ess:DescribeScalingGroups, ecs:AddTags, ram:CreateServiceLinkedRole*Discover cluster/infrastructure state and provision capacity.
Conditional ECS terminateecs:DeleteInstance*Only when acs:ResourceTag/cloudpilot.ai/managed = true.
Conditional ESS detachess:DetachInstances*Only when acs:ResourceTag/ack.aliyun.com = ${INTERNAL_CLUSTER_ID}.

These permissions allow CloudPilot AI to interact with ECS and ScalingGroup APIs for node lifecycle management.

All cloud provider permissions align closely with those used by Karpenter, with the addition of access to scaling group APIs (AWS/AutoScalingGroup, AlibabaCloud/ScalingGroup) to enable existing node optimization.

Workload Autoscaler Additional Permissions

When installing Workload Autoscaler, CloudPilot AI requires the following Kubernetes RBAC permissions.

These permissions allow the component to:

  • Read and patch target workloads (Deployment/StatefulSet/ReplicaSet) and related Pods
  • Evict and resize Pods when applying optimization recommendations
  • Manage Workload Autoscaler CRDs and status resources
  • Create/approve CSR resources and update webhook configurations for TLS rotation
  • Discover scrape endpoints for metrics collection
  • Perform leader election and manage namespaced secrets/config/events

RBAC Rule Matrix

The following table lists the permissions required by Workload Autoscaler. (Presented by capability only; no installation manifest metadata such as apiVersion, kind, or Helm template placeholders.)

ScopeAPI GroupResourcesVerbsPurpose
Cluster"" (core)pods, pods/eviction, pods/resize, pods/statuscreate, update, get, list, watch, patch, deletePod-level actions for recommendation execution (including eviction and in-place resize).
Clusterappsdeployments, statefulsets, replicasetsget, update, list, watch, patchRead/update/patch workload objects affected by autoscaling decisions.
Clusterevpa.cloudpilot.aiautoscalingpolicyconfigurations, autoscalingpolicyconfigurations/statuscreate, update, list, watch, patch, deleteManage Workload Autoscaler configuration CRDs.
Clusterevpa.cloudpilot.aiautoscalingpolicies, autoscalingpolicies/status, recommendationpoliciesupdate, list, watch, patchConsume and update policy/recommendation resources.
Clustercertificates.k8s.iocertificatesigningrequests, certificatesigningrequests/approvalget, create, updateCSR workflow for webhook TLS certificates.
Clustercertificates.k8s.iosigners (resourceNames: kubernetes.io/kubelet-serving, beta.eks.amazonaws.com/app-serving)approveApprove signer-specific certificate requests.
Clusteradmissionregistration.k8s.iomutatingwebhookconfigurations, validatingwebhookconfigurationsget, update, patch, list, watchMaintain webhook configuration state.
Clusterdiscovery.k8s.ioendpointslicesget, list, watchService discovery for metrics scraping.
Cluster"" (core)nodes, nodes/proxy, nodes/metrics, services, endpointsget, list, watchNode/service endpoint discovery and metrics access.
Clusternetworking.k8s.ioingressesget, list, watchDiscover ingress endpoints for metrics-related integrations.
Clusternon-resource URL/metricsgetAccess metrics endpoint.
Namespacecoordination.k8s.ioleasesget, create, update, watchLeader election.
Namespace"" (core)configmaps (resourceName: kube-root-ca.crt)get, updateRead/update cluster CA config for certificate workflows.
Namespace"" (core)secrets (resourceName: workload-autoscaler-webhook)get, update, patchManage webhook TLS secret content.
Namespace"" (core)secretslist, watchObserve secret changes needed by runtime components.
Namespace"" (core)eventscreate, patchEmit operational events.
Namespace (OpenShift)security.openshift.iosecuritycontextconstraints (resourceName: privileged)useRequired for Node Agent execution on OpenShift.

Required Binding

  • Bind the Workload Autoscaler agent service account (cloudpilot-agent) to the agent cluster-scope permission set in the deployment namespace.

Commitment to Privacy and Security

CloudPilot AI adheres strictly to data protection best practices. We access only the data necessary for operation and optimization, and we never request or retain unnecessary privileges.

By design, we reduce attack surfaces through scoped permissions and continuous review of access requirements. This ensures secure, compliant, and efficient operation within your environment.

For further assistance, feel free to reach out to us through our Slack channel .

Last updated on