Skip to Content
GuideAdministrationCustom IAM Roles for EKS

Custom IAM Roles for EKS

This guide explains how to configure custom IAM roles for CloudPilot on EKS, and how to use those roles during installation, migration, and upgrade.

This guide applies when you install CloudPilot on EKS with CUSTOM_NODE_ROLE and/or CUSTOM_CONTROLLER_ROLE.

Behavior When Custom Roles Are Provided

When a custom IAM role is provided:

  1. The installer no longer updates the custom role trust policy.
  2. The installer no longer attaches, detaches, or rewrites policies on the custom role.
  3. The installer validates that the custom role already satisfies the minimum CloudPilot requirements and fails fast if it does not.
  4. For a custom node role, the installer does not require the role to inherit permissions from the EKS managed node group role. It only validates the minimum permissions listed below.

The installer still manages non-role AWS resources such as cluster access entries, subnet/security-group tags, and the EKS OIDC provider when UPDATE_AWS_RESOURCE=true.

The IAM identity that runs the installer must also be allowed to call iam:SimulatePrincipalPolicy on the custom roles. Without that permission, the installer cannot complete the validation step.

Prepare the Role Files

Whether you apply the roles with AWS CLI or paste the policies in AWS Console, start by exporting the same environment variables and generating the same JSON files.

The examples below use envsubst to render shell variables into JSON files. If envsubst is not available on your machine, install GNU gettext first.

1. Export the required environment variables

Change the cluster and role names below, and usually leave AWS_PARTITION as-is:

export AWS_PARTITION=${AWS_PARTITION:-aws} export CLUSTER_NAME="<your-cluster-name>" export CLUSTER_REGION="<your-cluster-region>" export NODE_ROLE_NAME="<your-node-role-name>" export CONTROLLER_ROLE_NAME="<your-controller-role-name>" export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query 'Account' --output text) export OIDC_PROVIDER_HOSTPATH=$( aws eks describe-cluster \ --name "$CLUSTER_NAME" \ --region "$CLUSTER_REGION" \ --query 'cluster.identity.oidc.issuer' \ --output text | sed 's#^https://##' ) export NODE_ROLE_ARN="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${NODE_ROLE_NAME}"

2. Generate the JSON files

Use > instead of >> so that rerunning the command overwrites the old file instead of appending a second JSON document.

cat <<'EOF' | envsubst > node-role-trust-policy.json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "ec2.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } EOF cat <<'EOF' | envsubst > controller-role-trust-policy.json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Federated": "arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER_HOSTPATH}" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringEquals": { "${OIDC_PROVIDER_HOSTPATH}:aud": "sts.amazonaws.com", "${OIDC_PROVIDER_HOSTPATH}:sub": "system:serviceaccount:cloudpilot:cloudpilot-admin" } } } ] } EOF cat <<'EOF' | envsubst > controller-role-minimum-policy.json { "Version": "2012-10-17", "Statement": [ { "Sid": "CloudPilotReadAutoscalingAndNodeGroup", "Effect": "Allow", "Action": [ "autoscaling:DescribeAutoScalingGroups", "autoscaling:DescribeAutoScalingInstances", "autoscaling:DescribeLaunchConfigurations", "autoscaling:DescribeScalingActivities", "autoscaling:DescribeTags", "ec2:DescribeImages", "ec2:DescribeInstanceTypes", "ec2:DescribeLaunchTemplateVersions", "ec2:GetInstanceTypesFromInstanceRequirements", "eks:DescribeNodegroup" ], "Resource": "*" }, { "Sid": "CloudPilotMutateAutoscaling", "Effect": "Allow", "Action": [ "autoscaling:SetDesiredCapacity", "autoscaling:TerminateInstanceInAutoScalingGroup", "autoscaling:UpdateAutoScalingGroup" ], "Resource": "*" }, { "Sid": "CloudPilotProvisionEC2Capacity", "Effect": "Allow", "Action": [ "ssm:GetParameter", "ec2:DescribeImages", "ec2:RunInstances", "ec2:DescribeSubnets", "ec2:DescribeSecurityGroups", "ec2:DescribeLaunchTemplates", "ec2:DescribeInstances", "ec2:DescribeInstanceTypes", "ec2:DescribeInstanceTypeOfferings", "ec2:DescribeAvailabilityZones", "ec2:DeleteLaunchTemplate", "ec2:CreateTags", "ec2:CreateLaunchTemplate", "ec2:CreateFleet", "ec2:DescribeSpotPriceHistory", "pricing:GetProducts", "savingsplans:DescribeSavingsPlans", "ec2:DescribeRegions" ], "Resource": "*" }, { "Sid": "CloudPilotTerminateClusterInstancesOnly", "Effect": "Allow", "Action": "ec2:TerminateInstances", "Resource": "*", "Condition": { "StringEquals": { "ec2:ResourceTag/kubernetes.io/cluster/${CLUSTER_NAME}": [ "owned", "shared" ] } } }, { "Sid": "CloudPilotPassNodeRole", "Effect": "Allow", "Action": "iam:PassRole", "Resource": "${NODE_ROLE_ARN}" }, { "Sid": "CloudPilotDescribeCluster", "Effect": "Allow", "Action": "eks:DescribeCluster", "Resource": "arn:${AWS_PARTITION}:eks:${CLUSTER_REGION}:${AWS_ACCOUNT_ID}:cluster/${CLUSTER_NAME}" }, { "Sid": "CloudPilotCreateScopedInstanceProfiles", "Effect": "Allow", "Action": [ "iam:CreateInstanceProfile" ], "Resource": "*", "Condition": { "StringEquals": { "aws:RequestTag/kubernetes.io/cluster/${CLUSTER_NAME}": "owned", "aws:RequestTag/topology.kubernetes.io/region": "${CLUSTER_REGION}" }, "StringLike": { "aws:RequestTag/karpenter.k8s.aws/ec2nodeclass": "*" } } }, { "Sid": "CloudPilotTagScopedInstanceProfiles", "Effect": "Allow", "Action": [ "iam:TagInstanceProfile" ], "Resource": "*", "Condition": { "StringEquals": { "aws:ResourceTag/kubernetes.io/cluster/${CLUSTER_NAME}": "owned", "aws:ResourceTag/topology.kubernetes.io/region": "${CLUSTER_REGION}", "aws:RequestTag/kubernetes.io/cluster/${CLUSTER_NAME}": "owned", "aws:RequestTag/topology.kubernetes.io/region": "${CLUSTER_REGION}" }, "StringLike": { "aws:ResourceTag/karpenter.k8s.aws/ec2nodeclass": "*", "aws:RequestTag/karpenter.k8s.aws/ec2nodeclass": "*" } } }, { "Sid": "CloudPilotManageScopedInstanceProfiles", "Effect": "Allow", "Action": [ "iam:AddRoleToInstanceProfile", "iam:RemoveRoleFromInstanceProfile", "iam:DeleteInstanceProfile" ], "Resource": "*", "Condition": { "StringEquals": { "aws:ResourceTag/kubernetes.io/cluster/${CLUSTER_NAME}": "owned", "aws:ResourceTag/topology.kubernetes.io/region": "${CLUSTER_REGION}" }, "StringLike": { "aws:ResourceTag/karpenter.k8s.aws/ec2nodeclass": "*" } } }, { "Sid": "CloudPilotReadInstanceProfiles", "Effect": "Allow", "Action": "iam:GetInstanceProfile", "Resource": "*" } ] } EOF

Apply the Custom Roles

After the JSON files are generated, choose either the AWS CLI workflow or the AWS Console workflow.

Option 1: Apply with AWS CLI

This path is recommended because it uses AWS managed policies for the node role and a generated inline policy for the controller role.

aws iam update-assume-role-policy \ --role-name "$NODE_ROLE_NAME" \ --policy-document file://node-role-trust-policy.json aws iam attach-role-policy --role-name "$NODE_ROLE_NAME" --policy-arn "arn:${AWS_PARTITION}:iam::aws:policy/AmazonEKSWorkerNodePolicy" aws iam attach-role-policy --role-name "$NODE_ROLE_NAME" --policy-arn "arn:${AWS_PARTITION}:iam::aws:policy/AmazonEKS_CNI_Policy" aws iam attach-role-policy --role-name "$NODE_ROLE_NAME" --policy-arn "arn:${AWS_PARTITION}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly" aws iam attach-role-policy --role-name "$NODE_ROLE_NAME" --policy-arn "arn:${AWS_PARTITION}:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy" aws iam update-assume-role-policy \ --role-name "$CONTROLLER_ROLE_NAME" \ --policy-document file://controller-role-trust-policy.json aws iam put-role-policy \ --role-name "$CONTROLLER_ROLE_NAME" \ --policy-name CloudPilotControllerMinimumPolicy \ --policy-document file://controller-role-minimum-policy.json

Option 2: Apply in AWS Console

Use the same generated files from the Prepare the Role Files section.

Node role

  1. Open IAM -> Roles -> your node role.
  2. Open Trust relationships -> Edit trust policy.
  3. Paste the content of node-role-trust-policy.json, and save.
  4. Open Permissions.
  5. Attach these AWS managed policies:
    • AmazonEKSWorkerNodePolicy
    • AmazonEKS_CNI_Policy
    • AmazonEC2ContainerRegistryReadOnly
    • AmazonEBSCSIDriverPolicy

Controller role

  1. Open IAM -> Roles -> your controller role.
  2. Open Trust relationships -> Edit trust policy.
  3. Paste the content of controller-role-trust-policy.json, and save.
  4. Open Permissions.
  5. Create an inline policy from the content of controller-role-minimum-policy.json.

Role Requirements Reference

Use this section to understand what the generated files and managed policies are meant to satisfy.

Node role requirements

The node role must trust EC2 and include the minimum permissions below.

RequirementMinimum permissionsWhy CloudPilot needs it
Trust policysts:AssumeRole from ec2.amazonaws.comLets EC2 instances launched by CloudPilot assume the role
Cluster bootstrapPermissions provided by AmazonEKSWorkerNodePolicyLets worker nodes discover cluster metadata and join the cluster
VPC CNIPermissions provided by AmazonEKS_CNI_PolicyLets the AWS VPC CNI manage ENIs and secondary IPs
ECR image pullPermissions provided by AmazonEC2ContainerRegistryReadOnlyLets nodes pull container images from ECR
EBS CSIPermissions provided by AmazonEBSCSIDriverPolicyLets the EBS CSI driver manage EBS volumes used by workloads

Controller role requirements

The controller role must trust the CloudPilot service account through the cluster OIDC provider and include the minimum permissions below.

RequirementMinimum permissionsWhy CloudPilot needs it
Trust policysts:AssumeRoleWithWebIdentity from the cluster OIDC provider, restricted to system:serviceaccount:cloudpilot:cloudpilot-admin and aud=sts.amazonaws.comLets the CloudPilot controller assume the role through IRSA
Read-only cluster and autoscaling discoveryautoscaling:DescribeAutoScalingGroups, autoscaling:DescribeAutoScalingInstances, autoscaling:DescribeLaunchConfigurations, autoscaling:DescribeScalingActivities, autoscaling:DescribeTags, ec2:DescribeImages, ec2:DescribeInstanceTypes, ec2:DescribeLaunchTemplateVersions, ec2:GetInstanceTypesFromInstanceRequirements, eks:DescribeNodegroup, eks:DescribeClusterLets the controller inspect node groups, launch templates, instance types, and cluster metadata
Autoscaling mutationsautoscaling:SetDesiredCapacity, autoscaling:TerminateInstanceInAutoScalingGroup, autoscaling:UpdateAutoScalingGroupLets the controller rebalance existing node groups
EC2 provisioningssm:GetParameter, ec2:RunInstances, ec2:DescribeSubnets, ec2:DescribeSecurityGroups, ec2:DescribeLaunchTemplates, ec2:DescribeInstances, ec2:DescribeInstanceTypes, ec2:DescribeInstanceTypeOfferings, ec2:DescribeAvailabilityZones, ec2:DeleteLaunchTemplate, ec2:CreateTags, ec2:CreateLaunchTemplate, ec2:CreateFleet, ec2:DescribeSpotPriceHistory, pricing:GetProducts, savingsplans:DescribeSavingsPlans, ec2:DescribeRegionsLets the controller calculate capacity options and create EC2 capacity
Terminate CloudPilot-managed nodesec2:TerminateInstances with ec2:ResourceTag/kubernetes.io/cluster/${CLUSTER_NAME} equal to owned or sharedLimits direct EC2 termination to cluster-owned/shared nodes
Pass node roleiam:PassRole on ${NODE_ROLE_ARN}Lets the controller launch instances with the node IAM role
Instance profile lifecycleiam:CreateInstanceProfile, iam:TagInstanceProfile, iam:AddRoleToInstanceProfile, iam:RemoveRoleFromInstanceProfile, iam:DeleteInstanceProfile, iam:GetInstanceProfile with the tag conditions used in the generated policyLets the controller manage Karpenter instance profiles safely inside the cluster scope

Use Custom Roles During Installation, Migration, and Upgrade

After the custom roles have been prepared, export the CloudPilot installation variables and make CloudPilot use the custom role names.

export CUSTOM_NODE_ROLE="$NODE_ROLE_NAME" export CUSTOM_CONTROLLER_ROLE="$CONTROLLER_ROLE_NAME"

Scenario 1: Fresh install with custom roles

If the cluster has not run CloudPilot phase2 yet, export the variables above before running the phase2 install script.

Important notes:

  1. Run the role-preparation steps in the earlier sections first.
  2. Run phase1 before phase2, as usual.
  3. When CUSTOM_NODE_ROLE and CUSTOM_CONTROLLER_ROLE are set, the installer validates those roles and uses them directly instead of creating and managing the default CloudPilot roles.

Scenario 2: Migrate an existing cluster from default roles to custom roles

If the cluster is already installed with the default CloudPilot roles, you can migrate it by re-running phase2 with the custom role variables exported.

This updates the phase2 installation to reference the custom roles. The script validates the custom roles but does not modify them.

Scenario 3: Use the upgrade script while upgrading to a newer version

If you are already planning to upgrade CloudPilot to a newer version, export the custom role variables before running the EKS upgrade script. When the upgrade script reaches the target version’s phase2 install step, that phase2 run will use the custom roles.

Important limitations:

  1. upgrade.sh only runs phase2 when there is an actual version transition to apply.
  2. upgrade.sh may set UPDATE_AWS_RESOURCE automatically from its internal version matrix when the user does not provide it.
  3. For custom-role installation or migration, explicitly set UPDATE_AWS_RESOURCE=true so that phase2 also updates cluster access entries or aws-auth mappings for the custom node role.
  4. If the cluster is already on the latest target version and you only want to switch roles, upgrade.sh is not enough.
  5. In that case, re-run the current version’s phase2 install script directly, as shown in Scenario 2.

Final step: update the NodeClass role in CloudPilot Console

After any of the scenarios above, update the NodeClass configuration in CloudPilot Console so newly provisioned EC2 nodes use the custom node role.

  1. Open CloudPilot Console and go to the cluster’s Node Autoscaler configuration.
  2. Open the NodeClass used by your NodePool.
  3. Set Role to your custom node role name, for example clusterall-noderole.
  4. Save the NodeClass.

What happens to the old default CloudPilot roles

After a successful migration, CloudPilot will start using the custom roles for future phase2 runs and controller/node provisioning flows.

The migration step does not automatically delete the old default CloudPilot IAM roles. If you want to remove the old default roles, do that only after you have confirmed:

  1. The controller is running with the custom controller role.
  2. New nodes launched by CloudPilot are using the custom node role.
  3. The cluster is healthy after the migration.

Validation Summary

When a custom role is provided, the installer validates:

  1. The role exists.
  2. The trust policy matches the required principal and conditions.
  3. The role can actually perform the minimum required actions.
  4. For the controller role, iam:PassRole targets the exact node role ARN that CloudPilot will use.

If any of these checks fail, the installer exits before changing the cluster installation state.

Last updated on