Skip to main content

Set up a cluster and add Karpenter

Karpenter automatically provisions new nodes in response to unschedulable pods. Karpenter does this by observing events within the Kubernetes cluster, and then sending commands to the underlying cloud provider.

This guide shows how to get started with Karpenter by installing Karpenter. To use Karpenter, you must be running a supported Kubernetes cluster on Alibaba Cloud provider.

The guide below explains how to utilize the Karpenter provider for Alibaba Cloud with ACK.

Step 1. Install utilities

Karpenter requires cloud provider permissions to provision nodes. Currently, Karpenter uses AK/SK credentials (stored as a secret in the cluster) temporarily, but will adopt RRSA in the future to enhance credential security.

Install these tools before proceeding:

Run the following command to configure the AlibabaCloud CLI with AK/SK credentials:

aliyun configure

Step 2. Prepare an Alibaba Cloud ACK cluster

note

If you already have an ACK cluster, please skip this step.

Referring to Create ACK cluster with Terraform to create a cluster easily.

Step 3. Config the environment variables

Run the following commands to set corresponding environment variables:

export CLUSTER_NAME=<your_cluster_name>              # Config your cluster name
export CLUSTER_REGION=<your_cluster_region> # Config the alibabacloud region
export ALIBABACLOUD_AK=<alibaba_cloud_access_key> # Config the alibabacloud AK
export ALIBABACLOUD_SK=<alibaba_cloud_secret_key> # Config the alibabacloud SK

export CLUSTER_ID=$(aliyun cs GET /clusters | jq -r --arg CLUSTER_NAME "$CLUSTER_NAME" '.[] | select(.name == $CLUSTER_NAME) | .cluster_id')

export VSWITCH_IDS=$(aliyun cs GET /api/v1/clusters --header "Content-Type=application/json;" | jq -r --arg CLUSTER_NAME "$CLUSTER_NAME" '.clusters[] | select(.name == $CLUSTER_NAME) | .vswitch_ids[]')

export SECURITYGROUP_ID=$(aliyun cs GET /api/v1/clusters --header "Content-Type=application/json;" | jq -r --arg CLUSTER_NAME "$CLUSTER_NAME" '.clusters[] | select(.name == $CLUSTER_NAME) | .security_group_id')

Karpenter find the corresponding security group and vswitches by tags, run the following commands to tag them:

# Tag the security group
aliyun tag TagResources --region ${CLUSTER_REGION} --RegionId ${CLUSTER_REGION} --ResourceARN.1 "acs:ecs:*:*:securitygroup/${SECURITYGROUP_ID}" --Tags "{\"karpenter.sh/discovery\": \"$CLUSTER_NAME\"}"

# Tag the vswitch
IFS=' '
while IFS= read -r vs_id; do
aliyun tag TagResources --region ${CLUSTER_REGION} --RegionId ${CLUSTER_REGION} --ResourceARN.1 "acs:vpc:*:*:vswitch/${vs_id}" --Tags "{\"karpenter.sh/discovery\": \"$CLUSTER_NAME\"}"
done <<< $VSWITCH_IDS

Step 5. Install Karpenter

Run the following command to install the latest version of Karpenter alibaba cloud provider:

helm repo add karpenter-provider-alibabacloud https://cloudpilot-ai.github.io/karpenter-provider-alibabacloud

# 0.0.1 is the latest development version
helm upgrade karpenter karpenter-provider-alibabacloud/karpenter --install --version 0.0.1 \
--namespace karpenter-system --create-namespace \
--set "alibabacloud.access_key_id"=${ALIBABACLOUD_AK} \
--set "alibabacloud.access_key_secret"=${ALIBABACLOUD_SK} \
--set "alibabacloud.region_id"=${CLUSTER_REGION} \
--set controller.settings.clusterID=${CLUSTER_ID} \
--wait

Step 6. Create NodePool/ECSNodeClass

Create the following ECSNodeClass and NodePool:

cat > nodeclass.yaml <<EOF
apiVersion: karpenter.k8s.alibabacloud/v1alpha1
kind: ECSNodeClass
metadata:
name: defaultnodeclass
spec:
vSwitchSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}" # replace with your cluster name
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}" # replace with your cluster name
imageSelectorTerms:
# ContainerOS only support x86_64 linux nodes, and it's faster to initialize
- alias: ContainerOS
EOF

kubectl apply -f nodeclass.yaml


cat > nodepool.yaml <<EOF
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: ecsnodepool
spec:
disruption:
budgets:
- nodes: 95%
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 1m
template:
spec:
requirements:
- key: kubernetes.io/arch
operator: In
values: [ "amd64" ]
- key: kubernetes.io/os
operator: In
values: ["linux"]
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
nodeClassRef:
group: "karpenter.k8s.alibabacloud"
kind: ECSNodeClass
name: defaultnodeclass
EOF

kubectl apply -f nodepool.yaml

Step 7. Scale up deployment

Apply the following command to test the node scaling up:

cat > deployment.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 1
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: karpenter.sh/capacity-type
operator: Exists
securityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
containers:
- image: public.ecr.aws/eks-distro/kubernetes/pause:3.2
name: inflate
resources:
requests:
cpu: 250m
memory: 250Mi
securityContext:
allowPrivilegeEscalation: false
EOF

kubectl apply -f deployment.yaml

After the deployment is deployed, after about one minute, you will see there is one new node created and the pod is scheduled on the new node:

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
cn-shenzhen.172.16.1.58 Ready <none> 22m v1.31.1-aliyun.1
cn-shenzhen.172.16.2.142 Ready <none> 22m v1.31.1-aliyun.1
cn-shenzhen.172.16.5.18 Ready <none> 43s v1.31.1-aliyun.1

$ kubectl get po -l app=inflate -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
inflate-7758687f95-5r42m 1/1 Running 0 3m14s 10.169.2.2 cn-shenzhen.172.16.5.18 <none> <none>

Step 8. Scale down deployment

kubectl scale deployment/inflate --replicas=0

After the deployment is scaled down, you will see the new node are deleted after some time:

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
cn-shenzhen.172.16.1.58 Ready <none> 22m v1.31.1-aliyun.1
cn-shenzhen.172.16.2.142 Ready <none> 22m v1.31.1-aliyun.1

Step 9. Uninstall Karpenter

You can run the following commands to uninstall Karpenter:

helm uninstall karpenter --namespace karpenter-system
kubectl delete ns karpenter-system