Say Hello to AWS Karpenter

18 min readJul 27, 2024

AWS Karpenter is an open-source, flexible, high-performance Kubernetes cluster autoscaler. It helps improve application availability and cluster efficiency by rapidly launching right-sized compute resources in response to changing application load.

Karpenter also manages the entire node lifecycle, automatically terminating nodes when they’re no longer needed. Karpenter uses SQS queues to handle EC2 Spot Instance interruption notices, ensuring your applications have time to gracefully handle instance terminations.

The key feature of karpenter is Just-In-Time Provisioning — Karpenter can provision new nodes in seconds, much faster than traditional autoscalers. Imagine you have a sudden spike in traffic to your e-commerce website during a flash sale. Karpenter can quickly spin up new nodes to handle the increased load, ensuring your customers have a smooth shopping experience.

Additionall,y Karpenter can work with a wide variety of instance types, sizes, and purchase options (On-Demand, Spot, etc.). You can configure Karpenter to use a mix of c5.xlarge, m5.2xlarge, and r5.large instances based on your workload needs. It will automatically choose the most suitable and cost-effective option.

Karpenter also uses sophisticated bin-packing algorithms to maximize resource utilization. If you have a pod that requires 1.5 CPU cores and 3 GB of memory, Karpenter might place it on a c5.large instance (2 CPU, 4 GB RAM) instead of a larger, more expensive instance, optimizing your costs. The library can automatically remove nodes that are no longer needed, helping to reduce costs.

Karpenter uses Kubernetes CRDs for configuration, making it easy to integrate with existing Kubernetes workflows. Also, you can configure Karpenter to use Spot Instances for your batch processing jobs. If a Spot Instance is interrupted, Karpenter will quickly provision a new one to ensure your job continues running.

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  requirements:
    - key: karpenter.sh/capacity-type
      operator: In
      values: ["spot", "on-demand"]
  limits:
    resources:
      cpu: 1000
  provider:
    instanceProfile: KarpenterNodeInstanceProfile-${CLUSTER_NAME}

Karpenter vs. Cluster Autoscaler:

While both tools aim to autoscale Kubernetes clusters, Karpenter offers several advantages:
Faster scaling: Karpenter can provision nodes in seconds, compared to minutes for Cluster Autoscaler.
More flexible: Karpenter can work with a wider variety of instance types and sizes.
Better bin-packing: Karpenter’s algorithms result in more efficient resource utilization.
Simpler configuration: Karpenter uses Kubernetes-native CRDs for configuration.

Karpenter does especially…

Watching for pods that the Kubernetes scheduler has marked as unschedulable
Evaluating scheduling constraints (resource requests, nodeselectors, affinities, tolerations, and topology spread constraints) requested by the pods
Provisioning nodes that meet the requirements of the pods
Scheduling the pods to run on the new nodes
Removing the nodes when the nodes are no longer needed

Karpenter’s Disruption feature

Disruption feature (previously known as Consolidation) is a powerful set of capabilities that help maintain an efficient, cost-effective, and up-to-date Kubernetes cluster. It consists of three main components: Expiration, Drift, and Consolidation.

Expiration ensures that nodes in your cluster don’t become stale or outdated. By default, Karpenter will automatically expire instances after 720 hours (30 days), forcing a refresh of your nodes.

Imagine you’re running a financial services application that requires the latest security patches. With Expiration, you can ensure that no node in your cluster is older than 30 days, minimizing the risk of running outdated software.

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  ttlSecondsUntilExpired: 2592000  # 30 days in seconds

Drift detection allows Karpenter to identify and rectify discrepancies between the desired state — as defined in NodePool and EC2NodeClass configurations) — and the actual state of your nodes.

You might update your NodePool to use a new AMI with enhanced monitoring capabilities. Drift detection will identify existing nodes that don’t match this new configuration and gradually replace them, ensuring your entire cluster adopts the new capabilities without manual intervention.

Consolidation optimizes your cluster for cost-efficiency by intelligently packing workloads onto the most appropriate nodes and removing unnecessary capacity.

Let’s say your cluster has three nodes:

Node A: 4 CPU, 16GB RAM, 50% utilized
Node B: 4 CPU, 16GB RAM, 30% utilized
Node C: 4 CPU, 16GB RAM, 20% utilized

Karpenter’s Consolidation might move all workloads to Nodes A and B, then terminate Node C, saving you the cost of an entire EC2 instance.

Let’s also says that your batch processing job is running on a fleet of Spot Instances. Karpenter uses the AWS EC2 Fleet Instance API to request instances based on your NodePool configuration. If AWS can’t fulfill the request for a specific instance type, Karpenter quickly adjusts:

It might request alternative instance types specified in your NodePool.
If necessary, it could relax soft constraints (like preferred Availability Zones) to ensure your workloads keep running.

Spot-to-Spot Consolidation is a specialized form of consolidation designed to optimize costs while maintaining high availability for workloads running on Spot Instances.

Your NodePool configuration includes 15 different instance types across multiple families (c5, m5, r5, etc.) — which is to choose from a wide pool of instance types, increasing the chances of fulfilling capacity requests.

AWS Fargate

AWS Fargate for EKS provides a serverless compute engine for Kubernetes that allows you to run pods without managing the underlying EC2 instances. It combines EKS (for the control plane) with Fargate (for the data plane), resulting in a fully managed, serverless Kubernetes environment.

Also, Fargate provides VM-level isolation for each pod, enhancing security. When you’re running a multi-tenant SaaS application — With Fargate — each customer’s workload can run in its own isolated environment, reducing the risk of noisy neighbor issues or potential security breaches between tenants.

Fargate uses Fargate Profiles to determine which pods should run on Fargate. You can specify the subnets, namespaces, and labels for pods that should use Fargate. EKS’s scheduler decides which pods run on Fargate based on the Fargate Profile configuration.

You have a mixed cluster with both EC2 and Fargate. You can use node selectors or taints/tolerations to control which pods run on Fargate:

apiVersion: eks.amazonaws.com/v1beta1
kind: FargateProfile
metadata:
  name: my-fargate-profile
  namespace: default
spec:
  selectors:
    - namespace: my-namespace
      labels:
        env: prod
        service: backend
  subnets:
    - subnet-1234567890abcdef0
    - subnet-234567890abcdef01

apiVersion: apps/v1
kind: Deployment
metadata:
  name: fargate-app
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: myapp
    spec:
      nodeSelector:
        kubernetes.io/arch: amd64
        eks.amazonaws.com/compute-type: fargate
      containers:
      - name: app
        image: nginx

Fargate automatically scales the underlying compute resources based on your pod’s resource requirements. If your application suddenly receives a traffic spike. With Fargate, you don’t need to worry about scaling the EC2 instances. Simply increase the number of pod replicas, and Fargate will automatically provision the necessary resources.

Practices

The provided Terraform code sets up the following components:

An EKS cluster
A VPC with public and private subnets
Fargate profiles for running pods
Karpenter for cluster auto-scaling
Some EKS add-ons (CoreDNS, VPC CNI, kube-proxy)
Creates an EKS cluster with version 1.30
Sets up Fargate profiles for the “karpenter” and “kube-system” namespaces
Configures Karpenter using the eks-blueprints-addons module
Sets up necessary IAM roles and policies

provider "aws" {
  region = local.region
}

# Required for public ECR where Karpenter artifacts are hosted
provider "aws" {
  region = "us-east-1"
  alias  = "virginia"
}

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1beta1" 
    command     = "aws"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
  }
}

provider "helm" {
  kubernetes {
    host                   = module.eks.cluster_endpoint
    cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

    exec {
      api_version = "client.authentication.k8s.io/v1beta1"
      command     = "aws"
      # This requires the awscli to be installed locally where Terraform is executed
      args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
    }
  }
}

data "aws_ecrpublic_authorization_token" "token" {
  provider = aws.virginia
}

data "aws_availability_zones" "available" {}

locals {
  name   = "t101-${basename(path.cwd)}"
  region = "ap-northeast-2"

  vpc_cidr = "10.10.0.0/16"
  azs      = slice(data.aws_availability_zones.available.names, 0, 3)

  tags = {
    Blueprint  = local.name
    GithubRepo = "github.com/aws-ia/terraform-aws-eks-blueprints"
  }
}

################################################################################
# Cluster
################################################################################

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.11"

  cluster_name                   = local.name
  cluster_version                = "1.30"
  cluster_endpoint_public_access = true

  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets

  # Fargate profiles use the cluster primary security group so these are not utilized
  create_cluster_security_group = false
  create_node_security_group    = false

  enable_cluster_creator_admin_permissions = true

  fargate_profiles = {
    karpenter = {
      selectors = [
        { namespace = "karpenter" }
      ]
    }
    kube_system = {
      name = "kube-system"
      selectors = [
        { namespace = "kube-system" }
      ]
    }
  }

  tags = merge(local.tags, {
    # NOTE - if creating multiple security groups with this module, only tag the
    # security group that Karpenter should utilize with the following tag
    # (i.e. - at most, only one security group should have this tag in your account)
    "karpenter.sh/discovery" = local.name
  })
}

################################################################################
# EKS Blueprints Addons
################################################################################

module "eks_blueprints_addons" {
  source  = "aws-ia/eks-blueprints-addons/aws"
  version = "~> 1.16"

  cluster_name      = module.eks.cluster_name
  cluster_endpoint  = module.eks.cluster_endpoint
  cluster_version   = module.eks.cluster_version
  oidc_provider_arn = module.eks.oidc_provider_arn

  # We want to wait for the Fargate profiles to be deployed first
  create_delay_dependencies = [for prof in module.eks.fargate_profiles : prof.fargate_profile_arn]

  eks_addons = {
    coredns = {
      configuration_values = jsonencode({
        computeType = "Fargate"
        # Ensure that the we fully utilize the minimum amount of resources that are supplied by
        # Fargate https://docs.aws.amazon.com/eks/latest/userguide/fargate-pod-configuration.html
        # Fargate adds 256 MB to each pod's memory reservation for the required Kubernetes
        # components (kubelet, kube-proxy, and containerd). Fargate rounds up to the following
        # compute configuration that most closely matches the sum of vCPU and memory requests in
        # order to ensure pods always have the resources that they need to run.
        resources = {
          limits = {
            cpu = "0.25"
            # We are targeting the smallest Task size of 512Mb, so we subtract 256Mb from the
            # request/limit to ensure we can fit within that task
            memory = "256M"
          }
          requests = {
            cpu = "0.25"
            # We are targeting the smallest Task size of 512Mb, so we subtract 256Mb from the
            # request/limit to ensure we can fit within that task
            memory = "256M"
          }
        }
      })
    }
    vpc-cni    = {}
    kube-proxy = {}
  }

  enable_karpenter = true

  karpenter = {
    repository_username = data.aws_ecrpublic_authorization_token.token.user_name
    repository_password = data.aws_ecrpublic_authorization_token.token.password
  }

  karpenter_node = {
    # Use static name so that it matches what is defined in `karpenter.yaml` example manifest
    iam_role_use_name_prefix = false
  }

  tags = local.tags
}

resource "aws_eks_access_entry" "karpenter_node_access_entry" {
  cluster_name      = module.eks.cluster_name
  principal_arn     = module.eks_blueprints_addons.karpenter.node_iam_role_arn
  kubernetes_groups = []
  type              = "EC2_LINUX"
}

################################################################################
# Supporting Resources
################################################################################

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 5.0"

  name = local.name
  cidr = local.vpc_cidr

  azs             = local.azs
  private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 4, k)]
  public_subnets  = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k + 48)]

  enable_nat_gateway = true
  single_nat_gateway = true

  public_subnet_tags = {
    "kubernetes.io/role/elb" = 1
  }

  private_subnet_tags = {
    "kubernetes.io/role/internal-elb" = 1
    # Tags subnets for Karpenter auto-discovery
    "karpenter.sh/discovery" = local.name
  }

  tags = local.tags
}

terraform init
tree .terraform
cat .terraform/modules/modules.json | jq
tree .terraform/providers/registry.terraform.io/hashicorp -L 2

As you can see, we would see these 4 Fargate nodes and the implications of this setup after applying the configurations.

1. CoreDNS Pods (2 Fargate nodes):
— CoreDNS is crucial for DNS resolution within the cluster.
— The configuration sets specific resource requests for CoreDNS (0.25 CPU and 256M memory).
— Running two replicas ensures high availability for DNS services.
— Each CoreDNS pod gets its own Fargate node, providing isolation and dedicated resources.

2. Karpenter Pods (2 Fargate nodes):
— Karpenter is the cluster autoscaler, critical for dynamic resource management.
— While not explicitly defined in the Terraform code, Karpenter typically runs with two replicas for high availability.
— The Fargate profile for the “karpenter” namespace ensures these pods run on Fargate.
— Each Karpenter pod gets its own Fargate node, again for isolation and reliability.

sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid$ aws ec2 describe-vpcs --filter 'Name=isDefault,Values=false' --output yaml
Vpcs:
- CidrBlock: 10.10.0.0/16
  CidrBlockAssociationSet:
  - AssociationId: vpc-cidr-assoc-059e3b4e0738b90fd
    CidrBlock: 10.10.0.0/16
    CidrBlockState:
      State: associated
  DhcpOptionsId: dopt-74a4271f
  InstanceTenancy: default
  IsDefault: false
  OwnerId: '712218945685'
  State: available
  Tags:
  - Key: aws:cloudformation:stack-name
    Value: t101-2
  - Key: aws:cloudformation:logical-id
    Value: TerraformVPC
  - Key: aws:cloudformation:stack-id
    Value: arn:aws:cloudformation:ap-northeast-2:712218945685:stack/t101-2/1d9e3f70-413f-11ef-9021-022596c73975
  - Key: Name
    Value: Terraform-VPC
  VpcId: vpc-074fdf7c7db333935


sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/karpenter$ terraform state list
data.aws_availability_zones.available
module.vpc.aws_default_network_acl.this[0]
module.vpc.aws_default_route_table.default[0]
module.vpc.aws_default_security_group.this[0]
module.vpc.aws_eip.nat[0]
module.vpc.aws_internet_gateway.this[0]
module.vpc.aws_nat_gateway.this[0]
module.vpc.aws_route.private_nat_gateway[0]
module.vpc.aws_route.public_internet_gateway[0]
module.vpc.aws_route_table.private[0]
module.vpc.aws_route_table.public[0]
module.vpc.aws_route_table_association.private[0]
module.vpc.aws_route_table_association.private[1]
module.vpc.aws_route_table_association.private[2]
module.vpc.aws_route_table_association.public[0]
module.vpc.aws_route_table_association.public[1]
module.vpc.aws_route_table_association.public[2]
module.vpc.aws_subnet.private[0]
module.vpc.aws_subnet.private[1]
module.vpc.aws_subnet.private[2]
module.vpc.aws_subnet.public[0]
module.vpc.aws_subnet.public[1]
module.vpc.aws_subnet.public[2]
module.vpc.aws_vpc.this[0]

sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/karpenter$ aws ec2 describe-vpcs --filter 'Name=isDefault,Values=false' --output yaml
Vpcs:
- CidrBlock: 10.10.0.0/16
  CidrBlockAssociationSet:
  - AssociationId: vpc-cidr-assoc-059e3b4e0738b90fd
    CidrBlock: 10.10.0.0/16
    CidrBlockState:
      State: associated
  DhcpOptionsId: dopt-74a4271f
  InstanceTenancy: default
  IsDefault: false
  OwnerId: '712218945685'
  State: available
  Tags:
  - Key: aws:cloudformation:stack-name
    Value: t101-2
  - Key: aws:cloudformation:logical-id
    Value: TerraformVPC
  - Key: aws:cloudformation:stack-id
    Value: arn:aws:cloudformation:ap-northeast-2:712218945685:stack/t101-2/1d9e3f70-413f-11ef-9021-022596c73975
  - Key: Name
    Value: Terraform-VPC
  VpcId: vpc-074fdf7c7db333935
- CidrBlock: 10.0.0.0/16
  CidrBlockAssociationSet:
  - AssociationId: vpc-cidr-assoc-05bda9f8dc1d22922
    CidrBlock: 10.0.0.0/16
    CidrBlockState:
      State: associated
  DhcpOptionsId: dopt-74a4271f
  InstanceTenancy: default
  IsDefault: false

sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/karpenter$ echo "data.aws_availability_zones.available" | terraform console
{
  "all_availability_zones" = tobool(null)
  "exclude_names" = toset(null) /* of string */
  "exclude_zone_ids" = toset(null) /* of string */
  "filter" = toset(null) /* of object */
  "group_names" = toset([
    "ap-northeast-2",
  ])
  "id" = "ap-northeast-2"
  "names" = tolist([
    "ap-northeast-2a",
    "ap-northeast-2b",
    "ap-northeast-2c",
    "ap-northeast-2d",
  ])
  "state" = tostring(null)
  "timeouts" = null /* object */
  "zone_ids" = tolist([
    "apne2-az1",
    "apne2-az2",
    "apne2-az3",
    "apne2-az4",
  ])
}

It takes around 3~5 minutes to create new vpc endpoint on your account.

After that, let’s apply module.eks for the sake of worker nodes.

terraform apply -target="module.eks" -auto-approve

Apply complete! Resources: 24 added, 0 changed, 0 destroyed.

Outputs:

configure_kubectl = "aws eks --region ap-northeast-2 update-kubeconfig --name t101-karpenter"
sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/karpenter$ terraform state list
data.aws_availability_zones.available
module.eks.data.aws_caller_identity.current
module.eks.data.aws_iam_policy_document.assume_role_policy[0]
module.eks.data.aws_iam_session_context.current
module.eks.data.aws_partition.current
module.eks.data.tls_certificate.this[0]
module.eks.aws_cloudwatch_log_group.this[0]
module.eks.aws_ec2_tag.cluster_primary_security_group["Blueprint"]
module.eks.aws_ec2_tag.cluster_primary_security_group["GithubRepo"]
module.eks.aws_ec2_tag.cluster_primary_security_group["karpenter.sh/discovery"]
module.eks.aws_eks_access_entry.this["cluster_creator"]
module.eks.aws_eks_access_policy_association.this["cluster_creator_admin"]
module.eks.aws_eks_cluster.this[0]
module.eks.aws_iam_openid_connect_provider.oidc_provider[0]
module.eks.aws_iam_policy.cluster_encryption[0]
module.eks.aws_iam_role.this[0]
module.eks.aws_iam_role_policy_attachment.cluster_encryption[0]
module.eks.aws_iam_role_policy_attachment.this["AmazonEKSClusterPolicy"]
module.eks.aws_iam_role_policy_attachment.this["AmazonEKSVPCResourceController"]
module.eks.time_sleep.this[0]
module.vpc.aws_default_network_acl.this[0]
module.vpc.aws_default_route_table.default[0]
module.vpc.aws_default_security_group.this[0]
module.vpc.aws_eip.nat[0]
module.vpc.aws_internet_gateway.this[0]
module.vpc.aws_nat_gateway.this[0]
module.vpc.aws_route.private_nat_gateway[0]
module.vpc.aws_route.public_internet_gateway[0]
module.vpc.aws_route_table.private[0]
module.vpc.aws_route_table.public[0]
module.vpc.aws_route_table_association.private[0]
module.vpc.aws_route_table_association.private[1]
module.vpc.aws_route_table_association.private[2]
module.vpc.aws_route_table_association.public[0]
module.vpc.aws_route_table_association.public[1]
module.vpc.aws_route_table_association.public[2]
module.vpc.aws_subnet.private[0]
module.vpc.aws_subnet.private[1]
module.vpc.aws_subnet.private[2]
module.vpc.aws_subnet.public[0]
module.vpc.aws_subnet.public[1]
module.vpc.aws_subnet.public[2]
module.vpc.aws_vpc.this[0]
module.eks.module.fargate_profile["karpenter"].data.aws_caller_identity.current
module.eks.module.fargate_profile["karpenter"].data.aws_iam_policy_document.assume_role_policy[0]
module.eks.module.fargate_profile["karpenter"].data.aws_partition.current
module.eks.module.fargate_profile["karpenter"].data.aws_region.current
module.eks.module.fargate_profile["karpenter"].aws_eks_fargate_profile.this[0]
module.eks.module.fargate_profile["karpenter"].aws_iam_role.this[0]
module.eks.module.fargate_profile["karpenter"].aws_iam_role_policy_attachment.this["AmazonEKSFargatePodExecutionRolePolicy"]
module.eks.module.fargate_profile["karpenter"].aws_iam_role_policy_attachment.this["AmazonEKS_CNI_Policy"]
module.eks.module.fargate_profile["kube_system"].data.aws_caller_identity.current
module.eks.module.fargate_profile["kube_system"].data.aws_iam_policy_document.assume_role_policy[0]
module.eks.module.fargate_profile["kube_system"].data.aws_partition.current
module.eks.module.fargate_profile["kube_system"].data.aws_region.current
module.eks.module.fargate_profile["kube_system"].aws_eks_fargate_profile.this[0]
module.eks.module.fargate_profile["kube_system"].aws_iam_role.this[0]
module.eks.module.fargate_profile["kube_system"].aws_iam_role_policy_attachment.this["AmazonEKSFargatePodExecutionRolePolicy"]
module.eks.module.fargate_profile["kube_system"].aws_iam_role_policy_attachment.this["AmazonEKS_CNI_Policy"]
module.eks.module.kms.data.aws_caller_identity.current[0]
module.eks.module.kms.data.aws_iam_policy_document.this[0]
module.eks.module.kms.data.aws_partition.current[0]
module.eks.module.kms.aws_kms_alias.this["cluster"]
module.eks.module.kms.aws_kms_key.this[0]

VPCID=vpc-0a94d83d1d2a23be3

aws ec2 describe-subnets --filters "Name=vpc-id,Values=$VPCID" | jq

sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/karpenter$ aws ec2 describe-subnets --filters "Name=vpc-id,Values=$VPCID" --output text
SUBNETS False   ap-northeast-2c apne2-az3       251     10.0.50.0/24    False   False   False   False   False   712218945685    available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-0af1d2622cf944a13 subnet-0af1d2622cf944a13        vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH   False   False   ip-name
TAGS    GithubRepo      github.com/aws-ia/terraform-aws-eks-blueprints
TAGS    Blueprint       t101-karpenter
TAGS    kubernetes.io/role/elb  1
TAGS    Name    t101-karpenter-public-ap-northeast-2c
SUBNETS False   ap-northeast-2b apne2-az2       251     10.0.49.0/24    False   False   False   False   False   712218945685    available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-02e84d81a53f9e261 subnet-02e84d81a53f9e261        vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH   False   False   ip-name
TAGS    GithubRepo      github.com/aws-ia/terraform-aws-eks-blueprints
TAGS    kubernetes.io/role/elb  1
TAGS    Name    t101-karpenter-public-ap-northeast-2b
TAGS    Blueprint       t101-karpenter
SUBNETS False   ap-northeast-2a apne2-az1       250     10.0.48.0/24    False   False   False   False   False   712218945685    available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-0eaf312876e26277c subnet-0eaf312876e26277c        vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH   False   False   ip-name
TAGS    Name    t101-karpenter-public-ap-northeast-2a
TAGS    Blueprint       t101-karpenter
TAGS    GithubRepo      github.com/aws-ia/terraform-aws-eks-blueprints
TAGS    kubernetes.io/role/elb  1
SUBNETS False   ap-northeast-2c apne2-az3       4091    10.0.32.0/20    False   False   False   False   False   712218945685    available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-022e63af9a750193e subnet-022e63af9a750193e        vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH   False   False   ip-name
TAGS    karpenter.sh/discovery  t101-karpenter
TAGS    kubernetes.io/role/internal-elb 1
TAGS    GithubRepo      github.com/aws-ia/terraform-aws-eks-blueprints
:
SUBNETS False   ap-northeast-2c apne2-az3       251     10.0.50.0/24    False   False   False   False   False   712218945685    available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-0af1d2622cf944a13 subnet-0af1d2622cf944a13        vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH   False   False   ip-name
TAGS    GithubRepo      github.com/aws-ia/terraform-aws-eks-blueprints
TAGS    Blueprint       t101-karpenter
TAGS    kubernetes.io/role/elb  1
TAGS    Name    t101-karpenter-public-ap-northeast-2c
SUBNETS False   ap-northeast-2b apne2-az2       251     10.0.49.0/24    False   False   False   False   False   712218945685    available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-02e84d81a53f9e261 subnet-02e84d81a53f9e261        vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH   False   False   ip-name
TAGS    GithubRepo      github.com/aws-ia/terraform-aws-eks-blueprints
TAGS    kubernetes.io/role/elb  1
TAGS    Name    t101-karpenter-public-ap-northeast-2b
TAGS    Blueprint       t101-karpenter
SUBNETS False   ap-northeast-2a apne2-az1       250     10.0.48.0/24    False   False   False   False   False   712218945685    available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-0eaf312876e26277c subnet-0eaf312876e26277c        vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH   False   False   ip-name
TAGS    Name    t101-karpenter-public-ap-northeast-2a
TAGS    Blueprint       t101-karpenter
TAGS    GithubRepo      github.com/aws-ia/terraform-aws-eks-blueprints
TAGS    kubernetes.io/role/elb  1
SUBNETS False   ap-northeast-2c apne2-az3       4091    10.0.32.0/20    False   False   False   False   False   712218945685    available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-022e63af9a750193e subnet-022e63af9a750193e        vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH   False   False   ip-name
TAGS    karpenter.sh/discovery  t101-karpenter
TAGS    kubernetes.io/role/internal-elb 1
TAGS    GithubRepo      github.com/aws-ia/terraform-aws-eks-blueprints
:

# public 서브넷과 private 서브넷 CIDR 확인
## private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 4, k)]
## public_subnets  = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k + 48)]
terraform state show 'module.vpc.aws_subnet.public[0]'
terraform state show 'module.vpc.aws_subnet.private[0]'

aws eks --region ap-northeast-2 update-kubeconfig --name t101-karpenter

sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/karpenter$ kubectl cluster-info
Kubernetes control plane is running at https://9690F6AC485BDC1530DA59EE517421DC.gr7.ap-northeast-2.eks.amazonaws.com
CoreDNS is running at https://9690F6AC485BDC1530DA59EE517421DC.gr7.ap-northeast-2.eks.amazonaws.com/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

The setup below includes an EKS cluster with Fargate nodes with…

Karpenter is installed and configured for autoscaling
Various AWS resources (IAM roles, CloudWatch events, SQS queues) are set up to support Karpenter’s operation
The cluster is using EKS version 1.30 with containerd as the container runtime
Secrets are encrypted using AWS KMS for enhanced security

# 배포 : 2분 소요
terraform apply -auto-approve

# 확인
terraform state list
data.aws_ecrpublic_authorization_token.token
aws_eks_access_entry.karpenter_node_access_entry
module.eks_blueprints_addons.data.aws_caller_identity.current
module.eks_blueprints_addons.data.aws_eks_addon_version.this["coredns"]
module.eks_blueprints_addons.data.aws_eks_addon_version.this["kube-proxy"]
module.eks_blueprints_addons.data.aws_eks_addon_version.this["vpc-cni"]
module.eks_blueprints_addons.data.aws_iam_policy_document.karpenter[0]
module.eks_blueprints_addons.data.aws_iam_policy_document.karpenter_assume_role[0]
module.eks_blueprints_addons.data.aws_partition.current
module.eks_blueprints_addons.data.aws_region.current
module.eks_blueprints_addons.aws_cloudwatch_event_rule.karpenter["health_event"]
module.eks_blueprints_addons.aws_cloudwatch_event_rule.karpenter["instance_rebalance"]
module.eks_blueprints_addons.aws_cloudwatch_event_rule.karpenter["instance_state_change"]
module.eks_blueprints_addons.aws_cloudwatch_event_rule.karpenter["spot_interupt"]
module.eks_blueprints_addons.aws_cloudwatch_event_target.karpenter["health_event"]
module.eks_blueprints_addons.aws_cloudwatch_event_target.karpenter["instance_rebalance"]
module.eks_blueprints_addons.aws_cloudwatch_event_target.karpenter["instance_state_change"]
module.eks_blueprints_addons.aws_cloudwatch_event_target.karpenter["spot_interupt"]
module.eks_blueprints_addons.aws_eks_addon.this["coredns"]
module.eks_blueprints_addons.aws_eks_addon.this["kube-proxy"]
module.eks_blueprints_addons.aws_eks_addon.this["vpc-cni"]
module.eks_blueprints_addons.aws_iam_instance_profile.karpenter[0]
module.eks_blueprints_addons.aws_iam_role.karpenter[0]
module.eks_blueprints_addons.aws_iam_role_policy_attachment.karpenter["AmazonEC2ContainerRegistryReadOnly"]
module.eks_blueprints_addons.aws_iam_role_policy_attachment.karpenter["AmazonEKSWorkerNodePolicy"]
module.eks_blueprints_addons.aws_iam_role_policy_attachment.karpenter["AmazonEKS_CNI_Policy"]
module.eks_blueprints_addons.time_sleep.this
module.eks_blueprints_addons.module.karpenter.data.aws_caller_identity.current[0]
module.eks_blueprints_addons.module.karpenter.data.aws_iam_policy_document.assume[0]
module.eks_blueprints_addons.module.karpenter.data.aws_iam_policy_document.this[0]
module.eks_blueprints_addons.module.karpenter.data.aws_partition.current[0]
module.eks_blueprints_addons.module.karpenter.aws_iam_policy.this[0]
module.eks_blueprints_addons.module.karpenter.aws_iam_role.this[0]
module.eks_blueprints_addons.module.karpenter.aws_iam_role_policy_attachment.this[0]
module.eks_blueprints_addons.module.karpenter.helm_release.this[0]
module.eks_blueprints_addons.module.karpenter_sqs.data.aws_iam_policy_document.this[0]
module.eks_blueprints_addons.module.karpenter_sqs.aws_sqs_queue.this[0]
module.eks_blueprints_addons.module.karpenter_sqs.aws_sqs_queue_policy.this[0]

terraform show
...

# k8s 클러스터, 노드, 파드 정보 확인
kubectl cluster-info
kubectl get nodes -L node.kubernetes.io/instance-type -L topology.kubernetes.io/zone
kubectl get node -owide
NAME                                                      STATUS   ROLES    AGE   VERSION               INTERNAL-IP    EXTERNAL-IP   OS-IMAGE         KERNEL-VERSION                  CONTAINER-RUNTIME
fargate-ip-10-10-36-94.ap-northeast-2.compute.internal    Ready    <none>   10m   v1.30.0-eks-404b9c6   10.10.36.94    <none>        Amazon Linux 2   5.10.219-208.866.amzn2.x86_64   containerd://1.7.11
fargate-ip-10-10-4-201.ap-northeast-2.compute.internal    Ready    <none>   10m   v1.30.0-eks-404b9c6   10.10.4.201    <none>        Amazon Linux 2   5.10.219-208.866.amzn2.x86_64   containerd://1.7.11
fargate-ip-10-10-43-93.ap-northeast-2.compute.internal    Ready    <none>   10m   v1.30.0-eks-404b9c6   10.10.43.93    <none>        Amazon Linux 2   5.10.219-208.866.amzn2.x86_64   containerd://1.7.11
fargate-ip-10-10-46-178.ap-northeast-2.compute.internal   Ready    <none>   10m   v1.30.0-eks-404b9c6   10.10.46.178   <none>        Amazon Linux 2   5.10.219-208.866.amzn2.x86_64   containerd://1.7.11

kubectl get pod -A
NAMESPACE     NAME                         READY   STATUS    RESTARTS   AGE   IP             NODE                                                      NOMINATED NODE   READINESS GATES
karpenter     karpenter-6b8687f5db-r9b7q   1/1     Running   0          12m   10.10.36.94    fargate-ip-10-10-36-94.ap-northeast-2.compute.internal    <none>           <none>
karpenter     karpenter-6b8687f5db-v8zwb   1/1     Running   0          12m   10.10.46.178   fargate-ip-10-10-46-178.ap-northeast-2.compute.internal   <none>           <none>
kube-system   coredns-86dcddd859-x9zp8     1/1     Running   0          12m   10.10.4.201    fargate-ip-10-10-4-201.ap-northeast-2.compute.internal    <none>           <none>
kube-system   coredns-86dcddd859-xxk97     1/1     Running   0          12m   10.10.43.93    fargate-ip-10-10-43-93.ap-northeast-2.compute.internal    <none>           <none>
 
# helm chart 확인
helm list -n karpenter
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART           APP VERSION
karpenter       karpenter       1               2024-07-20 23:34:26.74931 +0900 KST     deployed        karpenter-0.35.00.35.0  

# SQS queue and EventBridge event rules for Karpenter to utilize for spot termination handling, capacity re-balancing, etc.
## https://jerryljh.tistory.com/132 , https://aws.github.io/aws-eks-best-practices/karpenter/
helm get values -n karpenter karpenter
USER-SUPPLIED VALUES:
serviceAccount:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::911283464785:role/karpenter-2024072203075821610000000c
  name: karpenter
settings:
  aws:
    clusterEndpoint: https://163CC620EAB64480AA969E78489258AD.yl4.ap-northeast-2.eks.amazonaws.com
    clusterName: t101-karpenter
    interruptionQueueName: karpenter-t101-karpenter
  clusterEndpoint: https://163CC620EAB64480AA969E78489258AD.yl4.ap-northeast-2.eks.amazonaws.com
  clusterName: t101-karpenter
  interruptionQueue: karpenter-t101-karpenter

# Encrypt Kubernetes secrets with AWS KMS on existing clusters
## Symmetric, Can encrypt and decrypt data , Created in the same AWS Region as the cluster
## Warning - You can't disable secrets encryption after enabling it. This action is irreversible.
kubectl get secret -n karpenter
kubectl get secret -n karpenter sh.helm.release.v1.karpenter.v1 -o json | jq

terraform state list
terraform state show 'module.eks_blueprints_addons.module.karpenter_sqs.aws_sqs_queue_policy.this[0]'

Let’s take a look at karpenter.json

---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2
  role: karpenter-t101-karpenter
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: t101-karpenter
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: t101-karpenter
  tags:
    karpenter.sh/discovery: t101-karpenter
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      nodeClassRef:
        name: default
      requirements:
        - key: "karpenter.k8s.aws/instance-category"
          operator: In
          values: ["c", "m", "r"]
        - key: "karpenter.k8s.aws/instance-cpu"
          operator: In
          values: ["4", "8", "16", "32"]
        - key: "karpenter.k8s.aws/instance-hypervisor"
          operator: In
          values: ["nitro"]
        - key: "karpenter.k8s.aws/instance-generation"
          operator: Gt
          values: ["2"]
  limits:
    cpu: 1000
  disruption:
    consolidationPolicy: WhenEmpty
    consolidateAfter: 30s

EC2NodeClass defines the properties of the EC2 instances that Karpenter can provision.

amiFamily: AL2: Specifies Amazon Linux 2 as the base AMI.
role: karpenter-t101-karpenter: The IAM role for the EC2 instances.
subnetSelectorTerms and securityGroupSelectorTerms: Use tags to select appropriate subnets and security groups.
tags: Adds a discovery tag to the instances.

NodePool defines the desired state of the node group and how Karpenter should manage it.

nodeClassRef: References the EC2NodeClass to use.
requirements: Sets constraints on the types of instances to use:
Instance categories: c (compute optimized), m (general purpose), r (memory optimized)
CPU counts: 4, 8, 16, or 32
Must use Nitro hypervisor
Must be generation 3 or newer
limits: Sets a maximum of 1000 CPU cores for this node pool.
disruption: Configures node consolidation:
consolidationPolicy: WhenEmpty: Only consolidate nodes when they're empty.
consolidateAfter: 30s: Wait 30 seconds before consolidating.

kubectl get ec2nodeclass,nodepool
NAME                                     AGE
ec2nodeclass.karpenter.k8s.aws/default   20s

NAME                            NODECLASS
nodepool.karpenter.sh/default   default

Let’s have test deployments which are nginx.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 0
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      terminationGracePeriodSeconds: 0
      containers:
        - name: inflate
          image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
          resources:
            requests:
              cpu: 1

w-karpenter$ kubectl get nodes -L karpenter.sh/nodepool -L node.kubernetes.io/instance-type -L topology.kubernetes.io/zone -L karpenter.sh/capacity-type
NAME                                                    STATUS   ROLES    AGE    VERSION               NODEPOOL   INSTANCE-TYPE   ZONE              CAPACITY-TYPE
fargate-ip-10-0-1-181.ap-northeast-2.compute.internal   Ready    <none>   132m   v1.30.0-eks-404b9c6                              ap-northeast-2a
fargate-ip-10-0-15-74.ap-northeast-2.compute.internal   Ready    <none>   132m   v1.30.0-eks-404b9c6                              ap-northeast-2a
fargate-ip-10-0-35-74.ap-northeast-2.compute.internal   Ready    <none>   28m    v1.30.0-eks-404b9c6                              ap-northeast-2c
fargate-ip-10-0-4-211.ap-northeast-2.compute.internal   Ready    <none>   28m    v1.30.0-eks-404b9c6                              ap-northeast-2a
ip-10-0-3-231.ap-northeast-2.compute.internal           Ready    <none>   140m   v1.30.0-eks-036c24b              c4.2xlarge      ap-northeast-2a
ip-10-0-45-99.ap-northeast-2.compute.internal           Ready    <none>   140m   v1.30.0-eks-036c24b              c4.2xlarge      ap-northeast-2c
sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/new-karpenter$

If you don’t want to use helm repo method, then you can rewrite with…

# 변수 정의
variable "cluster_name" {}
variable "cluster_endpoint" {}
variable "node_iam_role_arn" {}

# GitHub에서 Karpenter 리포지토리 클론 및 차트 준비
resource "null_resource" "prepare_karpenter_chart" {
  provisioner "local-exec" {
    command = <<-EOT
      git clone https://github.com/aws/karpenter-provider-aws.git
      mkdir -p karpenter-charts
      cp -R karpenter-provider-aws/charts/karpenter karpenter-charts/
      find karpenter-charts -type l | xargs rm
      rm -rf karpenter-provider-aws
    EOT
  }
}

# Karpenter values 파일 생성
data "template_file" "karpenter_values" {
  template = file("${path.module}/karpenter-values.yaml")
  vars = {
    cluster_name     = var.cluster_name
    cluster_endpoint = var.cluster_endpoint
    node_iam_role_arn = var.node_iam_role_arn
  }
}

resource "local_file" "karpenter_values" {
  content  = data.template_file.karpenter_values.rendered
  filename = "${path.module}/rendered-karpenter-values.yaml"
}

# Helm 템플릿 생성 및 포트 설정 수정
resource "null_resource" "helm_template_and_fix" {
  depends_on = [null_resource.prepare_karpenter_chart, local_file.karpenter_values]

  provisioner "local-exec" {
    command = <<-EOT
      helm template karpenter ./karpenter-charts/karpenter \
        --namespace karpenter \
        --create-namespace \
        -f ${local_file.karpenter_values.filename} \
        > ${path.module}/karpenter-resources.yaml
      
      # Fix duplicate port names
      sed -i 's/name: http-metrics/name: http-metrics-1/' ${path.module}/karpenter-resources.yaml
      sed -i 's/targetPort: http-metrics/targetPort: http-metrics-1/' ${path.module}/karpenter-resources.yaml
    EOT
  }

  triggers = {
    always_run = "${timestamp()}"
  }
}

# kubectl로 리소스 적용
resource "null_resource" "apply_karpenter" {
  depends_on = [null_resource.helm_template_and_fix]

  provisioner "local-exec" {
    command = "kubectl apply -f ${path.module}/karpenter-resources.yaml"
  }

  triggers = {
    helm_template_execution = null_resource.helm_template_and_fix.id
  }
}

Node Allocatable Capacity from eks-node-viewer has an insight below.

Each node in a Kubernetes cluster has a certain amount of resources (CPU, memory, etc.) that can be allocated to pods. This is not the total capacity of the node, as some resources are reserved for system processes and Kubernetes components.
The tool’s output is different from the actual resource usage, which can vary dynamically as the pod runs. Showing requests is crucial for understanding how the Kubernetes scheduler works. The scheduler uses these requests to determine which nodes have enough capacity to run new pods.

go install github.com/awslabs/eks-node-viewer/cmd/eks-node-viewer@latest

ls $HOME/go/bin

export PATH=$PATH:$HOME/go/bin

6 nodes (800m/23820m) 3.4% cpu █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ $0.955/hour | $697.014
8 pods (0 pending 8 running 8 bound)

ip-10-0-3-231.ap-northeast-2.compute.internal          cpu █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
ip-10-0-45-99.ap-northeast-2.compute.internal          cpu █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
fargate-ip-10-0-15-74.ap-northeast-2.compute.internal  cpu ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
fargate-ip-10-0-1-181.ap-northeast-2.compute.internal  cpu ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
fargate-ip-10-0-38-164.ap-northeast-2.compute.internal cpu ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
fargate-ip-10-0-18-151.ap-northeast-2.compute.internal cpu ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
•
←/→ page • q: quit

And the below is all about cleaning up from the military.

helm uninstall kube-ops-view -n kube-system

terraform destroy -target="module.eks_blueprints_addons" -auto-approve

terraform destroy -target="module.eks" -auto-approve

terraform destroy -auto-approve

aws ec2 describe-vpcs --filter 'Name=isDefault,Values=false' --output yaml

rm -rf ~/.kube/config

Say Hello to AWS Karpenter

Karpenter vs. Cluster Autoscaler:

Karpenter’s Disruption feature

AWS Fargate

Practices

Written by Sigrid Jin

No responses yet