Say Hello to AWS Karpenter
AWS Karpenter is an open-source, flexible, high-performance Kubernetes cluster autoscaler. It helps improve application availability and cluster efficiency by rapidly launching right-sized compute resources in response to changing application load.
Karpenter also manages the entire node lifecycle, automatically terminating nodes when they’re no longer needed. Karpenter uses SQS queues to handle EC2 Spot Instance interruption notices, ensuring your applications have time to gracefully handle instance terminations.
The key feature of karpenter is Just-In-Time Provisioning — Karpenter can provision new nodes in seconds, much faster than traditional autoscalers. Imagine you have a sudden spike in traffic to your e-commerce website during a flash sale. Karpenter can quickly spin up new nodes to handle the increased load, ensuring your customers have a smooth shopping experience.
Additionall,y Karpenter can work with a wide variety of instance types, sizes, and purchase options (On-Demand, Spot, etc.). You can configure Karpenter to use a mix of c5.xlarge, m5.2xlarge, and r5.large instances based on your workload needs. It will automatically choose the most suitable and cost-effective option.
Karpenter also uses sophisticated bin-packing algorithms to maximize resource utilization. If you have a pod that requires 1.5 CPU cores and 3 GB of memory, Karpenter might place it on a c5.large instance (2 CPU, 4 GB RAM) instead of a larger, more expensive instance, optimizing your costs. The library can automatically remove nodes that are no longer needed, helping to reduce costs.
Karpenter uses Kubernetes CRDs for configuration, making it easy to integrate with existing Kubernetes workflows. Also, you can configure Karpenter to use Spot Instances for your batch processing jobs. If a Spot Instance is interrupted, Karpenter will quickly provision a new one to ensure your job continues running.
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
limits:
resources:
cpu: 1000
provider:
instanceProfile: KarpenterNodeInstanceProfile-${CLUSTER_NAME}
Karpenter vs. Cluster Autoscaler:
While both tools aim to autoscale Kubernetes clusters, Karpenter offers several advantages:
Faster scaling: Karpenter can provision nodes in seconds, compared to minutes for Cluster Autoscaler.
More flexible: Karpenter can work with a wider variety of instance types and sizes.
Better bin-packing: Karpenter’s algorithms result in more efficient resource utilization.
Simpler configuration: Karpenter uses Kubernetes-native CRDs for configuration.
Karpenter does especially…
- Watching for pods that the Kubernetes scheduler has marked as unschedulable
- Evaluating scheduling constraints (resource requests, nodeselectors, affinities, tolerations, and topology spread constraints) requested by the pods
- Provisioning nodes that meet the requirements of the pods
- Scheduling the pods to run on the new nodes
- Removing the nodes when the nodes are no longer needed
Karpenter’s Disruption feature
Disruption feature (previously known as Consolidation) is a powerful set of capabilities that help maintain an efficient, cost-effective, and up-to-date Kubernetes cluster. It consists of three main components: Expiration, Drift, and Consolidation.
Expiration ensures that nodes in your cluster don’t become stale or outdated. By default, Karpenter will automatically expire instances after 720 hours (30 days), forcing a refresh of your nodes.
Imagine you’re running a financial services application that requires the latest security patches. With Expiration, you can ensure that no node in your cluster is older than 30 days, minimizing the risk of running outdated software.
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
ttlSecondsUntilExpired: 2592000 # 30 days in seconds
Drift detection allows Karpenter to identify and rectify discrepancies between the desired state — as defined in NodePool and EC2NodeClass configurations) — and the actual state of your nodes.
You might update your NodePool to use a new AMI with enhanced monitoring capabilities. Drift detection will identify existing nodes that don’t match this new configuration and gradually replace them, ensuring your entire cluster adopts the new capabilities without manual intervention.
Consolidation optimizes your cluster for cost-efficiency by intelligently packing workloads onto the most appropriate nodes and removing unnecessary capacity.
Let’s say your cluster has three nodes:
- Node A: 4 CPU, 16GB RAM, 50% utilized
- Node B: 4 CPU, 16GB RAM, 30% utilized
- Node C: 4 CPU, 16GB RAM, 20% utilized
Karpenter’s Consolidation might move all workloads to Nodes A and B, then terminate Node C, saving you the cost of an entire EC2 instance.
Let’s also says that your batch processing job is running on a fleet of Spot Instances. Karpenter uses the AWS EC2 Fleet Instance API to request instances based on your NodePool configuration. If AWS can’t fulfill the request for a specific instance type, Karpenter quickly adjusts:
- It might request alternative instance types specified in your NodePool.
- If necessary, it could relax soft constraints (like preferred Availability Zones) to ensure your workloads keep running.
Spot-to-Spot Consolidation is a specialized form of consolidation designed to optimize costs while maintaining high availability for workloads running on Spot Instances.
Your NodePool configuration includes 15 different instance types across multiple families (c5, m5, r5, etc.) — which is to choose from a wide pool of instance types, increasing the chances of fulfilling capacity requests.
AWS Fargate
AWS Fargate for EKS provides a serverless compute engine for Kubernetes that allows you to run pods without managing the underlying EC2 instances. It combines EKS (for the control plane) with Fargate (for the data plane), resulting in a fully managed, serverless Kubernetes environment.
Also, Fargate provides VM-level isolation for each pod, enhancing security. When you’re running a multi-tenant SaaS application — With Fargate — each customer’s workload can run in its own isolated environment, reducing the risk of noisy neighbor issues or potential security breaches between tenants.
Fargate uses Fargate Profiles to determine which pods should run on Fargate. You can specify the subnets, namespaces, and labels for pods that should use Fargate. EKS’s scheduler decides which pods run on Fargate based on the Fargate Profile configuration.
You have a mixed cluster with both EC2 and Fargate. You can use node selectors or taints/tolerations to control which pods run on Fargate:
apiVersion: eks.amazonaws.com/v1beta1
kind: FargateProfile
metadata:
name: my-fargate-profile
namespace: default
spec:
selectors:
- namespace: my-namespace
labels:
env: prod
service: backend
subnets:
- subnet-1234567890abcdef0
- subnet-234567890abcdef01
apiVersion: apps/v1
kind: Deployment
metadata:
name: fargate-app
spec:
replicas: 3
template:
metadata:
labels:
app: myapp
spec:
nodeSelector:
kubernetes.io/arch: amd64
eks.amazonaws.com/compute-type: fargate
containers:
- name: app
image: nginx
Fargate automatically scales the underlying compute resources based on your pod’s resource requirements. If your application suddenly receives a traffic spike. With Fargate, you don’t need to worry about scaling the EC2 instances. Simply increase the number of pod replicas, and Fargate will automatically provision the necessary resources.
Practices
The provided Terraform code sets up the following components:
- An EKS cluster
- A VPC with public and private subnets
- Fargate profiles for running pods
- Karpenter for cluster auto-scaling
- Some EKS add-ons (CoreDNS, VPC CNI, kube-proxy)
- Creates an EKS cluster with version 1.30
- Sets up Fargate profiles for the “karpenter” and “kube-system” namespaces
- Configures Karpenter using the eks-blueprints-addons module
- Sets up necessary IAM roles and policies
provider "aws" {
region = local.region
}
# Required for public ECR where Karpenter artifacts are hosted
provider "aws" {
region = "us-east-1"
alias = "virginia"
}
provider "kubernetes" {
host = module.eks.cluster_endpoint
cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
# This requires the awscli to be installed locally where Terraform is executed
args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
}
}
provider "helm" {
kubernetes {
host = module.eks.cluster_endpoint
cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
# This requires the awscli to be installed locally where Terraform is executed
args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name]
}
}
}
data "aws_ecrpublic_authorization_token" "token" {
provider = aws.virginia
}
data "aws_availability_zones" "available" {}
locals {
name = "t101-${basename(path.cwd)}"
region = "ap-northeast-2"
vpc_cidr = "10.10.0.0/16"
azs = slice(data.aws_availability_zones.available.names, 0, 3)
tags = {
Blueprint = local.name
GithubRepo = "github.com/aws-ia/terraform-aws-eks-blueprints"
}
}
################################################################################
# Cluster
################################################################################
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.11"
cluster_name = local.name
cluster_version = "1.30"
cluster_endpoint_public_access = true
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
# Fargate profiles use the cluster primary security group so these are not utilized
create_cluster_security_group = false
create_node_security_group = false
enable_cluster_creator_admin_permissions = true
fargate_profiles = {
karpenter = {
selectors = [
{ namespace = "karpenter" }
]
}
kube_system = {
name = "kube-system"
selectors = [
{ namespace = "kube-system" }
]
}
}
tags = merge(local.tags, {
# NOTE - if creating multiple security groups with this module, only tag the
# security group that Karpenter should utilize with the following tag
# (i.e. - at most, only one security group should have this tag in your account)
"karpenter.sh/discovery" = local.name
})
}
################################################################################
# EKS Blueprints Addons
################################################################################
module "eks_blueprints_addons" {
source = "aws-ia/eks-blueprints-addons/aws"
version = "~> 1.16"
cluster_name = module.eks.cluster_name
cluster_endpoint = module.eks.cluster_endpoint
cluster_version = module.eks.cluster_version
oidc_provider_arn = module.eks.oidc_provider_arn
# We want to wait for the Fargate profiles to be deployed first
create_delay_dependencies = [for prof in module.eks.fargate_profiles : prof.fargate_profile_arn]
eks_addons = {
coredns = {
configuration_values = jsonencode({
computeType = "Fargate"
# Ensure that the we fully utilize the minimum amount of resources that are supplied by
# Fargate https://docs.aws.amazon.com/eks/latest/userguide/fargate-pod-configuration.html
# Fargate adds 256 MB to each pod's memory reservation for the required Kubernetes
# components (kubelet, kube-proxy, and containerd). Fargate rounds up to the following
# compute configuration that most closely matches the sum of vCPU and memory requests in
# order to ensure pods always have the resources that they need to run.
resources = {
limits = {
cpu = "0.25"
# We are targeting the smallest Task size of 512Mb, so we subtract 256Mb from the
# request/limit to ensure we can fit within that task
memory = "256M"
}
requests = {
cpu = "0.25"
# We are targeting the smallest Task size of 512Mb, so we subtract 256Mb from the
# request/limit to ensure we can fit within that task
memory = "256M"
}
}
})
}
vpc-cni = {}
kube-proxy = {}
}
enable_karpenter = true
karpenter = {
repository_username = data.aws_ecrpublic_authorization_token.token.user_name
repository_password = data.aws_ecrpublic_authorization_token.token.password
}
karpenter_node = {
# Use static name so that it matches what is defined in `karpenter.yaml` example manifest
iam_role_use_name_prefix = false
}
tags = local.tags
}
resource "aws_eks_access_entry" "karpenter_node_access_entry" {
cluster_name = module.eks.cluster_name
principal_arn = module.eks_blueprints_addons.karpenter.node_iam_role_arn
kubernetes_groups = []
type = "EC2_LINUX"
}
################################################################################
# Supporting Resources
################################################################################
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
name = local.name
cidr = local.vpc_cidr
azs = local.azs
private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 4, k)]
public_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k + 48)]
enable_nat_gateway = true
single_nat_gateway = true
public_subnet_tags = {
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/role/internal-elb" = 1
# Tags subnets for Karpenter auto-discovery
"karpenter.sh/discovery" = local.name
}
tags = local.tags
}
terraform init
tree .terraform
cat .terraform/modules/modules.json | jq
tree .terraform/providers/registry.terraform.io/hashicorp -L 2
As you can see, we would see these 4 Fargate nodes and the implications of this setup after applying the configurations.
1. CoreDNS Pods (2 Fargate nodes):
— CoreDNS is crucial for DNS resolution within the cluster.
— The configuration sets specific resource requests for CoreDNS (0.25 CPU and 256M memory).
— Running two replicas ensures high availability for DNS services.
— Each CoreDNS pod gets its own Fargate node, providing isolation and dedicated resources.
2. Karpenter Pods (2 Fargate nodes):
— Karpenter is the cluster autoscaler, critical for dynamic resource management.
— While not explicitly defined in the Terraform code, Karpenter typically runs with two replicas for high availability.
— The Fargate profile for the “karpenter” namespace ensures these pods run on Fargate.
— Each Karpenter pod gets its own Fargate node, again for isolation and reliability.
sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid$ aws ec2 describe-vpcs --filter 'Name=isDefault,Values=false' --output yaml
Vpcs:
- CidrBlock: 10.10.0.0/16
CidrBlockAssociationSet:
- AssociationId: vpc-cidr-assoc-059e3b4e0738b90fd
CidrBlock: 10.10.0.0/16
CidrBlockState:
State: associated
DhcpOptionsId: dopt-74a4271f
InstanceTenancy: default
IsDefault: false
OwnerId: '712218945685'
State: available
Tags:
- Key: aws:cloudformation:stack-name
Value: t101-2
- Key: aws:cloudformation:logical-id
Value: TerraformVPC
- Key: aws:cloudformation:stack-id
Value: arn:aws:cloudformation:ap-northeast-2:712218945685:stack/t101-2/1d9e3f70-413f-11ef-9021-022596c73975
- Key: Name
Value: Terraform-VPC
VpcId: vpc-074fdf7c7db333935
sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/karpenter$ terraform state list
data.aws_availability_zones.available
module.vpc.aws_default_network_acl.this[0]
module.vpc.aws_default_route_table.default[0]
module.vpc.aws_default_security_group.this[0]
module.vpc.aws_eip.nat[0]
module.vpc.aws_internet_gateway.this[0]
module.vpc.aws_nat_gateway.this[0]
module.vpc.aws_route.private_nat_gateway[0]
module.vpc.aws_route.public_internet_gateway[0]
module.vpc.aws_route_table.private[0]
module.vpc.aws_route_table.public[0]
module.vpc.aws_route_table_association.private[0]
module.vpc.aws_route_table_association.private[1]
module.vpc.aws_route_table_association.private[2]
module.vpc.aws_route_table_association.public[0]
module.vpc.aws_route_table_association.public[1]
module.vpc.aws_route_table_association.public[2]
module.vpc.aws_subnet.private[0]
module.vpc.aws_subnet.private[1]
module.vpc.aws_subnet.private[2]
module.vpc.aws_subnet.public[0]
module.vpc.aws_subnet.public[1]
module.vpc.aws_subnet.public[2]
module.vpc.aws_vpc.this[0]
sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/karpenter$ aws ec2 describe-vpcs --filter 'Name=isDefault,Values=false' --output yaml
Vpcs:
- CidrBlock: 10.10.0.0/16
CidrBlockAssociationSet:
- AssociationId: vpc-cidr-assoc-059e3b4e0738b90fd
CidrBlock: 10.10.0.0/16
CidrBlockState:
State: associated
DhcpOptionsId: dopt-74a4271f
InstanceTenancy: default
IsDefault: false
OwnerId: '712218945685'
State: available
Tags:
- Key: aws:cloudformation:stack-name
Value: t101-2
- Key: aws:cloudformation:logical-id
Value: TerraformVPC
- Key: aws:cloudformation:stack-id
Value: arn:aws:cloudformation:ap-northeast-2:712218945685:stack/t101-2/1d9e3f70-413f-11ef-9021-022596c73975
- Key: Name
Value: Terraform-VPC
VpcId: vpc-074fdf7c7db333935
- CidrBlock: 10.0.0.0/16
CidrBlockAssociationSet:
- AssociationId: vpc-cidr-assoc-05bda9f8dc1d22922
CidrBlock: 10.0.0.0/16
CidrBlockState:
State: associated
DhcpOptionsId: dopt-74a4271f
InstanceTenancy: default
IsDefault: false
sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/karpenter$ echo "data.aws_availability_zones.available" | terraform console
{
"all_availability_zones" = tobool(null)
"exclude_names" = toset(null) /* of string */
"exclude_zone_ids" = toset(null) /* of string */
"filter" = toset(null) /* of object */
"group_names" = toset([
"ap-northeast-2",
])
"id" = "ap-northeast-2"
"names" = tolist([
"ap-northeast-2a",
"ap-northeast-2b",
"ap-northeast-2c",
"ap-northeast-2d",
])
"state" = tostring(null)
"timeouts" = null /* object */
"zone_ids" = tolist([
"apne2-az1",
"apne2-az2",
"apne2-az3",
"apne2-az4",
])
}
It takes around 3~5 minutes to create new vpc endpoint on your account.
After that, let’s apply module.eks
for the sake of worker nodes.
terraform apply -target="module.eks" -auto-approve
Apply complete! Resources: 24 added, 0 changed, 0 destroyed.
Outputs:
configure_kubectl = "aws eks --region ap-northeast-2 update-kubeconfig --name t101-karpenter"
sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/karpenter$ terraform state list
data.aws_availability_zones.available
module.eks.data.aws_caller_identity.current
module.eks.data.aws_iam_policy_document.assume_role_policy[0]
module.eks.data.aws_iam_session_context.current
module.eks.data.aws_partition.current
module.eks.data.tls_certificate.this[0]
module.eks.aws_cloudwatch_log_group.this[0]
module.eks.aws_ec2_tag.cluster_primary_security_group["Blueprint"]
module.eks.aws_ec2_tag.cluster_primary_security_group["GithubRepo"]
module.eks.aws_ec2_tag.cluster_primary_security_group["karpenter.sh/discovery"]
module.eks.aws_eks_access_entry.this["cluster_creator"]
module.eks.aws_eks_access_policy_association.this["cluster_creator_admin"]
module.eks.aws_eks_cluster.this[0]
module.eks.aws_iam_openid_connect_provider.oidc_provider[0]
module.eks.aws_iam_policy.cluster_encryption[0]
module.eks.aws_iam_role.this[0]
module.eks.aws_iam_role_policy_attachment.cluster_encryption[0]
module.eks.aws_iam_role_policy_attachment.this["AmazonEKSClusterPolicy"]
module.eks.aws_iam_role_policy_attachment.this["AmazonEKSVPCResourceController"]
module.eks.time_sleep.this[0]
module.vpc.aws_default_network_acl.this[0]
module.vpc.aws_default_route_table.default[0]
module.vpc.aws_default_security_group.this[0]
module.vpc.aws_eip.nat[0]
module.vpc.aws_internet_gateway.this[0]
module.vpc.aws_nat_gateway.this[0]
module.vpc.aws_route.private_nat_gateway[0]
module.vpc.aws_route.public_internet_gateway[0]
module.vpc.aws_route_table.private[0]
module.vpc.aws_route_table.public[0]
module.vpc.aws_route_table_association.private[0]
module.vpc.aws_route_table_association.private[1]
module.vpc.aws_route_table_association.private[2]
module.vpc.aws_route_table_association.public[0]
module.vpc.aws_route_table_association.public[1]
module.vpc.aws_route_table_association.public[2]
module.vpc.aws_subnet.private[0]
module.vpc.aws_subnet.private[1]
module.vpc.aws_subnet.private[2]
module.vpc.aws_subnet.public[0]
module.vpc.aws_subnet.public[1]
module.vpc.aws_subnet.public[2]
module.vpc.aws_vpc.this[0]
module.eks.module.fargate_profile["karpenter"].data.aws_caller_identity.current
module.eks.module.fargate_profile["karpenter"].data.aws_iam_policy_document.assume_role_policy[0]
module.eks.module.fargate_profile["karpenter"].data.aws_partition.current
module.eks.module.fargate_profile["karpenter"].data.aws_region.current
module.eks.module.fargate_profile["karpenter"].aws_eks_fargate_profile.this[0]
module.eks.module.fargate_profile["karpenter"].aws_iam_role.this[0]
module.eks.module.fargate_profile["karpenter"].aws_iam_role_policy_attachment.this["AmazonEKSFargatePodExecutionRolePolicy"]
module.eks.module.fargate_profile["karpenter"].aws_iam_role_policy_attachment.this["AmazonEKS_CNI_Policy"]
module.eks.module.fargate_profile["kube_system"].data.aws_caller_identity.current
module.eks.module.fargate_profile["kube_system"].data.aws_iam_policy_document.assume_role_policy[0]
module.eks.module.fargate_profile["kube_system"].data.aws_partition.current
module.eks.module.fargate_profile["kube_system"].data.aws_region.current
module.eks.module.fargate_profile["kube_system"].aws_eks_fargate_profile.this[0]
module.eks.module.fargate_profile["kube_system"].aws_iam_role.this[0]
module.eks.module.fargate_profile["kube_system"].aws_iam_role_policy_attachment.this["AmazonEKSFargatePodExecutionRolePolicy"]
module.eks.module.fargate_profile["kube_system"].aws_iam_role_policy_attachment.this["AmazonEKS_CNI_Policy"]
module.eks.module.kms.data.aws_caller_identity.current[0]
module.eks.module.kms.data.aws_iam_policy_document.this[0]
module.eks.module.kms.data.aws_partition.current[0]
module.eks.module.kms.aws_kms_alias.this["cluster"]
module.eks.module.kms.aws_kms_key.this[0]
VPCID=vpc-0a94d83d1d2a23be3
aws ec2 describe-subnets --filters "Name=vpc-id,Values=$VPCID" | jq
sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/karpenter$ aws ec2 describe-subnets --filters "Name=vpc-id,Values=$VPCID" --output text
SUBNETS False ap-northeast-2c apne2-az3 251 10.0.50.0/24 False False False False False 712218945685 available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-0af1d2622cf944a13 subnet-0af1d2622cf944a13 vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH False False ip-name
TAGS GithubRepo github.com/aws-ia/terraform-aws-eks-blueprints
TAGS Blueprint t101-karpenter
TAGS kubernetes.io/role/elb 1
TAGS Name t101-karpenter-public-ap-northeast-2c
SUBNETS False ap-northeast-2b apne2-az2 251 10.0.49.0/24 False False False False False 712218945685 available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-02e84d81a53f9e261 subnet-02e84d81a53f9e261 vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH False False ip-name
TAGS GithubRepo github.com/aws-ia/terraform-aws-eks-blueprints
TAGS kubernetes.io/role/elb 1
TAGS Name t101-karpenter-public-ap-northeast-2b
TAGS Blueprint t101-karpenter
SUBNETS False ap-northeast-2a apne2-az1 250 10.0.48.0/24 False False False False False 712218945685 available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-0eaf312876e26277c subnet-0eaf312876e26277c vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH False False ip-name
TAGS Name t101-karpenter-public-ap-northeast-2a
TAGS Blueprint t101-karpenter
TAGS GithubRepo github.com/aws-ia/terraform-aws-eks-blueprints
TAGS kubernetes.io/role/elb 1
SUBNETS False ap-northeast-2c apne2-az3 4091 10.0.32.0/20 False False False False False 712218945685 available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-022e63af9a750193e subnet-022e63af9a750193e vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH False False ip-name
TAGS karpenter.sh/discovery t101-karpenter
TAGS kubernetes.io/role/internal-elb 1
TAGS GithubRepo github.com/aws-ia/terraform-aws-eks-blueprints
:
SUBNETS False ap-northeast-2c apne2-az3 251 10.0.50.0/24 False False False False False 712218945685 available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-0af1d2622cf944a13 subnet-0af1d2622cf944a13 vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH False False ip-name
TAGS GithubRepo github.com/aws-ia/terraform-aws-eks-blueprints
TAGS Blueprint t101-karpenter
TAGS kubernetes.io/role/elb 1
TAGS Name t101-karpenter-public-ap-northeast-2c
SUBNETS False ap-northeast-2b apne2-az2 251 10.0.49.0/24 False False False False False 712218945685 available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-02e84d81a53f9e261 subnet-02e84d81a53f9e261 vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH False False ip-name
TAGS GithubRepo github.com/aws-ia/terraform-aws-eks-blueprints
TAGS kubernetes.io/role/elb 1
TAGS Name t101-karpenter-public-ap-northeast-2b
TAGS Blueprint t101-karpenter
SUBNETS False ap-northeast-2a apne2-az1 250 10.0.48.0/24 False False False False False 712218945685 available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-0eaf312876e26277c subnet-0eaf312876e26277c vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH False False ip-name
TAGS Name t101-karpenter-public-ap-northeast-2a
TAGS Blueprint t101-karpenter
TAGS GithubRepo github.com/aws-ia/terraform-aws-eks-blueprints
TAGS kubernetes.io/role/elb 1
SUBNETS False ap-northeast-2c apne2-az3 4091 10.0.32.0/20 False False False False False 712218945685 available arn:aws:ec2:ap-northeast-2:712218945685:subnet/subnet-022e63af9a750193e subnet-022e63af9a750193e vpc-0a94d83d1d2a23be3
PRIVATEDNSNAMEOPTIONSONLAUNCH False False ip-name
TAGS karpenter.sh/discovery t101-karpenter
TAGS kubernetes.io/role/internal-elb 1
TAGS GithubRepo github.com/aws-ia/terraform-aws-eks-blueprints
:
# public 서브넷과 private 서브넷 CIDR 확인
## private_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 4, k)]
## public_subnets = [for k, v in local.azs : cidrsubnet(local.vpc_cidr, 8, k + 48)]
terraform state show 'module.vpc.aws_subnet.public[0]'
terraform state show 'module.vpc.aws_subnet.private[0]'
aws eks --region ap-northeast-2 update-kubeconfig --name t101-karpenter
sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/karpenter$ kubectl cluster-info
Kubernetes control plane is running at https://9690F6AC485BDC1530DA59EE517421DC.gr7.ap-northeast-2.eks.amazonaws.com
CoreDNS is running at https://9690F6AC485BDC1530DA59EE517421DC.gr7.ap-northeast-2.eks.amazonaws.com/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
The setup below includes an EKS cluster with Fargate nodes with…
- Karpenter is installed and configured for autoscaling
- Various AWS resources (IAM roles, CloudWatch events, SQS queues) are set up to support Karpenter’s operation
- The cluster is using EKS version 1.30 with containerd as the container runtime
- Secrets are encrypted using AWS KMS for enhanced security
# 배포 : 2분 소요
terraform apply -auto-approve
# 확인
terraform state list
data.aws_ecrpublic_authorization_token.token
aws_eks_access_entry.karpenter_node_access_entry
module.eks_blueprints_addons.data.aws_caller_identity.current
module.eks_blueprints_addons.data.aws_eks_addon_version.this["coredns"]
module.eks_blueprints_addons.data.aws_eks_addon_version.this["kube-proxy"]
module.eks_blueprints_addons.data.aws_eks_addon_version.this["vpc-cni"]
module.eks_blueprints_addons.data.aws_iam_policy_document.karpenter[0]
module.eks_blueprints_addons.data.aws_iam_policy_document.karpenter_assume_role[0]
module.eks_blueprints_addons.data.aws_partition.current
module.eks_blueprints_addons.data.aws_region.current
module.eks_blueprints_addons.aws_cloudwatch_event_rule.karpenter["health_event"]
module.eks_blueprints_addons.aws_cloudwatch_event_rule.karpenter["instance_rebalance"]
module.eks_blueprints_addons.aws_cloudwatch_event_rule.karpenter["instance_state_change"]
module.eks_blueprints_addons.aws_cloudwatch_event_rule.karpenter["spot_interupt"]
module.eks_blueprints_addons.aws_cloudwatch_event_target.karpenter["health_event"]
module.eks_blueprints_addons.aws_cloudwatch_event_target.karpenter["instance_rebalance"]
module.eks_blueprints_addons.aws_cloudwatch_event_target.karpenter["instance_state_change"]
module.eks_blueprints_addons.aws_cloudwatch_event_target.karpenter["spot_interupt"]
module.eks_blueprints_addons.aws_eks_addon.this["coredns"]
module.eks_blueprints_addons.aws_eks_addon.this["kube-proxy"]
module.eks_blueprints_addons.aws_eks_addon.this["vpc-cni"]
module.eks_blueprints_addons.aws_iam_instance_profile.karpenter[0]
module.eks_blueprints_addons.aws_iam_role.karpenter[0]
module.eks_blueprints_addons.aws_iam_role_policy_attachment.karpenter["AmazonEC2ContainerRegistryReadOnly"]
module.eks_blueprints_addons.aws_iam_role_policy_attachment.karpenter["AmazonEKSWorkerNodePolicy"]
module.eks_blueprints_addons.aws_iam_role_policy_attachment.karpenter["AmazonEKS_CNI_Policy"]
module.eks_blueprints_addons.time_sleep.this
module.eks_blueprints_addons.module.karpenter.data.aws_caller_identity.current[0]
module.eks_blueprints_addons.module.karpenter.data.aws_iam_policy_document.assume[0]
module.eks_blueprints_addons.module.karpenter.data.aws_iam_policy_document.this[0]
module.eks_blueprints_addons.module.karpenter.data.aws_partition.current[0]
module.eks_blueprints_addons.module.karpenter.aws_iam_policy.this[0]
module.eks_blueprints_addons.module.karpenter.aws_iam_role.this[0]
module.eks_blueprints_addons.module.karpenter.aws_iam_role_policy_attachment.this[0]
module.eks_blueprints_addons.module.karpenter.helm_release.this[0]
module.eks_blueprints_addons.module.karpenter_sqs.data.aws_iam_policy_document.this[0]
module.eks_blueprints_addons.module.karpenter_sqs.aws_sqs_queue.this[0]
module.eks_blueprints_addons.module.karpenter_sqs.aws_sqs_queue_policy.this[0]
terraform show
...
# k8s 클러스터, 노드, 파드 정보 확인
kubectl cluster-info
kubectl get nodes -L node.kubernetes.io/instance-type -L topology.kubernetes.io/zone
kubectl get node -owide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
fargate-ip-10-10-36-94.ap-northeast-2.compute.internal Ready <none> 10m v1.30.0-eks-404b9c6 10.10.36.94 <none> Amazon Linux 2 5.10.219-208.866.amzn2.x86_64 containerd://1.7.11
fargate-ip-10-10-4-201.ap-northeast-2.compute.internal Ready <none> 10m v1.30.0-eks-404b9c6 10.10.4.201 <none> Amazon Linux 2 5.10.219-208.866.amzn2.x86_64 containerd://1.7.11
fargate-ip-10-10-43-93.ap-northeast-2.compute.internal Ready <none> 10m v1.30.0-eks-404b9c6 10.10.43.93 <none> Amazon Linux 2 5.10.219-208.866.amzn2.x86_64 containerd://1.7.11
fargate-ip-10-10-46-178.ap-northeast-2.compute.internal Ready <none> 10m v1.30.0-eks-404b9c6 10.10.46.178 <none> Amazon Linux 2 5.10.219-208.866.amzn2.x86_64 containerd://1.7.11
kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
karpenter karpenter-6b8687f5db-r9b7q 1/1 Running 0 12m 10.10.36.94 fargate-ip-10-10-36-94.ap-northeast-2.compute.internal <none> <none>
karpenter karpenter-6b8687f5db-v8zwb 1/1 Running 0 12m 10.10.46.178 fargate-ip-10-10-46-178.ap-northeast-2.compute.internal <none> <none>
kube-system coredns-86dcddd859-x9zp8 1/1 Running 0 12m 10.10.4.201 fargate-ip-10-10-4-201.ap-northeast-2.compute.internal <none> <none>
kube-system coredns-86dcddd859-xxk97 1/1 Running 0 12m 10.10.43.93 fargate-ip-10-10-43-93.ap-northeast-2.compute.internal <none> <none>
# helm chart 확인
helm list -n karpenter
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
karpenter karpenter 1 2024-07-20 23:34:26.74931 +0900 KST deployed karpenter-0.35.00.35.0
# SQS queue and EventBridge event rules for Karpenter to utilize for spot termination handling, capacity re-balancing, etc.
## https://jerryljh.tistory.com/132 , https://aws.github.io/aws-eks-best-practices/karpenter/
helm get values -n karpenter karpenter
USER-SUPPLIED VALUES:
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::911283464785:role/karpenter-2024072203075821610000000c
name: karpenter
settings:
aws:
clusterEndpoint: https://163CC620EAB64480AA969E78489258AD.yl4.ap-northeast-2.eks.amazonaws.com
clusterName: t101-karpenter
interruptionQueueName: karpenter-t101-karpenter
clusterEndpoint: https://163CC620EAB64480AA969E78489258AD.yl4.ap-northeast-2.eks.amazonaws.com
clusterName: t101-karpenter
interruptionQueue: karpenter-t101-karpenter
# Encrypt Kubernetes secrets with AWS KMS on existing clusters
## Symmetric, Can encrypt and decrypt data , Created in the same AWS Region as the cluster
## Warning - You can't disable secrets encryption after enabling it. This action is irreversible.
kubectl get secret -n karpenter
kubectl get secret -n karpenter sh.helm.release.v1.karpenter.v1 -o json | jq
terraform state list
terraform state show 'module.eks_blueprints_addons.module.karpenter_sqs.aws_sqs_queue_policy.this[0]'
Let’s take a look at karpenter.json
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
amiFamily: AL2
role: karpenter-t101-karpenter
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: t101-karpenter
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: t101-karpenter
tags:
karpenter.sh/discovery: t101-karpenter
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: default
spec:
template:
spec:
nodeClassRef:
name: default
requirements:
- key: "karpenter.k8s.aws/instance-category"
operator: In
values: ["c", "m", "r"]
- key: "karpenter.k8s.aws/instance-cpu"
operator: In
values: ["4", "8", "16", "32"]
- key: "karpenter.k8s.aws/instance-hypervisor"
operator: In
values: ["nitro"]
- key: "karpenter.k8s.aws/instance-generation"
operator: Gt
values: ["2"]
limits:
cpu: 1000
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 30s
EC2NodeClass defines the properties of the EC2 instances that Karpenter can provision.
amiFamily: AL2
: Specifies Amazon Linux 2 as the base AMI.role: karpenter-t101-karpenter
: The IAM role for the EC2 instances.subnetSelectorTerms
andsecurityGroupSelectorTerms
: Use tags to select appropriate subnets and security groups.tags
: Adds a discovery tag to the instances.
NodePool defines the desired state of the node group and how Karpenter should manage it.
nodeClassRef
: References the EC2NodeClass to use.requirements
: Sets constraints on the types of instances to use:- Instance categories: c (compute optimized), m (general purpose), r (memory optimized)
- CPU counts: 4, 8, 16, or 32
- Must use Nitro hypervisor
- Must be generation 3 or newer
limits
: Sets a maximum of 1000 CPU cores for this node pool.disruption
: Configures node consolidation:consolidationPolicy: WhenEmpty
: Only consolidate nodes when they're empty.consolidateAfter: 30s
: Wait 30 seconds before consolidating.
kubectl get ec2nodeclass,nodepool
NAME AGE
ec2nodeclass.karpenter.k8s.aws/default 20s
NAME NODECLASS
nodepool.karpenter.sh/default default
Let’s have test deployments which are nginx.
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 0
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: 1
w-karpenter$ kubectl get nodes -L karpenter.sh/nodepool -L node.kubernetes.io/instance-type -L topology.kubernetes.io/zone -L karpenter.sh/capacity-type
NAME STATUS ROLES AGE VERSION NODEPOOL INSTANCE-TYPE ZONE CAPACITY-TYPE
fargate-ip-10-0-1-181.ap-northeast-2.compute.internal Ready <none> 132m v1.30.0-eks-404b9c6 ap-northeast-2a
fargate-ip-10-0-15-74.ap-northeast-2.compute.internal Ready <none> 132m v1.30.0-eks-404b9c6 ap-northeast-2a
fargate-ip-10-0-35-74.ap-northeast-2.compute.internal Ready <none> 28m v1.30.0-eks-404b9c6 ap-northeast-2c
fargate-ip-10-0-4-211.ap-northeast-2.compute.internal Ready <none> 28m v1.30.0-eks-404b9c6 ap-northeast-2a
ip-10-0-3-231.ap-northeast-2.compute.internal Ready <none> 140m v1.30.0-eks-036c24b c4.2xlarge ap-northeast-2a
ip-10-0-45-99.ap-northeast-2.compute.internal Ready <none> 140m v1.30.0-eks-036c24b c4.2xlarge ap-northeast-2c
sigridjineth@sigridjineth-Z590-VISION-G:~/sigrid/terraform-aws-eks-blueprints/patterns/new-karpenter$
If you don’t want to use helm repo method, then you can rewrite with…
# 변수 정의
variable "cluster_name" {}
variable "cluster_endpoint" {}
variable "node_iam_role_arn" {}
# GitHub에서 Karpenter 리포지토리 클론 및 차트 준비
resource "null_resource" "prepare_karpenter_chart" {
provisioner "local-exec" {
command = <<-EOT
git clone https://github.com/aws/karpenter-provider-aws.git
mkdir -p karpenter-charts
cp -R karpenter-provider-aws/charts/karpenter karpenter-charts/
find karpenter-charts -type l | xargs rm
rm -rf karpenter-provider-aws
EOT
}
}
# Karpenter values 파일 생성
data "template_file" "karpenter_values" {
template = file("${path.module}/karpenter-values.yaml")
vars = {
cluster_name = var.cluster_name
cluster_endpoint = var.cluster_endpoint
node_iam_role_arn = var.node_iam_role_arn
}
}
resource "local_file" "karpenter_values" {
content = data.template_file.karpenter_values.rendered
filename = "${path.module}/rendered-karpenter-values.yaml"
}
# Helm 템플릿 생성 및 포트 설정 수정
resource "null_resource" "helm_template_and_fix" {
depends_on = [null_resource.prepare_karpenter_chart, local_file.karpenter_values]
provisioner "local-exec" {
command = <<-EOT
helm template karpenter ./karpenter-charts/karpenter \
--namespace karpenter \
--create-namespace \
-f ${local_file.karpenter_values.filename} \
> ${path.module}/karpenter-resources.yaml
# Fix duplicate port names
sed -i 's/name: http-metrics/name: http-metrics-1/' ${path.module}/karpenter-resources.yaml
sed -i 's/targetPort: http-metrics/targetPort: http-metrics-1/' ${path.module}/karpenter-resources.yaml
EOT
}
triggers = {
always_run = "${timestamp()}"
}
}
# kubectl로 리소스 적용
resource "null_resource" "apply_karpenter" {
depends_on = [null_resource.helm_template_and_fix]
provisioner "local-exec" {
command = "kubectl apply -f ${path.module}/karpenter-resources.yaml"
}
triggers = {
helm_template_execution = null_resource.helm_template_and_fix.id
}
}
Node Allocatable Capacity from eks-node-viewer has an insight below.
- Each node in a Kubernetes cluster has a certain amount of resources (CPU, memory, etc.) that can be allocated to pods. This is not the total capacity of the node, as some resources are reserved for system processes and Kubernetes components.
- The tool’s output is different from the actual resource usage, which can vary dynamically as the pod runs. Showing requests is crucial for understanding how the Kubernetes scheduler works. The scheduler uses these requests to determine which nodes have enough capacity to run new pods.
go install github.com/awslabs/eks-node-viewer/cmd/eks-node-viewer@latest
ls $HOME/go/bin
export PATH=$PATH:$HOME/go/bin
6 nodes (800m/23820m) 3.4% cpu █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ $0.955/hour | $697.014
8 pods (0 pending 8 running 8 bound)
ip-10-0-3-231.ap-northeast-2.compute.internal cpu █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
ip-10-0-45-99.ap-northeast-2.compute.internal cpu █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
fargate-ip-10-0-15-74.ap-northeast-2.compute.internal cpu ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
fargate-ip-10-0-1-181.ap-northeast-2.compute.internal cpu ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
fargate-ip-10-0-38-164.ap-northeast-2.compute.internal cpu ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
fargate-ip-10-0-18-151.ap-northeast-2.compute.internal cpu ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
•
←/→ page • q: quit
And the below is all about cleaning up from the military.
helm uninstall kube-ops-view -n kube-system
terraform destroy -target="module.eks_blueprints_addons" -auto-approve
terraform destroy -target="module.eks" -auto-approve
terraform destroy -auto-approve
aws ec2 describe-vpcs --filter 'Name=isDefault,Values=false' --output yaml
rm -rf ~/.kube/config