EKS Cluster Setup and Management with eksctl

EKS Cluster Setup and Management with eksctl

After years of deploying EKS clusters across various scenarios, from weekend projects to enterprise deployments, I've developed strong opinions about each deployment method. Let me walk you through my experiences and help you choose the right approach for your needs.

Quick Overview: My Personal Take

Before diving deep, here's my TL;DR:

  • eksctl: My go-to for personal projects and quick starts

  • Terraform: Best for enterprise and multi-cloud setups

  • CDK: Great when your team loves TypeScript/Python

  • CloudFormation: Solid but verbose; I mainly use it via CDK

Deep Dive: Each Method in Detail

1. eksctl: The Swift and Simple Approach

When I'm prototyping or need a cluster quickly, eksctl is my first choice. Here's why:

bashCopy# The simplest way to get started
eksctl create cluster --name weekend-project --nodes 2

# My favorite production setup
eksctl create cluster -f prod-config.yaml

My go-to production configuration:

yamlCopyapiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: prod-cluster
  region: us-west-2
vpc:
  cidr: "192.168.0.0/16"
  nat:
    gateway: HighlyAvailable
nodeGroups:
  - name: critical-workloads
    instanceType: m5.xlarge
    desiredCapacity: 3
    privateNetworking: true
    labels:
      workload: critical
    taints:
      dedicated: "critical:NoSchedule"

  - name: spot-workloads
    instanceTypes: ["t3.large", "t3.xlarge"]
    spot: true
    desiredCapacity: 2
    labels:
      workload: flexible

When I Use It

  • Personal projects

  • Quick prototypes

  • Simple production setups

  • When teaching others about EKS

2. Terraform: The Enterprise Workhorse

For my enterprise clients or complex multi-cloud setups, Terraform is unbeatable. Here's my battle-tested setup:

hclCopymodule "eks" {
  source = "terraform-aws-modules/eks/aws"

  cluster_name    = "enterprise-cluster"
  cluster_version = "1.27"

  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets

  # My preferred node group setup
  eks_managed_node_groups = {
    critical = {
      min_size     = 3
      max_size     = 10
      desired_size = 3

      instance_types = ["m5.xlarge"]
      capacity_type  = "ON_DEMAND"

      labels = {
        workload = "critical"
      }

      taints = [
        {
          key    = "dedicated"
          value  = "critical"
          effect = "NO_SCHEDULE"
        }
      ]
    }

    spot = {
      min_size     = 2
      max_size     = 20
      desired_size = 2

      instance_types = ["t3.large", "t3.xlarge"]
      capacity_type  = "SPOT"

      labels = {
        workload = "flexible"
      }
    }
  }

  # Authentication and RBAC
  manage_aws_auth_configmap = true
  aws_auth_roles = [
    {
      rolearn  = "arn:aws:iam::66666666666:role/role1"
      username = "role1"
      groups   = ["system:masters"]
    },
  ]
}

When I Use It

  • Multi-cloud environments

  • Complex infrastructure requirements

  • When working with enterprise clients

  • Need for strong state management

3. AWS CDK: The Developer's Choice

When I'm working with teams that live and breathe TypeScript or Python, CDK feels natural:

typescriptCopyimport * as eks from 'aws-cdk-lib/aws-eks';
import * as ec2 from 'aws-cdk-lib/aws-ec2';

export class EksStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // My preferred cluster setup
    const cluster = new eks.Cluster(this, 'DevCluster', {
      version: eks.KubernetesVersion.V1_27,
      defaultCapacity: 0,
      vpc: new ec2.Vpc(this, 'EksVpc', {
        maxAzs: 3,
        natGateways: 1,
      }),
      endpointAccess: eks.EndpointAccess.PRIVATE,
    });

    // Critical workload node group
    cluster.addNodegroupCapacity('CriticalNodes', {
      instanceTypes: [ec2.InstanceType.of(ec2.InstanceClass.M5, ec2.InstanceSize.XLARGE)],
      minSize: 3,
      maxSize: 10,
      desiredSize: 3,
      labels: {
        workload: 'critical',
      },
      taints: [{
        key: 'dedicated',
        value: 'critical',
        effect: eks.TaintEffect.NO_SCHEDULE,
      }],
    });

    // Spot instances for flexible workloads
    cluster.addNodegroupCapacity('SpotNodes', {
      instanceTypes: [
        ec2.InstanceType.of(ec2.InstanceClass.T3, ec2.InstanceSize.LARGE),
        ec2.InstanceType.of(ec2.InstanceClass.T3, ec2.InstanceSize.XLARGE),
      ],
      capacityType: eks.CapacityType.SPOT,
      minSize: 2,
      maxSize: 20,
      desiredSize: 2,
      labels: {
        workload: 'flexible',
      },
    });
  }
}

When I Use It

  • Teams comfortable with TypeScript/Python

  • Need for strong type checking

  • AWS-focused environments

  • Want to leverage existing AWS constructs

4. CloudFormation: The AWS Native

While I rarely write raw CloudFormation anymore, understanding it helps when working with CDK or debugging:

yamlCopyAWSTemplateFormatVersion: '2010-09-09'
Resources:
  EksCluster:
    Type: 'AWS::EKS::Cluster'
    Properties:
      Name: native-cluster
      Version: '1.27'
      RoleArn: !GetAtt EksServiceRole.Arn
      ResourcesVpcConfig:
        SecurityGroupIds: 
          - !Ref ClusterSecurityGroup
        SubnetIds: !Ref SubnetIds

  EksNodegroup:
    Type: 'AWS::EKS::Nodegroup'
    Properties:
      ClusterName: !Ref EksCluster
      NodeRole: !GetAtt NodeInstanceRole.Arn
      ScalingConfig:
        MinSize: 2
        DesiredSize: 3
        MaxSize: 10
      Subnets: !Ref SubnetIds
      InstanceTypes: 
        - t3.large

When I Use It

  • Need AWS-native solutions

  • Working with existing CloudFormation stacks

  • Simple, standalone clusters

Making the Choice: My Decision Framework

I choose my deployment method based on these factors:

  1. Project Scale

    • Personal/Small Team → eksctl

    • Enterprise → Terraform/CDK

  2. Team Experience

    • AWS experts → CDK

    • Multi-cloud teams → Terraform

    • Kubernetes focused → eksctl

  3. Infrastructure Complexity

    • Simple → eksctl

    • Complex → Terraform

    • AWS-specific complexity → CDK

  4. Long-term Maintenance

    • Need state management → Terraform

    • Heavy AWS integration → CDK

    • Quick deployment → eksctl

Conclusion

After years of working with EKS, I've learned that no single tool is perfect for every situation. eksctl remains my favorite for quick starts and simple production deployments, but I don't hesitate to reach for Terraform or CDK when the situation calls for it.

The key is understanding each tool's strengths and choosing the right one for your specific needs. Don't let anyone tell you there's only one "right way" to deploy EKS – use what works best for your situation.

Remember: The best tool is the one that helps you ship reliable infrastructure while keeping your team productive and your maintenance burden manageable.