EKS Cluster Architecture: Control Plane, Node Groups, and Endpoint Access Modes

3 min readCloud Infrastructure

EKS manages the Kubernetes control plane. You manage the data plane — the nodes that run your workloads. Node group type (managed, self-managed, Fargate) and cluster endpoint access mode (public, private, or both) are the first architectural decisions that shape everything downstream.

awsekskubernetes

What AWS manages and what you manage

EKS is a managed Kubernetes service, but "managed" applies narrowly to the control plane — the API server, etcd, scheduler, and controller manager. AWS runs these across multiple AZs in an AWS-owned account. You cannot access the control plane nodes directly.

The data plane — the worker nodes that run your actual workloads — runs in your VPC, in your account. You manage it, or you use Fargate to have AWS manage the underlying instances (but you still manage pods, deployments, and resource configurations).

The split responsibility model

ConceptAWS EKS

EKS control plane components are fully AWS-managed: high availability, patching, Kubernetes version upgrades (when you initiate them). The data plane is your responsibility: node OS patching, node group upgrades, cluster networking (VPC CNI), and storage drivers.

Prerequisites

  • Kubernetes architecture (API server, etcd, kubelet)
  • AWS VPC and EC2
  • IAM roles

Key Points

  • Control plane runs in AWS-managed account — you access it via the Kubernetes API endpoint only.
  • Your nodes communicate with the control plane via the cluster API endpoint (public or private).
  • EKS version upgrades: control plane upgrades first, then node groups (you trigger node group upgrades separately).
  • The aws-auth ConfigMap (or access entries API in newer EKS) maps AWS IAM principals to Kubernetes RBAC roles.

Node group types

Managed node groups (recommended for most workloads): AWS provisions and manages EC2 instances using an EKS-optimized AMI and an ASG. Node upgrades (new AMI for K8s version bump) happen via a managed rolling update — nodes are cordoned, drained, terminated, and replaced. You specify instance type, capacity, and labels/taints; AWS handles the rest.

Self-managed node groups: you manage the EC2 instances directly using your own launch templates. Required if you need custom AMIs, specific boot configurations, or instance types not supported by managed node groups. More operational overhead.

Fargate profiles: workloads matching specific namespace/label selectors run on Fargate instead of EC2 nodes. No node management. Each pod gets dedicated compute. No DaemonSets, no hostPath volumes, and only coredns and some other system pods can run on Fargate (not all EKS add-ons support it).

resource "aws_eks_node_group" "main" {
  cluster_name    = aws_eks_cluster.main.name
  node_group_name = "main"
  node_role_arn   = aws_iam_role.node.arn
  subnet_ids      = aws_subnet.private[*].id

  instance_types = ["m6i.xlarge"]

  scaling_config {
    desired_size = 3
    max_size     = 10
    min_size     = 1
  }

  update_config {
    max_unavailable = 1  # rolling update: 1 node unavailable at a time
  }

  labels = {
    role = "general"
  }

  # Use launch template for custom configurations
  launch_template {
    id      = aws_launch_template.eks_node.id
    version = aws_launch_template.eks_node.latest_version
  }
}

Cluster endpoint access modes

The EKS cluster API endpoint can be public, private, or both. This controls who can reach the Kubernetes API server:

Public (default): the API endpoint is accessible from the internet. Your kubectl commands, CI/CD pipelines, and applications can reach it from anywhere with valid credentials. Convenient, but the API server is internet-exposed.

Private: the API endpoint is only accessible from within your VPC (or peered networks). All kubectl access must go through a bastion host, VPN, or AWS Direct Connect. Required for strict network isolation requirements.

Public and private (recommended): external clients (developers, CI/CD) use the public endpoint; in-cluster nodes use the private endpoint (within VPC, no internet roundtrip). This is the practical default — developer convenience without routing all node-to-control-plane traffic over the internet.

resource "aws_eks_cluster" "main" {
  name     = "production"
  role_arn = aws_iam_role.cluster.arn
  version  = "1.29"

  vpc_config {
    subnet_ids              = concat(aws_subnet.private[*].id, aws_subnet.public[*].id)
    endpoint_private_access = true
    endpoint_public_access  = true
    public_access_cidrs     = ["10.0.0.0/8", "203.0.113.0/24"]  # restrict public endpoint to known CIDRs
  }

  # Enable control plane logging
  enabled_cluster_log_types = ["api", "audit", "authenticator", "controllerManager", "scheduler"]
}

public_access_cidrs restricts which IP ranges can reach the public API endpoint — your office IPs, CI/CD runner IPs, VPN exit nodes. This significantly reduces the attack surface of the public endpoint.

📝EKS add-ons: managing core cluster components

EKS manages several cluster-critical components as add-ons, with AWS handling version compatibility and updates:

  • VPC CNI (aws-vpc-cni): assigns VPC IPs to pods (each pod gets a real VPC IP)
  • CoreDNS: cluster DNS resolution
  • kube-proxy: iptables rules for Service networking
  • EBS CSI driver (aws-ebs-csi-driver): dynamic EBS provisioning for PVCs
  • EFS CSI driver (aws-efs-csi-driver): EFS persistent volumes
  • Pod Identity Agent: IAM Roles for Service Accounts
# List available add-ons for your cluster version
aws eks describe-addon-versions --kubernetes-version 1.29

# Install/update an add-on
aws eks create-addon \
  --cluster-name production \
  --addon-name aws-vpc-cni \
  --addon-version v1.18.0-eksbuild.1 \
  --resolve-conflicts OVERWRITE

Key operational note: when upgrading the Kubernetes cluster version, add-ons must also be upgraded. AWS marks add-ons as requiring updates after a cluster version upgrade. Run aws eks describe-addon --cluster-name production --addon-name aws-vpc-cni to check the status.

The VPC CNI add-on is particularly important — it controls how many pods can run per node (limited by ENIs and IP addresses per ENI). Certain instance types support ENI prefix delegation (assigning /28 CIDR blocks per ENI slot), significantly increasing pod density.

IAM access to the cluster

EKS uses two mechanisms to grant IAM principals access to the Kubernetes API:

aws-auth ConfigMap (legacy): maps IAM roles and users to Kubernetes RBAC groups. The creator of the cluster gets automatic system:masters access; everyone else must be added to aws-auth.

EKS Access Entries (recommended, EKS 1.28+): manage IAM-to-Kubernetes access via the EKS API rather than a ConfigMap. Less error-prone — a misconfigured aws-auth ConfigMap can lock you out of your cluster.

# Add access entry (new API)
aws eks create-access-entry \
  --cluster-name production \
  --principal-arn arn:aws:iam::123456789012:role/DeveloperRole \
  --type STANDARD

aws eks associate-access-policy \
  --cluster-name production \
  --principal-arn arn:aws:iam::123456789012:role/DeveloperRole \
  --policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSViewPolicy \
  --access-scope type=namespace,namespaces=default,staging

An EKS cluster has endpoint_public_access=false and endpoint_private_access=true. A developer running kubectl from their laptop gets 'connection refused'. They have valid AWS credentials and their IAM role is in aws-auth. What is the most likely cause?

easy

The developer is working from home, not connected to VPN. The cluster is in a private VPC with no NAT gateway. The cluster was recently created.

  • AThe developer's IAM credentials are expired
    Incorrect.Expired credentials would produce an authentication error from the API server, not 'connection refused'. A refused connection means the API endpoint is unreachable at the network level.
  • BWith private-only endpoint access, the Kubernetes API is only reachable from within the VPC — the developer cannot reach it from their laptop without VPN or a bastion host
    Correct!Private-only endpoint access means the cluster API endpoint resolves to a private IP within the VPC and is not reachable from the internet. The developer's laptop is on the internet. Without VPN, Direct Connect, or a bastion host in the VPC, there's no network path to the API. Fix: either enable public endpoint access (with restricted CIDRs), or connect the developer via VPN before running kubectl.
  • CThe developer needs to add the cluster's CA certificate to their local trust store
    Incorrect.Certificate trust issues produce TLS errors, not connection refused. Connection refused is a network-level failure before TLS happens.
  • Dkubectl is configured with the wrong cluster endpoint URL
    Incorrect.A wrong URL would produce a different error (DNS resolution failure or connection to wrong host). 'Connection refused' specifically means the endpoint is reachable but not listening, or more likely, the DNS resolves to a private IP that's unreachable from the internet.

Hint:What does private-only endpoint access mean for connectivity from outside the VPC?