Avoiding IP consumption in Amazon EKS
This article shows you how AWS EKS allocates IPs in clusters and how you can avoid running out of IPs. If you are getting curious or impatient to get this done, take a look into this repository with all Terraform configurations placed in a single place.
Container Networking Interface and pod networking
Every AWS EKS cluster comes pre-configured by default with the AWS VPC CNI plugin which provides robust networking for pods. This plugin is responsible for assigning IPs to the pods whenever the pods are created. In order to assign an IP, the plugin needs to be aware of the underlying instance’s ENI capacity and know-how many primary and secondary interfaces are supported by the particular instance type, also it has to be aware of the available IPs in your VPC.
Usage of VPC CNI becomes a problem if you don’t do good network planning before going to production. The limitation comes in two aspects, first, if you’re running a subnet with a small number of IP addresses, you might quickly run into troubles. Second, ENIs are attached to the nodes, providing a means of attaching IPs for each pod in the cluster, so the number of pods running on a node is limited by the ENI capacity of the instance and therefore by the EC2 instance type.
To illustrate the first limitation, let’s assume that you have created three subnets with a /28 prefix, and associated them with worker nodes. With this setup, you can get a total of 33 IP addresses available for your worker nodes and the pods that are launched in the cluster. Let’s suppose that you have 5 worker nodes as part of your cluster, which means you only have room for 28 pods which is not a lot if you are running multiple applications within that VPC.
To show you the second limitation, let’s suppose that you have a t3.micro instance, that instance supports 2 ENIs and 2 IPs per each ENI, so using the following formula:(Number of ENI * (number of IP per ENI — 1) + 2
a total of 4 pods can be scheduled in that instance(2 * (2 — 1) + 2
) which obviously is too limited and you might surely be wasting computing resources.
How to solve the problem?
1. Adding additional IPv4 CIDR blocks to VPC
EKS allows clusters to be created in a VPC with additional IPv4 CIDR blocks added. By adding secondary CIDR blocks to a VPC from the 100.64.0.0/10 and 198.19.0.0/16 ranges, it is possible for pods to no longer consume any RFC-1918 IP addresses from the VPC.
By introducing a secondary CIDR range, you can release the pressure on your primary CIDR range, and it also allows to have room for more pods in the cluster but this approach doesn’t solve the second limitation because your instances will be still underutilized and you don’t want to increase the worker node count unless you were hitting high resource utilization on your existing worker nodes.
2. Change VPC CNI for Calico CNI
You can also tackle this problem using a custom CNI. There are many CNI available (e.g: calico, weave-net, flannel, etc.). I decided to show you how to use Calico as it is pretty straightforward to install and this allows you to take advantage of the full set of Calico networking features, including Calico’s flexible IP address management capabilities.
To get EKS cluster to use Calico CNI, there were a couple of other steps I had to follow. Basically, first I had to uninstall the default VPC CNI, then install calico CNI, delete the pod limit per host and finally restart the kubelet on worker nodes.
After a short introduction, let’s get into the details. First of all, to uninstall VPC CNI using the Helm provider, I have replaced the aws-node
daemon set container image by google/pause
, doing that I deleted completely the pods that manage the VPC CNI networking. Take a look at the code.
Then, the Calico CNI installation is pretty simple, I have followed the official documentation and applied a yaml file provided on that using the Helm provider.
Then, to delete the pod limit per host and solve the problem of max pod capacity per worker node, I created a daemon set of privileged containers so you can access to host pid namespace as well as the host filesystem. The daemon set uses a init container with busybox to do its job, it is: modify the kubelet config file, change the kubelet service behavior and finally restart it. You can see below the bash code to do that.
After that, your worker nodes should have its max pods setting set to the default Kubernetes number (110). The following command will return the maximum pods value for <node_name>
:kubectl get node <node_name> -ojsonpath='{.status.capacity.pods}{"\n"}'
And you are all set. Your EKS cluster should now have Calico CNI installed and running.
I have implemented the whole process using Terraform and its Helm provider so that we can get a fully automated solution. I modified the official Terraform EKS module, the usage of official Terraform modules brings us simplicity of coding AWS components following the best practices from verified providers (A.K.A. do not reinvent the wheel). The solution was developed as a submodule into the official module, you can see the full code in the following link.
https://github.com/matiaszilli/terraform-aws-eks/tree/master/modules/calico_cni
Wrapping up
Every technology has peculiarities and pitfalls. The advantages of using EKS are enormous. When you move into complex and big architectures, EKS can end up consuming a lot of IPs. Don’t make this a problem for yourself and make sure you have enough networking room for your system!
I would really appreciate any kind of questions, feedback or comments so feel free to ping me here, or post any comments below in this post.