We chatted with Justin Garrison (@rothgar), a Developer Advocate at AWS, about the challenges of Kubernetes at scale, what he’s seen in his experience, and what he thinks the future holds for container orchestration.
Read on for insights from a practitioner who has solved real-world problems in the field, and now helps guide developers through the exciting changes in the container landscape.
Justin, let’s jump right in. What do you do as a Sr. Developer Advocate at AWS, and why did you choose to go down that route after being an engineer?
I am part of the EKS service team at AWS so I’m part of the feedback and prioritization efforts to make our container services better for customers. I’ve worked as an engineer at various companies before this and being a developer advocate lets me stay in touch with my developer roots, and it gives me the opportunity to take an active role in the Cloud Native and Kubernetes communities.
My previous engineering roles were focused on internal engineering efforts, and I wanted a role that would give me freedom to help the community and customers with non-code resources like blog posts, videos, and workshops.
How did you start your real-world experience with Kubernetes and what have been the most important changes since you began working with containers and orchestration?
I was trying to solve problems to allow development teams to ship applications faster and make the infrastructure more reliable. I was looking at containers because the VMs with configuration management we were using at the time weren’t enough.
We started with a handful of container hosts which worked well and I became the human orchestrator. This worked until I wanted to take a vacation and we needed to scale some of our applications. That’s when the need for a software orchestrator made a lot of sense to me.
The biggest changes with Kubernetes since then have been the ecosystem. So many new features and projects have emerged. Some fix pain points with existing infrastructure and others solve problems in Kubernetes native ways.
Where do you see Kubernetes going in the next five years? What excites you the most?
Kubernetes is starting to be invisible for a lot of organizations. This is great because it is becoming so ubiquitous and reliable that we don’t need to think about it anymore. When I drive to the store I don’t think about the engine or cylinders. I only think about pedals and steering.
Kubernetes is the engine for a lot of people’s infrastructure, but the interface to the engine abstracts away the details. If we switch to gas, or hybrid, or electric engines the user doesn’t worry about it because their interface is still the same.
That’s the part that excites me the most. Kubernetes was intended to be extended but it keeps a similar interface for the user. More things are able to be controlled through the Kubernetes API with controllers that we can help people focus on driving.
I’ve seen you writing about dockershim being deprecated. Can you give a brief overview of that for some of the less experienced readers, as well as explain the repercussions?
Many users want to run docker commands from inside their workloads for things like building containers or monitoring. Running docker inside a docker container requires a way to communicate with the docker engine on the host which is usually accomplished by mounting the docker socket.
When Kubernetes first started it was tightly coupled with the docker engine for running containers through code in Kubernetes called the dockershim. As the container ecosystem matured, alternative container runtimes became available with unique features and the Container Runtime Interface (CRI) was developed to help users pick whichever runtime they wanted. New container features can be developed in the runtime and Kubernetes only needs to use the CRI to run them, but for docker engine, Kubernetes still needs to use the dockershim code.
The dockershim code is being removed from Kubernetes in favor of runtimes that support the CRI interface. This means that to build containers from inside Kubernetes pods, users will need to move to different build tools that do not require the docker engine or run instances outside of Kubernetes that run the docker engine.
I built a tool for users to check their clusters to see if any workloads are mounting the docker socket volume to make sure they are not affected by this change. You can find it at this link: https://cftc.info/dds
Regarding scaling, how is Karpenter an improvement over past autoscaling technologies such as Cluster Autoscaler? Are there distinct advantages to using it with a comprehensive service like EKS?
Automatically scaling Kubernetes nodes works similarly to scaling workloads. If you don’t have enough resources you should add one (or more) nodes. This has worked great for clusters that have similar workload needs.
Workloads are not always similar and don’t always need the same thing. Some workloads need ½ a CPU and 1 GB of memory while others need 8 CPUs and 16 GB of memory. The only way to have both of those workloads run in the same cluster is to either pick the largest workload requirement (e.g. machines with at least 8 CPUs and 16 GB of memory) or create separate node groups for each workload. This gets more complex when you have multiple CPU architectures (e.g. arm) and compute accelerators (e.g. GPUs). The amount of configurations and groups you need to configure and autoscale grows rapidly.
Karpenter is groupless. Nodes are created directly for workload needs without any pre-configuring. I like to call it “workload native” because whatever the workload needs is what it gets. This opens up the possibility for more variety of workloads and lowers the overhead for management.
Karpenter works great with EKS to lower the complexity required to run a Kubernetes cluster. With a fully managed control plane and a workload native data plane the operational burden goes down a lot. Application developers can get the right sized resources when they need it. It’s a project I’m really excited about and think it can help companies scale and manage their clusters better than the traditional “one size fits all” approach.
It’s best to solve technical problems with technical solutions and people problems with process, but identifying which is which is not always easy.
In general, do you think the challenges of using a multi-cluster architecture (when warranted), outweigh the challenges? What are some types of situations that warrant a really sophisticated multi-cluster architecture?
I often tell people not to use Kubernetes unless they have a Kubernetes problem. The same goes for using lots of Kubernetes clusters. Don’t create 20 clusters until you’re able to maintain two. Don’t run 100 clusters until you know you need to.
The problems users are solving with dozens or hundreds of clusters can be everything from availability, localization, or internal company process. Sometimes it’s easier to create 50 clusters than it is to change a legacy policy. Figuring out which problems you’re solving with multiple clusters is always a good start.
It’s best to solve technical problems with technical solutions and people problems with process, but identifying which is which is not always easy. If you have problems of workload isolation, multi-region deployments, request latency, or incident blast radius, then running multiple Kubernetes clusters is probably a good option. If you have problems of slow upgrade cycles, varying application stability, deployment delays for review boards, or it takes weeks to deploy an application update then multiple clusters will exacerbate the existing problems.
What are some of the main challenges with massive scale Kubernetes?
Scaling Kubernetes has similar problems to scaling servers. If you use more compute resources you’ll run into limits like open file limits and PID exhaustion. If you scale horizontally and add more servers then you have to synchronize configuration, patching, and help processes coordinate. Kubernetes has similar problems. Some people use large, shared clusters and others use lots of smaller clusters. There’s no right answer for everyone which makes it difficult to know which option is best for you.
Some of the solutions we’ve created with Linux servers (e.g. configuration management) can apply to Kubernetes, but there are new interfaces which create new challenges. Many companies use service providers to avoid needing to deal with lower levels of abstraction. You don’t have to rack and stack servers any more if you move to the cloud. You don’t have to manage the Kubernetes control plane with Amazon EKS, and you don’t need to patch servers if you use compute options like AWS Fargate. It doesn’t eliminate all of the challenges, but it greatly reduces your responsibility.
Does EKS do anything special as the cluster scales in terms of adjusting the cluster configuration as the number of nodes and/or the number of pods increases?
There are multiple areas where Kubernetes needs to scale as clusters get larger with more nodes or workloads. There’s the API server which is the main point of communication, but there’s also the etcd database and controllers. EKS handles control plane scaling automatically for customers as the clusters require it.
We help solve pain points for Kubernetes that are not part of EKS. Customers can use Fargate as serverless and isolated workload compute. We built Karpenter to help simplify EC2 instance selection and management, and we’ve created AWS Controllers for Kubernetes (ACK) which help developers provision additional AWS infrastructure without leaving the comfort of kubectl.
We know many customers have use cases for Kubernetes outside of an AWS region. We support AWS Local Zones and AWS Outposts for EKS worker nodes and EKS Anywhere to create EKS clusters on customer owned hardware.
All of these environments have different constraints and scaling properties. We make it easier for customers in each use case to run Kubernetes the way they want.
How do users gain confidence in EKS to use it for their different use cases?
One of the cool things about EKS is how much of it is open source. You can look at what we use for Kubernetes binaries through the EKS Distro project. This project powers all the different deployments of EKS. You can download them and use the same binaries we use. If you want to use our container native operating system Bottlerocket, it’s open source and you can build your own copy of the OS from scratch.
We know some customers want the ability to see what goes into their infrastructure. It helps build trust and allows us to show exactly how we manage Kubernetes at scale.