AWS EKS vs Azure AKS - my thoughts and reflection after using both in Production

I am lucky enough to work both with Azure AKS, and AWS EKS(EC2), and I decided to dedicate a post on my blog about the two. I have also to admit that this is the most opinionated post I have written so far. I tried to put more emphasises on their differences than to write about their similarities. Here are the three areas I comparing them: 
  • AKS vs EKS from Cluster Management Standpoint, 
  • AKS vs EKS from Networking Standpoint, 
  • AKS vs EKS from Scalability Standpoint

AKS vs EKS from Cluster Management Standpoint 

Both AKS and EKS mange the master node (AKA Data plane) for you completely, and there isn't much difference I notice in that area between the two. However, Azure does not charge you anything for the Master node it manage and you only pay for the worker nodes, however, EKS charge you a fix price monthly for that Master node (around USD 70 dollar depending on the region)

One feature I like in EKS is called Managed Node Groups. This feature automate and off loads a lot of configuration and maintenance of your nodes from you. There is a similar feature in AKS that also help you to add new node and drain node from your cluster, but I find the EKS to be more seamless. (this is probably the most opinionated statement in this post :) )

both EKS and AKS uses RBAC for managing permissions around K8 resources, but the way you can permission to the cluster to access other cloud resources such as Database and object storage, and etc are different due to the differences between AWS IAM and Azure AD. in my personal experience I found the Azure AKS integration with Azure AD to be more seamless than AWS IAM. (yet another opinionated statement in this post)

EKS vs AKS from Networking Standpoint 

From Cluster standpoint the networking setup between EKS and AKS has more similarities than differences. From Pod perspective, AWS has amazon-vpc-cni-k8s and Azure has Azure VNet CNI plug and both of them do the same thing which is to let POD take it's IP directly from VPC or VNet pool.

I had some struggle to setup Ingress Controller up and running in AWS EKS. The ingress couldn't find Application Load Balancer and create a new instance. It turns out the issue was due to some tag missing in the subnets which my EKS was deployed. This Stackoverflow question helped me to solve the issue.

AKS vs EKS from Scalability Standpoint

Scalability is the part where I felt their differences become more apparent that the first two areas. EKS supports up to 1000 nodes per cluster while the number for Azure AKS is just 100 which is significantly different if you are thinking of running Kubernetes at scale. I have to admit though that none of the workloads I have ever come across ever reach the threshold of either provider.

Kubernetes is a distributed platform which generally is designed so that the nodes communicate with each other over LAN with stable connection. Therefore it is not possible to have a Kubernetes cluster that spans over multi region. There are however solutions like Kubernetes Federation (I wrote a whole post about it here) that ease the process of synchronising two independent cluster running in parallel.

It is also possible to achieve to some degree a similar setup with the tools that Azure and AWS offers. In Azure you can use Azure traffic Manager to distribute the traffic across two AKS cluster running on different region, while AWS you can leverage Route53 and AWS Global accelerator (figure 1) to achieve the same thing. I found the AWS solution for running multi Kubernetes cluster across different regions to be more Enterprise ready than Azure.
 
Figure 1: Multi Region EKS setup

To conclude this, When it comes to scalability, I believe EKS outperform AKS. And a big part of it is not due to the service it-self but rather the other tools that AWS offers like Route 53, and Global Accelerator that I found to be more powerful than it's Azure competitors. 


Comments

Popular posts from this blog

Azure CNI vs Kubenet, What are the differences between them and which one to use?

Architecting Kubernetes for High Availability, Fault Tolerance and Business Continuity