Architecting Kubernetes for High Availability, Fault Tolerance and Business Continuity
Kubernetes can take care of many things, and can solve many problems except the ones it doesn't know about such as region failure and human errors.
In this post I want to compare and contrast the differences between Single Cluster setup spread across multi Availability Zones that is very common vs Multi Cluster Setup Spread across different Region. Hopefully by then end of this post you have some clue about when to use which setup no matter which cloud provider you are using; be it AWS, Azure, or GCP.
Single Cluster Setup:
In this Setup the Kubernetes nodes and their storages are distributed across multiple Availability Zones (AZ). This model ensures the nodes are physically separated from each other and the outage in one of the AZ will not cause the entire cluster to go out of service. At the same time the communication between each node is via private connection and does not route over internet no matter which cloud provider you use.
I took the following image which I from official Azure documentation illustrate how cross AZ looks like - but it's pretty much the same if you are on AWS, GKE.
On Azure Setting up a cluster that spread cross multi region is almost same as setting up a cluster within one AZ. The only thing you need to do is to click on the AZ check-box, and that's it. Azure will take care of the rest of configuration for you.
AWS also recommends to run EKS cluster spread across three or more Availability Zones. And setting an EKS cluster that spread between EKS requires minimal configuration and same goes for Google GKE.
Multi-cluster Setup
I took above image from official Azure docs demonstrate how multi cluster works, the setup would be pretty much similar across AWS and GCP.
Advantages of Multi-Cluster Setup:
The advantage that multi-cluster setup goes beyond just high availability, it can also be leveraged to reduce latency, let's say your application has visitors from all around the world, in the single cluster setup, you are forced to process each request within a single cluster that you have no matter where the origin of that request is, in the multi-region setup however, you can easily introduce routing policy in the cloud DNS router to send traffic to specific cluster depending on the request origin. For example, a request from U.S goes to the cluster deployed in U.S region, and a request originated from Europe gets processed by the Cluster running in Europe region. This can help to reduce the network latency and partitions.


Comments
Post a Comment