Managing Kubernetes clusters is hard. You need stable infrastructure, frequent updates and top-notch security. While all major cloud providers have a managed solution, I feel like Google Kubernetes Engine truly succeeds in taking away most worries of operating a production-ready cluster. Let's go over some lessons I've learned about features you can select when creating a Google Kubernetes Engine cluster.
👍 Release channels. Automatic updates of your Kubernetes version. Long-term project should keep up with upstream or risk getting stale and insecure. It's exactly what you want from a managed Kubernetes cluster and it just works.
👎 HTTP load balancing. GKE automatically deploys load balancers for (L7) ingress and (L4) services with type LoadBalancer. While there is certain convenience, the large amount of load balancers carry a hefty price tag. The amount of load balancers for L4 and L7 respectively scale linearly with the amount of exposed services and the amount of namespaces. My recommendation is to use custom ingress controllers such as Traefik or Istio which results in one single global load balancer.
👎 Preemtible nodes. Unless your cluster has a substantial size this feature is a trap. The cost in man-hours to get this working often greatly outweights the money saved. On top of this, the persistant instability luring in the background keeps focus from what matters - adding business value. My recommendation is to reduce cost with auto-scaled node pools and only look into this feature once you have dedicated platform teams.
👍 Config connector. It brings my personal favorite Kubernetes feature to Google Cloud Platform: declarative management of resources. Being able to treat your other Google Cloud Services as Kubernetes manifests is extremely powerful. Check out the example below and the complete list of all resources.
apiVersion: pubsub.cnrm.cloud.google.com/v1beta1 kind: PubSubTopic metadata: name: my-topic --- apiVersion: pubsub.cnrm.cloud.google.com/v1beta1 kind: PubSubSubscription metadata: name: my-subscription spec: topicRef: name: my-topic ackDeadlineSeconds: 60
👎 Application Manager. Google's take on GitOps operators. These operators are essential for reaching high deployment frequencies and tackling configuration drift. Unfortunately, Application Manager compared to alternatives feels lacking. My recommendation is to use Argo CD instead.
👍 Cloud Operations. Observability is the foundation of SRE practices. Choosing a different managed solution makes little sense and doing it yourself, for example with a Prometheus/Loki/Grafana stack, adds more complexity than benefits. Cloud Operations will bring your monitoring and alerts to the next level.
👎 Istio. Istio is great though this feature is outdated. Since its launch Istio got drastically simplified with a consolidated control plane and operator. My recommendation is to manage Istio yourself.
👍 Workload identity. Simplifies IAM and enables key rotations. Get used to this feature as soon it will be the default way of working. Instead of mounting GCP service account keys on pods through secrets, you create a binding between a Kubernetes and Google Cloud service account. Check out the introduction.
Google Kubernetes Engine has plenty of features. Some are amazing. Other feel a bit underwhelming. In the end, our life as cloud engineers becomes a little bit easier with each newly released feature.