Google Kubernetes Engine (GKE) offers two different ways to perform service discovery and DNS resolution: the in-cluster kube-dns functionality, and GCP managed Cloud DNS. Either approach can be combined with the performance-enhancing NodeLocal DNSCache add-on.
New GKE Autopilot clusters use Cloud DNS as a fully managed DNS solution for your GKE Autopilot clusters without any configuration required on your part. But for GKE Standard clusters, you have the following DNS provider choices:
-
Kube-dns (default)
-
Cloud DNS - configured for either cluster-scope or VPC scope, and
-
Install and run your own DNS (like Core DNS)
In this blog, we break down the differences between the DNS providers for your GKE Standard clusters, and guide you to the best solution for your specific situation.
Kube-DNS
kube-dns is the default DNS provider for Standard GKE clusters, providing DNS resolution for services and pods within the cluster. If you select this option, GKE deploys the necessary kube-dns components such as Kube-dns pods, Kube-dns-autoscaler, Kube-dns configmap and Kube-dns service in the kube-system namespace.
kube-dns is the default DNS provider for GKE Standard clusters and the only DNS provider for Autopilot clusters running versions earlier than 1.25.9-gke.400 and 1.26.4-gke.500.
Kube-dns is a suitable solution for workloads with moderate DNS query volumes that have stringent DNS resolution latency requirements (e.g. under ~2-4ms). Kube-dns is able to provide low latency DNS resolution for all DNS queries as all the DNS resolutions are performed within the cluster.
If you notice DNS timeouts or failed DNS resolutions for bursty workload traffic patterns when using kube-dns, consider scaling the number of kube-dns pods, and enabling NodeLocal DNS cache for the cluster. You can scale the number of kube-dns pods beforehand using Kube-dns autoscaler, and manually tuning it to the cluster's DNS traffic patterns. Using kube-dns along with Nodelocal DNS cache (discussed below) also reduces overhead on the kube-dns pods for DNS resolution of external services.
While scaling up kube-dns and using NodeLocal DNS Cache(NLD) helps in the short term, it does not guarantee reliable DNS resolution during sudden traffic spikes. Hence migrating to Cloud DNS provides a more robust and long-term solution for improved reliability of DNS resolution consistently across varying DNS query volumes. You can update the DNS provider for your existing GKE Standard from kube-dns to Cloud DNS without requiring to re-create your existing cluster.
For logging the DNS queries when using kube-dns, there is manual effort required in creating a new kube-dns debug pod with log-queries enabled.
Cloud DNS
Cloud DNS is a Google-managed service that is designed for high scalability and availability. In addition, Cloud DNS elastically scales to adapt to your DNS query volume, providing consistent and reliable DNS query resolution regardless of traffic volume. Cloud DNS simplifies your operations and minimizes operational overhead since it is a Google managed service and does not require you to maintain any additional infrastructure. Cloud DNS supports dns resolutions across the entire VPC, which is something not currently possible with kube-dns.
Also, while using Multi Cluster Services (MCS) in GKE, Cloud DNS provides DNS resolution for services across your fleet of clusters.
Unlike kube-dns, Google Cloud’s hosted DNS service Cloud DNS provides Pod and Service DNS resolution that auto-scales and offers a 100% service-level agreement, reducing DNS timeouts and providing consistent DNS resolution latency for heavy DNS workloads.
Cloud DNS also integrates with Cloud Monitoring, giving you greater visibility into DNS queries for enhanced troubleshooting and analysis.
The Cloud DNS controller automatically provisions DNS records for pods and services in Cloud DNS for ClusterIP, headless and external name services.
You can configure Cloud DNS to provide GKE DNS resolution in either VPC or Cluster (the default) scope. With VPC scope, the DNS records are resolvable with the entire VPC. This is achieved with the private DNS zone that gets created automatically. With Cluster scope, the DNS records are resolvable only within the cluster.
While Cloud DNS offers enhanced features, it does come with usage-based costs. You save on compute costs and overhead by removing kube-dns pods when using Cloud DNS. Considering the typical cluster size workload traffic patterns, Cloud DNS is usually more cost effective than running kube-dns
You can migrate clusters from kube-dns to Cloud DNS cluster scope without downtime or changes to your applications. The reverse (migrating from Cloud DNS to kube-dns) is not a seamless operation.
NodeLocal DNSCache
NodeLocal DNSCache is a GKE add-on that you can run in addition to kube-dns and Cloud DNS. The node-local-dns pod gets deployed on the GKE nodes after the option has been enabled (subject to a node upgrade procedure).
Nodelocal DNS Cache (NLD) helps to reduce the average DNS resolution times by resolving the DNS requests locally on the same nodes as the pods, and only forwards requests that it cannot resolve to the other DNS servers in the cluster. This is a great fit for clusters that have heavy internal DNS query loads.
Enable NLD during maintenance windows. Please note that node pools must be re-created for this change to take effect.
Final thoughts
The choice of DNS provider for your GKE Standard cluster has implications for the performance and reliability, in addition to your operations and overall service discovery architecture. Hence, it is crucial for GKE Standard users to understand their DNS options taking into account their application and architecture objectives. Standard GKE clusters allow you to use either kube-dns or Cloud DNS as your DNS provider, allowing you to optimize for either low latency DNS resolution or a simple, scalable and reliable DNS solution for GKE Standard clusters. You can learn more about DNS for your GKE cluster from the GKE documentation . If you have any further questions, feel free to contact us.
We thank the Google Cloud team member who contributed to the blog: Selin Goksu, Technical Solutions Developer, Google