Securing a Private GKE Cluster on Google Cloud

If you've been following my Google Cloud learning journey, you know I've been working through the security-focused labs on Google Skills for Partners. This week I tackled the Cloud Security Fundamentals Challenge Lab — and honestly, and gain deeper knowledge about security inside Google Cloud.

No step-by-step walkthroughs here. Instead, let me share what the challenge taught me and how I'd apply it in a real infrastructure role.

The Scenario: You're the New Security Engineer

The lab drops you straight into a realistic scenario: you've just joined the security team at a fictional company as a junior engineer, and you need to deploy a production-grade Kubernetes Engine cluster for a development team — but it has to meet strict security standards. No handholding. Figure it out.

Those framing matters. It forced me to think like an actual cloud security engineer rather than someone following a recipe.

The Big Idea: Least Privilege, Applied End-to-End

The throughline of everything I did was least privilege — giving cloud resources only the permissions they absolutely need, and nothing more.

This principle sounds simple. Actually, applying it end-to-end across IAM roles, service accounts, and cluster configuration is where the real learning happens.

Here's how it played out across the five tasks:

1. Custom IAM Roles vs. Predefined Roles

One thing that clicked for me: you shouldn't reach for broad predefined roles (like roles/storage.admin) when your workload only needs a narrow set of actions. Google Cloud lets you craft a custom role with exactly the permissions required — say, reading and updating objects in Cloud Storage, but nothing more.

The lesson: always audit what a workload actually needs before assigning a role. Over-permissioned service accounts are a common source of cloud security incidents.

Below is an example CLI command to create a custom IAM role.

gcloud iam roles create orca_storage_editor_796 
--project=$DEVSHELL_PROJECT_ID 
--title="Orca Storage Editor" 
--description="Custom role for GKE cluster storage access" 
--permissions="storage.buckets.get,storage.objects.get,storage.objects.list,storage.objects.update,storage.objects.create" 
--stage=GA

2. Service Accounts Are Identities, Not Just Credentials

Before this lab, I think of service accounts somewhat abstractly. After it, I see them as proper identities — like an employee badge. You create the account, then deliberately grant it specific roles for specific reasons.

The cluster in this lab got its own dedicated service account. The minimum built-in roles required for a GKE cluster to function (monitoring and logging) came straight from Google's own hardening guide. Anything extra — like storage access for the dev team — came as an additional, explicitly justified custom role.

The minimum required permissions for the service account used by a Kubernetes Engine cluster are covered by the following three built-in roles:

roles/monitoring.viewer
roles/monitoring.metricWriter
roles/logging.logWriter

This deliberate, documented approach is exactly how production environments should be managed.

Below is an example CLI command for creating the service account.

gcloud iam service-accounts create orca-private-cluster-396-sa \
  --display-name="Orca Private Cluster SA" \
  --project=$DEVSHELL_PROJECT_ID

And here is an example CLI command for assigning the required roles to the service account.

gcloud projects add-iam-policy-binding $DEVSHELL_PROJECT_ID
--member="serviceAccount:orca-private-cluster-396-sa@$DEVSHELL_PROJECT_ID.iam.gserviceaccount.com"
--role="roles/monitoring.viewer"

3. Private Clusters Are Genuinely Private

This was the biggest conceptual shift for me. A "private" GKE cluster in Google Cloud means:

Worker nodes have no public IP addresses
The cluster's control plane (master) endpoint is also not publicly reachable
Access to the cluster is restricted to an explicitly approved network range

In practice, that means you can't just run kubectl from your laptop. You need to go through a jumphost — a VM that lives inside the same VPC — and connect from there. This puts an additional layer of network isolation between the cluster and the outside world.

The security payoff is significant: even if credentials were leaked, an attacker couldn't reach the cluster's API server from outside the VPC.

The image above shows the jump host after it has been successfully created.

4. Network Topology Matters

Setting up master authorized networks forced me to think carefully about which IP address to trust — specifically, the internal IP of the jumphost, not its external one. Using a /32 CIDR (a single host) rather than a broader range is deliberate: it authorizes exactly one machine.

Below is an example CLI command for getting the jumphost internal IP.

gcloud compute instances describe orca-jumphost \
  --zone=us-east4-b \
  --format='get(networkInterfaces[0].networkIP)'

This kind of precision is easy to skip when you're moving fast, but it's the difference between "private cluster" and "private cluster with a wide-open door."

5. The Jumphost Pattern in Practice

Finally, deploying a test application from the jumphost made the whole architecture tangible. SSH into the jumphost via Identity-Aware Proxy (no external IP needed), install the GKE auth plugin, pull cluster credentials with the --internal-ip flag, and then kubectl works as expected.

The IAP tunnel is itself a security feature — it means the jumphost doesn't need a public IP either. The whole stack starts to feel like defense in depth, not just security theater.

Below is an example CLI command for deploying app from the jumphost after SSH into it

# Install auth plugin
sudo apt-get install -y google-cloud-sdk-gke-gcloud-auth-plugin

# Set environment variable
echo "export USE_GKE_GCLOUD_AUTH_PLUGIN=True" >> ~/.bashrc && source ~/.bashrc

# Fetch cluster credentials
gcloud container clusters get-credentials orca-cluster-463 \
  --internal-ip \
  --project=$DEVSHELL_PROJECT_ID \
  --zone=us-east4-b

# Deploy the app
kubectl create deployment hello-server \
  --image=gcr.io/google-samples/hello-app:1.0

What Surprised Me

A few things caught me off guard:

The auth plugin requirement. Modern GKE requires google-cloud-sdk-gke-gcloud-auth-plugin to be installed for kubectl to work. Forgetting this step will leave you puzzled when credential setup appears to succeed but kubectl silently fails.

Order of operations matters. You have to know the jumphost's internal IP before creating the cluster, because that IP goes into the master authorized networks at cluster creation time. This is the kind of dependency that's obvious in hindsight but easy to miss the first time.

Private endpoint = no external API access, period. With --enable-private-endpoint, there's no fallback. You must use the jumphost. This is the right security call for a production cluster, but it means your operational playbook needs to account for it.

Even though the cluster is private, the applications it serves can still be public and accessed via a load balancer. This is expected behavior— “private cluster” refers to restricting access to the control plane and nodes, not preventing applications from being exposed externally.

Below is an example CLI command for verifying the GKE cluster actually works.

# Expose via load balancer
kubectl expose deployment hello-server \
  --type=LoadBalancer \
  --port=80 \
  --target-port=8080

# Get the external IP
kubectl get service hello-server --watch

Real-World Takeaways

If I were setting up a GKE cluster in a real environment tomorrow, here's what this lab would have me do differently:

Never use default service accounts for production workloads. Always create dedicated accounts with documented, scoped permissions.
Use private clusters with private endpoints for anything sensitive. The operational overhead is worth it.
Document your IAM decisions. Custom roles should have clear descriptions explaining why each permission was granted.
IAP for administrative access beats bastions with public IPs. No public IP on your jumphost is strictly better.

Final Thoughts

This Challenge Lab left me with a much clearer mental model of how IAM, private networking, and cluster configuration fit together into a coherent security posture. Not just how to run the commands — but why each piece of the architecture exists.

If you're on the Google Cloud security learning path, I'd strongly recommend sitting with the concepts before reaching for the CLI. The "why" is what transfers to real work.

Have questions about cloud security fundamentals or GKE networking? Drop a comment below — always happy to compare notes.

What I Learned Securing a Private GKE Cluster on Google Cloud

The Scenario: You're the New Security Engineer

The Big Idea: Least Privilege, Applied End-to-End

1. Custom IAM Roles vs. Predefined Roles

2. Service Accounts Are Identities, Not Just Credentials

3. Private Clusters Are Genuinely Private

4. Network Topology Matters

5. The Jumphost Pattern in Practice

What Surprised Me

Real-World Takeaways

Final Thoughts

Comments

GCP Learning & Labs

Connecting Cloud Networks with Network Connectivity Center — What I Actually Learned

More from this blog

Connecting Cloud Networks with Network Connectivity Center — What I Actually Learned

Google Cloud Certification Renewal 2026 — The Honest Guide from Someone Who Just Survived It

Command Palette

The Scenario: You're the New Security Engineer

The Big Idea: Least Privilege, Applied End-to-End

1. Custom IAM Roles vs. Predefined Roles

2. Service Accounts Are Identities, Not Just Credentials

3. Private Clusters Are Genuinely Private

4. Network Topology Matters

5. The Jumphost Pattern in Practice

What Surprised Me

Real-World Takeaways

Final Thoughts

Comments

GCP Learning & Labs

Connecting Cloud Networks with Network Connectivity Center — What I Actually Learned

More from this blog