GKE

This guide covers a production Nebula install on GKE using Cloud SQL for PostgreSQL, Google Cloud Storage (via HMAC keys or a MinIO bridge), DynamoDB-compatible orchestration tables, GKE Workload Identity, and external secrets via External Secrets Operator with GCP Secret Manager.

Prereqs

Before helm install, the following must be in place on the cluster side.

Cluster

GKE 1.30+ (Autopilot or Standard mode both work; Standard gives more control over node pools)
Workload Identity enabled on the cluster (--workload-pool=<project>.svc.id.goog) — required for keyless SA binding to GCP IAM
OIDC provider is implicit on GKE when Workload Identity is enabled; no separate step needed

Addons + controllers

Component	Purpose	Install reference
GKE Cluster Autoscaler	Node autoscaling	GKE built-in: `--enable-autoscaling` per node pool
nginx Ingress Controller (or GCE Ingress)	HTTP/HTTPS ingress	kubernetes.github.io/ingress-nginx
cert-manager	TLS from Let’s Encrypt	cert-manager.io/docs
External Secrets Operator (recommended)	Sync from GCP Secret Manager	external-secrets.io

GKE Standard clusters create node pools manually; size them to match the workload sizing table below. GKE Autopilot provisions nodes on-demand from pod resource requests — set resources.requests precisely so Autopilot selects the right machine family.

GCP-managed resources (recommended)

Cloud SQL for PostgreSQL 16 in the same region as the cluster, with Private IP enabled. Nebula requires vector, pg_partman, and pg_cron; confirm all three are available in your Cloud SQL version, enable the required database flags for vector/pg_cron, then run nebula-enterprise postgres provision to create the Nebula database, user, extensions, and chart credential Secret.
GCS bucket in the same region. Grant the Nebula service account roles/storage.objectAdmin on the bucket.
DynamoDB-compatible service for orchestration state, reachable from Nebula pods. Before Helm install, run nebula-enterprise orchestration dynamodb ensure to create or verify the four pk / sk orchestration tables and writer-authority records; use --endpoint-url for non-AWS endpoints and set the matching NEBULA_ORCHESTRATION_DYNAMODB_* values in the chart.

Object storage note: the chart’s objectStorage block uses S3-protocol env vars. GCS exposes an S3-compatible XML API at https://storage.googleapis.com. Use HMAC keys (Service Accounts → HMAC keys in the Cloud Console) as the credentialsSecret, and set objectStorage.forcePathStyle: false for the GCS XML API. Alternatively, run a MinIO gateway in front of GCS.

Workload Identity setup

Create a GCP service account for Nebula:

gcloud iam service-accounts create nebula-sa \
  --project <project>

Bind it to the Kubernetes service account the chart creates:

gcloud iam service-accounts add-iam-policy-binding \
  nebula-sa@<project>.iam.gserviceaccount.com \
  --role roles/iam.workloadIdentityUser \
  --member "serviceAccount:<project>.svc.id.goog[nebula/<release>-nebula-sa]"

Replace <release> with your helm install release name.

Grant the GCP service account access to GCS:

gcloud storage buckets add-iam-policy-binding gs://<bucket> \
  --role roles/storage.objectAdmin \
  --member "serviceAccount:nebula-sa@<project>.iam.gserviceaccount.com"

If ESO uses the same GCP service account for Secret Manager access, also grant roles/secretmanager.secretAccessor on the secrets.

Annotate the Kubernetes service account in your values file:

serviceAccount:
  annotations:
    iam.gke.io/gcp-service-account: nebula-sa@<project>.iam.gserviceaccount.com

Install

1. Push images to Artifact Registry

tar -xzf nebula-enterprise-<version>.tar.gz
cd nebula-enterprise-<version>/
sha256sum -c checksums.txt
docker load -i images.tar

REGION=us-central1
AR="${REGION}-docker.pkg.dev/<project>/<repo>"
gcloud auth configure-docker "${REGION}-docker.pkg.dev"

docker tag nebula:enterprise-<version>              "${AR}/nebula/nebula-runtime:<version>"
docker tag nebula-graph-engine:enterprise-<version> "${AR}/nebula/graph-engine:<version>"
docker tag nebula-postgres:enterprise-<version>     "${AR}/nebula/postgres:<version>"
docker push "${AR}/nebula/nebula-runtime:<version>"
docker push "${AR}/nebula/graph-engine:<version>"
docker push "${AR}/nebula/postgres:<version>"

For private-cluster GKE (no public-registry egress), also mirror third-party images:

docker tag public.ecr.aws/docker/library/busybox:1.37.0       "${AR}/busybox:1.37.0"
docker push "${AR}/busybox:1.37.0"

Then set the mirrored repositories in your values file:

image:
  busybox:
    repository: busybox

2. Seed secrets in GCP Secret Manager

Generate the JWT RSA private key with the commands in Service authentication before creating NEBULA_JWT_PRIVATE_KEY_PEM.

echo -n "sk-..."           | gcloud secrets create OPENAI_API_KEY       --data-file=-
echo -n "$(openssl rand -hex 32)" | gcloud secrets create NEBULA_SECRET_KEY --data-file=-
echo -n "nebula-YYYY-MM"   | gcloud secrets create NEBULA_JWT_KID       --data-file=-
gcloud secrets create NEBULA_JWT_PRIVATE_KEY_PEM --data-file=nebula-jwt-private.pem
echo -n "[]"               | gcloud secrets create NEBULA_JWT_RETIRED_PUBLIC_KEYS_JSON --data-file=-
# Repeat for NEBULA_SERVICE_API_KEY, NEBULA_WEBHOOK_HMAC_SECRET, and
# NEBULA_INTERNAL_WAKE_TOKEN.

NEBULA_JWT_RETIRED_PUBLIC_KEYS_JSON can stay [] on a fresh install. Populate it only during JWT signing-key rotation; see Service authentication. For an empty Cloud SQL instance, use the bundle helper as the canonical logical bootstrap:

PGPASSWORD=<admin-password> \
./nebula-enterprise postgres provision \
  --namespace nebula \
  --admin-url "postgresql://postgres@<cloud-sql-private-ip-or-dns>:5432/postgres?sslmode=require" \
  --nebula-database nebula \
  --nebula-user nebula \
  --nebula-secret nebula-postgres-credentials

If your platform team provisions Postgres separately, mirror the same contract: a Nebula user/database, required extensions in the Nebula database, and a Kubernetes Secret with username and password keys. Run the read-only verifier before Helm install:

PGPASSWORD=<admin-password> \
./nebula-enterprise postgres verify \
  --namespace nebula \
  --admin-url "postgresql://postgres@<cloud-sql-private-ip-or-dns>:5432/postgres?sslmode=require" \
  --nebula-database nebula \
  --nebula-user nebula \
  --nebula-secret nebula-postgres-credentials

3. Copy + fill the reference values file

The bundle ships helm/examples/gke/values.yaml with GKE-specific knobs pre-wired (Workload Identity annotation, GCS endpoint, nginx ingress, Secret Manager ESO). Copy it, fill in the <placeholder> markers, and save as your-values.yaml.

4. Install

gcloud container clusters get-credentials <cluster> --region <region> --project <project>

helm install nebula ./helm/nebula-<version>.tgz \
  -n nebula --create-namespace \
  -f helm/examples/_common/production-sizing.yaml \
  -f your-values.yaml

_common/production-sizing.yaml is the shared production-shape sizing block (replicas, CPU/memory requests + limits, persistence) used by all three cloud-managed K8s examples (EKS/AKS/GKE). Omit it to keep the chart’s minimal-dev defaults; override per-workload in your-values.yaml to fit your GKE node SKUs. The chart runs schema migrations and catalog-apply automatically via a per-revision Job (<release>-nebula-migrations-<revision>); API and worker pods gate startup on an init container that polls public.nebula_release_contract for the install’s release row. releaseContract.releaseId and releaseContract.gitSha are stamped by bundle.sh and consumed automatically.

5. Verify

kubectl -n nebula get pods
kubectl -n nebula get ingress nebula
curl -fsS https://nebula.<your-domain>.com/v1/health

Upgrade

Pull the new bundle, push new images to Artifact Registry, then:

helm upgrade nebula ./helm/nebula-<new-version>.tgz \
  -n nebula \
  -f your-values.yaml

Sizing reference

Workload	Starter	When to scale
API	2 replicas, 1 CPU / 2-4 GB	HPA on CPU >70% sustained
Worker	2 replicas, 2 CPU / 4-8 GB	HPA on queue depth (Orchestration metric)
Graph engine	2 replicas, 2 CPU / 4-8 GB	Manual; restart-sensitive (WAL replay)
Compactor	1 replica, 1 CPU / 2-4 GB	Single-writer; do not scale horizontally
Queue	1 replica, 8 GB PVC	Single-broker is fine up to ~10k workflows/min

Recommended GKE machine types: n2-standard-4 (4 vCPU / 16 GB) for API, worker, Orchestration; n2-highmem-4 (4 vCPU / 32 GB) for graph-engine and compactor.

Troubleshooting

Workload Identity not bound — pods receive permission denied from GCS

Confirm the Kubernetes SA annotation is set: kubectl -n nebula describe sa <release>-nebula-sa should show iam.gke.io/gcp-service-account. Also verify the IAM binding: gcloud iam service-accounts get-iam-policy nebula-sa@<project>.iam.gserviceaccount.com should list the workloadIdentityUser binding for the K8s SA. Ensure the cluster’s Workload Identity pool (<project>.svc.id.goog) is enabled.

GCE Ingress (not nginx) provisioning slow

The GCE Ingress controller provisions a Google Cloud Load Balancer which can take 5-10 minutes. Check kubectl -n nebula describe ingress nebula for events. If you need faster provisioning, switch ingress.className: nginx and install the nginx Ingress controller instead.

pgvector missing on Cloud SQL — 'extension vector does not exist'

Cloud SQL for PostgreSQL 16.3+ supports pgvector via the vector extension. Enable the Cloud SQL flag (cloudsql.enable_pgvector=on), then run nebula-enterprise postgres provision or have your platform workflow satisfy nebula-enterprise postgres verify. Cloud SQL docs: Use pgvector.

GCS HMAC credentials rejected by graph-engine

Verify the HMAC key is created for a service account (not a user account). HMAC keys for service accounts are under IAM & Admin → Service Accounts → select the account → Keys tab → HMAC keys. Store the Access ID and Secret in the Kubernetes Secret referenced by objectStorage.credentialsSecret. The Secret must have AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY keys — those exact uppercase names — the chart’s nebula.objectStorageEnv helper reads them via secretKeyRef.key.

Get Started

Kubernetes

Docker Compose

Connectors

Reference

Prereqs

Cluster

Addons + controllers

GCP-managed resources (recommended)

Workload Identity setup

Install

1. Push images to Artifact Registry

2. Seed secrets in GCP Secret Manager

3. Copy + fill the reference values file

4. Install

5. Verify

Upgrade

Sizing reference

Troubleshooting

​Prereqs

​Cluster

​Addons + controllers

​GCP-managed resources (recommended)

​Workload Identity setup

​Install

​1. Push images to Artifact Registry

​2. Seed secrets in GCP Secret Manager

​3. Copy + fill the reference values file

​4. Install

​5. Verify

​Upgrade

​Sizing reference

​Troubleshooting

Prereqs

Cluster

Addons + controllers

GCP-managed resources (recommended)

Workload Identity setup

Install

1. Push images to Artifact Registry

2. Seed secrets in GCP Secret Manager

3. Copy + fill the reference values file

4. Install

5. Verify

Upgrade

Sizing reference

Troubleshooting