Skip to main content
This guide covers a production Nebula install on AKS using Azure-managed Postgres, Azure Blob Storage (S3-compatible endpoint), and Azure Key Vault secrets via External Secrets Operator.

Prereqs

Before helm install, the following must be in place on the cluster side.

Cluster

  • AKS 1.30+
  • OIDC issuer enabled on the cluster (az aks update --enable-oidc-issuer --enable-workload-identity) — required for Workload Identity federation
  • Cluster nodes must have outbound internet access, or images must be mirrored to Azure Container Registry (ACR) first

Addons + controllers

ComponentPurposeInstall reference
Cluster Autoscaler (or Karpenter for Azure preview)Node autoscalingAKS addon: --enable-cluster-autoscaler
nginx Ingress Controller (or AGIC)HTTP/HTTPS ingresskubernetes.github.io/ingress-nginx
Azure Disk CSI DriverPremium SSD volumes for graph-engine / compactor / RabbitMQAKS built-in: enabled by default on AKS 1.21+
cert-managerTLS certificate provisioning from Let’s Encryptcert-manager.io/docs
External Secrets Operator (recommended)Sync from Azure Key Vaultexternal-secrets.io
  • Azure Database for PostgreSQL Flexible Server in the same virtual network as the AKS cluster. Nebula requires vector, pg_partman, and pg_cron; in the Azure portal, navigate to Server parametersazure.extensions and add the required extensions, then enable pg_cron preloading before bootstrap. Run nebula-enterprise postgres provision to create the Nebula/Hatchet databases, users, extensions, and chart credential Secrets. Private access (VNet-integrated) is strongly recommended.
  • Azure Blob Storage account with a container for graph segments. The chart’s object storage path uses Azure Blob’s S3-compatible API endpoint — see the note under Object storage below.
Known limitation: the chart’s objectStorage block emits S3-protocol environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, S3_ENDPOINT_URL). Azure Blob exposes an S3-compatible endpoint (Storage account → Settings → S3 compatibility, currently in preview). Enable it and use HMAC access keys as the credentialsSecret. If the S3-compat preview is not available in your region or subscription tier, run a MinIO gateway in front of Azure Blob as a bridge.

Workload Identity setup

Workload Identity replaces the legacy aad-pod-identity approach. Steps:
  1. Create a managed identity in the same resource group as the cluster:
    az identity create --name nebula-wi --resource-group <rg>
    
  2. Federate the managed identity with the AKS OIDC issuer for the Nebula service account:
    AKS_OIDC_ISSUER="$(az aks show --name <cluster> --resource-group <rg> \
      --query 'oidcIssuerProfile.issuerUrl' -o tsv)"
    az identity federated-credential create \
      --name nebula-federated \
      --identity-name nebula-wi \
      --resource-group <rg> \
      --issuer "$AKS_OIDC_ISSUER" \
      --subject "system:serviceaccount:nebula:<release>-nebula-sa" \
      --audience api://AzureADTokenExchange
    
    Replace <release> with your helm install release name (e.g. nebula).
  3. Grant the managed identity Storage Blob Data Contributor on the Blob container and Key Vault Secrets User on the Key Vault if ESO uses the same identity.
  4. Record the managed identity Client ID — you’ll set it under serviceAccount.annotations in your values file.

Install

1. Push images to your ACR

tar -xzf nebula-enterprise-<version>.tar.gz
cd nebula-enterprise-<version>/
sha256sum -c checksums.txt
docker load -i images.tar

ACR=<your-registry>.azurecr.io
az acr login --name <your-registry>

docker tag nebula:enterprise-<version>              "${ACR}/nebula/nebula-runtime:<version>"
docker tag nebula-graph-engine:enterprise-<version> "${ACR}/nebula/graph-engine:<version>"
docker tag nebula-postgres:enterprise-<version>     "${ACR}/nebula/postgres:<version>"
docker push "${ACR}/nebula/nebula-runtime:<version>"
docker push "${ACR}/nebula/graph-engine:<version>"
docker push "${ACR}/nebula/postgres:<version>"
For air-gapped AKS (no public-registry egress), also mirror third-party images to ACR and override image.*.repository in your values file:
docker tag ghcr.io/hatchet-dev/hatchet/hatchet-engine:v0.79.0 "${ACR}/hatchet-engine:v0.79.0"
docker tag ghcr.io/hatchet-dev/hatchet/hatchet-admin:v0.79.0  "${ACR}/hatchet-admin:v0.79.0"
docker tag ghcr.io/hatchet-dev/hatchet/hatchet-migrate:v0.79.0 "${ACR}/hatchet-migrate:v0.79.0"
docker tag rabbitmq:3.13.7-management                           "${ACR}/rabbitmq:3.13.7-management"
docker tag public.ecr.aws/docker/library/busybox:1.37.0       "${ACR}/busybox:1.37.0"
docker push "${ACR}/hatchet-engine:v0.79.0"
docker push "${ACR}/hatchet-admin:v0.79.0"
docker push "${ACR}/hatchet-migrate:v0.79.0"
docker push "${ACR}/rabbitmq:3.13.7-management"
docker push "${ACR}/busybox:1.37.0"
Then set the mirrored repositories in your values file:
image:
  hatchetEngine:
    repository: hatchet-engine
  hatchetAdmin:
    repository: hatchet-admin
  hatchetMigrate:
    repository: hatchet-migrate
  rabbitmq:
    repository: rabbitmq
  busybox:
    repository: busybox

2. Seed secrets in Azure Key Vault

Create a Key Vault and store one secret per Nebula key, or store a JSON blob at a single secret name and use ESO’s dataFrom extraction. Example using individual secrets: Generate the JWT RSA private key with the commands in Service authentication before setting NEBULA-JWT-PRIVATE-KEY-PEM.
az keyvault secret set --vault-name <kv> --name OPENAI-API-KEY      --value "sk-..."
az keyvault secret set --vault-name <kv> --name NEBULA-SECRET-KEY   --value "$(openssl rand -hex 32)"
az keyvault secret set --vault-name <kv> --name NEBULA-SERVICE-API-KEY --value "$(openssl rand -hex 32)"
az keyvault secret set --vault-name <kv> --name NEBULA-WEBHOOK-HMAC-SECRET --value "$(openssl rand -hex 32)"
az keyvault secret set --vault-name <kv> --name NEBULA-JWT-KID --value "nebula-YYYY-MM"
az keyvault secret set --vault-name <kv> --name NEBULA-JWT-PRIVATE-KEY-PEM --file nebula-jwt-private.pem
az keyvault secret set --vault-name <kv> --name NEBULA-JWT-RETIRED-PUBLIC-KEYS-JSON --value "[]"
az keyvault secret set --vault-name <kv> --name NEBULA-INTERNAL-WAKE-TOKEN --value "$(openssl rand -hex 32)"
az keyvault secret set --vault-name <kv> --name NEBULA-VECTOR-BUILD-HATCHET-TRIGGER-TOKEN --value "$(openssl rand -hex 32)"
NEBULA-JWT-RETIRED-PUBLIC-KEYS-JSON can stay [] on a fresh install. Populate it only during JWT signing-key rotation; see Service authentication. For an empty Flexible Server, use the bundle helper as the canonical logical bootstrap:
PGPASSWORD=<admin-password> \
./nebula-enterprise postgres provision \
  --namespace nebula \
  --admin-url "postgresql://postgres@<server>.postgres.database.azure.com:5432/postgres?sslmode=require" \
  --nebula-database nebula \
  --nebula-user nebula \
  --nebula-secret nebula-postgres-credentials \
  --hatchet-database hatchet \
  --hatchet-user hatchet \
  --hatchet-secret hatchet-postgres-credentials
If your platform team provisions Postgres separately, mirror the same contract: distinct Nebula and Hatchet users, distinct logical databases, required extensions in the Nebula database, and a Hatchet database_url Secret that is already URL-encoded. Run the read-only verifier before Helm install:
PGPASSWORD=<admin-password> \
./nebula-enterprise postgres verify \
  --namespace nebula \
  --admin-url "postgresql://postgres@<server>.postgres.database.azure.com:5432/postgres?sslmode=require" \
  --nebula-database nebula \
  --nebula-user nebula \
  --nebula-secret nebula-postgres-credentials \
  --hatchet-database hatchet \
  --hatchet-user hatchet \
  --hatchet-secret hatchet-postgres-credentials

3. Copy + fill the reference values file

The bundle ships helm/examples/aks/values.yaml with every AKS-specific knob pre-wired. Copy it, fill in the <placeholder> markers (ACR login server, Flexible Server hostname, Blob storage account, managed identity client ID, Key Vault name, domain), and save as your-values.yaml.

4. Install

helm install nebula ./helm/nebula-<version>.tgz \
  -n nebula --create-namespace \
  -f helm/examples/_common/production-sizing.yaml \
  -f your-values.yaml
_common/production-sizing.yaml is the shared production-shape sizing block (replicas, CPU/memory requests + limits, persistence) used by all three cloud-managed K8s examples (EKS/AKS/GKE). Omit it to keep the chart’s minimal-dev defaults; override per-workload in your-values.yaml to fit your AKS node SKUs. The chart runs schema migrations and catalog-apply automatically via a per-revision Job (<release>-nebula-migrations-<revision>); API and worker pods gate startup on an init container that polls public.nebula_release_contract for the install’s release row. releaseContract.releaseId and releaseContract.gitSha are stamped into the bundled values by bundle.sh and are consumed automatically.

5. Verify

az aks get-credentials --name <cluster> --resource-group <rg>
kubectl -n nebula get pods
kubectl -n nebula get ingress nebula
curl -fsS https://nebula.<your-domain>.com/v1/health

Upgrade

Pull the new bundle, push new images to your ACR, then:
helm upgrade nebula ./helm/nebula-<new-version>.tgz \
  -n nebula \
  -f your-values.yaml

Sizing reference

WorkloadStarterWhen to scale
API2 replicas, 1 CPU / 2-4 GBHPA on CPU >70% sustained
Worker2 replicas, 2 CPU / 4-8 GBHPA on queue depth (Hatchet metric)
Graph engine2 replicas, 2 CPU / 4-8 GBManual; restart-sensitive (WAL replay)
Compactor1 replica, 1 CPU / 2-4 GBSingle-writer; do not scale horizontally
RabbitMQ1 replica, 8 GB PVCSingle-broker is fine up to ~10k workflows/min
Recommended AKS node SKUs for the starter shape: Standard_D4s_v5 (4 vCPU / 16 GB) for API, worker, and Hatchet; Standard_D8s_v5 (8 vCPU / 32 GB) for graph-engine and compactor.

Troubleshooting

Check that the managed identity’s federated credential subject exactly matches system:serviceaccount:<namespace>:<release>-nebula-sa. The release name prefix is part of the service account name. Confirm with kubectl -n nebula get sa and compare to az identity federated-credential list --identity-name nebula-wi --resource-group <rg>.
nginx Ingress on AKS provisions a public Azure Load Balancer automatically. The provisioning can take 3-5 minutes on a fresh cluster. Check kubectl -n ingress-nginx get svc ingress-nginx-controller for the external IP assignment. If it stays in Pending, verify that the cluster’s subnet has enough IP space and that the AKS service principal / managed identity has Network Contributor on the virtual network.
The azure.extensions server parameter must include vector before bootstrap. Then run nebula-enterprise postgres provision or have your platform workflow satisfy nebula-enterprise postgres verify; the contract requires the extension to be enabled at the server level and installed in the Nebula database.
Azure Blob’s S3-compatible endpoint requires HMAC keys, not the storage account connection string. Generate HMAC keys under Storage accountAccess keysEnable S3 compatible HMAC. Store the Access Key ID and Secret Access Key in the Kubernetes Secret referenced by objectStorage.credentialsSecret with keys AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY (those exact uppercase names — the chart’s nebula.objectStorageEnv helper reads them via secretKeyRef.key).