Azure AKS, GCP GKE, and AWS EKS — Terraform and Helm provisioning guide
This document explains how the K2view monitoring infrastructure is provisioned on Kubernetes across Azure (AKS), GCP (GKE), and AWS (EKS) using the K2view Terraform blueprints. It covers what the blueprints deploy, the required inputs, how to run the deployment, and what needs to be configured after deployment to connect Fabric metrics.
This document is for platform engineers and DevOps teams responsible for standing up or maintaining the K2view observability stack on cloud-managed Kubernetes clusters.
[ Azure + GCP + AWS ] The Grafana Agent Helm chart is the same across all three cloud platforms.
The K2view Terraform blueprints provision the cluster infrastructure and deploy the Grafana Agent observability stack during the same Terraform run. The monitoring deployment is controlled by a single flag:
deploy_grafana_agent = true
When this flag is true, Terraform deploys the Grafana k8s-monitoring Helm chart into the cluster. This chart installs:
Grafana Agent is configured to remote-write metrics to an external Prometheus endpoint and forward logs to an external Loki endpoint. Both endpoints are provided as Terraform input variables. The cluster name, credentials, and endpoint URLs are the three things you must supply before running.
Important: The blueprints deploy the Grafana Agent and its supporting components. They do NOT automatically configure Grafana Agent to scrape Fabric pods. Fabric metric collection requires additional configuration after the stack is deployed. See Section 6.
[ Azure + GCP + AWS ] Applies to all three cloud platforms.
Before running the Terraform deployment, confirm the following:
The Grafana Agent requires two external endpoints to send data to:
These can be Grafana Cloud endpoints or self-hosted Prometheus and Loki instances. You will need:
Note: The GCP and AWS modules also support a Tempo (distributed tracing) endpoint. Tracing is disabled by default. If not using Tempo, the token placeholder is still required in the Terraform variable but traces will not be sent.
The blueprints also deploy the K2view Agent (k2v_agent), which connects the cluster to K2view K2cloud Orchestrator via a mailbox ID. This is separate from monitoring but is deployed in the same Terraform run. You will need:
[ Azure / AKS ] Blueprint path: blueprints/Azure/terraform/AKS/
The Azure blueprint deploys an AKS cluster and, optionally, Grafana Agent. The Grafana Agent values are supplied via a separate YAML file rather than individual Terraform variables.
Copy or edit the template:
blueprints/Azure/terraform/AKS/terraform.tfvars.template
Key variables to set:
cluster_name = "your-cluster-name"
resource_group_name = "your-resource-group"
location = "your-azure-region"
deploy_grafana_agent = true
mailbox_id = "your-k2view-mailbox-id"
Edit the Grafana Agent values file in the AKS directory:
blueprints/Azure/terraform/AKS/grafana-agent-values.yaml
Replace all placeholder tokens:
cluster:
name: <YOUR_CLUSTER_NAME>
externalServices:
prometheus:
host: <PROMETHEUS_URL>
basicAuth:
username: <PROMETHEUS_USER>
password: <GRAFANA_TOKEN>
loki:
host: <LOKI_URL>
basicAuth:
username: <LOKI_USER>
password: <GRAFANA_TOKEN>
**Note:** The Azure blueprint uses a local copy of the k8s-monitoring chart from blueprints/Azure/helm/charts/grafana-agent/k8s-monitoring/. GCP and AWS pull from the published Grafana Helm registry. The chart behavior is the same.
cd blueprints/Azure/terraform/AKS
terraform init
terraform plan
terraform apply
**Private clusters:** If private_cluster_enabled = true in your tfvars, Grafana Agent and other Helm-based components will not be deployed by Terraform. They must be deployed manually after the cluster is created and a private network path is available.
[ GCP / GKE ] Blueprint path: blueprints/gcp/terraform/GKE/
The GCP blueprint deploys a GKE cluster and, optionally, Grafana Agent. All Grafana Agent configuration is passed as Terraform variables.
Copy or edit the template:
blueprints/gcp/terraform/GKE/terraform.tfvars.template
Key variables to set:
project_id = "your-gcp-project-id"
cluster_name = "your-cluster-name"
region = "gcp-region"
deploy_grafana_agent = true
grafana_token = "your-grafana-access-policy-token"
mailbox_id = "your-k2view-mailbox-id"
The Prometheus and Loki host URLs and usernames are pre-populated in the module variables with Grafana Cloud defaults. If using a different endpoint, override it in the tfvars file:
Only needed if NOT using the Grafana Cloud defaults
externalservices_prometheus_host = "https://
" externalservices_prometheus_username =
externalservices_loki_host = "https://
" externalservices_loki_username =
cd blueprints/gcp/terraform/GKE
terraform init
terraform plan
terraform apply
[ AWS / EKS ] Blueprint path: blueprints/aws/terraform/EKS/
The AWS blueprint deploys an EKS cluster and, optionally, Grafana Agent. The pattern is the same as GCP — all configurations are passed as Terraform variables.
Create a tfvars file from the variables:
cluster_name = "your-cluster-name"
region = "aws-region"
deploy_grafana_agent = true
grafana_token = "your-grafana-access-policy-token"
mailbox_id = "your-k2view-mailbox-id"
As with GCP, the Prometheus and Loki host URLs use Grafana Cloud defaults from the module variables. Override if using different endpoints.
Note: deploy_grafana_agent defaults to false in the AWS blueprint. You must explicitly set it to true in your tfvars file for Grafana Agent to be deployed.
cd blueprints/aws/terraform/EKS
terraform init
terraform plan
terraform apply
[ Azure + GCP + AWS ] The Grafana Agent k8s-monitoring chart deploys the same components on all three clouds.
After a successful Terraform apply with deploy_grafana_agent = true, the following are present in the cluster:
All components are deployed into the grafana-agent namespace.
To confirm the deployment:
kubectl get pods -n grafana-agent
All pods should reach Running status within a few minutes of the Helm release completing.
[ Azure + GCP + AWS ] Applies to all three cloud platforms.
Critical gap: The Grafana Agent deployment does NOT automatically scrape Fabric pods. The k8s-monitoring chart collects node metrics, cluster state, and pod logs by default — but it has no built-in knowledge of the Fabric JMX Exporter endpoint. You must add Fabric metric collection explicitly after deployment.
There are two approaches. Choose based on the level of control you need.
The k8s-monitoring chart supports annotation-based autodiscovery. When enabled, any pod annotated with the scrape annotation is automatically discovered and scraped.
Add the following to your Grafana Agent values override (grafana-agent-values.yaml for Azure, or as Terraform variable overrides for GCP/AWS):
metrics:
autoDiscover:
enabled: true
Add the following annotations to the Fabric pod spec or deployment template:
annotations:
k8s.grafana.com/scrape: "true"
k8s.grafana.com/metrics.portNumber: "7170"
For iid_finder metrics on port 7270, configure a second scrape annotation or use Option B below for per-port control.
Note: Annotation-based autodiscovery will scrape any pod in the cluster with the scrape annotation set to true. Apply the annotation only to pods you intend to monitor and ensure metric filtering rules are in place to control volume. See How to Control Metric Volume with Filtering and Relabeling.
For production deployments, an explicit River pipeline gives you full control over which pods are scraped, how metrics are filtered, and what labels are applied. This is the recommended approach.
Create a River configuration file (e.g., fabric-scrape.river):
// Discover Fabric pods by label
discovery.relabel "fabric_pods" {
targets = discovery.kubernetes.pods.targets
rule {
source_labels = ["__meta_kubernetes_pod_label_app"]
regex = "fabric"
action = "keep"
}
rule {
source_labels = ["__meta_kubernetes_pod_container_port_number"]
regex = "7170"
action = "keep"
}
}
// Filter to useful metric families before forwarding
prometheus.relabel "fabric_filter" {
rule {
source_labels = ["__name__"]
regex = "fabric_.*|jvm_.*|tomcat_.*|process_.*"
action = "keep"
}
forward_to = [prometheus.relabel.metrics_service.receiver]
}
// Scrape Fabric pods and forward to filter
prometheus.scrape "fabric_jmx" {
targets = discovery.relabel.fabric_pods.output
job_name = "fabric-jmx"
forward_to = [prometheus.relabel.fabric_filter.receiver]
}
Adjust the label selector (app=fabric) to match the actual labels on your Fabric pods.
For Azure (values file):
# In grafana-agent-values.yaml, add:
extraConfig: |
<paste River config inline here>
Or pass it as a file during Helm upgrade:
helm upgrade grafana-k8s-monitoring .
--namespace grafana-agent
--values grafana-agent-values.yaml
--set-file extraConfig=fabric-scrape.river
For GCP and AWS (Terraform), add the River config as an additional Helm set in the grafana-agent module, or run a separate helm upgrade after the initial Terraform apply.
[ Azure + GCP + AWS ] Applies to all three cloud platforms.
kubectl get pods -n grafana-agent
Expected: all pods in Running state with no restart loops.
kubectl logs -n grafana-agent -l app.kubernetes.io/name=grafana-agent --tail=50
Look for:
Query your Prometheus endpoint for infrastructure metrics that should be present immediately after deployment:
# Node metrics — from prometheus-node-exporter
node_cpu_seconds_total
# Kubernetes state — from kube-state-metrics
kube_pod_status_ready
Once Fabric scraping is configured, also check:
# Fabric JVM metrics
jvm_memory_bytes_used
# Fabric product metrics
fabric_read_total
In Grafana, query Loki for recent logs from the cluster:
{cluster="<YOUR_CLUSTER_NAME>"}
Pod logs should appear within one scrape interval (default 60 seconds) of Grafana Agent starting.
Before deployment:
After deployment:
After adding Fabric scraping:
Azure AKS, GCP GKE, and AWS EKS — Terraform and Helm provisioning guide
This document explains how the K2view monitoring infrastructure is provisioned on Kubernetes across Azure (AKS), GCP (GKE), and AWS (EKS) using the K2view Terraform blueprints. It covers what the blueprints deploy, the required inputs, how to run the deployment, and what needs to be configured after deployment to connect Fabric metrics.
This document is for platform engineers and DevOps teams responsible for standing up or maintaining the K2view observability stack on cloud-managed Kubernetes clusters.
[ Azure + GCP + AWS ] The Grafana Agent Helm chart is the same across all three cloud platforms.
The K2view Terraform blueprints provision the cluster infrastructure and deploy the Grafana Agent observability stack during the same Terraform run. The monitoring deployment is controlled by a single flag:
deploy_grafana_agent = true
When this flag is true, Terraform deploys the Grafana k8s-monitoring Helm chart into the cluster. This chart installs:
Grafana Agent is configured to remote-write metrics to an external Prometheus endpoint and forward logs to an external Loki endpoint. Both endpoints are provided as Terraform input variables. The cluster name, credentials, and endpoint URLs are the three things you must supply before running.
Important: The blueprints deploy the Grafana Agent and its supporting components. They do NOT automatically configure Grafana Agent to scrape Fabric pods. Fabric metric collection requires additional configuration after the stack is deployed. See Section 6.
[ Azure + GCP + AWS ] Applies to all three cloud platforms.
Before running the Terraform deployment, confirm the following:
The Grafana Agent requires two external endpoints to send data to:
These can be Grafana Cloud endpoints or self-hosted Prometheus and Loki instances. You will need:
Note: The GCP and AWS modules also support a Tempo (distributed tracing) endpoint. Tracing is disabled by default. If not using Tempo, the token placeholder is still required in the Terraform variable but traces will not be sent.
The blueprints also deploy the K2view Agent (k2v_agent), which connects the cluster to K2view K2cloud Orchestrator via a mailbox ID. This is separate from monitoring but is deployed in the same Terraform run. You will need:
[ Azure / AKS ] Blueprint path: blueprints/Azure/terraform/AKS/
The Azure blueprint deploys an AKS cluster and, optionally, Grafana Agent. The Grafana Agent values are supplied via a separate YAML file rather than individual Terraform variables.
Copy or edit the template:
blueprints/Azure/terraform/AKS/terraform.tfvars.template
Key variables to set:
cluster_name = "your-cluster-name"
resource_group_name = "your-resource-group"
location = "your-azure-region"
deploy_grafana_agent = true
mailbox_id = "your-k2view-mailbox-id"
Edit the Grafana Agent values file in the AKS directory:
blueprints/Azure/terraform/AKS/grafana-agent-values.yaml
Replace all placeholder tokens:
cluster:
name: <YOUR_CLUSTER_NAME>
externalServices:
prometheus:
host: <PROMETHEUS_URL>
basicAuth:
username: <PROMETHEUS_USER>
password: <GRAFANA_TOKEN>
loki:
host: <LOKI_URL>
basicAuth:
username: <LOKI_USER>
password: <GRAFANA_TOKEN>
**Note:** The Azure blueprint uses a local copy of the k8s-monitoring chart from blueprints/Azure/helm/charts/grafana-agent/k8s-monitoring/. GCP and AWS pull from the published Grafana Helm registry. The chart behavior is the same.
cd blueprints/Azure/terraform/AKS
terraform init
terraform plan
terraform apply
**Private clusters:** If private_cluster_enabled = true in your tfvars, Grafana Agent and other Helm-based components will not be deployed by Terraform. They must be deployed manually after the cluster is created and a private network path is available.
[ GCP / GKE ] Blueprint path: blueprints/gcp/terraform/GKE/
The GCP blueprint deploys a GKE cluster and, optionally, Grafana Agent. All Grafana Agent configuration is passed as Terraform variables.
Copy or edit the template:
blueprints/gcp/terraform/GKE/terraform.tfvars.template
Key variables to set:
project_id = "your-gcp-project-id"
cluster_name = "your-cluster-name"
region = "gcp-region"
deploy_grafana_agent = true
grafana_token = "your-grafana-access-policy-token"
mailbox_id = "your-k2view-mailbox-id"
The Prometheus and Loki host URLs and usernames are pre-populated in the module variables with Grafana Cloud defaults. If using a different endpoint, override it in the tfvars file:
Only needed if NOT using the Grafana Cloud defaults
externalservices_prometheus_host = "https://
" externalservices_prometheus_username =
externalservices_loki_host = "https://
" externalservices_loki_username =
cd blueprints/gcp/terraform/GKE
terraform init
terraform plan
terraform apply
[ AWS / EKS ] Blueprint path: blueprints/aws/terraform/EKS/
The AWS blueprint deploys an EKS cluster and, optionally, Grafana Agent. The pattern is the same as GCP — all configurations are passed as Terraform variables.
Create a tfvars file from the variables:
cluster_name = "your-cluster-name"
region = "aws-region"
deploy_grafana_agent = true
grafana_token = "your-grafana-access-policy-token"
mailbox_id = "your-k2view-mailbox-id"
As with GCP, the Prometheus and Loki host URLs use Grafana Cloud defaults from the module variables. Override if using different endpoints.
Note: deploy_grafana_agent defaults to false in the AWS blueprint. You must explicitly set it to true in your tfvars file for Grafana Agent to be deployed.
cd blueprints/aws/terraform/EKS
terraform init
terraform plan
terraform apply
[ Azure + GCP + AWS ] The Grafana Agent k8s-monitoring chart deploys the same components on all three clouds.
After a successful Terraform apply with deploy_grafana_agent = true, the following are present in the cluster:
All components are deployed into the grafana-agent namespace.
To confirm the deployment:
kubectl get pods -n grafana-agent
All pods should reach Running status within a few minutes of the Helm release completing.
[ Azure + GCP + AWS ] Applies to all three cloud platforms.
Critical gap: The Grafana Agent deployment does NOT automatically scrape Fabric pods. The k8s-monitoring chart collects node metrics, cluster state, and pod logs by default — but it has no built-in knowledge of the Fabric JMX Exporter endpoint. You must add Fabric metric collection explicitly after deployment.
There are two approaches. Choose based on the level of control you need.
The k8s-monitoring chart supports annotation-based autodiscovery. When enabled, any pod annotated with the scrape annotation is automatically discovered and scraped.
Add the following to your Grafana Agent values override (grafana-agent-values.yaml for Azure, or as Terraform variable overrides for GCP/AWS):
metrics:
autoDiscover:
enabled: true
Add the following annotations to the Fabric pod spec or deployment template:
annotations:
k8s.grafana.com/scrape: "true"
k8s.grafana.com/metrics.portNumber: "7170"
For iid_finder metrics on port 7270, configure a second scrape annotation or use Option B below for per-port control.
Note: Annotation-based autodiscovery will scrape any pod in the cluster with the scrape annotation set to true. Apply the annotation only to pods you intend to monitor and ensure metric filtering rules are in place to control volume. See How to Control Metric Volume with Filtering and Relabeling.
For production deployments, an explicit River pipeline gives you full control over which pods are scraped, how metrics are filtered, and what labels are applied. This is the recommended approach.
Create a River configuration file (e.g., fabric-scrape.river):
// Discover Fabric pods by label
discovery.relabel "fabric_pods" {
targets = discovery.kubernetes.pods.targets
rule {
source_labels = ["__meta_kubernetes_pod_label_app"]
regex = "fabric"
action = "keep"
}
rule {
source_labels = ["__meta_kubernetes_pod_container_port_number"]
regex = "7170"
action = "keep"
}
}
// Filter to useful metric families before forwarding
prometheus.relabel "fabric_filter" {
rule {
source_labels = ["__name__"]
regex = "fabric_.*|jvm_.*|tomcat_.*|process_.*"
action = "keep"
}
forward_to = [prometheus.relabel.metrics_service.receiver]
}
// Scrape Fabric pods and forward to filter
prometheus.scrape "fabric_jmx" {
targets = discovery.relabel.fabric_pods.output
job_name = "fabric-jmx"
forward_to = [prometheus.relabel.fabric_filter.receiver]
}
Adjust the label selector (app=fabric) to match the actual labels on your Fabric pods.
For Azure (values file):
# In grafana-agent-values.yaml, add:
extraConfig: |
<paste River config inline here>
Or pass it as a file during Helm upgrade:
helm upgrade grafana-k8s-monitoring .
--namespace grafana-agent
--values grafana-agent-values.yaml
--set-file extraConfig=fabric-scrape.river
For GCP and AWS (Terraform), add the River config as an additional Helm set in the grafana-agent module, or run a separate helm upgrade after the initial Terraform apply.
[ Azure + GCP + AWS ] Applies to all three cloud platforms.
kubectl get pods -n grafana-agent
Expected: all pods in Running state with no restart loops.
kubectl logs -n grafana-agent -l app.kubernetes.io/name=grafana-agent --tail=50
Look for:
Query your Prometheus endpoint for infrastructure metrics that should be present immediately after deployment:
# Node metrics — from prometheus-node-exporter
node_cpu_seconds_total
# Kubernetes state — from kube-state-metrics
kube_pod_status_ready
Once Fabric scraping is configured, also check:
# Fabric JVM metrics
jvm_memory_bytes_used
# Fabric product metrics
fabric_read_total
In Grafana, query Loki for recent logs from the cluster:
{cluster="<YOUR_CLUSTER_NAME>"}
Pod logs should appear within one scrape interval (default 60 seconds) of Grafana Agent starting.
Before deployment:
After deployment:
After adding Fabric scraping: