Keeping Credentials Out of Code - A Practical Guide to 1Password and Vault
The Problem: Hardcoded Credentials Every developer has faced this temptation: you need to test something quickly, so you hardcode an API key or database

In our previous post on Secure Communication in Kubernetes with Consul Connect and Vault Agent Injector, we established a robust service mesh foundation. Now, let's take it to the next level by implementing comprehensive monitoring and observability for your Consul Connect service mesh in Kubernetes.
Monitoring a service mesh is crucial for understanding service performance, identifying bottlenecks, troubleshooting issues, and ensuring overall system health. In this blog post, we'll explore how to implement a complete observability stack including metrics collection with Prometheus, visualization with Grafana, and distributed tracing with Jaeger for your Consul Connect-enabled Kubernetes applications.
Prerequisites: This article builds upon our previous Consul Connect and Vault Agent Injector setup. Ensure you have completed the setup from Secure Communication in Kubernetes with Consul Connect and Vault Agent Injector before proceeding.
Our observability stack integrates seamlessly with Consul Connect to provide comprehensive monitoring, metrics collection, and distributed tracing across your service mesh:
The architecture consists of four key components that work together to provide complete observability:
First, let's enhance our existing Consul Connect setup to enable comprehensive metrics collection.
Update your consul-values.yaml to enable metrics and tracing:
1# consul-values-observability.yaml
2global:
3 name: consul
4 datacenter: dc1
5 consulAPITimeout: 5s
6 # Enable metrics collection
7 metrics:
8 enabled: true
9 enableAgentMetrics: true
10 agentMetricsRetentionTime: "1m"
11
12server:
13 enabled: false
14
15client:
16 enabled: false
17
18connectInject:
19 enabled: true
20 default: false
21 consulNode:
22 meta:
23 pod-name: ${HOSTNAME}
24 node-name: ${NODE_NAME}
25 k8sAllowNamespaces: ["*"]
26 k8sDenyNamespaces: []
27
28 # Enable metrics and tracing for Connect proxies
29 metrics:
30 defaultEnabled: true
31 defaultEnableMerging: true
32 enableGatewayMetrics: true
33
34 # Configure Envoy proxy settings for observability
35 envoyExtraArgs: |
36 --log-level info
37 --component-log-level upstream:debug,connection:info
38
39 # Enable tracing
40 centralConfig:
41 enabled: true
42 defaultProtocol: "http"
43 proxyDefaults: |
44 {
45 "config": {
46 "envoy_tracing_json": {
47 "http": {
48 "name": "envoy.tracers.zipkin",
49 "typedConfig": {
50 "@type": "type.googleapis.com/envoy.extensions.tracers.zipkin.v3.ZipkinConfig",
51 "collector_cluster": "jaeger-collector",
52 "collector_endpoint_version": "HTTP_JSON",
53 "collector_endpoint": "/api/v1/spans",
54 "shared_span_context": false
55 }
56 }
57 }
58 }
59 }
60
61controller:
62 enabled: true
63
64ui:
65 enabled: false
66
67dns:
68 enabled: false
69
70externalServers:
71 enabled: true
72 hosts: ["consul.example.com"]
73 httpsPort: 8501
74 useSystemRoots: true
75
76# Enable Prometheus metrics scraping
77prometheus:
78 enabled: true
1# Upgrade Consul Connect with observability features
2helm upgrade consul hashicorp/consul \
3 --namespace consul \
4 --values consul-values-observability.yaml
5
6# Verify the upgrade
7kubectl rollout status deployment/consul-connect-injector -n consul
Deploy Prometheus with the official kube-prometheus-stack for comprehensive Kubernetes and service mesh monitoring.
1# Add Prometheus community Helm repository
2helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
3helm repo update
4
5# Create monitoring namespace
6kubectl create namespace monitoring
7
8# Create Prometheus values for Consul Connect integration
9cat > prometheus-values.yaml <<EOF
10# Prometheus configuration for Consul Connect monitoring
11prometheus:
12 prometheusSpec:
13 # Enable service mesh scraping
14 additionalScrapeConfigs:
15 - job_name: 'consul-connect-envoy-sidecar'
16 kubernetes_sd_configs:
17 - role: pod
18 namespaces:
19 names: ['demo', 'default']
20 relabel_configs:
21 - source_labels: [__meta_kubernetes_pod_container_name]
22 regex: consul-connect-envoy-sidecar
23 action: keep
24 - source_labels: [__meta_kubernetes_pod_annotation_consul_hashicorp_com_connect_inject]
25 regex: 'true'
26 action: keep
27 - source_labels: [__address__]
28 regex: '([^:]+):.*'
29 target_label: __address__
30 replacement: '${1}:19000' # Envoy admin port
31 - source_labels: [__meta_kubernetes_pod_name]
32 target_label: instance
33 - source_labels: [__meta_kubernetes_namespace]
34 target_label: namespace
35 metrics_path: '/stats/prometheus'
36
37 - job_name: 'consul-external-servers'
38 static_configs:
39 - targets: ['consul.example.com:8500']
40 metrics_path: '/v1/agent/metrics'
41 params:
42 format: ['prometheus']
43
44 # Storage configuration
45 storageSpec:
46 volumeClaimTemplate:
47 spec:
48 storageClassName: standard
49 accessModes: ["ReadWriteOnce"]
50 resources:
51 requests:
52 storage: 50Gi
53
54 # Resource configuration
55 resources:
56 requests:
57 cpu: 500m
58 memory: 2Gi
59 limits:
60 cpu: 1000m
61 memory: 4Gi
62
63# Grafana configuration
64grafana:
65 enabled: true
66 adminPassword: "admin123" # Change in production
67 persistence:
68 enabled: true
69 size: 10Gi
70 resources:
71 requests:
72 cpu: 250m
73 memory: 512Mi
74 limits:
75 cpu: 500m
76 memory: 1Gi
77
78 # Pre-configure Consul Connect dashboards
79 dashboardProviders:
80 dashboardproviders.yaml:
81 apiVersion: 1
82 providers:
83 - name: 'consul-connect'
84 orgId: 1
85 folder: 'Consul Connect'
86 type: file
87 disableDeletion: false
88 editable: true
89 options:
90 path: /var/lib/grafana/dashboards/consul-connect
91
92 dashboardsConfigMaps:
93 consul-connect: consul-connect-dashboards
94
95# Node exporter for infrastructure metrics
96nodeExporter:
97 enabled: true
98
99# Alert manager for notifications
100alertmanager:
101 enabled: true
102 alertmanagerSpec:
103 resources:
104 requests:
105 cpu: 100m
106 memory: 128Mi
107 limits:
108 cpu: 200m
109 memory: 256Mi
110EOF
111
112# Install Prometheus stack
113helm install prometheus-stack prometheus-community/kube-prometheus-stack \
114 --namespace monitoring \
115 --values prometheus-values.yaml
116
117# Wait for deployment
118kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=prometheus -n monitoring --timeout=300s
Deploy Jaeger to capture and analyze distributed traces from your service mesh.
1# Add Jaeger Helm repository
2helm repo add jaegertracing https://jaegertracing.github.io/helm-charts
3helm repo update
4
5# Create Jaeger values for Consul Connect integration
6cat > jaeger-values.yaml <<EOF
7# Jaeger configuration for Consul Connect tracing
8provisionDataStore:
9 cassandra: false
10 elasticsearch: true
11
12storage:
13 type: elasticsearch
14 elasticsearch:
15 host: jaeger-elasticsearch-master
16 port: 9200
17
18agent:
19 enabled: true
20 daemonset:
21 useHostNetwork: true
22
23collector:
24 enabled: true
25 replicaCount: 2
26 service:
27 type: ClusterIP
28 # Expose Zipkin compatible endpoint for Envoy
29 zipkin:
30 port: 9411
31 resources:
32 requests:
33 cpu: 250m
34 memory: 512Mi
35 limits:
36 cpu: 500m
37 memory: 1Gi
38
39query:
40 enabled: true
41 replicaCount: 1
42 service:
43 type: ClusterIP
44 resources:
45 requests:
46 cpu: 250m
47 memory: 512Mi
48 limits:
49 cpu: 500m
50 memory: 1Gi
51
52# Configure ingress for Jaeger UI
53ingress:
54 enabled: true
55 hosts:
56 - jaeger.local
57 annotations:
58 nginx.ingress.kubernetes.io/rewrite-target: /
59
60# Elasticsearch for trace storage
61elasticsearch:
62 enabled: true
63 replicas: 1
64 minimumMasterNodes: 1
65 resources:
66 requests:
67 cpu: 500m
68 memory: 2Gi
69 limits:
70 cpu: 1000m
71 memory: 4Gi
72 volumeClaimTemplate:
73 storageClassName: standard
74 accessModes: [ "ReadWriteOnce" ]
75 resources:
76 requests:
77 storage: 30Gi
78EOF
79
80# Install Jaeger
81helm install jaeger jaegertracing/jaeger \
82 --namespace monitoring \
83 --values jaeger-values.yaml
84
85# Wait for Jaeger components
86kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=jaeger -n monitoring --timeout=300s
Create comprehensive Grafana dashboards for Consul Connect monitoring.
1# Create ConfigMap with Consul Connect dashboards
2kubectl create configmap consul-connect-dashboards -n monitoring --from-literal=consul-connect-overview.json='
3{
4 "dashboard": {
5 "id": null,
6 "title": "Consul Connect Service Mesh Overview",
7 "tags": ["consul", "service-mesh"],
8 "timezone": "browser",
9 "panels": [
10 {
11 "title": "Service Mesh Request Rate",
12 "type": "graph",
13 "targets": [
14 {
15 "expr": "sum(rate(envoy_http_downstream_rq_total[5m])) by (service)",
16 "legendFormat": "{{service}}"
17 }
18 ],
19 "yAxes": [
20 {
21 "label": "Requests/sec"
22 }
23 ]
24 },
25 {
26 "title": "Service Mesh Error Rate",
27 "type": "graph",
28 "targets": [
29 {
30 "expr": "sum(rate(envoy_http_downstream_rq_xx{envoy_response_code_class=\"5\"}[5m])) by (service) / sum(rate(envoy_http_downstream_rq_total[5m])) by (service)",
31 "legendFormat": "{{service}} 5xx errors"
32 }
33 ]
34 },
35 {
36 "title": "Service Mesh Response Times",
37 "type": "graph",
38 "targets": [
39 {
40 "expr": "histogram_quantile(0.99, sum(rate(envoy_http_downstream_rq_time_bucket[5m])) by (service, le))",
41 "legendFormat": "{{service}} p99"
42 },
43 {
44 "expr": "histogram_quantile(0.95, sum(rate(envoy_http_downstream_rq_time_bucket[5m])) by (service, le))",
45 "legendFormat": "{{service}} p95"
46 }
47 ]
48 },
49 {
50 "title": "Active Connections",
51 "type": "graph",
52 "targets": [
53 {
54 "expr": "sum(envoy_http_downstream_cx_active) by (service)",
55 "legendFormat": "{{service}}"
56 }
57 ]
58 },
59 {
60 "title": "Consul Service Health",
61 "type": "stat",
62 "targets": [
63 {
64 "expr": "sum(consul_catalog_service_count)",
65 "legendFormat": "Total Services"
66 },
67 {
68 "expr": "sum(consul_health_node_failure)",
69 "legendFormat": "Failed Nodes"
70 }
71 ]
72 }
73 ],
74 "refresh": "30s",
75 "time": {
76 "from": "now-1h",
77 "to": "now"
78 }
79 }
80}'
81
82# Create service-specific dashboard
83kubectl create configmap consul-connect-service-details -n monitoring --from-literal=service-details.json='
84{
85 "dashboard": {
86 "title": "Consul Connect Service Details",
87 "panels": [
88 {
89 "title": "Request Volume by Service",
90 "type": "graph",
91 "targets": [
92 {
93 "expr": "sum(rate(envoy_http_downstream_rq_total[5m])) by (envoy_http_conn_manager_prefix)",
94 "legendFormat": "{{envoy_http_conn_manager_prefix}}"
95 }
96 ]
97 },
98 {
99 "title": "Upstream Connection Status",
100 "type": "graph",
101 "targets": [
102 {
103 "expr": "envoy_cluster_upstream_cx_active",
104 "legendFormat": "Active - {{envoy_cluster_name}}"
105 },
106 {
107 "expr": "envoy_cluster_upstream_cx_connect_fail",
108 "legendFormat": "Failed - {{envoy_cluster_name}}"
109 }
110 ]
111 },
112 {
113 "title": "Certificate Expiry",
114 "type": "graph",
115 "targets": [
116 {
117 "expr": "envoy_server_days_until_first_cert_expiring",
118 "legendFormat": "Days until cert expiry"
119 }
120 ]
121 }
122 ]
123 }
124}'
Update your applications to fully utilize the observability stack.
1# dynamic-app-observability.yaml
2apiVersion: apps/v1
3kind: Deployment
4metadata:
5 name: dynamic-app
6 namespace: demo
7 labels:
8 app: dynamic-app
9 version: v1.0.0
10spec:
11 replicas: 2
12 selector:
13 matchLabels:
14 app: dynamic-app
15 template:
16 metadata:
17 labels:
18 app: dynamic-app
19 version: v1.0.0
20 annotations:
21 # Consul Connect annotations
22 "consul.hashicorp.com/connect-inject": "true"
23 "consul.hashicorp.com/connect-service": "dynamic-app"
24 "consul.hashicorp.com/connect-service-port": "8080"
25 "consul.hashicorp.com/connect-service-upstreams": "mysql-server:3306"
26
27 # Enable metrics scraping
28 "consul.hashicorp.com/service-metrics-enabled": "true"
29 "consul.hashicorp.com/service-metrics-path": "/metrics"
30 "consul.hashicorp.com/service-metrics-port": "8080"
31
32 # Prometheus annotations for application metrics
33 "prometheus.io/scrape": "true"
34 "prometheus.io/port": "8080"
35 "prometheus.io/path": "/metrics"
36
37 # Vault Agent Injector annotations
38 "vault.hashicorp.com/agent-inject": "true"
39 "vault.hashicorp.com/agent-inject-status": "update"
40 "vault.hashicorp.com/agent-inject-vault-addr": "https://vault.example.com:8200"
41 "vault.hashicorp.com/role": "dynamic-app-consul"
42 "vault.hashicorp.com/agent-inject-secret-config.ini": "dynamic-app/db/creds/app"
43 "vault.hashicorp.com/agent-inject-template-config.ini": |
44 [DEFAULT]
45 LogLevel = DEBUG
46 Port = 8080
47
48 [DATABASE]
49 Address = 127.0.0.1
50 Port = 3306
51 Database = my_app
52 {{- with secret "dynamic-app/db/creds/app" }}
53 User = {{ .Data.username }}
54 Password = {{ .Data.password }}
55 {{- end }}
56
57 [VAULT]
58 Enabled = True
59 InjectToken = True
60 Namespace =
61 Address = https://vault.example.com:8200
62 KeyPath = dynamic-app/transit
63 KeyName = app
64
65 [OBSERVABILITY]
66 MetricsEnabled = True
67 TracingEnabled = True
68 JaegerEndpoint = http://jaeger-collector.monitoring.svc.cluster.local:14268/api/traces
69 spec:
70 serviceAccountName: dynamic-app-consul
71 containers:
72 - name: dynamic-app
73 image: ghcr.io/infralovers/nomad-vault-mysql:1.0.0
74 ports:
75 - containerPort: 8080
76 name: http
77 env:
78 - name: CONFIG_FILE
79 value: "/vault/secrets/config.ini"
80 - name: VAULT_ADDR
81 value: "https://vault.example.com:8200"
82 # Tracing configuration
83 - name: JAEGER_ENDPOINT
84 value: "http://jaeger-collector.monitoring.svc.cluster.local:14268/api/traces"
85 - name: JAEGER_SERVICE_NAME
86 value: "dynamic-app"
87 - name: JAEGER_SAMPLER_TYPE
88 value: "const"
89 - name: JAEGER_SAMPLER_PARAM
90 value: "1"
91 resources:
92 requests:
93 cpu: 256m
94 memory: 256Mi
95 limits:
96 cpu: 500m
97 memory: 512Mi
98 livenessProbe:
99 httpGet:
100 path: /health
101 port: 8080
102 initialDelaySeconds: 60
103 periodSeconds: 10
104 readinessProbe:
105 httpGet:
106 path: /health
107 port: 8080
108 initialDelaySeconds: 30
109 periodSeconds: 5
Create ServiceMonitor resources for automatic Prometheus scraping.
1# consul-connect-service-monitors.yaml
2apiVersion: monitoring.coreos.com/v1
3kind: ServiceMonitor
4metadata:
5 name: consul-connect-envoy-sidecars
6 namespace: monitoring
7 labels:
8 app: consul-connect
9 component: envoy-sidecar
10spec:
11 selector:
12 matchLabels:
13 service: consul-connect-proxy
14 endpoints:
15 - port: envoy-metrics
16 path: /stats/prometheus
17 interval: 30s
18 scrapeTimeout: 10s
19 namespaceSelector:
20 matchNames:
21 - demo
22 - default
23
24---
25apiVersion: monitoring.coreos.com/v1
26kind: ServiceMonitor
27metadata:
28 name: consul-external-servers
29 namespace: monitoring
30 labels:
31 app: consul
32 component: server
33spec:
34 selector:
35 matchLabels:
36 app: consul-external
37 endpoints:
38 - port: http
39 path: /v1/agent/metrics
40 params:
41 format: ["prometheus"]
42 interval: 30s
43
44---
45apiVersion: v1
46kind: Service
47metadata:
48 name: consul-external
49 namespace: monitoring
50 labels:
51 app: consul-external
52spec:
53 type: ExternalName
54 externalName: consul.example.com
55 ports:
56 - port: 8500
57 name: http
Create alerting rules for proactive monitoring of your service mesh.
1# consul-connect-alerts.yaml
2apiVersion: monitoring.coreos.com/v1
3kind: PrometheusRule
4metadata:
5 name: consul-connect-alerts
6 namespace: monitoring
7 labels:
8 app: consul-connect
9 component: alerting
10spec:
11 groups:
12 - name: consul-connect.rules
13 rules:
14 - alert: ConsulConnectServiceDown
15 expr: up{job="consul-connect-envoy-sidecar"} == 0
16 for: 5m
17 labels:
18 severity: critical
19 annotations:
20 summary: "Consul Connect service is down"
21 description: "Service {{ $labels.instance }} has been down for more than 5 minutes."
22
23 - alert: ConsulConnectHighErrorRate
24 expr: |
25 (
26 sum(rate(envoy_http_downstream_rq_xx{envoy_response_code_class="5"}[5m])) by (service) /
27 sum(rate(envoy_http_downstream_rq_total[5m])) by (service)
28 ) * 100 > 10
29 for: 5m
30 labels:
31 severity: warning
32 annotations:
33 summary: "High error rate in Consul Connect service"
34 description: "Service {{ $labels.service }} has error rate of {{ $value }}% over the last 5 minutes."
35
36 - alert: ConsulConnectHighLatency
37 expr: |
38 histogram_quantile(0.99,
39 sum(rate(envoy_http_downstream_rq_time_bucket[5m])) by (service, le)
40 ) > 2000
41 for: 10m
42 labels:
43 severity: warning
44 annotations:
45 summary: "High latency in Consul Connect service"
46 description: "Service {{ $labels.service }} has 99th percentile latency of {{ $value }}ms."
47
48 - alert: ConsulServiceUnhealthy
49 expr: consul_catalog_service_count{state="critical"} > 0
50 for: 5m
51 labels:
52 severity: critical
53 annotations:
54 summary: "Consul services are unhealthy"
55 description: "{{ $value }} Consul services are in critical state."
56
57 - alert: ConsulConnectCertificateExpiringSoon
58 expr: envoy_server_days_until_first_cert_expiring < 7
59 for: 1h
60 labels:
61 severity: warning
62 annotations:
63 summary: "Consul Connect certificate expiring soon"
64 description: "Certificate will expire in {{ $value }} days."
Monitor Consul Connect health checks and service registration:
1# Create health check monitoring script
2kubectl create configmap health-check-monitor -n demo --from-literal=monitor.sh='
3#!/bin/bash
4while true; do
5 echo "=== Consul Service Status ==="
6 curl -s "https://consul.example.com:8500/v1/agent/services" | jq .
7
8 echo "=== Connect Proxy Status ==="
9 curl -s localhost:19000/clusters | grep -E "(health_flags|priority|max_requests)"
10
11 echo "=== Certificate Information ==="
12 curl -s localhost:19000/certs | jq .
13
14 sleep 30
15done'
16
17# Run health monitoring
18kubectl run health-monitor --image=curlimages/curl:latest -n demo --rm -it --restart=Never -- sh -c "
19 apk add --no-cache jq bash
20 while true; do
21 echo '=== Service Mesh Health Check ==='
22 curl -s localhost:19000/ready && echo 'Envoy Ready' || echo 'Envoy Not Ready'
23 curl -s localhost:19000/stats | grep server.state
24 sleep 30
25 done
26"
Service Mesh Level:
rate(envoy_http_downstream_rq_total[5m])rate(envoy_http_downstream_rq_xx[5m])histogram_quantile(0.95, envoy_http_downstream_rq_time_bucket)envoy_http_downstream_cx_activeInfrastructure Level:
Implementing comprehensive observability for your Consul Connect service mesh in Kubernetes provides crucial insights into service performance, security, and health. With Prometheus metrics, Grafana dashboards, and Jaeger tracing, you can:
This observability stack builds upon our secure service mesh foundation, providing the visibility needed to operate Consul Connect reliably in production Kubernetes environments.
Key takeaways:
Start monitoring your Consul Connect service mesh today and gain the insights needed to maintain high availability, performance, and security in your Kubernetes environment.
For implementing multi-platform service mesh architectures that span Kubernetes, Nomad, bare metal, and VMs, see our comprehensive guide on Multi-Platform Service Mesh: Connecting Kubernetes, Nomad, Bare Metal, and VMs with Consul Connect.
You are interested in our courses or you simply have a question that needs answering? You can contact us at anytime! We will do our best to answer all your questions.
Contact us