Skip to main content

Observability

This page focuses solely on collecting and visualizing metrics for Semantic Router using Prometheus and Grafana—deployment method (Docker Compose vs Kubernetes) is covered in docker-quickstart.md.


1. Metrics & Endpoints Summary

ComponentEndpointNotes
Router metrics:9190/metricsPrometheus format (flag: --metrics-port)
Router health (future probe):8080/healthHTTP readiness/liveness candidate
Envoy metrics (optional):19000/stats/prometheusIf you enable Envoy

Dashboard JSON: deploy/llm-router-dashboard.json.

Primary source file exposing metrics: src/semantic-router/cmd/main.go (uses promhttp).


2. Docker Compose Observability

Compose bundles: prometheus, grafana, semantic-router, (optional) envoy, mock-vllm.

Key files:

  • config/prometheus.yaml
  • config/grafana/datasource.yaml
  • config/grafana/dashboards.yaml
  • deploy/llm-router-dashboard.json

Start (with testing profile example):

CONFIG_FILE=/app/config/config.testing.yaml docker compose --profile testing up --build

Access:

Expected Prometheus targets:

  • semantic-router:9190
  • envoy-proxy:19000 (optional)

3. Kubernetes Observability

After applying deploy/kubernetes/, you get services:

  • semantic-router (gRPC)
  • semantic-router-metrics (metrics 9190)

3.1 Prometheus Operator (ServiceMonitor)

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: semantic-router
namespace: semantic-router
spec:
selector:
matchLabels:
app: semantic-router
service: metrics
namespaceSelector:
matchNames: ["semantic-router"]
endpoints:
- port: metrics
interval: 15s
path: /metrics

Ensure the metrics Service carries a label like service: metrics. (It does in the provided manifests.)

3.2 Plain Prometheus Static Scrape

scrape_configs:
- job_name: semantic-router
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
regex: semantic-router-metrics
action: keep

3.3 Port Forward for Spot Checks

kubectl -n semantic-router port-forward svc/semantic-router-metrics 9190:9190
curl -s localhost:9190/metrics | head

3.4 Grafana Dashboard Provision

If using kube-prometheus-stack or a Grafana sidecar:

apiVersion: v1
kind: ConfigMap
metadata:
name: semantic-router-dashboard
namespace: semantic-router
labels:
grafana_dashboard: "1"
data:
llm-router-dashboard.json: |
# Paste JSON from deploy/llm-router-dashboard.json

Otherwise import the JSON manually in Grafana UI.


4. Key Metrics (Sample)

MetricTypeDescription
llm_category_classifications_countcounterNumber of category classification operations
llm_model_completion_tokens_totalcounterTokens emitted per model
llm_model_routing_modifications_totalcounterModel switch / routing adjustments
llm_model_completion_latency_secondshistogramCompletion latency distribution
process_cpu_seconds_total / process_resident_memory_bytesstandardRuntime resource usage

Use typical PromQL patterns:

rate(llm_model_completion_tokens_total[5m])
histogram_quantile(0.95, sum by (le) (rate(llm_model_completion_latency_seconds_bucket[5m])))

5. Troubleshooting

SymptomLikely CauseCheckFix
Target DOWN (Docker)Service name mismatchPrometheus /targetsEnsure semantic-router container running
Target DOWN (K8s)Label/selectors mismatchkubectl get ep semantic-router-metricsAlign labels or ServiceMonitor selector
No new tokens metricsNo trafficGenerate chat/completions via EnvoySend test requests
Dashboard emptyDatasource URL wrongGrafana datasource settingsPoint to http://prometheus:9090 (Docker) or cluster Prometheus
Large 5xx spikesBackend model unreachableRouter logsVerify vLLM endpoints configuration