Prometheus Metrics for the Wallet Service

The wallet service can expose an HTTP metrics API in Prometheus format. These application-level metrics can be queried with curl, a Prometheus collector, or Grafana dashboards.

📘
Note:
Metrics are off by default. Enable them by setting telemetry.enable_prom: true in wallet.yaml. When enabled, the metrics server listens on port 9090 by default (configurable via telemetry.prom_port).

The following metrics are exposed when Prometheus is enabled:

Metric Name	Dimensions	Description
wallet_http_inbound_request_total	path, status	Total inbound HTTP requests.
wallet_http_inbound_request_duration_seconds	path, status	Inbound request duration (histogram; Prometheus exposes _count, _sum, _bucket).
wallet_http_inbound_client_ip_requests_total	ip_addr	Inbound requests by client IP.
wallet_http_outbound_request_in_flight	—	Current number of outbound HTTP requests in flight.
wallet_http_outbound_request_duration_seconds	—	Outbound request duration (histogram).
wallet_http_outbound_request_total	code, method	Total outbound HTTP requests by response code and method.
wallet_http_outbound_request_dns_duration_seconds	event	DNS resolution latency (event: dns_start, dns_done).
wallet_http_outbound_request_tls_duration_seconds	event	TLS handshake latency (event: tls_handshake_start, tls_handshake_done).
gateway_listener_messages_total	event_type	Total gateway webhook messages by event type.

In addition, when Prometheus is enabled the registry includes the standard Go and process collectors (e.g. go_*, process_*).

Health Status for the Wallet Stack Services

To maintain smooth operations, each service within the deployment provides a /health endpoint. This endpoint is responsible for indicating the health status of the service through HTTP status codes. Follow the steps outlined below to ensure the reliability of the system.

1. Enable Micro-Service Health Monitoring

For optimal micro-service performance, consistently check the /health endpoint, which reflects the current status.

2. Decode Status Codes

The /health endpoint status codes reveal service health. 200 - OK indicates success; others like 503 - service unavailable suggest possible issues.

3. Set up Instant Alerts

Arrange to receive notifications when /health status codes deviate from 200 for rapid issue resolution.

4. Apply Auto-Restart

When a service repeatedly fails health checks, trigger an automatic restart. Fine-tune the number of failures needed to invoke this to ensure system stability.

5. Connect Health Checks and Logs

Link health checks with logs for additional failure cause insight, aiding in swift problems resolution.

🗣️We Are Here to Help!

Please contact us via email or support chat if you encounter an issue, bug, or need assistance. Don't forget to include any relevant details about the problem. To request a wallet form and Institutional Vault Approver form, please click here or contact our sales team.