Voice Registrar Service metrics and alerts
Find the metrics Voice Registrar Service exposes and the alerts defined for Voice Registrar Service.
Service | CRD or annotations? | Port | Endpoint/Selector | Metrics update interval |
---|---|---|---|---|
Voice Registrar Service | Supports both CRD and annotations | 11500 | http://<pod-ipaddress>:11500/metrics | 30 seconds |
See details about:
Metrics[edit source]
Voice Registrar Service exposes Genesys-defined, Registrar Service–specific metrics as well as some standard Kafka metrics. You can query Prometheus directly to see all the metrics that the Registrar Service exposes. The following metrics are likely to be particularly useful. Genesys does not commit to maintain other currently available Voice Registrar Service metrics not documented on this page.
Metric and description | Metric details | Indicator of |
---|---|---|
registrar_ Number of registrations. |
Unit: N/A Type: counter |
Traffic |
registrar_ Health level of the registrar node: -1 – fail |
Unit: N/A Type: gauge |
Errors |
registrar_ Time taken to process the request (ms). |
Unit: milliseconds Type: histogram |
Latency |
registrar_ Number of active SIP registrations. |
Unit: N/A Type: gauge |
Traffic |
kafka_ Consumer latency is the time difference between when the message is produced and when the message is consumed. That is, the time when the consumer received the message minus the time when the producer produced the message. |
Unit: Type: histogram |
Latency |
kafka_ Current Kafka consumer connection state: 0 – disconnected |
Unit: Type: gauge |
Alerts[edit source]
The following alerts are defined for Voice Registrar Service.
Alert | Severity | Description | Based on | Threshold |
---|---|---|---|---|
Kafka events latency is too high | Warning | Actions:
|
kafka_consumer_latency_bucket | Latency for more than 5% of messages is more than 0.5 seconds for topic {{ $labels.topic }}.
|
Too many Kafka consumer failed health checks | Warning | Actions:
|
kafka_consumer_error_total | Health check failed more than 10 times in 5 minutes for Kafka consumer for topic {{$labels.topic}}.
|
Too many Kafka consumer request timeouts | Warning | Actions:
|
kafka_consumer_error_total | There were more than 10 request timeouts within 5 minutes for the Kafka consumer for topic {{$labels.topic}}.
|
Too many Kafka consumer crashes | Critical | Actions:
|
kafka_consumer_error_total | There were more than 3 Kafka consumer crashes within 5 minutes for service {{ $labels.container }}.
|
Kafka not available | Critical | Kafka is not available for pod {{ $labels.pod }}.
Actions:
|
kafka_producer_state, kafka_consumer_state | Kafka is not available for pod {{ $labels.pod }} for 5 consecutive minutes.
|
Redis disconnected for 5 minutes | Warning | Actions:
|
redis_state | Redis is not available for pod {{ $labels.pod }} for 5 minutes.
|
Redis disconnected for 10 minutes | Critical | Actions:
|
redis_state | Redis is not available for pod {{ $labels.pod }} for 10 minutes.
|
Pod Failed | Warning | Pod {{ $labels.pod }} failed.
Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} is in Failed state.
|
Pod Unknown state | Warning | Pod {{ $labels.pod }} is in Unknown state.
Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} is in Unknown state for 5 minutes.
|
Pod Pending state | Warning | Pod {{ $labels.pod }} is in Pending state.
Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} is in Pending state for 5 minutes.
|
Pod Not ready for 10 minutes | Critical | Actions:
|
kube_pod_status_ready | Pod {{ $labels.pod }} is in the NotReady state for 10 minutes.
|
Container restarted repeatedly | Critical | Actions:
|
kube_pod_container_status_restarts_total | Container {{ $labels.container }} was restarted 5 or more times within 15 minutes.
|
Pod CPU greater than 65% | Warning | High CPU load for pod {{ $labels.pod }}.
Actions:
|
container_cpu_usage_seconds_total, kube_pod_container_resource_limits | Container {{ $labels.container }} CPU usage exceeded 65% for 5 minutes.
|
Pod memory greater than 65% | Warning | High memory usage for pod {{ $labels.pod }}.
Actions:
|
container_memory_working_set_bytes, kube_pod_container_resource_limits | Container {{ $labels.container }} memory usage exceeded 65% for 5 minutes.
|
Pod memory greater than 80% | Critical | Critical memory usage for pod {{ $labels.pod }}.
Actions:
|
container_memory_working_set_bytes, kube_pod_container_resource_limits | Container {{ $labels.container }} memory usage exceeded 80% for 5 minutes.
|
Pod CPU greater than 80% | Critical | Critical CPU load for pod {{ $labels.pod }}.
Actions:
|
container_cpu_usage_seconds_total, kube_pod_container_resource_limits | Container {{ $labels.container }} CPU usage exceeded 80% for 5 minutes. |