Voice RQ Service metrics and alerts
Find the metrics Voice RQ Service exposes and the alerts defined for Voice RQ Service.
Service | CRD or annotations? | Port | Endpoint/Selector | Metrics update interval |
---|---|---|---|---|
Voice RQ Service | Supports both CRD and annotations | 12000 | http://<pod-ipaddress>:12000/metrics | 30 seconds |
See details about:
Metrics[edit source]
You can query Prometheus directly to see all the metrics that the Voice RQ Service exposes. The following metrics are likely to be particularly useful. Genesys does not commit to maintain other currently available Voice RQ Service metrics not documented on this page.
Metric and description | Metric details | Indicator of |
---|---|---|
rqnode_ Number of clients connected. |
Unit: N/A Type: gauge |
Traffic |
rqnode_ Number of active streams present. |
Unit: N/A Type: gauge |
Traffic |
rqnode_ Number of XREAD requests received. |
Unit: N/A Type: counter |
Traffic |
rqnode_ Number of XADD requests received. |
Unit: N/A Type: counter |
Traffic |
rqnode_ Current Redis connection state. |
Unit: N/A Type: gauge |
Errors |
rqnode_ The number of Redis disconnects that occurred for the RQ node. |
Unit: Type: counter |
Errors |
rqnode_ Number of errors received from Consul during the leadership process. |
Unit: N/A Type: counter |
Errors |
rqnode_ Service master role is active. |
Unit: N/A Type: gauge |
Saturation |
rqnode_ Service backup role is active. |
Unit: N/A Type: gauge |
Saturation |
rqnode_ RQ latency; that is, the time duration between when an event is added to Redis and when it's read via XREAD. |
Unit: Type: histogram |
Latency |
rqnode_ RQ latency; that is, the time duration between when a message is received and when it's added to the list. |
Unit: Type: histogram |
Latency |
rqnode_ Latency caused by Redis read/write. |
Unit: Type: histogram |
Latency |
Alerts[edit source]
The following alerts are defined for Voice RQ Service.
Alert | Severity | Description | Based on | Threshold |
---|---|---|---|---|
Number of Redis streams is too high | Warning | Too many active sessions.
Actions:
|
rqnode_streams | More than 10000 active streams running.
|
Redis disconnected for 5 minutes | Warning | Redis is not available for the pod {{ $labels.pod }}.
Actions:
|
redis_state | Redis is not available for the pod {{ $labels.pod }} for 5 minutes.
|
Redis disconnected for 10 minutes | Critical | Redis is not available for the pod {{ $labels.pod }}.
Actions:
|
redis_state | Redis is not available for the pod {{ $labels.pod }} for 10 minutes.
|
Pod failed | Warning | Pod {{ $labels.pod }} failed.
Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} is in Failed state.
|
Pod Unknown state | Warning | Pod {{ $labels.pod }} in Unknown state.
Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} in Unknown state for 5 minutes.
|
Pod Pending state | Warning | Pod {{ $labels.pod }} is in the Pending state.
Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} is in the Pending state for 5 minutes.
|
Pod not ready for 10 minutes | Critical | Pod {{ $labels.pod }} in NotReady state.
Actions:
|
kube_pod_status_ready | Pod {{ $labels.pod }} in NotReady state for 10 minutes.
|
Container restored repeatedly | Critical | Container {{ $labels.container }} was repeatedly restarted.
Actions:
|
kube_pod_container_status_restarts_total | Container {{ $labels.container }} was restarted 5 or more times within 15 minutes.
|
Pod memory greater than 65% | Warning | High memory usage for pod {{ $labels.pod }}.
Actions:
|
container_memory_working_set_bytes, kube_pod_container_resource_requests_memory_bytes | Container {{ $labels.container }} memory usage exceeded 65% for 5 minutes.
|
Pod memory greater than 80% | Critical | Critical memory usage for pod {{ $labels.pod }}.
Actions:
|
container_memory_working_set_bytes, kube_pod_container_resource_requests_memory_bytes | Container {{ $labels.container }} memory usage exceeded 80% for 5 minutes.
|
Pod CPU greater than 65% | Warning | High CPU load for pod {{ $labels.pod }}.
Actions:
|
container_cpu_usage_seconds_total, container_spec_cpu_period | Container {{ $labels.container }} CPU usage exceeded 65% for 5 minutes.
|
Pod CPU greater than 80% | Critical | Critical CPU load for pod {{ $labels.pod }}.
Actions:
|
container_cpu_usage_seconds_total, container_spec_cpu_period | Container {{ $labels.container }} CPU usage exceeded 80% for 5 minutes. |