Config Service metrics and alerts
Find the metrics Config Service exposes and the alerts defined for Config Service.
Service | CRD or annotations? | Port | Endpoint/Selector | Metrics update interval |
---|---|---|---|---|
Config Service | Supports both CRD and annotations | 9100 | http://<pod-ipaddress>:9100/metrics | 30 seconds |
See details about:
Metrics[edit source]
You can query Prometheus directly to see all the metrics that the Voice Config Service exposes. The following metrics are likely to be particularly useful. Genesys does not commit to maintain other currently available Config Service metrics not documented on this page.
Metric and description | Metric details | Indicator of |
---|---|---|
config_ Number of device responses for each request. |
Unit: N/A Type: counter |
Traffic |
config_ Number of Tenant responses for each request. |
Unit: N/A Type: counter |
Traffic |
config_ Number of Get responses for each request. |
Unit: N/A Type: counter |
Traffic |
config_ Number of agent responses for each request. |
Unit: N/A Type: counter |
Traffic |
config_ Current Redis connection state: -1 – error |
Unit: N/A Type: gauge |
Errors |
service_ Displays the version of Voice Config Service that is currently running. In the case of this metric, the labels provide the important information. The metric value is always 1 and does not provide any information. |
Unit: N/A Type: gauge |
|
config_ Health level of the config node: -1 – error |
Unit: N/A Type: gauge |
Errors |
config_ Generic error during health check. |
Unit: N/A Type: gauge |
Alerts[edit source]
The following alerts are defined for Config Service.
Alert | Severity | Description | Based on | Threshold |
---|---|---|---|---|
Redis disconnected for 5 minutes | Warning | Actions:
|
redis_state | Redis is not available for pod {{ $labels.pod }} for 5 minutes.
|
Redis disconnected for 10 minutes | Critical | Actions:
|
redis_state | Redis is not available for the pod {{ $labels.pod }} for 10 minutes.
|
Pod Failed | Warning | Actions:
|
kube_pod_status_phase | Pod failed {{ $labels.pod }}.
|
Pod Unknown state | Warning | Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} is in Unknown state for 5 minutes.
|
Pod Pending state | Warning | Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} is in Pending state for 5 minutes.
|
Pod Not ready for 10 minutes | Critical | Actions:
|
kube_pod_status_ready | Pod {{ $labels.pod }} is in NotReady state for 10 minutes.
|
Container restarted repeatedly | Critical | Actions:
|
kube_pod_container_status_restarts_total | Container {{ $labels.container }} was restarted 5 or more times within 15 minutes.
|
Pod memory greater than 65% | Warning | High memory usage for pod {{ $labels.pod }}.
Actions:
|
container_memory_working_set_bytes, kube_pod_container_resource_requests_memory_bytes | Container {{ $labels.container }} memory usage exceeded 65% for 5 minutes.
|
Pod memory greater than 80% | Critical | Critical memory usage for pod {{ $labels.pod }}.
Actions:
|
container_memory_working_set_bytes, kube_pod_container_resource_requests_memory_bytes | Container {{ $labels.container }} memory usage exceeded 80% for 5 minutes.
|
Pod CPU greater than 65% | Warning | High CPU load for pod {{ $labels.pod }}.
Actions:
|
container_cpu_usage_seconds_total, container_spec_cpu_period | Container {{ $labels.container }} CPU usage exceeded 65% for 5 minutes.
|
Pod CPU greater than 80% | Critical | Critical CPU load for pod {{ $labels.pod }}.
Actions:
|
container_cpu_usage_seconds_total, container_spec_cpu_period | Container {{ $labels.container }} CPU usage exceeded 80% for 5 minutes. |