Difference between revisions of "VM/Current/VMPEGuide/VoiceDialPlanServiceMetrics"
(Published) |
(Published) |
||
Line 11: | Line 11: | ||
|Type=gauge | |Type=gauge | ||
|Unit=N/A | |Unit=N/A | ||
− | |MetricDescription=Aggregated health level of the dialplan node for dependent services such as Redis and Envoy sidecar connection: | + | |MetricDescription=Aggregated health level of the dialplan node for dependent services such as Redis and the Envoy sidecar connection: |
-1 – fail<br /> | -1 – fail<br /> |
Revision as of 22:17, February 23, 2022
Find the metrics Dial Plan Service exposes and the alerts defined for Dial Plan Service.
Service | CRD or annotations? | Port | Endpoint/Selector | Metrics update interval |
---|---|---|---|---|
Dial Plan Service | Supports both CRD and annotations | 8800 | http://<pod-ipaddress>:8800/metrics | 30 seconds |
See details about:
Metrics[edit source]
You can query Prometheus directly to see all the metrics that the Voice Dial Plan Service exposes. The following metrics are likely to be particularly useful. Genesys does not commit to maintain other currently available Dial Plan Service metrics not documented on this page.
Metric and description | Metric details | Indicator of |
---|---|---|
dialplan_ Aggregated health level of the dialplan node for dependent services such as Redis and the Envoy sidecar connection: -1 – fail |
Unit: N/A Type: gauge |
Health |
dialplan_ Current Redis connection state: 0 – disconnected |
Unit: N/A Type: gauge |
Health |
dialplan_ Number of dialplan requests received. |
Unit: N/A Type: counter |
Traffic |
dialplan_ The number of Dial Plan failure responses. |
Unit: N/A Type: counter |
Traffic |
dialplan_ Dialplan request processing duration histogram, in ms. |
Unit: milliseconds Type: histogram |
Latency |
dialplan_ Redis fetch latency, measured in milliseconds. |
Unit: milliseconds Type: histogram |
Latency |
Alerts[edit source]
The following alerts are defined for Dial Plan Service.
Alert | Severity | Description | Based on | Threshold |
---|---|---|---|---|
DialPlan processing time > 0.5 seconds | Warning | Actions:
|
dialplan_response_time | When the latency for 95% of the dial plan messages is more than 0.5 seconds for a duration of 5 minutes, then this warning alarm is raised for the {{ $labels.container }}.
|
DialPlan processing time > 2 seconds | Critical | Actions:
|
dialplan_response_time | If the latency for 95% of the dial plan messages is more than 2 seconds for a duration of 5 minutes, then this warning alarm is raised for the {{ $labels.container }}.
|
Aggregated service health failing for 5 minutes | Critical | Actions:
|
dialplan_health_level | Dependent services or the Envoy sidecar is not available for 5 minutes in the pod {{ $labels.pod }}.
|
Redis disconnected for 5 minutes | Warning | Actions:
|
redis_state | Redis is not available for the pod {{ $labels.pod }} for 5 minutes.
|
Redis disconnected for 10 minutes | Critical | Actions:
|
redis_state | Redis is not available for the pod {{ $labels.pod }} for 10 minutes.
|
Pod Failed | Warning | Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} failed.
|
Pod Unknown state | Warning | Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} is in Unknown state for 5 minutes.
|
Pod Pending state | Warning | Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} is in the Pending state for 5 minutes.
|
Pod Not ready for 10 minutes | Critical | Actions:
|
kube_pod_status_ready | Pod {{ $labels.pod }} is in the NotReady state for 10 minutes.
|
Pod memory greater than 65% | Warning | High memory usage for pod {{ $labels.pod }}.
Actions:
|
container_memory_working_set_bytes, kube_pod_container_resource_limits | Container {{ $labels.container }} memory usage exceeded 65% for 5 minutes.
|
Pod memory greater than 80% | Critical | Critical memory usage for pod {{ $labels.pod }}.
Actions:
|
container_memory_working_set_bytes, kube_pod_container_resource_limits | Container {{ $labels.container }} memory usage exceeded 80% for 5 minutes.
|
Pod CPU greater than 65% | Warning | High CPU load for pod {{ $labels.pod }}.
Actions:
|
container_cpu_usage_seconds_total, kube_pod_container_resource_limits | Container {{ $labels.container }} CPU usage exceeded 65% for 5 minutes.
|
Pod CPU greater than 80% | Critical | Critical CPU load for pod {{ $labels.pod }}.
Actions:
|
container_cpu_usage_seconds_total, kube_pod_container_resource_limits | Container {{ $labels.container }} CPU usage exceeded 80% for 5 minutes. |