Voice Platform Reporting Server metrics and alerts

From Genesys Documentation
Jump to: navigation, search
This topic is part of the manual Genesys Voice Platform Private Edition Guide for version Current of Genesys Voice Platform.


Find the metrics Voice Platform Reporting Server exposes and the alerts defined for Voice Platform Reporting Server.

Service CRD or annotations? Port Endpoint/Selector Metrics update interval
Voice Platform Reporting Server ServiceMonitor / PodMonitor 9116 Metrics endpoint:
curl -v "http://<RS_POD_IP>:9116/snmp?target=127.0.0.1%3A1161&module=if_mib"

Enabling metrics:

Service/Pod Monitoring Settings

prometheus:
  enabled: true
  metric:
   port: 9116

Enable for Prometheus operator

podMonitor:
  enabled: true
  metric:
   path: /snmp
   module: [ if_mib ]
   target: [ 127.0.0.1:1161 ]

monitoring:

  • prometheusRulesEnabled: true
  •   grafanaEnabled: true

monitor:

  •   monitorName: gvp-monitoring

See details about:

Metrics[edit source]

Metric and description Metric details Indicator of
rsQueueName

The name of the message queue

(rsQueueName{gvpConfigDBID="172",rsQueueIndex="1",rsQueueName="rs.queue.remote_cdr.rm"}

Unit: DisplayString

Type: Gauge
Label:
Sample value: 1

Information
rsQueueSize

Used to get the size of the message queue.

(rsQueueSize{gvpConfigDBID="172",rsQueueIndex="1"} )

Unit: Unsigned32

Type: Gauge
Label:
Sample value: 1

Traffic
rsDequeueCount

Used to get dequeue value of the message queue.

(rsDequeueCount{gvpConfigDBID="172",rsQueueIndex="1"} )

Unit: Counter64

Type: Counter
Label:
Sample value: 0

Traffic
rsEnqueueCount

Used to get enqueue value of the message queue.

(rsEnqueueCount{gvpConfigDBID="172",rsQueueIndex="1"})

Unit: Counter64

Type: Counter
Label:
Sample value: 4

Traffic
rsUptime

The time (in hundredths of a second) since the RS was started.

Unit: Unsigned32

Type: Gauge
Label:
Sample value: 30619972

Information


Alerts[edit source]

The following alerts are defined for Voice Platform Reporting Server.

Alert Severity Description Based on Threshold
PodStatusNotReady CRITICAL The trigger will flag an alarm when RS pod status is Not ready for 30 mins and this will be controlled through override-value.yaml file. kube_pod_status_ready 30mins


ContainerRestartedRepeatedly CRITICAL The trigger will flag an alarm when the RS or RS SNMP container gets restarted 5 or more times within 15 mins kube_pod_container_status_restarts_total 15mins


InitContainerFailingRepeatedly CRITICAL The trigger will flag an alarm when the RS init container gets failed 5 or more times within 15 mins kube_pod_init_container_status_restarts_total 15mins


ContainerCPUreached80percent HIGH The trigger will flag an alarm when the RS container CPU utilization goes beyond 80% for 15 mins container_cpu_usage_seconds_total, container_spec_cpu_quota, container_spec_cpu_period 15mins


ContainerMemoryUsage80percent HIGH The trigger will flag an alarm when the RS container Memory utilization goes beyond 80% for 15 mins container_memory_usage_bytes, kube_pod_container_resource_limits_memory_bytes 15mins


RSQueueSizeCritical HIGH The trigger will flag an alarm when RS JMS message queue size goes beyond 15000 (3GB approx. backlog) for 15 mins rsQueueSize 15mins


PVC50PercentFilled HIGH This trigger will flag an alarm when the RS PVC size is 50% filled


kubelet_volume_stats_used_bytes, kubelet_volume_stats_capacity_bytes 15mins


PVC80PercentFilled CRITICAL This trigger will flag an alarm when the RS PVC size is 80% filled


kubelet_volume_stats_used_bytes, kubelet_volume_stats_capacity_bytes 5mins