Voice Platform Resource Manager metrics and alerts

From Genesys Documentation
Jump to: navigation, search
This topic is part of the manual Genesys Voice Platform Private Edition Guide for version Current of Genesys Voice Platform.


Find the metrics Voice Platform Resource Manager exposes and the alerts defined for Voice Platform Resource Manager.

Service CRD or annotations? Port Endpoint/Selector Metrics update interval
Voice Platform Resource Manager ServiceMonitor / PodMonitor 9116,

8200

Metrics endpoints:
curl -v "http://<RM_POD_IP>:9116/snmp?target=127.0.0.1%3A1161&module=if_mib"
curl -v "http://<RM_POD_IP>:8200/log

Enable metrics:

Service/Pod Monitoring Settings
prometheus:
  enabled: true
  metric:
   port: 9116
  log:
   port: 8200
Enable for Prometheus operator
podMonitor:
  enabled: true
  metric:
   path: /snmp
   module: [ if_mib ]
   target: [ 127.0.0.1:1161 ]
  log:
   path: /log
monitoring:
  • prometheusRulesEnabled: true
  • grafanaEnabled: true

monitor:

  • monitorName: gvp-monitoring
  • logFilePrefixName: RM

See details about:

Metrics[edit source]

Metric and description Metric details Indicator of
rmTotal5xxInviteSent

Number of 5xx that were received for INVITE sent by RM

Unit: Unsigned32

Type: Gauge
Label:
Sample value: 514

Error
rmTotal4xxInviteSent

Number of 4xx that were received for INVITE sent by RM

Unit: Unsigned32

Type: Gauge
Label:
Sample value: 10

Error
rmPRStatus

GVP_RM_PhysicalResourceTable.Status: Current state of the resource

(rmPRStatus{gvpConfigDBID="174",rmPRName="mcp-10.206.5.89",rmPRStatus="AVAILABLE"})

Unit: DisplayString

Type: Gauge
Label:
Sample value: 1

Information
rmPRActiveCalls

GVP_RM_PhysicalResourceTable.ActiveCalls: Number of calls that currently are handled by the resource.

(rmPRActiveCalls{gvpConfigDBID="174",rmPRName="mcp-10.206.5.89"})

Unit: Unsigned32

Type: Gauge
Label:
Sample value:

Traffic
rmPRTotalCalls

GVP_RM_PhysicalResourceTable.TotalCalls: Total number of calls that have been handled by this resource since it was connected to RM. (rmPRTotalCalls{gvpConfigDBID="174",rmPRName="mcp-10.206.5.89"})

Unit: Unsigned32

Type: Gauge
Label:
Sample value: 150

Traffic
rmTenantCurrentInboundCalls

Number of active inbound calls that use this tenant.

Unit: Unsigned32

Type: Gauge
Label:
Sample value: 2

Traffic
rmTenantPeakCalls

Maximum number of concurrent calls to this Tenant since it became active

Unit:

Type: Gauge
Label:
Sample value: 10

Traffic


Alerts[edit source]

The following alerts are defined for Voice Platform Resource Manager.

Alert Severity Description Based on Threshold
RM Service Down CRITICAL RM pods are not in ready state and RM service is not available kube_pod_container_status_running 0


InitContainerFailingRepeatedly CRITICAL The trigger will flag an alarm when the RM init container gets failed 5 or more times within 15 mins. kube_pod_init_container_status_restarts_total 15 mins


ContainerRestartedRepeatedly CRITICAL The trigger will flag an alarm when the RM or RM SNMP container gets restarted 5 or more times within 15 mins kube_pod_container_status_restarts_total 15 mins


PodStatusNotReady CRITICAL The trigger will flag an alarm when RM pod status is Not ready for 30 mins and this will be controlled by override-value.yaml. kube_pod_status_ready 30mins


RMTotal5XXErrorForINVITE HIGH The RM mib counter stats will be collected for every 30 seconds and if the mib counter total5xxInviteSent increments from its previous value by 5 within 5 minutes the trigger will flag an alarm. rmTotal5xxInviteSent 5 mins


RMTotal4XXErrorForINVITE MEDIUM The RM mib counter stats will be collected for every 60 seconds and if the mib counter total4xxInviteSent increments from its previous value by 10 within 60 seconds the trigger will flag an alarm. rmTotal4xxInviteSent 1min


RMInterNodeConnectivityBroken HIGH Inter-node connectivity between RM nodes is lost for 5mins. gvp_rm_log_parser_warn_total 5 mins


RMConfigServerConnectionLost HIGH RM lost connection to GVP Configuration Server for 5mins. gvp_rm_log_parser_warn_total 5 mins


RMSocketInterNodeError HIGH RM Inter node Socket Error for 5mins. gvp_rm_log_parser_eror_total 5mins


ContainerCPUreached80percentForRM0 HIGH The trigger will flag an alarm when the RM container CPU utilization goes beyond 80% for 15 mins container_cpu_usage_seconds_total, container_spec_cpu_quota, container_spec_cpu_period 15mins


ContainerCPUreached80percentForRM1 HIGH The trigger will flag an alarm when the RM container CPU utilization goes beyond 80% for 15 mins container_cpu_usage_seconds_total, container_spec_cpu_quota, container_spec_cpu_period 15mins


ContainerMemoryUsage80percentForRM0 HIGH The trigger will flag an alarm when the RM container Memory utilization goes beyond 80% for 15 mins container_memory_rss, kube_pod_container_resource_limits_memory_bytes 15mins


ContainerMemoryUsage80percentForRM1 HIGH The trigger will flag an alarm when the RM container Memory utilization goes beyond 80% for 15 mins container_memory_rss, kube_pod_container_resource_limits_memory_bytes 15mins


MCPPortsExceeded HIGH All the MCP ports in MCP LRG are exceeded gvp_rm_log_parser_eror_total 1min


RMServiceDegradedTo50Percentage HIGH One of the RM container is not in running state for 5mins kube_pod_container_status_running 5mins


RMMatchingIVRTenantNotFound MEDIUM Matching IVR profile tenant could not be found for 2mins gvp_rm_log_parser_eror_total 2mins


RMResourceAllocationFailed MEDIUM RM Resource allocation failed for 1mins gvp_rm_log_parser_eror_total 1min