Voice Platform Resource Manager metrics and alerts
Find the metrics Voice Platform Resource Manager exposes and the alerts defined for Voice Platform Resource Manager.
Service | CRD or annotations? | Port | Endpoint/Selector | Metrics update interval |
Voice Platform Resource Manager | ServiceMonitor / PodMonitor | 9116,
8200 |
Metrics endpoints:
curl -v "http://<RM_POD_IP>:9116/snmp?target="
curl -v "http://<RM_POD_IP>:8200/log Enable metrics: Service/Pod Monitoring Settingsprometheus:
enabled: true
port: 9116
port: 8200 podMonitor:
enabled: true
path: /snmp
module: [ if_mib ]
target: [ ]
path: /log
See details about:
Metrics[edit source]
Metric and description | Metric details | Indicator of |
Number of 5xx that were received for INVITE sent by RM |
Unit: Unsigned32 Type: Gauge |
Error |
Number of 4xx that were received for INVITE sent by RM |
Unit: Unsigned32 Type: Gauge |
Error |
GVP_RM_PhysicalResourceTable.Status: Current state of the resource (rmPRStatus{gvpConfigDBID="174",rmPRName="mcp-",rmPRStatus="AVAILABLE"}) |
Unit: DisplayString Type: Gauge |
Information |
GVP_RM_PhysicalResourceTable.ActiveCalls: Number of calls that currently are handled by the resource. (rmPRActiveCalls{gvpConfigDBID="174",rmPRName="mcp-"}) |
Unit: Unsigned32 Type: Gauge |
Traffic |
GVP_RM_PhysicalResourceTable.TotalCalls: Total number of calls that have been handled by this resource since it was connected to RM. (rmPRTotalCalls{gvpConfigDBID="174",rmPRName="mcp-"}) |
Unit: Unsigned32 Type: Gauge |
Traffic |
Number of active inbound calls that use this tenant. |
Unit: Unsigned32 Type: Gauge |
Traffic |
Maximum number of concurrent calls to this Tenant since it became active |
Unit: Type: Gauge |
Traffic |
Alerts[edit source]
The following alerts are defined for Voice Platform Resource Manager.
Alert | Severity | Description | Based on | Threshold |
RM Service Down | CRITICAL | RM pods are not in ready state and RM service is not available | kube_pod_container_status_running | 0
InitContainerFailingRepeatedly | CRITICAL | The trigger will flag an alarm when the RM init container gets failed 5 or more times within 15 mins. | kube_pod_init_container_status_restarts_total | 15 mins
ContainerRestartedRepeatedly | CRITICAL | The trigger will flag an alarm when the RM or RM SNMP container gets restarted 5 or more times within 15 mins | kube_pod_container_status_restarts_total | 15 mins
PodStatusNotReady | CRITICAL | The trigger will flag an alarm when RM pod status is Not ready for 30 mins and this will be controlled by override-value.yaml. | kube_pod_status_ready | 30mins
RMTotal5XXErrorForINVITE | HIGH | The RM mib counter stats will be collected for every 30 seconds and if the mib counter total5xxInviteSent increments from its previous value by 5 within 5 minutes the trigger will flag an alarm. | rmTotal5xxInviteSent | 5 mins
RMTotal4XXErrorForINVITE | MEDIUM | The RM mib counter stats will be collected for every 60 seconds and if the mib counter total4xxInviteSent increments from its previous value by 10 within 60 seconds the trigger will flag an alarm. | rmTotal4xxInviteSent | 1min
RMInterNodeConnectivityBroken | HIGH | Inter-node connectivity between RM nodes is lost for 5mins. | gvp_rm_log_parser_warn_total | 5 mins
RMConfigServerConnectionLost | HIGH | RM lost connection to GVP Configuration Server for 5mins. | gvp_rm_log_parser_warn_total | 5 mins
RMSocketInterNodeError | HIGH | RM Inter node Socket Error for 5mins. | gvp_rm_log_parser_eror_total | 5mins
ContainerCPUreached80percentForRM0 | HIGH | The trigger will flag an alarm when the RM container CPU utilization goes beyond 80% for 15 mins | container_cpu_usage_seconds_total, container_spec_cpu_quota, container_spec_cpu_period | 15mins
ContainerCPUreached80percentForRM1 | HIGH | The trigger will flag an alarm when the RM container CPU utilization goes beyond 80% for 15 mins | container_cpu_usage_seconds_total, container_spec_cpu_quota, container_spec_cpu_period | 15mins
ContainerMemoryUsage80percentForRM0 | HIGH | The trigger will flag an alarm when the RM container Memory utilization goes beyond 80% for 15 mins | container_memory_rss, kube_pod_container_resource_limits_memory_bytes | 15mins
ContainerMemoryUsage80percentForRM1 | HIGH | The trigger will flag an alarm when the RM container Memory utilization goes beyond 80% for 15 mins | container_memory_rss, kube_pod_container_resource_limits_memory_bytes | 15mins
MCPPortsExceeded | HIGH | All the MCP ports in MCP LRG are exceeded | gvp_rm_log_parser_eror_total | 1min
RMServiceDegradedTo50Percentage | HIGH | One of the RM container is not in running state for 5mins | kube_pod_container_status_running | 5mins
RMMatchingIVRTenantNotFound | MEDIUM | Matching IVR profile tenant could not be found for 2mins | gvp_rm_log_parser_eror_total | 2mins
RMResourceAllocationFailed | MEDIUM | RM Resource allocation failed for 1mins | gvp_rm_log_parser_eror_total | 1min |