Voice SIP Proxy Service metrics and alerts
Find the metrics Voice SIP Proxy Service exposes and the alerts defined for Voice SIP Proxy Service.
Service | CRD or annotations? | Port | Endpoint/Selector | Metrics update interval |
---|---|---|---|---|
Voice SIP Proxy Service | Supports both CRD and annotations | 11400 | http://<pod-ipaddress>:11400/metrics | 30 seconds |
See details about:
Metrics[edit source]
Voice SIP Proxy Service exposes Genesys-defined, SIP Proxy Service–specific metrics as well as some standard Kafka metrics. You can query Prometheus directly to see all the metrics that the SIP Proxy Service exposes. The following metrics are likely to be particularly useful. Genesys does not commit to maintain other currently available SIP Proxy Service metrics not documented on this page.
Metric and description | Metric details | Indicator of |
---|---|---|
sipproxy_ Total number of received requests. |
Unit: N/A Type: counter |
Traffic |
sipproxy_ The total number of rejected requests. |
Unit: N/A Type: counter |
Errors |
sipproxy_ The total number of received requests that were processed by SIP Proxy itself. |
Unit: N/A Type: counter |
Traffic |
sipproxy_ The total number of forwarded requests. |
Unit: N/A Type: counter |
Traffic |
sipproxy_ Total count of sip-node reselection. |
Unit: N/A Type: counter |
Errors |
sipproxy_ Total count of forwarded responses. |
Unit: N/A Type: counter |
Traffic |
sipproxy_ SIP response latency. |
Unit: Type: histogram |
Latency |
sipproxy_ Total number of REGISTER requests that SIP Proxy received for processing. |
Unit: N/A Type: counter |
Traffic |
sipproxy_ Total number of REGISTER requests for processing that were rejected. |
Unit: N/A Type: counter |
Errors |
sipproxy_ Current calculated calls per second. |
Unit: N/A Type: gauge |
Saturation |
sipproxy_ Current number of active SIP nodes. |
Unit: N/A Type: gauge |
|
sipproxy_ Current number of discovered SIP nodes. |
Unit: N/A Type: gauge |
|
sipproxy_ Current count of discovered tenants. |
Unit: N/A Type: gauge |
|
sipproxy_ Current number of errors while processing records got from Consul. |
Unit: N/A Type: counter |
|
sipproxy_ Current number of Consul errors. |
Unit: N/A Type: counter |
|
sipproxy_ Indicates whether SIP node has available capacity or not. |
Unit: Type: gauge |
|
service_ Displays the version of Voice SIP Proxy Service that is currently running. In the case of this metric, the labels provide the important information. The metric value is always 1 and does not provide any information. |
Unit: N/A Type: gauge |
|
sipproxy_ Health level of the SIP Proxy node: -1 – fail |
Unit: N/A Type: gauge |
|
sipproxy_ Status of the Envoy proxy: -1 – error |
Unit: N/A Type: gauge |
|
sipproxy_ Status of the Config node connection: 0 – disconnected |
Unit: N/A Type: gauge |
|
sip_ Total number of created server transactions. |
Unit: N/A Type: counter |
Traffic |
sip_ Total number of created client transactions. |
Unit: N/A Type: counter |
Traffic |
sip_ Total number of deleted server transactions. |
Unit: N/A Type: counter |
Traffic |
sip_ Total number of deleted client transactions. |
Unit: N/A Type: counter |
Traffic |
sip_ Current number of client transactions. |
Unit: N/A Type: gauge |
Saturation |
sip_ Current number of server transactions. |
Unit: N/A Type: gauge |
Saturation |
sip_ Total number of server transactions rejected for internal reasons. |
Unit: N/A Type: counter |
Errors |
sip_ Current number of active SIP Proxy forwarding contexts. |
Unit: N/A Type: gauge |
Saturation |
sip_ Total traffic received, measured in bytes. |
Unit: bytes Type: counter |
Traffic |
sip_ Total traffic sent, measured in bytes. |
Unit: bytes Type: counter |
Traffic |
sip_ Total number of transport errors. |
Unit: N/A Type: counter |
Errors |
sip_ Total number of requests to wait for drain events on stream transports. |
Unit: N/A Type: counter |
|
sip_ Total number of flood events on the stream transports. |
Unit: N/A Type: counter |
|
http_ The time duration between the HTTP client request and the response, measured in seconds. |
Unit: seconds Type: histogram |
Latency |
http_ The number of HTTP client responses received. |
Unit: N/A Type: counter |
Traffic |
log_ The total amount of log output, measured in bytes. |
Unit: bytes Type: counter |
|
kafka_ Number of messages received from Kafka. |
Unit: Type: counter |
Traffic |
kafka_ Number of Kafka consumer errors. |
Unit: Type: counter |
Errors |
kafka_ Consumer latency is the time difference between when the message is produced and when the message is consumed. That is, the time when the consumer received the message minus the time when the producer produced the message. |
Unit: Type: histogram |
Latency |
kafka_ Number of Kafka consumer rebalance events. |
Unit: Type: counter |
|
kafka_ Current state of the Kafka consumer. |
Unit: Type: gauge |
|
kafka_ Number of messages received from Kafka. |
Unit: Type: counter |
Traffic |
kafka_ Number of Kafka producer pending events. |
Unit: Type: gauge |
Saturation |
kafka_ Age of the oldest producer pending event in seconds. |
Unit: seconds Type: gauge |
|
kafka_ Number of Kafka producer errors. |
Unit: Type: counter |
Errors |
kafka_ Current state of the Kafka producer. |
Unit: Type: gauge |
|
kafka_ Biggest event size so far. |
Unit: Type: gauge |
|
kafka_ Exposed config to compare with biggest event size. |
Unit: Type: gauge |
|
kafka_ Number of dropped events. |
Unit: Type: gauge |
Alerts[edit source]
The following alerts are defined for Voice SIP Proxy Service.
Alert | Severity | Description | Based on | Threshold |
---|---|---|---|---|
Too many Kafka pending events | Critical | Too many Kafka producer pending events for pod {{ $labels.pod }}. This alert means there are issues with SIP REGISTER processing on this voice-sipproxy.
Actions:
|
kafka_producer_queue_depth | Too many Kafka producer pending events for service {{ $labels.container }} (more than 100 in 5 minutes).
|
SIP server response time too high | Warning | Actions:
|
sipproxy_response_latency_bucket | SIP response latency for more than 95% of messages forwarded to {{ $labels.sip_node_id }} is more than 1 second for sipproxy-node {{ $labels.pod }}.
|
Pod status failed | Warning | Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} is in Failed state.
|
Pod status Unknown | Warning | Pod {{ $labels.pod }} is in Unknown state.
Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} is in Unknown state for 5 minutes.
|
Pod status Pending | Warning | Pod {{ $labels.pod }} is in Pending state.
Actions:
|
kube_pod_status_phase | Pod {{ $labels.pod }} is in Pending state for 5 minutes.
|
Pod status NotReady | Critical | Pod {{ $labels.pod }} is in NotReady state.
Actions:
|
kube_pod_status_ready | Pod {{ $labels.pod }} is in NotReady state for 5 minutes.
|
Container restarted repeatedly | Critical | Container {{ $labels.container }} was repeatedly restarted.
Actions:
|
kube_pod_container_status_restarts_total | Container {{ $labels.container }} was restarted 5 or more times within 15 minutes.
|
No sip-nodes available for 2 minutes | Critical | No sip-nodes are available for the pod {{ $labels.pod }}.
Actions:
|
sipproxy_active_sip_nodes_count | No sip-nodes are available for the pod {{ $labels.pod }} for 2 minutes.
|
sip-node capacity limit reached | Warning | The sip-node {{ $labels.sip_node_id }} hit capacity limit on {{ $labels.pod }}.
Actions:
|
sipproxy_sip_node_is_capacity_available | The sip-node {{ $labels.sip_node_id }} hit capacity limit on {{ $labels.pod }} for 3 consecutive minutes.
|
Pod CPU greater than 80% | Critical | Critical CPU load for pod {{ $labels.pod }}.
Actions:
|
container_cpu_usage_seconds_total, container_spec_cpu_period | Container {{ $labels.container }} CPU usage exceeded 80% for 5 minutes.
|
Pod CPU greater than 65% | Warning | High CPU load for pod {{ $labels.pod }}.
Actions:
|
container_cpu_usage_seconds_total, container_spec_cpu_period | Container {{ $labels.container }} CPU usage exceeded 65% for 5 minutes.
|
Pod memory greater than 80% | Critical | Critical memory usage for pod {{ $labels.pod }}.
Actions:
|
container_memory_working_set_bytes, kube_pod_container_resource_requests_memory_bytes | Container {{ $labels.container }} memory usage exceeded 80% for 5 minutes
|
Pod memory greater than 65% | Warning | Pod {{ $labels.pod }} has high memory usage.
Actions:
|
container_memory_working_set_bytes, kube_pod_container_resource_requests_memory_bytes | Container {{ $labels.container }} memory usage exceeded 65% for 5 minutes.
|
Config node fail | Warning | The request to the config node failed.
Action:
|
http_client_response_count | Requests to the config node fail for 5 consecutive minutes. |