Dial Manager metrics and alerts

From Genesys Documentation
Jump to: navigation, search
This topic is part of the manual Outbound (CX Contact) Private Edition Guide for version Current of Outbound (CX Contact).


Find the metrics DM exposes and the alerts defined for DM.

Related documentation:
Service CRD or annotations? Port Endpoint/Selector Metrics update interval
Dial Manager ServiceMonitor 3109 /metrics 15 seconds

See details about:

Metrics[edit source]

Metric and description Metric details Indicator of
cxc_dm_healthy_instance

Healthy instance.

Unit:

Type: Gauge
Label: n/a
Sample value: 4.2

cxc_dm_processed_batches_total

Total processed batches.

Unit:

Type: Counter
Label: "'media', 'ccid', 'tenant_name'"
Sample value: 42

cxc_dm_processed_messages_total

Total processed messages.

Unit:

Type: Counter
Label: "'media', 'ccid', 'tenant_name'"
Sample value: 42

cxc_dm_opt_out_messages_total

Total opt out messages.

Unit:

Type: Counter
Label: "'media', 'ccid', 'tenant_name'"
Sample value: 42

cxc_dm_failed_processed_messages_total

Total failed messages.

Unit:

Type: Counter
Label: "'media', 'ccid', 'tenant_name'"
Sample value: 42

cxc_dm_batch_size

Batch size histogram.

Unit:

Type: Histogram
Label: "'media', 'ccid', 'tenant_name'"
Sample value: [1, 2, 3]

cxc_dm_process_message_duration_seconds

Processing message duration histogram.

Unit:

Type: Histogram
Label: "'media', 'ccid', 'tenant_name'"
Sample value: [1, 2, 3]

cxc_dm_delivery_buffer_size

Delivery buffer size.

Unit:

Type: Gauge
Label: 'media'
Sample value: 4.2

cxc_dm_test_messages_total

Total test messages.

Unit:

Type: Counter
Label: "'media', 'ccid', 'tenant_name'"
Sample value: 42

cxc_dm_failed_test_messages_total

Total failed test messages.

Unit:

Type: Counter
Label: "'media', 'ccid', 'tenant_name'"
Sample value: 42

cxc_dm_nexus_service_status

The current status of the connection to the Nexus service.

Unit:

Type: Gauge
Label: "'ccid', 'tenant_name'"
Sample value: 4.2

cxc_dm_request_count

Total requests made to Nexus via websocket.

Unit:

Type: Counter
Label: "'media', 'ccid', 'tenant_name', 'code'"
Sample value: 42

cxc_dm_request_latencies_ms

Request latencies histogram by tenant, in milliseconds.

Unit:

Type: Histogram
Label:
Sample value: [1, 2, 3]

cxc_dm_request_out_count

Total out requests by verb, destination, and code.

Unit:

Type: Counter
Label: "'method', 'destination', 'code'"
Sample value: 42

cxc_dm_request_out_latencies_ms

Out Request latencies histogram by verb, destination, and code, in milliseconds.

Unit:

Type: Histogram
Label:
Sample value: [1, 2, 3]

cxc_dm_elasticsearch_service_latencies_ms

Elasticsearch Request latencies histogram by verb, destination, and code, in milliseconds.

Unit:

Type: Histogram
Label: "'method', 'destination', 'code'"
Sample value: [1, 2, 3]


Alerts[edit source]

The following alerts are defined for Dial Manager.

Alert Severity Description Based on Threshold
CXC-DM-LatencyHigh HIGH Triggered when the latency for dial manager is above the defined threshold. 5000ms for 5m


CXC-CPUUsage HIGH Triggered when the CPU utilization of a pod is beyond the threshold 300% for 5m


CXC-MemoryUsage HIGH Triggered when the memory utilization of a pod is beyond the threshold. 70% for 5m


CXC-PodNotReadyCount HIGH Triggered when the number of pods ready for a CX Contact deployment is less than or equal to the threshold. 1 for 5m


CXC-PodRestartsCount HIGH Triggered when the restart count for a pod is beyond the threshold. 1 for 5m


CXC-MemoryUsagePD HIGH Triggered when the memory usage of a pod is above the critical threshold. 90% for 5m


CXC-PodRestartsCountPD HIGH Triggered when the restart count is beyond the critical threshold. 5 for 5m


CXC-PodsNotReadyPD HIGH Triggered when there are no pods ready for CX Contact deployment. 0 for 1m