Dial Manager metrics and alerts

This topic is part of the manual Outbound (CX Contact) Private Edition Guide for version Current of Outbound (CX Contact).

Metrics[edit source]

Metric and description	Metric details	Indicator of
cxc_dm_healthy_instance Healthy instance.	Unit: Type: Gauge Label: n/a Sample value: 4.2
cxc_dm_processed_batches_total Total processed batches.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name'" Sample value: 42
cxc_dm_processed_messages_total Total processed messages.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name'" Sample value: 42
cxc_dm_opt_out_messages_total Total opt out messages.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name'" Sample value: 42
cxc_dm_failed_processed_messages_total Total failed messages.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name'" Sample value: 42
cxc_dm_batch_size Batch size histogram.	Unit: Type: Histogram Label: "'media', 'ccid', 'tenant_name'" Sample value: [1, 2, 3]
cxc_dm_process_message_duration_seconds Processing message duration histogram.	Unit: Type: Histogram Label: "'media', 'ccid', 'tenant_name'" Sample value: [1, 2, 3]
cxc_dm_delivery_buffer_size Delivery buffer size.	Unit: Type: Gauge Label: 'media' Sample value: 4.2
cxc_dm_test_messages_total Total test messages.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name'" Sample value: 42
cxc_dm_failed_test_messages_total Total failed test messages.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name'" Sample value: 42
cxc_dm_nexus_service_status The current status of the connection to the Nexus service.	Unit: Type: Gauge Label: "'ccid', 'tenant_name'" Sample value: 4.2
cxc_dm_request_count Total requests made to Nexus via websocket.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name', 'code'" Sample value: 42
cxc_dm_request_latencies_ms Request latencies histogram by tenant, in milliseconds.	Unit: Type: Histogram Label: Sample value: [1, 2, 3]
cxc_dm_request_out_count Total out requests by verb, destination, and code.	Unit: Type: Counter Label: "'method', 'destination', 'code'" Sample value: 42
cxc_dm_request_out_latencies_ms Out Request latencies histogram by verb, destination, and code, in milliseconds.	Unit: Type: Histogram Label: Sample value: [1, 2, 3]
cxc_dm_elasticsearch_service_latencies_ms Elasticsearch Request latencies histogram by verb, destination, and code, in milliseconds.	Unit: Type: Histogram Label: "'method', 'destination', 'code'" Sample value: [1, 2, 3]

Alerts[edit source]

The following alerts are defined for Dial Manager.

Alert	Severity	Description	Threshold
CXC-DM-LatencyHigh	HIGH	Triggered when the latency for dial manager is above the defined threshold.	5000ms for 5m
CXC-CPUUsage	HIGH	Triggered when the CPU utilization of a pod is beyond the threshold	300% for 5m
CXC-MemoryUsage	HIGH	Triggered when the memory utilization of a pod is beyond the threshold.	70% for 5m
CXC-PodNotReadyCount	HIGH	Triggered when the number of pods ready for a CX Contact deployment is less than or equal to the threshold.	1 for 5m
CXC-PodRestartsCount	HIGH	Triggered when the restart count for a pod is beyond the threshold.	1 for 5m
CXC-MemoryUsagePD	HIGH	Triggered when the memory usage of a pod is above the critical threshold.	90% for 5m
CXC-PodRestartsCountPD	HIGH	Triggered when the restart count is beyond the critical threshold.	5 for 5m
CXC-PodsNotReadyPD	HIGH	Triggered when there are no pods ready for CX Contact deployment.	0 for 1m

Outbound (CX Contact) Private Edition Guide

Overview

Configure and deploy

Upgrade, roll back, or uninstall

Observability

Dial Manager metrics and alerts

Contents

Metrics[edit source]

Alerts[edit source]