Difference between revisions of "PEC-OU/Current/CXCPEGuide/DMMetrics"

Latest revision as of 14:22, February 7, 2022

This topic is part of the manual Outbound (CX Contact) Private Edition Guide for version Current of Outbound (CX Contact).

Metrics[edit source]

Metric and description	Metric details	Indicator of
cxc_dm_healthy_instance Healthy instance.	Unit: Type: Gauge Label: n/a Sample value: 4.2
cxc_dm_processed_batches_total Total processed batches.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name'" Sample value: 42
cxc_dm_processed_messages_total Total processed messages.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name'" Sample value: 42
cxc_dm_opt_out_messages_total Total opt out messages.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name'" Sample value: 42
cxc_dm_failed_processed_messages_total Total failed messages.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name'" Sample value: 42
cxc_dm_batch_size Batch size histogram.	Unit: Type: Histogram Label: "'media', 'ccid', 'tenant_name'" Sample value: [1, 2, 3]
cxc_dm_process_message_duration_seconds Processing message duration histogram.	Unit: Type: Histogram Label: "'media', 'ccid', 'tenant_name'" Sample value: [1, 2, 3]
cxc_dm_delivery_buffer_size Delivery buffer size.	Unit: Type: Gauge Label: 'media' Sample value: 4.2
cxc_dm_test_messages_total Total test messages.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name'" Sample value: 42
cxc_dm_failed_test_messages_total Total failed test messages.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name'" Sample value: 42
cxc_dm_nexus_service_status The current status of the connection to the Nexus service.	Unit: Type: Gauge Label: "'ccid', 'tenant_name'" Sample value: 4.2
cxc_dm_request_count Total requests made to Nexus via websocket.	Unit: Type: Counter Label: "'media', 'ccid', 'tenant_name', 'code'" Sample value: 42
cxc_dm_request_latencies_ms Request latencies histogram by tenant, in milliseconds.	Unit: Type: Histogram Label: Sample value: [1, 2, 3]
cxc_dm_request_out_count Total out requests by verb, destination, and code.	Unit: Type: Counter Label: "'method', 'destination', 'code'" Sample value: 42
cxc_dm_request_out_latencies_ms Out Request latencies histogram by verb, destination, and code, in milliseconds.	Unit: Type: Histogram Label: Sample value: [1, 2, 3]
cxc_dm_elasticsearch_service_latencies_ms Elasticsearch Request latencies histogram by verb, destination, and code, in milliseconds.	Unit: Type: Histogram Label: "'method', 'destination', 'code'" Sample value: [1, 2, 3]

Alerts[edit source]

The following alerts are defined for Dial Manager.

Alert	Severity	Description	Threshold
CXC-DM-LatencyHigh	HIGH	Triggered when the latency for dial manager is above the defined threshold.	5000ms for 5m
CXC-CPUUsage	HIGH	Triggered when the CPU utilization of a pod is beyond the threshold	300% for 5m
CXC-MemoryUsage	HIGH	Triggered when the memory utilization of a pod is beyond the threshold.	70% for 5m
CXC-PodNotReadyCount	HIGH	Triggered when the number of pods ready for a CX Contact deployment is less than or equal to the threshold.	1 for 5m
CXC-PodRestartsCount	HIGH	Triggered when the restart count for a pod is beyond the threshold.	1 for 5m
CXC-MemoryUsagePD	HIGH	Triggered when the memory usage of a pod is above the critical threshold.	90% for 5m
CXC-PodRestartsCountPD	HIGH	Triggered when the restart count is beyond the critical threshold.	5 for 5m
CXC-PodsNotReadyPD	HIGH	Triggered when there are no pods ready for CX Contact deployment.	0 for 1m

@@ Line 6: / Line 6: @@
 |Endpoint=/metrics
 |MetricsUpdateInterval=15 seconds
+|MetricsDefined=Yes
 |PEMetric={{PEMetric
 |Metric=cxc_dm_healthy_instance
@@ Line 102: / Line 103: @@
 }}
 |AlertsDefined=Yes
+|PEAlert={{PEAlert
+|Alert=CXC-DM-LatencyHigh
+|Severity=HIGH
+|AlertDescription=Triggered when the latency for dial manager is above the defined threshold.
+|Threshold=5000ms for 5m
+}}{{PEAlert
+|Alert=CXC-CPUUsage
+|Severity=HIGH
+|AlertDescription=Triggered when the CPU utilization of a pod is beyond the threshold
+|Threshold=300% for 5m
+}}{{PEAlert
+|Alert=CXC-MemoryUsage
+|Severity=HIGH
+|AlertDescription=Triggered when the memory utilization of a pod is beyond the threshold.
+|Threshold=70% for 5m
+}}{{PEAlert
+|Alert=CXC-PodNotReadyCount
+|Severity=HIGH
+|AlertDescription=Triggered when the number of pods ready for a CX Contact deployment is less than or equal to the threshold.
+|Threshold=1 for 5m
+}}{{PEAlert
+|Alert=CXC-PodRestartsCount
+|Severity=HIGH
+|AlertDescription=Triggered when the restart count for a pod is beyond the threshold.
+|Threshold=1 for 5m
+}}{{PEAlert
+|Alert=CXC-MemoryUsagePD
+|Severity=HIGH
+|AlertDescription=Triggered when the memory usage of a pod is above the critical threshold.
+|Threshold=90% for 5m
+}}{{PEAlert
+|Alert=CXC-PodRestartsCountPD
+|Severity=HIGH
+|AlertDescription=Triggered when the restart count is beyond the critical threshold.
+|Threshold=5 for 5m
+}}{{PEAlert
+|Alert=CXC-PodsNotReadyPD
+|Severity=HIGH
+|AlertDescription=Triggered when there are no pods ready for CX Contact deployment.
+|Threshold=0 for 1m
+}}
 }}

Outbound (CX Contact) Private Edition Guide

Overview

Configure and deploy

Upgrade, roll back, or uninstall

Observability

Difference between revisions of "PEC-OU/Current/CXCPEGuide/DMMetrics"

Latest revision as of 14:22, February 7, 2022

Contents

Metrics[edit source]

Alerts[edit source]