Compliance Manager metrics and alerts
Find the metrics CPLM exposes and the alerts defined for CPLM.
Service | CRD or annotations? | Port | Endpoint/Selector | Metrics update interval |
---|---|---|---|---|
Compliance Manager | ServiceMonitor | 3107 | /metrics | 15 seconds |
See details about:
Metrics[edit source]
Metric and description | Metric details | Indicator of |
---|---|---|
compliance_ Total number of history API calls. |
Unit: Type: Counter |
|
compliance_ Total number validation requests are under processing. |
Unit: Type: Gauge |
|
compliance_ Total number of completed validation calls. |
Unit: Type: Counter |
|
compliance_ Number of validated requests with Success status. |
Unit: Type: Counter |
|
compliance_ Number of validation requests with Failed status. |
Unit: Type: Counter |
|
compliance_ Number of validation requests by Tenant with Success result. |
Unit: Type: Counter |
|
compliance_ Number of validation requests by Tenant with Fail result. |
Unit: Type: Counter |
|
cxc_ Healthy instance. |
Unit: Type: Gauge |
|
cxc_ The latencies of all HTTP requests distributed by method, plus path and HTTP response code.
|
Unit: Type: Histogram |
|
cxc_ The number of all HTTP requests distributed by method, plus path and HTTP response code. |
Unit: Type: Counter |
|
compliance_ Total number of Redis connections made. |
Unit: Type: Counter |
|
compliance_ Total number of Redis connections closed. Current can be calculated with the help of compliance_redis_connections_made. |
Unit: Type: Counter |
|
compliance_ Total number of reported REDIS errors. |
Unit: Type: Counter |
|
compliance_ Total number of calls placed by OCS broken by GSW_CALL_RESULT. |
Unit: Type: Counter |
|
cxc_ Total Out Requests by verb, destination, and code. |
Unit: Type: Counter |
|
cxc_ Out Request latencies histogram by verb, destination, and code, in milliseconds. |
Unit: Type: Histogram |
|
cxc_ Elasticsearch Request latencies histogram by verb, destination, and code, in milliseconds. |
Unit: Type: Histogram |
|
cxc_ Total number of validation requests rejected due to rate limit exceeded, broken by customer (tenant) and a limit reason {device, customerId, overall}. |
Unit: Type: Counter |
Alerts[edit source]
The following alerts are defined for Compliance Manager.
Alert | Severity | Description | Based on | Threshold |
---|---|---|---|---|
CXC-Compliance-LatencyHigh | HIGH | Triggered when the latency for API responses is beyond the defined threshold. | 5000ms for 5m
| |
CXC-CoM-Redis-no-active-connections | HIGH | Triggered when CX Contact compliance has no active redis connection for 2 minutes | 2m
| |
CXC-CPUUsage | HIGH | Triggered when the CPU utilization of a pod is beyond the threshold. | 300% for 5m
| |
CXC-MemoryUsage | HIGH | Triggered when the memory utilization of a pod is beyond the threshold. | 70% for 5m
| |
CXC-PodNotReadyCount | HIGH | Triggered when the number of pods ready for a CX Contact deployment is less than or equal to the threshold. | 1 for 5m
| |
CXC-PodRestartsCount | HIGH | Triggered when the restart count for a pod is beyond the threshold. | 1 for 5m
| |
CXC-MemoryUsagePD | HIGH | Triggered when the memory usage of a pod is above the critical threshold. | 90% for 5m
| |
CXC-PodRestartsCountPD | HIGH | Triggered when the restart count is beyond the critical threshold. | 5 for 5m
| |
CXC-PodsNotReadyPD | HIGH | Triggered when there are no pods ready for CX Contact deployment. | 0 for 1m |