UCS metrics and alerts
Find the metrics UCS exposes and the alerts defined for UCS.
Service | CRD or annotations? | Port | Endpoint/Selector | Metrics update interval |
---|---|---|---|---|
UCS | ServiceMonitorpodAnnotations | 10052 | ucsx.ucsx.svc.cluster.local:10052/metrics | 30 seconds |
See details about:
Metrics[edit source]
Metric and description | Metric details | Indicator of |
---|---|---|
ucsx_ Basic metric, created with starting metrics service, cannot be disabled. |
Unit: Percentage Type: Gauge |
CPU Usage |
ucsx_ Resident Set Size |
Unit: MB Type: Gauge |
|
ucsx_ Total number of all HTTP requests |
Unit: Integer Type: Histogram |
|
ucsx_ Total duration of all HTTP requests |
Unit: Milliseconds Type: Histogram |
|
ucsx_ Total number of all ElasticSearch requests |
Unit: Integer Type: Histogram |
|
ucsx_ Total duration of all ElasticSearch requests |
Unit: Milliseconds Type: Histogram |
|
ucsx_ Tenant DB status |
Unit: Integer Type: Gauge |
|
ucsx_ ElasticSearch status |
Unit: Integer Type: Gauge |
|
ucsx_ Master DB status |
Unit: Integer Type: Gauge |
|
ucsx_ Count of the overload protection events |
Unit: Integer Type: Counter |
Alerts[edit source]
The following alerts are defined for No results.
Alert | Severity | Description | Based on | Threshold |
---|---|---|---|---|
ucsx_instance_high_cpu_utilization | warning | Triggered when average CPU usage is more than 80% | ucsx_performance | 5 minutes
|
ucsx_instance_high_memory_usage | warning | Triggered when average CPU usage is more than 800 Mb | ucsx_memory | 5 minutes
|
ucsx_instance_high_http_request_rate | warning | Triggered when request rate is more than 120 requests per seconds on one UCS-X instance | ucsx_http_request_duration_count | 30 minutes
|
ucsx_instance_slow_http_response | critical | Triggered when average http response time > 500 ms | ucsx_http_request_duration_sum, ucsx_http_request_duration_count | 5 minutes
|
ucsx_elasticsearch_slow_processing_time | critical | Triggered when Elasticsearch internal processing time > 500 ms | ucsx_elastic_search_sum, ucsx_elastic_search_count | 5 minutes
|
ucsx_tenantdb_health_status | critical | Triggered when there is no connection to tenant DB | ucsx_tenantdb_health_status | 2 minutes
|
ucsx_elasticsearch_health_status | critical | Triggered when there is no connection to ElasticSearch | ucsx_elasticsearch_health_status | 2 minutes
|
ucsx_masterdb_health_status | warning | Triggered when there is no connection to master DB | ucsx_masterdb_health_status | 2 minutes
|
ucsx_instance_overloaded | warning | Triggered when overload protection rate is more than 0 | ucsx_overload_protection_count | 5 minutes |