UCS metrics and alerts
Find the metrics UCS exposes and the alerts defined for UCS.
| Service | CRD or annotations? | Port | Endpoint/Selector | Metrics update interval |
|---|---|---|---|---|
| UCS | ServiceMonitorpodAnnotations | 10052 | ucsx.ucsx.svc.cluster.local:10052/metrics | 30 seconds |
See details about:
Metrics[edit source]
| Metric and description | Metric details | Indicator of |
|---|---|---|
| ucsx_ Basic metric, created with starting metrics service, cannot be disabled. |
Unit: Percentage Type: Gauge |
CPU Usage |
| ucsx_ Resident Set Size |
Unit: MB Type: Gauge |
|
| ucsx_ Total number of all HTTP requests |
Unit: Integer Type: Histogram |
|
| ucsx_ Total duration of all HTTP requests |
Unit: Milliseconds Type: Histogram |
|
| ucsx_ Total number of all ElasticSearch requests |
Unit: Integer Type: Histogram |
|
| ucsx_ Total duration of all ElasticSearch requests |
Unit: Milliseconds Type: Histogram |
|
| ucsx_ Tenant DB status |
Unit: Integer Type: Gauge |
|
| ucsx_ ElasticSearch status |
Unit: Integer Type: Gauge |
|
| ucsx_ Master DB status |
Unit: Integer Type: Gauge |
|
| ucsx_ Count of the overload protection events |
Unit: Integer Type: Counter |
Alerts[edit source]
The following alerts are defined for No results.
| Alert | Severity | Description | Based on | Threshold |
|---|---|---|---|---|
| ucsx_instance_high_cpu_utilization | warning | Triggered when average CPU usage is more than 80% | ucsx_performance | 5 minutes
|
| ucsx_instance_high_memory_usage | warning | Triggered when average CPU usage is more than 800 Mb | ucsx_memory | 5 minutes
|
| ucsx_instance_high_http_request_rate | warning | Triggered when request rate is more than 120 requests per seconds on one UCS-X instance | ucsx_http_request_duration_count | 30 minutes
|
| ucsx_instance_slow_http_response | critical | Triggered when average http response time > 500 ms | ucsx_http_request_duration_sum, ucsx_http_request_duration_count | 5 minutes
|
| ucsx_elasticsearch_slow_processing_time | critical | Triggered when Elasticsearch internal processing time > 500 ms | ucsx_elastic_search_sum, ucsx_elastic_search_count | 5 minutes
|
| ucsx_tenantdb_health_status | critical | Triggered when there is no connection to tenant DB | ucsx_tenantdb_health_status | 2 minutes
|
| ucsx_elasticsearch_health_status | critical | Triggered when there is no connection to ElasticSearch | ucsx_elasticsearch_health_status | 2 minutes
|
| ucsx_masterdb_health_status | warning | Triggered when there is no connection to master DB | ucsx_masterdb_health_status | 2 minutes
|
| ucsx_instance_overloaded | warning | Triggered when overload protection rate is more than 0 | ucsx_overload_protection_count | 5 minutes |