Difference between revisions of "Draft: UCS/Current/UCSPEGuide/Metrics"

From Genesys Documentation
Jump to: navigation, search
m
m
Line 10: Line 10:
 
Describe metrics (compatible with Prometheus endpoints) that customers can use to create their own monitoring dashboard in a tool like Grafana.
 
Describe metrics (compatible with Prometheus endpoints) that customers can use to create their own monitoring dashboard in a tool like Grafana.
  
Make to identify any metrics that are important to monitor for alarming purposes, and include sample thresholds the customer should alarm on.</div>UCS-X provides internal monitoring metrics through a Prometheus endpoint on port 10052.
+
Make to identify any metrics that are important to monitor for alarming purposes, and include sample thresholds the customer should alarm on.</div>UCS provides internal monitoring metrics through a Prometheus endpoint on port 10052.
 
<br />
 
<br />
 
|Status=No
 
|Status=No

Revision as of 07:55, June 24, 2021

This is a draft page; the published version of this page can be found at UCS/Current/UCSPEGuide/Metrics.

Learn which metrics you should monitor for Universal Contact Service (UCS) and when to sound the alarm.

Include a link to the "suite-level" documentation for monitoring:
#mintydocs_link must be called from a MintyDocs-enabled page (Draft:UCS/Current/UCSPEGuide/Metrics).

Describe metrics (compatible with Prometheus endpoints) that customers can use to create their own monitoring dashboard in a tool like Grafana.

Make to identify any metrics that are important to monitor for alarming purposes, and include sample thresholds the customer should alarm on.
UCS provides internal monitoring metrics through a Prometheus endpoint on port 10052.


Common Performance Metrics

Name (type) Key Units Additional
keys
Alarm condition Description Common description
ucsx_performance
(Gauge)
metric="cpuUsage" % 'nodeId',
'pid'
> 95 CPU usage basic metric,
created with starting metrics service,
cannot be disabled
metric="loopDelay" ms event loop lag
ucsx_internal_queue
(Gauge)
metric="queueSize" 'nodeId',
'pid',
'endpoint'
number requests waiting for processing
metric="queueDelay" ms > 10000 request's waiting for processing time
ucsx_timings

(Gauge)

metric="cpuTime" s

'nodeId',
'pid'


CPU time used by process basic metric,
created with starting metrics service,
cannot be disabled
metric="sysTime" Syetm mode time
metric="userTime" User mode time
metric="upTime" Process running time
ucsx_memory

(Gauge)

metric="rss" Mb

'nodeId',
'pid',


> 1024 Resident Set Size basic metric,
created with starting metrics service,
cannot be disabled
metric="heapTotal"
metric="heapUsed"

Database Connection Metrics

Name (type) Key Units Additional
keys
Alarm condition Description Common description
ucsx_dbPool
(Gauge)
metric="total" 'nodeId',
'pid',
'ccId',
'address'
Total count of connection to particular database (defined by key 'ccId') basic metric,
created with starting call-center-storage service,
cannot be disabled,
shows usage of connection pool
metric="idle" Total count of idle connection to particular database (defined by key 'ccId')
metric="wait" Total count of connection in 'waiting' state to particular database (defined by key 'ccId')
metric="max" Max available connection for this instance of database

HTTP Request Metrics

Name (type) Key Units Additional
keys
Additional configuration
key
Alarm condition Description Common description
ucsx_http_request_duration_bucket
(Histogram)
le="10" ms 'nodeId',
'pid',
'endpoint',
'url',
'code',
'method',
'ccId',
'ccName'
/rest-gws/metrics/enabled = false - to disable > 10000 Count HTTP request's of duration less 10ms Metrics of HTTP requests duration
le="50" Count HTTP request's of duration less 50ms
le="200" Count HTTP request's of duration less 200ms
le="1000" Count HTTP request's of duration less 1000ms
le="+Inf" Count all HTTP request's
ucsx_http_request_duration_sum
(Histogram)
ms Total duration of all HTTP requests
ucsx_http_request_duration_count
(Histogram)
Total number of all HTTP requests
ucsx_http_total_requests_count
(Counter)
'nodeId',
'pid',
'endpoint'
Accumulative counter of all HTTP requests
ucsx_http_active_requests_count
(Gauge)
'nodeId',
'pid',
'endpoint'
Counter of all incomplete requests
ucsx_http_requests_per_second
(Gauge)
requests/s 'nodeId',
'pid',
'endpoint'
Performance of processing HTTP requests

CometD Metrics

Name (type) Key Units Additional
keys
Additional configuration
key
Alarm condition Description Common description
ucsx_cometd_request_duration_bucket
(Histogram)
le="10" ms 'nodeId',
'pid',
'endpoint',
'url',
'code',
'method',
'ccId',
'ccName'
/rest-gws-cometd/metrics/enabled = false - to disable > 10000 Count CometD request's of duration less 10ms Metrics of CometD requests duration
le="50" Count CometD request's of duration less 50ms
le="200" Count CometD request's of duration less 200ms
le="1000" Count CometD request's of duration less 1000ms
le="+Inf" Count all CometD request's
ucsx_cometd_request_duration_sum
(Histogram)
ms Total duration of all HTTP requests
ucsx_cometd_request_duration_count
(Histogram)
Total number of all HTTP requests
ucsx_cometd_requests_total
(Counter)
'nodeId',
'pid',
'endpoint'
Accumulative counter of all CometD requests
ucsx_cometd_active_requests_count
(Gauge)
'nodeId',
'pid',
'endpoint'
Counter of all incomplete requests

Internal Functions Calls Monitoring

Name (type) Key Units Additional
keys
Additional configuration
key
Description Common description
ucsx_internal_calls_bucket
(Histogram)
le="10" ms 'nodeId',
'pid',
'function'
/${serviceName}/callMetrics = false - to disable Count internal calls request's of duration less 10ms Metrics of services's methods calls duration
le="50" Count internal calls of duration less 50ms
le="200" Count internal calls of duration less 200ms
le="1000" Count internal calls of duration less 1000ms
le="+Inf" Count all internal calls
ucsx_internal_calls_sum
(Histogram)
Total duration of all calls of function
ucsx_internal_calls_count
(Histogram)
Total number of all calls of function

SQL Monitoring

Name (type) Key Units Additional
keys
Description Common description
ucsx_sql_bucket
(Histogram)
le="10" ms 'nodeId',
'pid',
'operation',
'ccId',
'ccName'
Count SQL request's of duration less 10ms Metrics of raw SQL request duration.
Key 'operation' might have values: 'update','insert','delete','query','rawSql'
le="50" Count SQL request's of duration less 50ms
le="200" Count SQL request's of duration less 200ms
le="1000" Count SQL request's of duration less 1000ms
le="+Inf" Count all SQL request's
ucsx_sql_sum
(Histogram)
Total duration of all raw SQL requests
ucsx_sql_count
(Histogram)
Total number of all raw SQL requests

Internal Cache Monitoring

Name (type) Key Units Additional
keys
Description Common description
ucsx_cache
(Counter)
method="set" 'nodeId',
'pid',
'key'
Count writes to cache Metrics of internal application cache
method="get" Count successful reads from cache
method="delete" Count delete operations from cache
method="expired" Count unsuccessful reads from cache due to data expiration
method="remove" Count remove operation by timer
method="miss" Count unsuccessful reads from cache due to data absence

Elasticsearch Monitoring

Name (type) Key Units Additional
keys
Alarm condition Description Common description
ucsx_elastic_search_bucket
(Histogram)
le="10" ms 'nodeId',
'pid',
'operation',
'ccId',
'ccName'
Count ES request's of duration less 10ms Metrics of ElasticSearch request duration.
Key 'operation' can have values: 'read','write'.
le="50" Count ES request's of duration less 50ms
le="200" Count ES request's of duration less 200ms
le="1000" Count ES request's of duration less 1000ms
le="+Inf" Count all ES request's
ucsx_elastic_search_sum
(Histogram)
Total duration of all ElasticSearch requests
ucsx_elastic_search_count
(Histogram)
Total number of all ElasticSearch requests

Session Metrics

Name (type) Key Units Additional
keys
Description Common description
ucsx_sessions
(Gauge)
state="active" counter 'nodeId',
'pid',
'state',
'session'
Count of sessions that have had some activity within configured interval of time. Available session types are 'HTTP' and 'cometD'
state="idle" Count of sessions that have not had some activity within configured interval of time but still alive.
state="newPerMinute" Count of new opened sessions per last minute