Digital Channels metrics and alerts

From Genesys Documentation
Jump to: navigation, search
This topic is part of the manual Digital Channels Private Edition Guide for version Current of Digital Channels.


Find the metrics Digital Channels exposes and the alerts defined for Digital Channels.

Related documentation:
Service CRD or annotations? Port Endpoint/Selector Metrics update interval
Digital Channels Both — ServiceMonitor and annotations 4004 nexus.nexus.svc.cluster.local/metrics 15 seconds

See details about:

Metrics[edit source]

Digital Channels exposes many Genesys-defined metrics. You can query Prometheus directly to see all the available metrics. The metrics documented on this page are likely to be particularly useful. Genesys does not commit to maintain other currently available Digital Channels metrics not documented on this page.

Metric and description Metric details Indicator of
nexus_errors_total

The total number of requests that resulted in an error.

Unit:

Type: Number
Label:
Sample value: 100

nexus_request_total

The total number of requests.

Unit:

Type: Number
Label:
Sample value: 1000

nexus_process_resident_memory_bytes

The total byes of memory Digital Channels consumed.

Unit:

Type: Number
Label:
Sample value: 100000

nexus_redis_connections_established

The current number of established Redis connections.

Unit:

Type: Gauge
Label:
Sample value: 0

nexus_redis_connections_reconnecting

The current number of reconnecting Redis connections.

Unit:

Type: Gauge
Label:
Sample value: 0

nexus_redis_connections_ready

The current number of ready Redis connections.

Unit:

Type: Gauge
Label:
Sample value: 1

nexus_redis_duration_until_ready

The duration until Redis reaches the ready state.

Unit:

Type: Histogram
Label: 'le'
Sample value: 0, 1, 39

nexus_redis_errors_total

The total number of Redis connection errors.

Unit:

Type: Counter
Label:
Sample value: 0

nexus_db_connect_total

The total number of all database connection requests.

Unit:

Type: Counter
Label: 'db'
Sample value: 1252424, 1457770

nexus_db_disconnect_total

The total number of all database disconnection requests.

Unit:

Type: Counter
Label: 'db'
Sample value: 1252424, 1457770

nexus_db_request_total

The total number of all database requests sent.

Unit:

Type: Counter
Label: 'db'
Sample value: 4850730, 5056452

nexus_db_success_total

The total number of all database requests executed successfully.

Unit:

Type: Counter
Label: 'db', 'command'
Sample value: 2307896, 2126805, 1221394, 1450355

nexus_db_errors_total

The total number of all database errors.

Unit:

Type: Counter
Label: 'db', 'code'
Sample value: 131, 5, 4

nexus_db_request_duration_milliseconds

The database transaction duration.

Unit:

Type: histogram
Label: 'le', 'db', 'method'
Sample value: 2290844, 2306385, 2307241, 2307894

iwd_process_cpu_user_seconds_total

The total user CPU time spent, in seconds.

Unit:

Type: counter
Label:
Sample value: 1634045655571

nexus_process_cpu_system_seconds_total

The total system CPU time spent, in seconds.

Unit:

Type: counter
Label:
Sample value: 1634045655571

nexus_process_cpu_seconds_total

The total user and system CPU time spent, in seconds.

Unit:

Type: counter
Label:
Sample value: 1634045655571

nexus_process_start_time_seconds

The start time of the process since the Unix epoch, in seconds.

Unit:

Type: gauge
Label:
Sample value: 1633992102

nexus_process_resident_memory_bytes

The resident memory size, in bytes.

Unit:

Type: gauge
Label:
Sample value: 1634045655572

nexus_process_virtual_memory_bytes

The virtual memory size, in bytes.

Unit:

Type: gauge
Label:
Sample value: 1634045655572

nexus_process_heap_bytes

The process heap size, in bytes.

Unit:

Type: gauge
Label:
Sample value: 1634045655572

nexus_process_open_fds

The number of open file descriptors.

Unit:

Type: gauge
Label:
Sample value: 1634045655572

nexus_process_max_fds

The maximum number of open file descriptors.

Unit:

Type: gauge
Label:
Sample value: 197176

iwd_nodejs_eventloop_lag_seconds

The Node.js event loop lag, in seconds.

Unit:

Type: gauge
Label:
Sample value: 1634045655572

nexus_nodejs_active_handles

The number of active libuv handles, grouped by handle type. Every handle type is a C++ class name.

Unit:

Type: gauge
Label: 'type'
Sample value: 17, 1, 69

nexus_nodejs_active_handles_total

The total number of active libuv handles.

Unit:

Type: gauge
Label:
Sample value: 1634045655572

nexus_nodejs_active_requests

The number of active libuv requests, grouped by request type. Every request type is a C++ class name.

Unit:

Type: gauge
Label: 'type'
Sample value: 2

nexus_nodejs_active_requests_total

The total number of active libuv requests.

Unit:

Type: gauge
Label:
Sample value: 1634045655572

nexus_nodejs_heap_size_total_bytes

The process heap size from Node.js, in bytes.

Unit:

Type: gauge
Label:
Sample value: 1634045655572

nexus_nodejs_heap_size_used_bytes

The process heap size used from Node.js, in bytes.

Unit:

Type: gauge
Label:
Sample value: 1634045655572

nexus_nodejs_external_memory_bytes

The Node.js external memory size, in bytes.

Unit:

Type: gauge
Label:
Sample value: 1634045655572

nexus_nodejs_heap_space_size_total_bytes

The process heap space size total from Node.js, in bytes.

Unit:

Type: gauge
Label: 'space'
Sample value: 262144, 16777216, 130428928, 6721536

nexus_nodejs_heap_space_size_used_bytes

The process heap space size used from Node.js in bytes.

Unit:

Type: gauge
Label: 'space'
Sample value: 32808, 1479672, 92634792, 4852384

nexus_nodejs_heap_space_size_available_bytes

The process heap space size available from Node.js, in bytes.

Unit:

Type: gauge
Label: 'space'
Sample value: 0, 6899976, 37040456, 1542496

nexus_nodejs_version_info

Node.js version information.

Unit:

Type: gauge
Label: 'version', 'major', 'minor', 'patch'
Sample value: 1


Alerts[edit source]

The following alerts are defined for Digital Channels.

Alert Severity Description Based on Threshold
Nexus error rate Critical Triggered when the error rate on this pod is greater than 20% for 15 minutes. nexus_errors_total, nexus_request_total For 15 minutes


Memory usage is above 3000 Mb Critical Triggered when the memory usage on this pod is above 3000 Mb for 15 minutes. nexus_process_resident_memory_bytes For 15 minutes