Tenant Load Distribution Server (LDS) metrics and alerts

From Genesys Documentation
Jump to: navigation, search
This topic is part of the manual Genesys Pulse Private Edition Guide for version Current of Reporting.


Find the metrics Tenant Load Distribution Server (LDS) exposes and the alerts defined for Tenant Load Distribution Server (LDS).

Related documentation:
Service CRD or annotations? Port Endpoint/Selector Metrics update interval
Tenant Load Distribution Server (LDS) PodMonitor 9091
selector:
  matchLabels:
    app.kubernetes.io/name: {{include "common.util.chart.name" . }}
    app.kubernetes.io/instance: {{include "common.util.chart.fullname" . }}
    service: {{.Release.Namespace }}
    servicename: {{include "common.util.chart.name" . }}
    tenant: {{.Values.tenant.sid }}

Endpoints to query: /metrics/

30 seconds

See details about:

Metrics[edit source]

Metric and description Metric details Indicator of
pulse_monitor_check_duration_seconds

The duration in seconds of the last health check performed by Monitor Agent.

Unit: seconds

Type: Gauge
Label: tenant
Sample value:

Error
pulse_lds_uptime_seconds

The LDS container uptime in seconds.

Unit: seconds

Type: Gauge
Label: tenant
Sample value:

Error
pulse_lds_senders_number

The number of upstream servers to which the LDS is connected.

Unit:

Type: Gauge
Label: tenant
Sample value: 2

Error
pulse_lds_receivers_number

The number of clients connected to the LDS.

Unit:

Type: Gauge
Label: tenant
Sample value: 2

Error
pulse_lds_sender_connected_seconds

Duration in seconds of connection to the upstream server.

Unit: seconds

Type: Gauge
Label: tenant, sender
Sample value:

Error
pulse_lds_sender_disconnected_seconds

Duration in seconds of disconnection from the upstream server.

Unit: seconds

Type: Gauge
Label: tenant, sender
Sample value:

Error
pulse_lds_sender_registered_dns_number

The number of DNs registered on the upstream server.

Unit:

Type: Gauge
Label: tenant, sender
Sample value: 1000

Saturation
pulse_lds_sender_registration_errors_number

The number of failed registrations of DNs on the upstream server.

Unit:

Type: Gauge
Label: tenant, sender
Sample value: 0

Error
pulse_lds_receiver_connected_seconds

Duration in seconds of client connection to the LDS.

Unit: seconds

Type: Gauge
Label: tenant, receiver
Sample value:

Error
pulse_lds_receiver_registered_dns_number

The number of DNs registered by the client.

Unit:

Type: Gauge
Label: tenant, receiver
Sample value: 1000

Saturation
pulse_lds_receiver_registration_errors_number

The number of failed registrations of DNs received from the client.

Unit:

Type: Gauge
Label: tenant, receiver
Sample value: 0

Error


Alerts[edit source]

Alerts are based on LDS and Kubernetes cluster metrics.

The following alerts are defined for Tenant Load Distribution Server (LDS).

Alert Severity Description Based on Threshold
pulse_lds_monitor_data_unavailable Critical Pulse LDS Monitor Agents do not provide data. pulse_monitor_check_duration_seconds, kube_statefulset_replicas for 15 minutes


pulse_lds_critical_nonrunning_instances Critical Triggered when Pulse LDS instances are down. kube_statefulset_status_replicas_ready, kube_statefulset_status_replicas for 15 minutes


pulse_lds_too_frequent_restarts Critical Detected too frequent restarts of LDS Pod container. kube_pod_container_status_restarts_total 2 for 1 hour


pulse_lds_critical_cpu Critical Detected critical CPU usage by Pulse LDS Pod. container_cpu_usage_seconds_total, kube_pod_container_resource_limits 90%


pulse_lds_critical_memory Critical Detected critical memory usage by Pulse LDS Pod. container_memory_working_set_bytes, kube_pod_container_resource_limits 90%


pulse_lds_no_connected_senders Critical Pule LDS is not connected to upstream servers. pulse_lds_senders_number for 15 minutes


pulse_lds_no_registered_dns Critical No DNs are registered on Pulse LDS. pulse_lds_sender_registered_dns_number for 30 minutes