Job Scheduler metrics and alerts

From Genesys Documentation
Jump to: navigation, search
This topic is part of the manual Outbound (CX Contact) Private Edition Guide for version Current of Outbound (CX Contact).


Find the metrics JS exposes and the alerts defined for JS.

Related documentation:
Service CRD or annotations? Port Endpoint/Selector Metrics update interval
Job Scheduler ServiceMonitor 3108 /metrics 15 seconds

See details about:

Metrics[edit source]

Metric and description Metric details Indicator of
сxc_js_jobs_executed_total

Total jobs executed.

Unit:

Type: Counter
Label: "'type', 'ccid', 'tenant_name'"
Sample value: 42

cxc_js_jobs_failed_total

Total failed jobs.

Unit:

Type: Counter
Label: "'type', 'ccid', 'tenant_name'"
Sample value: 42

cxc_js_jobs_success_total

Total successful jobs.

Unit:

Type: Counter
Label: "'type', 'ccid', 'tenant_name'"
Sample value: 42

cxc_js_jobs_nothing_to_do_total

Total jobs with Nothing TO DO result.

Unit:

Type: Counter
Label: "'type', 'ccid', 'tenant_name'"
Sample value: 42

cxc_js_jobs_run_now_total

Total jobs that were started manually.

Unit:

Type: Counter
Label: "'ccid', 'tenant_name'"
Sample value: 42

cxc_js_files_imported_total

Total files imported.

Unit:

Type: Counter
Label: "'action', 'ccid', 'tenant_name'"
Sample value: 42

cxc_js_jobs_ttl_exceeded_total

Total ttl exceeded jobs.

Unit:

Type: Counter
Label: "'type', 'ccid', 'tenant_name'"
Sample value: 42

cxc_js_jobs_running_total

Number of currently active jobs.

Unit:

Type: Gauge
Label: "'type', 'ccid', 'tenant_name'"
Sample value: 4.2

cxc_js_redis_connections

Count of active connections to Redis server.

Unit:

Type: Gauge
Label: n/a
Sample value: 4.2

cxc_js_job_duration_seconds

Job duration histogram.

Unit:

Type: Histogram
Label: "'type', 'ccid', 'tenant_name'"
Sample value: [1, 2, 3]

cxc_js_job_import_file_size_megabytes

Job import file size histogram.

Unit:

Type: Histogram
Label: "'action', 'ccid', 'tenant_name'"
Sample value: [1, 2, 3]

cxc_js_healthy_instance

Healthy instance.

Unit:

Type: Gauge
Label: n/a
Sample value: 4.2

cxc_js_request_count

Total requests by verb and code.

Unit:

Type: Counter
Label: "'method', 'path', 'code'"
Sample value: 42

cxc_js_request_latencies_ms

Request latencies histogram by verb, in milliseconds.

Unit:

Type: Histogram
Label: "'method', 'path', 'code'"
Sample value: [1, 2, 3]

cxc_js_request_out_count

Total out requests by verb, destination, and code.

Unit:

Type: Counter
Label: "'method', 'destination', 'code'"
Sample value: 42

cxc_js_request_out_latencies_ms

Out Request latencies histogram by verb, destination, and code, in milliseconds.

Unit:

Type: Histogram
Label: "'method', 'destination', 'code'"
Sample value: [1, 2. 3]

cxc_js_healthy_tenants

Healthy tenants.

Unit:

Type: Gauge
Label: "'ccid', 'tenant_name'"
Sample value: 4.2


Alerts[edit source]

The following alerts are defined for Job Scheduler.

Alert Severity Description Based on Threshold
CXC-JS-LatencyHigh HIGH Triggered when the latency for job scheduler is above the defined threshold. 5000ms for 5m


CXC-CPUUsage HIGH Triggered when the CPU utilization of a pod is beyond the threshold 300% for 5m


CXC-MemoryUsage HIGH Triggered when the memory utilization of a pod is beyond the threshold. 70% for 5m


CXC-PodNotReadyCount HIGH Triggered when the number of pods ready for a CX Contact deployment is less than or equal to the threshold. 1 for 5m


CXC-PodRestartsCount HIGH Triggered when the restart count for a pod is beyond the threshold. 1 for 5m


CXC-MemoryUsagePD HIGH Triggered when the memory usage of a pod is above the critical threshold. 90% for 5m


CXC-PodRestartsCountPD HIGH Triggered when the restart count is beyond the critical threshold. 5 for 5m


CXC-PodsNotReadyPD HIGH Triggered when there are no pods ready for CX Contact deployment. 0 for 1m