BDS metrics and alerts

From Genesys Documentation
Jump to: navigation, search
This topic is part of the manual Billing Data Service Private Edition Guide for version Current of Billing Data Service.


Find the metrics BDS exposes and the alerts defined for BDS.

Service CRD or annotations? Port Endpoint/Selector Metrics update interval
BDS n/a n/a n/a n/a
Note: As a serverless component, BDS is run via a Kubernetes CronJob. By default, the job runs every 12 hours (twice a day) and pushes information into the Prometheus Pushgateway.

See details about:

Metrics[edit source]

Billing Data Service (BDS) exposes few metrics through Prometheus Pushgateway for monitoring BDS performance and containers. Note that the service-monitoring metrics are distinct from the metrics BDS provides to monitor contact center activity, which are described in the Billing Data Server User's Guide.

The following system metrics are likely to be particularly useful.

  • kube_pod_container_status_restarts_total
  • kube_job_status_start_time
  • kube_job_status_failed

For information about standard system metrics, see System metrics.

Other than the system metrics, BDS provides the following service specific metrics for monitoring and alerting purposes:

Metric and description Metric details Indicator of
bds_pod_processing_start

The time in which the BDS cron job has started its process.

Unit: Second

Type: Gauge
Label:
Sample value: 1638529175

bds_pod_processing_end

The time in which the BDS cron job has ended its process.

Unit: Second

Type: Gauge
Label:
Sample value: 1638538924

bds_processing_exit_code

Exit code indicating whether the cron job has successfully started. Returns 0 if the cron job has started successfully, else returns a numeric value.

Unit:

Type: Gauge
Label:
Sample value: 0


Alerts[edit source]

Billing Data Service does not define any alerts by default in the Helm charts. The following table indicates the sample alerts configuration that you can create using the supported metrics.

The following alerts are defined for BDS.

Alert Severity Description Based on Threshold
BDS-ContainerRestartCount The container in which BDS did not start or no longer responsive. kube_pod_container_status_restarts_total >0


BDS-JobStartTime The cron job that did not start in scheduled time or that was being processed for a long time. kube_job_status_start_time < (time() - 44100)


BDS-JobFailStatus Cron job failed status. kube_job_status_failed >0