BDS metrics and alerts
Find the metrics BDS exposes and the alerts defined for BDS.
|Service||CRD or annotations?||Port||Endpoint/Selector||Metrics update interval|
|Note: As a serverless component, BDS is run via a Kubernetes CronJob. By default, the job runs every 12 hours (twice a day) and pushes information into the Prometheus Pushgateway.|
See details about:
Billing Data Service (BDS) exposes few metrics through Prometheus Pushgateway for monitoring BDS performance and containers. Note that the service-monitoring metrics are distinct from the metrics BDS provides to monitor contact center activity, which are described in the Billing Data Server User's Guide.
The following system metrics are likely to be particularly useful.
For information about standard system metrics, see System metrics.
Other than the system metrics, BDS provides the following service specific metrics for monitoring and alerting purposes:
|Metric and description||Metric details||Indicator of|
The time in which the BDS cron job has started its process.
The time in which the BDS cron job has ended its process.
Exit code indicating whether the cron job has successfully started. Returns 0 if the cron job has started successfully, else returns a numeric value.
Billing Data Service does not define any alerts by default in the Helm charts. The following table indicates the sample alerts configuration that you can create using the supported metrics.
The following alerts are defined for BDS.
|BDS-ContainerRestartCount||The container in which BDS did not start or no longer responsive.||kube_pod_container_status_restarts_total||>0
|BDS-JobStartTime||The cron job that did not start in scheduled time or that was being processed for a long time.||kube_job_status_start_time||< (time() - 44100)
|BDS-JobFailStatus||Cron job failed status.||kube_job_status_failed||>0|