Monitoring your services
Contents
Learn how to set up monitoring for the cluster and your private edition services.
Enabling monitoring for your services
This topic describes how to enable monitoring for your services in OpenShift Container Platform and GKE Platform.
OpenShift Container Platform
Enabling monitoring in OpenShift Container Platform allows cluster administrators, developers, and other users to specify how services and pods are monitored in projects. After you enable this feature, you can query metrics, review dashboards, and manage alerting rules and silences for your projects in the OpenShift Container Platform web console.
To enable monitoring of services, follow these steps.
- Edit the cluster-monitoring-config ConfigMap object.
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
- Set enableUserWorkload under data/config.yaml to
true
.apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: enableUserWorkload: true
- Save the file to apply the changes. Monitoring your own services is enabled automatically.
- Optional: Use the following command to ensure the prometheus-user-workload pods are created.
$ oc -n openshift-user-workload-monitoring get pod
- Sample output:
- Error creating thumbnail: Unable to save thumbnail to destination
- For more details, refer to Enabling monitoring for user-defined projects.
GKE Platform
Google Cloud operations suite - Cloud Monitoring
Google Cloud's operations suite (formerly Stackdriver) enables a centralized capability of receiving events, logs, metrics, and traces from your GKE platform resources.
Cloud Monitoring tracks metrics, events, and metadata from GKE platform, uptime probes, and services. Stackdriver ingests that data and makes it available via dashboards, charts, and alerts.
For more details, refer to https://cloud.google.com/monitoring/docs.Enable cloud monitoring
Supported values for the --logging
flag for the create and update commands.
Source | Value | Logs collected |
---|---|---|
System | SYSTEM | Metrics from essential system components required for Kubernetes functionality. See a complete list of these Kubernetes metrics. |
Workload | WORKLOAD | Enable a fully managed pipeline capable of collecting Prometheus-style metrics exposed by any GKE workload. You must configure which metrics to collect by deploying a PodMonitor custom resource. |
Console UI
To enable cloud monitoring through console UI, follow these steps:
- Navigate to Console UI.
- Select Clusters and then select the cluster instance.
- Under Features > Cloud Monitoring, click the Edit icon.
- Select Enable Cloud Monitoring and then select System and Workflow from the drop-down list.
- Click SAVE CHANGES.
This section explains setting up Prometheus on a Kubernetes cluster for monitoring the Kubernetes cluster.
GCloud CLI
To enable cloud monitoring through GCloud UI, follow these steps:
- Log on to the existing cluster.
gcloud container clusters get-credentials <cluster instance> --zone <zone name> --project <project name>
- Configure the logs to be sent to Cloud Monitoring by updating a comma-separated list of values to the
gcloud container clusters update
with--monitoring
flag. Here is an example:gcloud container clusters update gke1 \ --zone=us-west1-a \ --monitoring==SYSTEM,WORKLOAD
Google Cloud Monitoring API
Google Cloud Monitoring API refers to the API that is provided with Google Cloud operations suite to customize your Monitoring solution inside GKE platform.
Stackdriver reads this configuration to prescribe how it processes, manages, and responds to monitored events generated in the cluster.
For more details, refer to Introduction to the Cloud Monitoring API.
Summary of monitoring support
The service-level guides provide information about enabling monitoring for the respective services. Click the link in the “Included service” column in the summary below to go to the applicable page for service-specific information.
Service | Included service | CRD or annotations? | Port | Endpoint/Selector | Metrics update interval |
---|---|---|---|---|---|
Both — ServiceMonitor and annotations | 4004 | nexus.nexus.svc.cluster.local/metrics | 15 seconds | ||
CX Contact | CX Contact API Aggregator | ServiceMonitor | 9102 | /metrics | 15 seconds |
CX Contact | CX Contact Campaign Manager | ServiceMonitor | 3106 | /metrics | 15 seconds |
CX Contact | CX Contact Compliance Manager | ServiceMonitor | 3107 | /metrics | 15 seconds |
CX Contact | CX Contact Dial Manager | ServiceMonitor | 3109 | /metrics | 15 seconds |
CX Contact | CX Contact Job Scheduler | ServiceMonitor | 3108 | /metrics | 15 seconds |
CX Contact | CX Contact List Builder | ServiceMonitor | 3104 | /metrics | 15 seconds |
CX Contact | CX Contact List Manager | ServiceMonitor | 3105 | /metrics | 15 seconds |
Designer | Designer Application Server | ServiceMonitor | 8081 | See selector details on the DAS metrics and alerts page | 10 seconds |
Designer | Designer | ServiceMonitor | 8888 | See selector details on the DES metrics and alerts page | 10 seconds |
Email Service | Email Service | Both or either, depends on harvester | Default is 4024 (overridden by values) | /iwd-email/v3/metrics | 15 sec recommended, depends on harvester |
Genesys Authentication | Authentication Service | Annotations | 8081 | /prometheus | Real-time |
Genesys Authentication | Environment Service | Annotations | 8081 | /prometheus | Real-time |
Genesys Customer Experience Insights | Genesys CX Insights | ServiceMonitor | 8180 | See selector details on the GCXI metrics and alerts page | 15 minutes |
Genesys Customer Experience Insights | Reporting and Analytics Aggregates | PodMonitor and PrometheusRule | metrics: 9100, health: 9101 |
See selector details on the RAA metrics and alerts page | metrics: several seconds, health: up to 3 minutes |
Genesys Info Mart | GIM Config Adapter | PodMonitor | 9249 | See selector details on the GCA metrics and alerts page | 30 seconds |
Genesys Info Mart | GIM | PodMonitor | 8249 | See selector details on the GIM metrics and alerts page | 30 seconds |
Genesys Info Mart | GIM Stream Processor | PodMonitor | 9249 | See selector details on the GSP metrics and alerts page | 30 seconds |
Genesys Pulse | Tenant Data Collection Unit (DCU) | PodMonitor | 9091 | See selector details on the Tenant Data Collection Unit (DCU) metrics and alerts page | 30 seconds |
Genesys Pulse | Tenant Load Distribution Server (LDS) | PodMonitor | 9091 | See selector details on the Tenant Load Distribution Server (LDS) metrics and alerts page | 30 seconds |
Genesys Pulse | Pulse Web Service | ServiceMonitor | 8090 | See selector details on the Pulse metrics and alerts page | 30 seconds |
Genesys Pulse | Tenant Permissions Service | ||||
Genesys Voice Platform | Voice Platform Configuration Server | Service/Pod Monitoring Settings | Not applicable | See selector details on the Voice Platform Configuration Server metrics and alerts page | |
Genesys Voice Platform | Voice Platform Media Control Platform | Service/Pod Monitoring Settings | 9116,
8080, 8200 |
See selector details on the Voice Platform Media Control Platform metrics and alerts page | |
Genesys Voice Platform | Voice Platform Service Discovery | Automatic | 9090 | See selector details on the Voice Platform Service Discovery metrics and alerts page | |
Genesys Voice Platform | Voice Platform Reporting Server | ServiceMonitor / PodMonitor | 9116 | See selector details on the Voice Platform Reporting Server metrics and alerts page | |
Genesys Voice Platform | Voice Platform Resource Manager | ServiceMonitor / PodMonitor | 9116, 8200 | See selector details on the Voice Platform Resource Manager metrics and alerts page | |
Interaction Server (IXN) | Interaction Server (IXN) | PodMonitor | 13131,
13133, 13139 |
option ixnServer.ports.health - default port 13131 - Endpoint: “/health/prometheus/all”
option ixnNode.ports.default - default port 13133 - Endpoint: “/metrics” option ixnVQNode.ports.health - default port 13139 - Endpoint: “/metrics” Note: The above options are references to ports that match endpoints. Use these options to perform the associated query. |
Default |
Tenant Service | Tenant Service | PodMonitor | 15000 | /metrics (http://<pod address>:15000/metrics) | 30 seconds (Applicable for any metric(s) that Tenant Service exposes. The update interval is not a property of the metric; it is a property of the optional PodMonitor that you can create.) |
Voice Microservices | Agent State Service | PodMonitor | 11000 | http://<pod-ipaddress>:11000/metrics | 30 seconds |
Voice Microservices | Call State Service | Supports both CRD and annotations | 11900 | http://<pod-ipaddress>:11900/metrics | 30 seconds |
Voice Microservices | Config Service | Supports both CRD and annotations | 9100 | http://<pod-ipaddress>:9100/metrics | 30 seconds |
Voice Microservices | Dial Plan Service | Supports both CRD and annotations | 8800 | http://<pod-ipaddress>:8800/metrics | 30 seconds |
Voice Microservices | FrontEnd Service | Supports both CRD and annotations | 9101 | http://<pod-ipaddress>:9101/metrics | 30 seconds |
Voice Microservices | ORS | Supports both CRD and annotations | 11200 | http://<pod-ipaddress>:11200/metrics | 30 seconds |
Voice Microservices | Voice Registrar Service | Supports both CRD and annotations | 11500 | http://<pod-ipaddress>:11500/metrics | 30 seconds |
Voice Microservices | Voice RQ Service | Supports both CRD and annotations | 12000 | http://<pod-ipaddress>:12000/metrics | 30 seconds |
Voice Microservices | Voice SIP Cluster Service | Supports both CRD and annotations | 11300 | http://<pod-ipaddress>:11300/metrics | 30 seconds |
Voice Microservices | Voice SIP Proxy Service | Supports both CRD and annotations | 11400 | http://<pod-ipaddress>:11400/metrics | 30 seconds |
Voice Microservices | Voicemail | Supports both CRD and annotations | 8081 | http://<pod-ipaddress>:8081/metrics | 30 seconds |
WebRTC Media Service | WebRTC Gateway Service | PodMonitor | 10052 | /metrics | 30s |
Sample Prometheus queries to collect metrics
For each query in Prometheus, you can view the results as graph or console output.
Query1: kubelet_http_requests_total
Output:
GraphQuery 2: sum(irate(sipproxy_requests_processed_self_total{pod=~"voice-sipproxy-0"}[5m])) by (pod,method)
Output
Graph:
Console:
Output
Graph:
Console: