Monitoring your services

From Genesys Documentation
Revision as of 18:03, December 20, 2021 by Jabashree.amudha@genesys.com (talk | contribs) (Published)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
This topic is part of the manual Operations for version Current of Genesys Multicloud CX Private Edition.

Learn how to set up monitoring for the cluster and your private edition services.

Enabling monitoring for your services

This topic describes how to enable monitoring for your services in OpenShift Container Platform and GKE Platform.

OpenShift Container Platform

Enabling monitoring in OpenShift Container Platform allows cluster administrators, developers, and other users to specify how services and pods are monitored in projects. After you enable this feature, you can query metrics, review dashboards, and manage alerting rules and silences for your projects in the OpenShift Container Platform web console.

To enable monitoring of services, follow these steps.

  1. Edit the cluster-monitoring-config ConfigMap object.
    $ oc -n openshift-monitoring edit configmap cluster-monitoring-config
  2. Set enableUserWorkload under data/config.yaml to true.
     apiVersion: v1
     kind: ConfigMap
     metadata:
       name: cluster-monitoring-config
       namespace: openshift-monitoring
     data:
       config.yaml: 
         enableUserWorkload: true
    Save the file to apply the changes. Monitoring your own services is enabled automatically.
  3. Optional: Use the following command to ensure the prometheus-user-workload pods are created.
     $ oc -n openshift-user-workload-monitoring get pod
    Sample output:
    Error creating thumbnail: Unable to save thumbnail to destination
    For more details, refer to Enabling monitoring for user-defined projects.

GKE Platform

Google Cloud operations suite - Cloud Monitoring

Google Cloud's operations suite (formerly Stackdriver) enables a centralized capability of receiving events, logs, metrics, and traces from your GKE platform resources.

Cloud Monitoring tracks metrics, events, and metadata from GKE platform, uptime probes, and services. Stackdriver ingests that data and makes it available via dashboards, charts, and alerts.

For more details, refer to https://cloud.google.com/monitoring/docs.
Error creating thumbnail: Unable to save thumbnail to destination

Enable cloud monitoring

Supported values for the --logging flag for the create and update commands.

Source Value Logs collected
System SYSTEM Metrics from essential system components required for Kubernetes functionality. See a complete list of these Kubernetes metrics.
Workload WORKLOAD Enable a fully managed pipeline capable of collecting Prometheus-style metrics exposed by any GKE workload. You must configure which metrics to collect by deploying a PodMonitor custom resource.

Console UI

To enable cloud monitoring through console UI, follow these steps:

  1. Navigate to Console UI.
  2. Select Clusters and then select the cluster instance.
  3. Under Features > Cloud Monitoring, click the Edit icon.
  4. Select Enable Cloud Monitoring and then select System and Workflow from the drop-down list.
  5. Click SAVE CHANGES.

This section explains setting up Prometheus on a Kubernetes cluster for monitoring the Kubernetes cluster.

GCloud CLI

To enable cloud monitoring through GCloud UI, follow these steps:

  1. Log on to the existing cluster.
    gcloud container clusters get-credentials <cluster instance> --zone <zone name> --project <project name>
  2. Configure the logs to be sent to Cloud Monitoring by updating a comma-separated list of values to the gcloud container clusters update  with --monitoring flag. Here is an example:
    gcloud container clusters update gke1 \
        --zone=us-west1-a \
        --monitoring==SYSTEM,WORKLOAD

Google Cloud Monitoring API

Google Cloud Monitoring API refers to the API that is provided with Google Cloud operations suite to customize your Monitoring solution inside GKE platform.

Stackdriver reads this configuration to prescribe how it processes, manages, and responds to monitored events generated in the cluster.
For more details, refer to Introduction to the Cloud Monitoring API.

Summary of monitoring support

The service-level guides provide information about enabling monitoring for the respective services. Click the link in the “Included service” column in the summary below to go to the applicable page for service-specific information.

Service Included service CRD or annotations? Port Endpoint/Selector Metrics update interval
Both — ServiceMonitor and annotations 4004 nexus.nexus.svc.cluster.local/metrics 15 seconds
CX Contact CX Contact API Aggregator ServiceMonitor 9102 /metrics 15 seconds
CX Contact CX Contact Campaign Manager ServiceMonitor 3106 /metrics 15 seconds
CX Contact CX Contact Compliance Manager ServiceMonitor 3107 /metrics 15 seconds
CX Contact CX Contact Dial Manager ServiceMonitor 3109 /metrics 15 seconds
CX Contact CX Contact Job Scheduler ServiceMonitor 3108 /metrics 15 seconds
CX Contact CX Contact List Builder ServiceMonitor 3104 /metrics 15 seconds
CX Contact CX Contact List Manager ServiceMonitor 3105 /metrics 15 seconds
Designer Designer Application Server ServiceMonitor 8081 See selector details on the DAS metrics and alerts page 10 seconds
Designer Designer ServiceMonitor 8888 See selector details on the DES metrics and alerts page 10 seconds
Email Service Email Service Both or either, depends on harvester Default is 4024 (overridden by values) /iwd-email/v3/metrics 15 sec recommended, depends on harvester
Genesys Authentication Authentication Service Annotations 8081 /prometheus Real-time
Genesys Authentication Environment Service Annotations 8081 /prometheus Real-time
Genesys Customer Experience Insights Genesys CX Insights ServiceMonitor 8180 See selector details on the GCXI metrics and alerts page 15 minutes
Genesys Customer Experience Insights Reporting and Analytics Aggregates PodMonitor and PrometheusRule metrics: 9100,
health: 9101
See selector details on the RAA metrics and alerts page metrics: several seconds,
health: up to 3 minutes
Genesys Info Mart GIM Config Adapter PodMonitor 9249 See selector details on the GCA metrics and alerts page 30 seconds
Genesys Info Mart GIM PodMonitor 8249 See selector details on the GIM metrics and alerts page 30 seconds
Genesys Info Mart GIM Stream Processor PodMonitor 9249 See selector details on the GSP metrics and alerts page 30 seconds
Genesys Pulse Tenant Data Collection Unit (DCU) PodMonitor 9091 See selector details on the Tenant Data Collection Unit (DCU) metrics and alerts page 30 seconds
Genesys Pulse Tenant Load Distribution Server (LDS) PodMonitor 9091 See selector details on the Tenant Load Distribution Server (LDS) metrics and alerts page 30 seconds
Genesys Pulse Pulse Web Service ServiceMonitor 8090 See selector details on the Pulse metrics and alerts page 30 seconds
Genesys Pulse Tenant Permissions Service
Genesys Voice Platform Voice Platform Configuration Server Service/Pod Monitoring Settings Not applicable See selector details on the Voice Platform Configuration Server metrics and alerts page
Genesys Voice Platform Voice Platform Media Control Platform Service/Pod Monitoring Settings 9116,

8080,

8200
See selector details on the Voice Platform Media Control Platform metrics and alerts page
Genesys Voice Platform Voice Platform Service Discovery Automatic 9090 See selector details on the Voice Platform Service Discovery metrics and alerts page
Genesys Voice Platform Voice Platform Reporting Server ServiceMonitor / PodMonitor 9116 See selector details on the Voice Platform Reporting Server metrics and alerts page
Genesys Voice Platform Voice Platform Resource Manager ServiceMonitor / PodMonitor 9116, 8200 See selector details on the Voice Platform Resource Manager metrics and alerts page
Interaction Server (IXN) Interaction Server (IXN) PodMonitor 13131,

13133,

13139
option ixnServer.ports.health - default port 13131 - Endpoint: “/health/prometheus/all”

option ixnNode.ports.default - default port 13133 - Endpoint: “/metrics”

option ixnVQNode.ports.health - default port 13139 - Endpoint: “/metrics”

Note: The above options are references to ports that match endpoints. Use these options to perform the associated query.
Default
Tenant Service Tenant Service PodMonitor 15000 /metrics (http://<pod address>:15000/metrics) 30 seconds (Applicable for any metric(s) that Tenant Service exposes. The update interval is not a property of the metric; it is a property of the optional PodMonitor that you can create.)
Voice Microservices Agent State Service PodMonitor 11000 http://<pod-ipaddress>:11000/metrics 30 seconds
Voice Microservices Call State Service Supports both CRD and annotations 11900 http://<pod-ipaddress>:11900/metrics 30 seconds
Voice Microservices Config Service Supports both CRD and annotations 9100 http://<pod-ipaddress>:9100/metrics 30 seconds
Voice Microservices Dial Plan Service Supports both CRD and annotations 8800 http://<pod-ipaddress>:8800/metrics 30 seconds
Voice Microservices FrontEnd Service Supports both CRD and annotations 9101 http://<pod-ipaddress>:9101/metrics 30 seconds
Voice Microservices ORS Supports both CRD and annotations 11200 http://<pod-ipaddress>:11200/metrics 30 seconds
Voice Microservices Voice Registrar Service Supports both CRD and annotations 11500 http://<pod-ipaddress>:11500/metrics 30 seconds
Voice Microservices Voice RQ Service Supports both CRD and annotations 12000 http://<pod-ipaddress>:12000/metrics 30 seconds
Voice Microservices Voice SIP Cluster Service Supports both CRD and annotations 11300 http://<pod-ipaddress>:11300/metrics 30 seconds
Voice Microservices Voice SIP Proxy Service Supports both CRD and annotations 11400 http://<pod-ipaddress>:11400/metrics 30 seconds
Voice Microservices Voicemail Supports both CRD and annotations 8081 http://<pod-ipaddress>:8081/metrics 30 seconds
WebRTC Media Service WebRTC Gateway Service PodMonitor 10052 /metrics 30s

Sample Prometheus queries to collect metrics

For each query in Prometheus, you can view the results as graph or console output.

Query1: kubelet_http_requests_total

Output:

Graph
Error creating thumbnail: Unable to save thumbnail to destination
Console:
Error creating thumbnail: Unable to save thumbnail to destination

Query 2: sum(irate(sipproxy_requests_processed_self_total{pod=~"voice-sipproxy-0"}[5m])) by (pod,method)

Output

Graph:

Error creating thumbnail: Unable to save thumbnail to destination

Console:

Error creating thumbnail: Unable to save thumbnail to destination
Query 3: node_cpu_utilisation:avg1m

Output

Graph:

Error creating thumbnail: Unable to save thumbnail to destination

Console:

Error creating thumbnail: Unable to save thumbnail to destination