System metrics
Find useful metrics provided by Kubernetes and other system resources to monitor the status and performance of the cluster and nodes.
Kubernetes and Node metrics
In addition to the service-defined metrics described in the service-level guides (see links here), standard Kubernetes and other system metrics are obviously important for monitoring the status and performance of your cluster(s), nodes, and services.
Kubernetes metrics
For full information about all the cluster metrics Kubernetes provides, see the Kubernetes documentation. Genesys recommends that you pay attention to the following cluster-related metrics in particular.
{{PESystemMetricMetric | Prometheus formula | Indicator of | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Metric=Pod Restarts | MetricDescription= | PrometheusFormula=increase(kube_pod_container_status_restarts_total{namespace="$namespace", pod=~"$service.*"})[1m] | Type=Kubernetes | UsedFor=
}} {{PESystemMetric |
Metric=The cgroup's total memory | MetricDescription= | PrometheusFormula=sum(container_memory_usage_bytes{namespace="$namespace",pod=~"$service-.*", container!=""}) by (pod) | Type=Kubernetes | UsedFor=Memory
}} {{PESystemMetric |
Metric=The cgroup's CPU usage | MetricDescription= | PrometheusFormula=sum (rate (container_cpu_usage_seconds_total{namespace="$namespace",pod=~"$service-.*", container!="POD"}[1m])) by (pod) * 100 | Type=Kubernetes | UsedFor=CPU utilization
}} {{PESystemMetric |
Metric=Bytes transmitted over the network by the container | MetricDescription= | PrometheusFormula=rate(container_network_transmit_bytes_total{namespace="$namespace",pod=~"$service-.*", container!=""}[1m]) | Type=Kubernetes | UsedFor=
}} {{PESystemMetric |
Metric=Bytes received over the network by the container | MetricDescription= | PrometheusFormula=rate(container_network_receive_bytes_total{namespace="$namespace",pod=~"$service-.*", container!=""}[1m]) | Type=Kubernetes | UsedFor=
}} |
Node metrics
Genesys recommends that you pay attention to the following node-related metrics in particular.
Metric | Prometheus formula | Indicator of
{{PESystemMetric |
Metric=Process HEAP All | MetricDescription= | PrometheusFormula={SERVICE_NAME}_process_heap_bytes{pod=~"$pod",service="$service"} | Type=Node | UsedFor=Heap status
}} {{PESystemMetric |
Metric=Process CPU All | MetricDescription= | PrometheusFormula=sum(rate({SERVICE_NAME}_process_cpu_seconds_total{pod=~"$pod",service="$service"}[30s]) * 100) by (pod) | Type=Node | UsedFor=CPU utilization
}} {{PESystemMetric |
Metric=Process Memory All: resident memory | MetricDescription= | PrometheusFormula={SERVICE_NAME}_process_resident_memory_bytes{pod=~"$pod",service="$service"} | Type=Node | UsedFor=Memory
}} {{PESystemMetric |
Metric=Process Memory All: virtual memory | MetricDescription= | PrometheusFormula={SERVICE_NAME}_process_virtual_memory_bytes{pod=~"$pod",service="$service"} | Type=Node | UsedFor=Memory
}} |
---|