Cargo query

Showing below up to 100 results in range #201 to #300.

View (previous 100 | next 100) (20 | 50 | 100 | 250 | 500)

Page	Alert	Severity	AlertDescription	BasedOn	Threshold
Draft:PEC-OU/Current/CXCPEGuide/CPLMMetrics	CXC-PodRestartsCount	HIGH	Triggered when the restart count for a pod is beyond the threshold.		1 for 5m
Draft:PEC-OU/Current/CXCPEGuide/CPLMMetrics	CXC-PodRestartsCountPD	HIGH	Triggered when the restart count is beyond the critical threshold.		5 for 5m
Draft:PEC-OU/Current/CXCPEGuide/CPLMMetrics	CXC-PodsNotReadyPD	HIGH	Triggered when there are no pods ready for CX Contact deployment.		0 for 1m
Draft:PEC-OU/Current/CXCPEGuide/DMMetrics	CXC-CPUUsage	HIGH	Triggered when the CPU utilization of a pod is beyond the threshold		300% for 5m
Draft:PEC-OU/Current/CXCPEGuide/DMMetrics	CXC-DM-LatencyHigh	HIGH	Triggered when the latency for dial manager is above the defined threshold.		5000ms for 5m
Draft:PEC-OU/Current/CXCPEGuide/DMMetrics	CXC-MemoryUsage	HIGH	Triggered when the memory utilization of a pod is beyond the threshold.		70% for 5m
Draft:PEC-OU/Current/CXCPEGuide/DMMetrics	CXC-MemoryUsagePD	HIGH	Triggered when the memory usage of a pod is above the critical threshold.		90% for 5m
Draft:PEC-OU/Current/CXCPEGuide/DMMetrics	CXC-PodNotReadyCount	HIGH	Triggered when the number of pods ready for a CX Contact deployment is less than or equal to the threshold.		1 for 5m
Draft:PEC-OU/Current/CXCPEGuide/DMMetrics	CXC-PodRestartsCount	HIGH	Triggered when the restart count for a pod is beyond the threshold.		1 for 5m
Draft:PEC-OU/Current/CXCPEGuide/DMMetrics	CXC-PodRestartsCountPD	HIGH	Triggered when the restart count is beyond the critical threshold.		5 for 5m
Draft:PEC-OU/Current/CXCPEGuide/DMMetrics	CXC-PodsNotReadyPD	HIGH	Triggered when there are no pods ready for CX Contact deployment.		0 for 1m
Draft:PEC-OU/Current/CXCPEGuide/JSMetrics	CXC-CPUUsage	HIGH	Triggered when the CPU utilization of a pod is beyond the threshold		300% for 5m
Draft:PEC-OU/Current/CXCPEGuide/JSMetrics	CXC-JS-LatencyHigh	HIGH	Triggered when the latency for job scheduler is above the defined threshold.		5000ms for 5m
Draft:PEC-OU/Current/CXCPEGuide/JSMetrics	CXC-MemoryUsage	HIGH	Triggered when the memory utilization of a pod is beyond the threshold.		70% for 5m
Draft:PEC-OU/Current/CXCPEGuide/JSMetrics	CXC-MemoryUsagePD	HIGH	Triggered when the memory usage of a pod is above the critical threshold.		90% for 5m
Draft:PEC-OU/Current/CXCPEGuide/JSMetrics	CXC-PodNotReadyCount	HIGH	Triggered when the number of pods ready for a CX Contact deployment is less than or equal to the threshold.		1 for 5m
Draft:PEC-OU/Current/CXCPEGuide/JSMetrics	CXC-PodRestartsCount	HIGH	Triggered when the restart count for a pod is beyond the threshold.		1 for 5m
Draft:PEC-OU/Current/CXCPEGuide/JSMetrics	CXC-PodRestartsCountPD	HIGH	Triggered when the restart count is beyond the critical threshold.		5 for 5m
Draft:PEC-OU/Current/CXCPEGuide/JSMetrics	CXC-PodsNotReadyPD	HIGH	Triggered when there are no pods ready for CX Contact deployment.		0 for 1m
Draft:PEC-OU/Current/CXCPEGuide/LBMetrics	CXC-CPUUsage	HIGH	Triggered when the CPU utilization of a pod is beyond the threshold		300% for 5m
Draft:PEC-OU/Current/CXCPEGuide/LBMetrics	CXC-LB-LatencyHigh	HIGH	Triggered when the latency for list builder is above the defined threshold.		5000ms for 5m
Draft:PEC-OU/Current/CXCPEGuide/LBMetrics	CXC-MemoryUsage	HIGH	Triggered when the memory utilization of a pod is beyond the threshold.		70% for 5m
Draft:PEC-OU/Current/CXCPEGuide/LBMetrics	CXC-MemoryUsagePD	HIGH	Triggered when the memory usage of a pod is above the critical threshold.		90% for 5m
Draft:PEC-OU/Current/CXCPEGuide/LBMetrics	CXC-PodNotReadyCount	HIGH	Triggered when the number of pods ready for a CX Contact deployment is less than or equal to the threshold.		1 for 5m
Draft:PEC-OU/Current/CXCPEGuide/LBMetrics	CXC-PodRestartsCount	HIGH	Triggered when the restart count for a pod is beyond the threshold.		1 for 5m
Draft:PEC-OU/Current/CXCPEGuide/LBMetrics	CXC-PodRestartsCountPD	HIGH	Triggered when the restart count is beyond the critical threshold.		5 for 5m
Draft:PEC-OU/Current/CXCPEGuide/LBMetrics	CXC-PodsNotReadyPD	HIGH	Triggered when there are no pods ready for CX Contact deployment.		0 for 1m
Draft:PEC-OU/Current/CXCPEGuide/LMMetrics	CXC-CPUUsage	HIGH	Triggered when the CPU utilization of a pod is beyond the threshold		300% for 5m
Draft:PEC-OU/Current/CXCPEGuide/LMMetrics	CXC-LM-LatencyHigh	HIGH	Triggered when the latency for list manager is above the defined threshold		5000ms for 5m
Draft:PEC-OU/Current/CXCPEGuide/LMMetrics	CXC-MemoryUsage	HIGH	Triggered when the memory utilization of a pod is beyond the threshold.		70% for 5m
Draft:PEC-OU/Current/CXCPEGuide/LMMetrics	CXC-MemoryUsagePD	HIGH	Triggered when the memory usage of a pod is above the critical threshold.		90% for 5m
Draft:PEC-OU/Current/CXCPEGuide/LMMetrics	CXC-PodNotReadyCount	HIGH	Triggered when the number of pods ready for a CX Contact deployment is less than or equal to the threshold.		1 for 5m
Draft:PEC-OU/Current/CXCPEGuide/LMMetrics	CXC-PodRestartsCount	HIGH	Triggered when the restart count for a pod is beyond the threshold.		1 for 5m
Draft:PEC-OU/Current/CXCPEGuide/LMMetrics	CXC-PodRestartsCountPD	HIGH	Triggered when the restart count is beyond the critical threshold.		5 for 5m
Draft:PEC-OU/Current/CXCPEGuide/LMMetrics	CXC-PodsNotReadyPD	HIGH	Triggered when there are no pods ready for CX Contact deployment.		0 for 1m
Draft:PEC-OU/Current/CXCPEGuide/LMMetrics	cxc_list_manager_too_many_errors_from_auth	HIGH	Triggered when there are too many error responses from the auth service (list manager) for more than the specified time threshold.		1m
Draft:PEC-REP/Current/GCXIPEGuide/GCXIMetrics	gcxi__cluster__info		This alert indicates problems with the cluster states. Applicable only if you have two or more nodes in a cluster.	gcxi__cluster__info
Draft:PEC-REP/Current/GCXIPEGuide/GCXIMetrics	gcxi__projects__status		If the value of cxi__projects__status is greater than 0, this alarm is set, indicating that reporting is not functioning properly.	cxi__projects__status	< 0
Draft:PEC-REP/Current/GCXIPEGuide/RAAMetrics	raa-errors	'''Specified by''': raa.prometheusRule.alerts.raa-errors.labels.severity in values.yaml. '''Recommended value''': warning	A nonzero value indicates that errors have been logged during the scrape interval.	gcxi_raa_error_count	>0
Draft:PEC-REP/Current/GCXIPEGuide/RAAMetrics	raa-health	'''Specified by''': raa.prometheusRule.alerts.labels.severity '''Recommended value:''' severe	A zero value for a recent period (several scrape intervals) indicates that RAA is not operating.	gcxi_raa_health_level	Specified by: raa.prometheusRule.alerts.health.for '''Recommended value''': 30m
Draft:PEC-REP/Current/GCXIPEGuide/RAAMetrics	raa-long-aggregation	'''Specified by''': raa.prometheusRule.alerts.longAggregation.labels.severity in values.yaml. '''Recommended value''': warning	Indicates that the average duration of aggregation queries specified by the hierarchy, level, and mediaType labels is greater than the deadlock-threshold.	gcxi_raa_aggregated_duration_ms/ gcxi_raa_aggregated_count	Greater than the value (seconds) of raa.prometheusRule.alerts.longAggregation.thresholdSec in values.yaml. '''Recommended value''': 300
Draft:PEC-REP/Current/GIMPEGuide/GCAMetrics	GcaOOMKilled	Critical	Triggered when a GCA pod is restarted because of OOMKilled.	kube_pod_container_status_restarts_total and kube_pod_container_status_last_terminated_reason	1
Draft:PEC-REP/Current/GIMPEGuide/GCAMetrics	GcaPodCrashLooping	Critical	Triggered when a GCA pod is crash looping.	kube_pod_container_status_restarts_total	The restart rate is greater than 0 for 5 minutes
Draft:PEC-REP/Current/GIMPEGuide/GSPMetrics	GspFlinkJobDown	Critical	Triggered when the GSP Flink job is not running (number of running jobs equals to 0 or metric is not available)	flink_jobmanager_numRunningJobs	For 5 minutes
Draft:PEC-REP/Current/GIMPEGuide/GSPMetrics	GspNoTmRegistered	Critical	Triggered when there are no registered TaskManagers (or metric not available)	flink_jobmanager_numRegisteredTaskManagers	For 5 minutes
Draft:PEC-REP/Current/GIMPEGuide/GSPMetrics	GspOOMKilled	Critical	Triggered when a GSP pod is restarted because of OOMKilled	kube_pod_container_status_restarts_total	0
Draft:PEC-REP/Current/GIMPEGuide/GSPMetrics	GspUnknownPerson	High	Triggered when GSP encounters unknown person(s)	flink_taskmanager_job_task_operator_tenant_error_total{error="unknown_person",service="gsp"}	For 5 minutes
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_critical_col_connected_configservers	Critical	Pulse DCU Collector is not connected to ConfigServer.	pulse_collector_connection_status	for 15 minutes
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_critical_col_connected_dbservers	Critical	Pulse DCU Collector is not connected to DbServer.	pulse_collector_connection_status	for 15 minutes
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_critical_col_connected_statservers	Critical	Pulse DCU Collector is not connected to Stat Server.	pulse_collector_connection_status	for 15 minutes
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_critical_col_snapshot_writing	Critical	Pulse DCU Collector does not write snapshots.	pulse_collector_snapshot_writing_status	for 15 minutes
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_critical_cpu	Critical	Detected critical CPU usage by Pulse DCU Pod.	container_cpu_usage_seconds_total, kube_pod_container_resource_limits	90%
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_critical_disk	Critical	Detected critical disk usage by Pulse DCU Pod.	kubelet_volume_stats_available_bytes, kubelet_volume_stats_capacity_bytes	90%
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_critical_memory	Critical	Detected critical memory usage by Pulse DCU Pod.	container_memory_working_set_bytes, kube_pod_container_resource_limits	90%
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_critical_nonrunning_instances	Critical	Triggered when Pulse DCU instances are down.	kube_statefulset_status_replicas_ready, kube_statefulset_status_replicas	for 15 minutes
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_critical_ss_connected_configservers	Critical	Pulse DCU Stat Server is not connected to ConfigServer.	pulse_statserver_server_connected_seconds	for 15 minutes
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_critical_ss_connected_ixnservers	Critical	Pulse DCU Stat Server is not connected to IxnServers.	pulse_statserver_server_connected_seconds	2
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_critical_ss_connected_tservers	Critical	Pulse DCU Stat Server is not connected to T-Servers.	pulse_statserver_server_connected_number	2
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_critical_ss_failed_dn_registrations	Critical	Detected critical DN registration failures on Pulse DCU Stat Server.	pulse_statserver_dn_failed, pulse_statserver_dn_registered	0.5%
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_monitor_data_unavailable	Critical	Pulse DCU Monitor Agents do not provide data.	pulse_monitor_check_duration_seconds, kube_statefulset_replicas	for 15 minutes
Draft:PEC-REP/Current/PulsePEGuide/dcuMetrics	pulse_dcu_too_frequent_restarts	Critical	Detected too frequent restarts of DCU Pod container.	kube_pod_container_status_restarts_total	2 for 1 hour
Draft:PEC-REP/Current/PulsePEGuide/ldsMetrics	pulse_lds_critical_cpu	Critical	Detected critical CPU usage by Pulse LDS Pod.	container_cpu_usage_seconds_total, kube_pod_container_resource_limits	90%
Draft:PEC-REP/Current/PulsePEGuide/ldsMetrics	pulse_lds_critical_memory	Critical	Detected critical memory usage by Pulse LDS Pod.	container_memory_working_set_bytes, kube_pod_container_resource_limits	90%
Draft:PEC-REP/Current/PulsePEGuide/ldsMetrics	pulse_lds_critical_nonrunning_instances	Critical	Triggered when Pulse LDS instances are down.	kube_statefulset_status_replicas_ready, kube_statefulset_status_replicas	for 15 minutes
Draft:PEC-REP/Current/PulsePEGuide/ldsMetrics	pulse_lds_monitor_data_unavailable	Critical	Pulse LDS Monitor Agents do not provide data.	pulse_monitor_check_duration_seconds, kube_statefulset_replicas	for 15 minutes
Draft:PEC-REP/Current/PulsePEGuide/ldsMetrics	pulse_lds_no_connected_senders	Critical	Pule LDS is not connected to upstream servers.	pulse_lds_senders_number	for 15 minutes
Draft:PEC-REP/Current/PulsePEGuide/ldsMetrics	pulse_lds_no_registered_dns	Critical	No DNs are registered on Pulse LDS.	pulse_lds_sender_registered_dns_number	for 30 minutes
Draft:PEC-REP/Current/PulsePEGuide/ldsMetrics	pulse_lds_too_frequent_restarts	Critical	Detected too frequent restarts of LDS Pod container.	kube_pod_container_status_restarts_total	2 for 1 hour
Draft:PEC-REP/Current/PulsePEGuide/PulseMetrics	pulse_critical_5xx	Critical	Detected critical 5xx errors per second for Pulse container.	http_server_requests_seconds_count	15%
Draft:PEC-REP/Current/PulsePEGuide/PulseMetrics	pulse_critical_cpu	Critical	Detected critical CPU usage by Pulse Pod.	container_cpu_usage_seconds_total, kube_pod_container_resource_limits	90%
Draft:PEC-REP/Current/PulsePEGuide/PulseMetrics	pulse_critical_hikari_cp	Critical	Detected critical Hikari connections pool usage by Pulse container.	hikaricp_connections_active, hikaricp_connections	90%
Draft:PEC-REP/Current/PulsePEGuide/PulseMetrics	pulse_critical_memory	Critical	Detected critical memory usage by Pulse Pod.	container_memory_working_set_bytes, kube_pod_container_resource_limits	90%
Draft:PEC-REP/Current/PulsePEGuide/PulseMetrics	pulse_critical_pulse_health	Critical	Detected critical number of healthy Pulse containers.	pulse_health_all_Boolean	50%
Draft:PEC-REP/Current/PulsePEGuide/PulseMetrics	pulse_critical_running_instances	Critical	Triggered when Pulse instances are down.	kube_deployment_status_replicas_available, kube_deployment_status_replicas	75%
Draft:PEC-REP/Current/PulsePEGuide/PulseMetrics	pulse_service_down	Critical	All Pulse instances are down.	up	for 15 minutes
Draft:PEC-REP/Current/PulsePEGuide/PulseMetrics	pulse_too_frequent_restarts	Critical	Detected too frequent restarts of Pulse Pod container.	kube_pod_container_status_restarts_total	2 for 1 hour
Draft:PEC-REP/Current/PulsePEGuide/PulsePermissionsMetrics	pulse_permissions_critical_cpu	Critical	Detected critical CPU usage by Pulse Permissions Pod.	container_cpu_usage_seconds_total, kube_pod_container_resource_limits	90%
Draft:PEC-REP/Current/PulsePEGuide/PulsePermissionsMetrics	pulse_permissions_critical_memory	Critical	Detected critical memory usage by Pulse Permissions Pod.	container_memory_working_set_bytes, kube_pod_container_resource_limits	90%
Draft:PEC-REP/Current/PulsePEGuide/PulsePermissionsMetrics	pulse_permissions_critical_running_instances	Critical	Triggered when Pulse Permissions instances are down.	kube_deployment_status_replicas_available, kube_deployment_status_replicas	75%
Draft:PEC-REP/Current/PulsePEGuide/PulsePermissionsMetrics	pulse_permissions_too_frequent_restarts	Critical	Detected too frequent restarts of Permissions Pod container.	kube_pod_container_status_restarts_total	2 for 1 hour
Draft:STRMS/Current/STRMSPEGuide/ServiceMetrics	streams_GWS_AUTH_DOWN	critical	Unable to connect to GWS auth service	gws_auth_down	10 seconds
Draft:STRMS/Current/STRMSPEGuide/STRMSMetrics	streams_BATCH_LAG_TIME	warning	Message handling exceeds 2 secs		30 seconds
Draft:STRMS/Current/STRMSPEGuide/STRMSMetrics	streams_DOWN	critical	The number of running instances is 0	sum(up) < 1	10 seconds
Draft:STRMS/Current/STRMSPEGuide/STRMSMetrics	streams_ENDPOINT_CONNECTION_DOWN	warning	Unable to connect to a customer endpoint	endpoint_connection_down	30 seconds
Draft:STRMS/Current/STRMSPEGuide/STRMSMetrics	streams_ENGAGE_KAFKA_CONNECTION_DOWN	critical	Unable to connect to Engage Kafka	engage_kafka_main_connection_down	10 seconds
Draft:STRMS/Current/STRMSPEGuide/STRMSMetrics	streams_GWS_AUTH_DOWN	Critical	Unable to connect to GWS auth service	gws_auth_down	30 seconds
Draft:STRMS/Current/STRMSPEGuide/STRMSMetrics	streams_GWS_CONFIG_DOWN	critical	Unable to connect to GWS config service	gws_config_down
Draft:STRMS/Current/STRMSPEGuide/STRMSMetrics	streams_GWS_ENV_DOWN	critical	Unable to connect to GWS environment service	gws_env_down	30 seconds
Draft:STRMS/Current/STRMSPEGuide/STRMSMetrics	streams_INIT_ERROR	critical	Aborted due to initialization error e.g., KAFKA_FQDN is not defined	application_streams_init_error > 0	10 seconds
Draft:STRMS/Current/STRMSPEGuide/STRMSMetrics	streams_REDIS_DOWN	critical		redis_connection_down	10 seconds
Draft:TLM/Current/TLMPEGuide/TLMMetrics	Http Errors Occurrences Exceeded Threshold	High	Triggered when the number of HTTP errors exceeds 500 responses in 5 minutes	telemetry_events{eventName=~"http_error_.*", eventName!="http_error_404"}	>500 in 5 minutes
Draft:TLM/Current/TLMPEGuide/TLMMetrics	Telemetry CPU Utilization is Greater Than Threshold	High	Triggered when average CPU usage is more than 60%	node_cpu_seconds_total	>60%
Draft:TLM/Current/TLMPEGuide/TLMMetrics	Telemetry Dependency Status	Low	Triggered when there is no connection to one of the dependent services - GAuth, Config, Prometheus	telemetry_dependency_status	<80
Draft:TLM/Current/TLMPEGuide/TLMMetrics	Telemetry GAuth Time Alert	High	Triggered when there is no connection to the GAuth service	telemetry_gws_auth_req_time	>10000
Draft:TLM/Current/TLMPEGuide/TLMMetrics	Telemetry Healthy Pod Count Alert	High	Triggered when the number of healthy pods drops to critical level	kube_pod_container_status_ready	<2
Draft:TLM/Current/TLMPEGuide/TLMMetrics	Telemetry High Network Traffic	High	Triggered when network traffic exceeds 10MB/second for 5 minutes	node_network_transmit_bytes_total, node_network_receive_bytes_total	>10MBps
Draft:TLM/Current/TLMPEGuide/TLMMetrics	Telemetry Memory Usage is Greater Than Threshold	High	Triggered when average memory usage is more than 60%	container_cpu_usage_seconds_total, kube_pod_container_resource_limits_cpu_cores	>60%
Draft:UCS/Current/UCSPEGuide/UCSMetrics	ucsx_elasticsearch_health_status	critical	Triggered when there is no connection to ElasticSearch	ucsx_elasticsearch_health_status	2 minutes
Draft:UCS/Current/UCSPEGuide/UCSMetrics	ucsx_elasticsearch_slow_processing_time	critical	Triggered when Elasticsearch internal processing time > 500 ms	ucsx_elastic_search_sum, ucsx_elastic_search_count	5 minutes
Draft:UCS/Current/UCSPEGuide/UCSMetrics	ucsx_instance_high_cpu_utilization	warning	Triggered when average CPU usage is more than 80%	ucsx_performance	5 minutes

View (previous 100 | next 100) (20 | 50 | 100 | 250 | 500)

Modify query

Table(s):
Field(s):
Where:
Join on:
Group by:
Having:
Order by:
Limit:
Offset:
Format: