Control plane metrics with Prometheus
The Kubernetes API server exposes a number of metrics that are useful for monitoring and
analysis. These metrics are exposed internally through a metrics endpoint that refers to the
/metrics
HTTP API. Like other endpoints, this endpoint is exposed on the
Amazon EKS control plane.
While this endpoint is useful if you are looking for a specific metric, you typically want
to analyze these metrics over time. To do this, you can deploy Prometheus
Viewing the raw metrics
To view the raw metrics output, use kubectl
with the --raw
flag. This
command allows you to pass any HTTP path and returns the raw response.
kubectl get --raw /metrics
An example output is as follows.
[...]
# HELP rest_client_requests_total Number of HTTP requests, partitioned by status code, method, and host.
# TYPE rest_client_requests_total counter
rest_client_requests_total{code="200",host="127.0.0.1:21362",method="POST"} 4994
rest_client_requests_total{code="200",host="127.0.0.1:443",method="DELETE"} 1
rest_client_requests_total{code="200",host="127.0.0.1:443",method="GET"} 1.326086e+06
rest_client_requests_total{code="200",host="127.0.0.1:443",method="PUT"} 862173
rest_client_requests_total{code="404",host="127.0.0.1:443",method="GET"} 2
rest_client_requests_total{code="409",host="127.0.0.1:443",method="POST"} 3
rest_client_requests_total{code="409",host="127.0.0.1:443",method="PUT"} 8
# HELP ssh_tunnel_open_count Counter of ssh tunnel total open attempts
# TYPE ssh_tunnel_open_count counter
ssh_tunnel_open_count 0
# HELP ssh_tunnel_open_fail_count Counter of ssh tunnel failed open attempts
# TYPE ssh_tunnel_open_fail_count counter
ssh_tunnel_open_fail_count 0
This raw output returns verbatim what the API server exposes. These metrics are
represented in a Prometheus format
metric_name
{"tag
"="value
"[,...
]} value
While this endpoint is useful if you are looking for a specific metric, you typically want to analyze these metrics over time. To do this, you can deploy Prometheus into your cluster.
Deploying Prometheus
This topic helps you deploy Prometheus into your cluster with Helm V3.
If you already have Helm installed, you can check your version with the helm
version
command. Helm is a package manager for Kubernetes clusters. For more
information about Helm and how to install it, see Using Helm with Amazon EKS.
After you configure Helm for your Amazon EKS cluster, you can use it to deploy Prometheus with the following steps.
To deploy Prometheus using Helm
-
Create a Prometheus namespace.
kubectl create namespace prometheus
-
Add the
prometheus-community
chart repository.helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
-
Deploy Prometheus.
helm upgrade -i prometheus prometheus-community/prometheus \ --namespace prometheus \ --set alertmanager.persistentVolume.storageClass="gp2",server.persistentVolume.storageClass="gp2"
Note
If you get the error
Error: failed to download "stable/prometheus" (hint: running `helm repo update` may help)
when executing this command, runhelm repo update prometheus-community
, and then try running the Step 2 command again.If you get the error
Error: rendered manifests contain a resource that already exists
, runhelm uninstall your-release-name -n namespace
, then try running the Step 3 command again. -
Verify that all of the Pods in the
prometheus
namespace are in theREADY
state.kubectl get pods -n prometheus
An example output is as follows.
NAME READY STATUS RESTARTS AGE prometheus-alertmanager-59b4c8c744-r7bgp 1/2 Running 0 48s prometheus-kube-state-metrics-7cfd87cf99-jkz2f 1/1 Running 0 48s prometheus-node-exporter-jcjqz 1/1 Running 0 48s prometheus-node-exporter-jxv2h 1/1 Running 0 48s prometheus-node-exporter-vbdks 1/1 Running 0 48s prometheus-pushgateway-76c444b68c-82tnw 1/1 Running 0 48s prometheus-server-775957f748-mmht9 1/2 Running 0 48s
-
Use
kubectl
to port forward the Prometheus console to your local machine.kubectl --namespace=prometheus port-forward deploy/prometheus-server 9090
-
Point a web browser to
http://localhost:9090
to view the Prometheus console. -
Choose a metric from the - insert metric at cursor menu, then choose Execute. Choose the Graph tab to show the metric over time. The following image shows
container_memory_usage_bytes
over time. -
From the top navigation bar, choose Status, then Targets.
All of the Kubernetes endpoints that are connected to Prometheus using service discovery are displayed.
Store your Prometheus metrics in Amazon Managed Service for Prometheus
Amazon Managed Service for Prometheus is a Prometheus-compatible monitoring and alerting service that makes it easy to monitor containerized applications and infrastructure at scale. It is a fully-managed service that automatically scales the ingestion, storage, querying, and alerting of your metrics. It also integrates with Amazon security services to enable fast and secure access to your data. You can use the open-source PromQL query language to query your metrics and alert on them.
For more information, see Getting started with Amazon Managed Service for Prometheus.