How to use Prometheus and Grafana to Monitor Kubernetes – Part 1

This is a detailed guide on how you can monitor Kubernetes using Prometheus and Grafana. The very first step to doing that is knowing more about our technology. Let’s start with Kubernetes.
Kubernetes
If you are using Kubernetes or are looking to use it, then you are obviously willing to manage your containerized workloads and services using an extensible, open-source, and portable platform that can facilitate automation along with declarative configuration.
There are different features of Kubernetes like self-healing, storage orchestration, secrets and configuration management, and service topology. Learn how to get started with Kubernetes by reading our recent blog post.
Prometheus
Prometheus is ideal when recording a purely numeric time series. Whether you are looking to monitor a highly dynamic architecture that is purely service-oriented or wants something for machine-centric monitoring, this would be the way to go.
- multiple modes of graphing
- a multi-dimensional data model
- no reliance on data storage
Comparing Prometheus with other Kubernetes monitoring tools
Dot separated dimension becomes hard to adapt when it is exposed to higher dimensions data. It contains multiple labels per metric, which makes query-based integration even more difficult.
For example, if you have 15 servers, and you want to group them by error code, you will have a huge number of independent metrics in dot separated dimensions.
Nagios, Sensu, and some other tools are more suitable for network, memory, and CPU monitoring. If you want to get internal details of how your microservices behave, if you need to determine the causes of an incident easily, not just the symptoms, you should think of adopting the Whitebox monitoring. Prometheus is the more appropriate tool for that.
Prometheus and Influxdb are time-series databases, but Influxdb is better for event logging due to the nanosecond time resolution. Prometheus, on the other hand, has a very powerful querying tool and is more suitable for metrics collection as it can inspect them. You can read more about this here.
Grafana
Introduction to monitoring Kubernetes
When the following changes took place in technology, it raised the need to use a modern monitoring framework.
- Monitoring previously consisted of watching hosts, services, and networks; it has entirely changed now. Integrating apps and business is now a requirement of developers as they are involved in the CI/CD pipelining and are performing operations debugging on their own. Now engineers need more details into all details of the infrastructure-application stack, which means monitoring has morphed into observability.
- Traditional monitoring tools were not really designed to handle a huge number of services, network addresses, exposed metrics, and volatile software entities. They were primarily built to scale to only dozens or hundreds of servers. With the explosion of VMs and containers usage, the ‘server’ count of a large team can easily jump to the thousands, even tens of thousands. To perform container actions on container-based infrastructures like high availability, monitoring, debugging, and logging; new monitoring paradigms, and applications were needed.
There is a lot of technical complexity, a rugged learning curve, and a paradigm shift behind the cloud-native architectures. All these complexities create problems in security, design, deployment, and everything else that concerns the observability and monitoring of Kubernetes.
In other words, here are some of the reasons why Prometheus is perfect for the job:
Containers add new challenges to monitoring as they are operationally considered as black boxes (BlackBox refers to the situation in which the innards or structure of application or environment being monitored is unknown or opaque – hence the name ‘black box’). The installation of the kube-state-metrics utility resolves this issue. This exposes the internal metrics on the ‘/metrics’ endpoint, for example, the number of running replicas, stats about deployments, etc.
Prometheus is easy to install and use. All you need to do is expose it to a metrics port, and most of your problems regarding it would be over. In some cases, all a developer needs to do is add a path, and the service is already presenting an HTTP interface. In case the service does not offer Prometheus-compatible metrics, you need to deploy a Prometheus exporter bundle. For the same pod, there is often a sidecar container.
Static monitoring systems or the classical ones are not able to handle the ephemeral entities which might stop or start reporting at any time. The auto discovery mechanism of Prometheus can easily deal with it.
With technology like Kubernetes around, the physical hosting systems are absolutely pointless. It makes everything easier and helps you visualize everything on a better level. For this, you will have to organize monitoring around different groups, for example, namespace, deployment versions, and microservices performance, etc.
Conclusion
