Postgres on Kubernetes: Using AWS EBS as a volume For data persistence – Part I
Kubernetes is the most popular container orchestration platform right now. It is getting massive popularity and its adaption is increasing at a rapid pace. If someone has a sound knowledge of Kubernetes; then the management, monitoring, and deployment of the containerized application is a piece of cake for them. With the advancement and wide adoption of Microservices architecture, the need for a good orchestration platform was imminent and Kubernetes has filled this gap. Kubernetes is contributing a lot to the stability and performance of mission-critical applications and very big industry names have proudly adopted this technology.
Kubernetes is versatile, you can use it for any kind of containerized applications. Its use is not just limited to microservices apps, but monolithic applications can greatly benefit from it as well as long as they have a containerized deployment approach. Kubernetes has addressed almost all challenges of cloud deployments and auto-scaling and every cloud provider is offering managed Kubernetes services to further ease the transition process. It has the ability to handle the deployment and scaling of both Stateless and Stateful applications. If we talk about the scope of this Kubernetes tutorial and summarize what we will be discussing further, here are the four agenda items:
1. Difference between Stateless and Stateful Applications
2. Deploying a Stateful Postgres on a Kubernetes Cluster
3. Using AWS provided EBS Volume as Persistent Storage for Postgres
4. Running a sample Go application code to connect with Postgres within Cluster
We will be using typical Kubernetes terminologies and here are prerequisites which you should be familiar with before attempting this.
- Quest for learning Kubernetes and the eagerness to overcome the non-descriptive errors thrown by YAML files.
- A Working Kubernetes Cluster ( Recommended if you are using AWS managed Kubernetes cluster)
- Sound understanding of underlying Kubernetes concepts and terminologies (Services, Pods, Containers, ConfigMaps, Persistent Volumes, Persistent Volume Claim, Storage Classes, etc)
Let’s get started.
Stateless vs Stateful Applications
Before we plunge into this Kubernetes tutorial and definitions of technical terminologies, consider simple examples that will better explain stateless and stateful applications. A calculator is a stateless application as it does not store the previously performed calculations on its end while a terminal or command-line tool is a stateful application as it keeps track of the history of the executed commands etc. So to put into simpler terms, any application which does not need to store client’s session data to use on the next phase is called stateless application and they applications which store customer/client’s session data to utilize it on the next phase or next login are called Stateful applications.
Both of these applications have pros and cons and it is very important for a good software architect to decide how they will proceed with their application design and implementation. The typical examples of stateless applications are IOT and microservices applications and stateful apps are those which involve database on the backend like banking systems or other systems of this nature.
Kubernetes is well suited for both types of applications. Monitoring, flawless deployments, and maximum uptime are always important, no matter what type of application architecture we are using. Kubernetes is there to ensure that both stateless and stateful applications continue to run with the highest uptime and zero downtime during and after software updates and releases.
Deploying Postgres on Kubernetes Cluster ( Data Persistence / Statefulness)
Let’s see how we can deploy a Postgres container on Kubernetes and also ensure that its data is persisted during new deployments/reboots or even in case of recreation. Here are the things we need to create in order to achieve this goal.
1. Postgres Docker Image
2. Persistent storage volume
3. ConfigMap for storing configuration data
4. A deployment file to actually spin up a pod
We will be using a publically available Postgres docker image from the docker hub – for the sake of demonstration, we will be using an image with the latest tag, feel free to use any version of your choice. If you have any locally built docker image for Postgres then you can also utilize it. In case you are using docker-hub’s Postgres image then you can simply pull it on your Kubernetes cluster using docker pull command, but it’s alright if you don’t do it right now as our deployment YAML file will take care of this process automatically.
Let’s go ahead and create a ConfigMap to store the values that the Postgres image will need upon its startup. We will need the username, password, and database name during startup which and we are going to supply/provide these pieces of information via this new configMap. Here is the YAML file to create the configMap that stores these three values.
apiVersion: v1 kind: ConfigMap metadata: name: pg-config labels: app: postgres data: POSTGRES_DB: postgresdb POSTGRES_USER: root POSTGRES_PASSWORD: root123
Store the above configuration in a file called “pg-configmap.yaml” or whatever name you would like, and use kubectl to create this configMap. If you are running this on your production setup, don’t forget to use stronger login and password.
kubectl apply -f pg-configmap.yaml
We can view which values have been stored in this newly created ConfigMap as:
kubectl describe configMaps pg-config
Our ConfigMap is all set now, the next step is going to be creating a Persistent Volume (PV) and a Persistent Volume Claim (PVC). Since a Docker container is ephemeral by default, upon redeploying it, it loses any data it stored unless we ensure that we are starting them up using persistent storage.
First, we will create a Persistent Volume (PV) and then we will add a Persistent Volume Claim ( which in simpler terms; calls, or utilizes the persistent volume). Note that a single Persistent Volume ( PV) can be utilized in multiple locations as well.
Here is a YAML file that will create a persistent volume of size 5 GB. This is how our pg- pv.yaml file looks like.
kind: PersistentVolume apiVersion: v1 metadata: name: pg-pv-volume labels: type: local app: postgres spec: storageClassName: manual capacity: storage: 5Gi accessModes: - ReadWriteMany hostPath: path: "/home/data/"
Once executed, it will generate a persistent volume of 5 GB, the local path for this mounted volume will be /home/data and it will have both read and write access on the disk. The name of the Persistent Volume ( PV) will be pg-pv-volume.
kubectl apply -f pg-pv.yaml
You can view the details of this newly created Persistent Volume ( PV) using kubectl command.
kubectl describe pv
So far, we have Postgres image, ConfigMap, and Persistent Volume. Next, we need to create a Persistent Volume Claim ( PVC) to utilize this volume. Here is our pg-pvc.yaml file looks like.
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: pg-claim labels: app: postgres spec: storageClassName: manual accessModes: - ReadWriteMany resources: requests: storage: 5Gi
And upon executing the kubectl apply command using this YAML, we get successful PVC.
kubectl apply -f pg-pvc.yaml
All the prerequisites have been fulfilled, let’s go ahead and create a Postgres deployment.
The following YAML file will create a Postgres deployment using the Docker hub Postgres image and this deployment will have login credentials as specified in our above-created ConfigMap and it will use the disk space for data persistence as configured in our PV and PVC.
Here is our pg-deployment.yaml file.
apiVersion: apps/v1 kind: Deployment metadata: name: postgres labels: app: postgres spec: replicas: 1 selector: matchLabels: app: postgres template: metadata: labels: app: postgres spec: containers: - name: postgres image: postgres imagePullPolicy: "IfNotPresent" ports: - containerPort: 5432 envFrom: - configMapRef: name: pg-config volumeMounts: - mountPath: /var/lib/postgresql/data name: postgredb volumes: - name: postgredb persistentVolumeClaim: claimName: pg-claim
The following command will generate Postgres deployment using the YAML.
kubectl apply -f pg-deployment.yaml
Verify that deployment is successful and healthy. You can see that our Postgres pod is up and running.
kubectl get pods
You can further see logs of this running pod to ensure that Postgres service is up.
So far, during this Kubernetes tutorial, we have successfully achieved running Postgres containers inside the Kubernetes cluster using a persistent volume for data storage. Our database is up and ready for any connections from our application services.
In the second part of this Kubernetes tutorial, and before we dive into how to connect a sample application with this Postgres, we are going to see how we can use the concept of volume mounting for data persistence on a popular cloud service like AWS.