How to Write kubernetes manifest file for database server and mount pvc to it?
In the era of cloud-native applications, Kubernetes has emerged as the de facto platform for orchestrating containerized workloads. While stateless applications are relatively straightforward to manage, stateful applications like databases present unique challenges. Databases require persistent storage, stable network identities, and high availability—features that demand careful configuration in Kubernetes.
This guide provides an in-depth walkthrough of deploying a production-ready database (using PostgreSQL as an example) on Kubernetes. We’ll cover everything from foundational concepts like Persistent Volume Claims (PVCs) to advanced strategies for high availability, security, and disaster recovery. By the end, you’ll understand how to:
- Use StatefulSets for stable, scalable database deployments.
- Securely manage credentials with Kubernetes Secrets.
- Configure Storage Classes for cloud-optimized storage.
- Implement high availability and automated backups.
- Monitor database health with Prometheus and Grafana.
Table of Contents
- Understanding Kubernetes Storage: PVs, PVCs, and Storage Classes
- Why StatefulSets Trump Deployments for Databases
- Step-by-Step: Writing a PostgreSQL StatefulSet Manifest
- Securing Your Database: Secrets and Network Policies
- Optimizing Storage with Cloud-Specific Storage Classes
- High Availability: Anti-Affinity and Readiness Probes
- Disaster Recovery: Automated Backups to S3
- Monitoring with Prometheus and PostgreSQL Exporter
- Common Pitfalls and How to Avoid Them
1. Understanding Kubernetes Storage: PVs, PVCs, and Storage Classes
Persistent Volumes (PVs)
A Persistent Volume (PV) is a cluster-wide storage resource provisioned by an administrator or dynamically via a Storage Class. PVs abstract the underlying storage infrastructure (e.g., AWS EBS, GCP Persistent Disk) and decouple storage from pods.
Persistent Volume Claims (PVCs)
A Persistent Volume Claim (PVC) is a user’s request for storage. PVCs bind to PVs, allowing pods to consume storage without needing to know the specifics of the storage backend.
# Example PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp3-encrypted # Uses a custom Storage Class
resources:
requests:
storage: 50Gi
Storage Classes
A Storage Class defines the “type” of storage (e.g., SSD, HDD) and provisioning parameters. Cloud providers offer pre-configured Storage Classes, but you can create custom ones for encryption, performance, or cost optimization.
# AWS gp3 Storage Class with Encryption
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3-encrypted
provisioner: ebs.csi.aws.com
parameters:
type: gp3
encrypted: "true" # Enables EBS encryption
volumeBindingMode: WaitForFirstConsumer # Delays provisioning until pod scheduling
2. Why StatefulSets Trump Deployments for Databases
Deployments: Designed for Stateless Workloads
- Pros: Simple scaling, rolling updates.
- Cons: Pods are ephemeral, no stable network identity, shared storage.
StatefulSets: Built for Stateful Workloads
- Stable Network Identity: Each pod gets a unique hostname (e.g.,
postgres-0
,postgres-1
). - Ordered Scaling: Pods are created/terminated sequentially.
- Persistent Storage: Each pod gets its own PVC via
volumeClaimTemplates
.
3. Step-by-Step: Writing a PostgreSQL StatefulSet Manifest
Step 1: Create a Headless Service
A headless Service (no ClusterIP) enables direct DNS resolution to individual pods.
apiVersion: v1
kind: Service
metadata:
name: postgres-headless
spec:
clusterIP: None
selector:
app: postgres
ports:
- port: 5432
Step 2: Define the StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres-headless
replicas: 3 # 3-node HA setup
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15.3 # Pinned version
envFrom:
- secretRef:
name: postgres-secrets # Securely inject credentials
ports:
- containerPort: 5432
volumeMounts:
- name: postgres-data
mountPath: /var/lib/postgresql/data
livenessProbe:
exec:
command:
- sh
- -c
- "pg_isready -U admin -d mydatabase"
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- sh
- -c
- "pg_isready -U admin -d mydatabase"
resources:
requests:
memory: 2Gi
cpu: 500m
limits:
memory: 4Gi
cpu: 2
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: postgres
topologyKey: kubernetes.io/hostname # Spread pods across nodes
volumeClaimTemplates:
- metadata:
name: postgres-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: gp3-encrypted
resources:
requests:
storage: 50Gi
4. Securing Your Database: Secrets and Network Policies
Step 1: Create a Secret for Credentials
kubectl create secret generic postgres-secrets \
--from-literal=POSTGRES_USER=admin \
--from-literal=POSTGRES_PASSWORD=MySecurePassword! \
--from-literal=POSTGRES_DB=mydatabase
Step 2: Restrict Access with Network Policies
Only allow traffic from your application pods.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: postgres-allow-app
spec:
podSelector:
matchLabels:
app: postgres
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: my-app # Replace with your app's label
ports:
- protocol: TCP
port: 5432
5. Optimizing Storage with Cloud-Specific Storage Classes
AWS Example: gp3 with Encryption
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3-encrypted
provisioner: ebs.csi.aws.com
parameters:
type: gp3
encrypted: "true"
iops: "10000" # Customize IOPS for high-performance workloads
throughput: "500" # MB/s
volumeBindingMode: WaitForFirstConsumer
GCP Example: Regional Persistent Disk
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: regional-pd
provisioner: pd.csi.storage.gke.io
parameters:
type: pd-balanced
replication-type: regional-pd # Replicates across zones
volumeBindingMode: WaitForFirstConsumer
6. High Availability: Anti-Affinity and Readiness Probes
Pod Anti-Affinity
Prevents scheduling multiple database pods on the same node:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: postgres
topologyKey: kubernetes.io/hostname
Liveness and Readiness Probes
Ensure Kubernetes restarts unhealthy pods and routes traffic only to ready instances.
7. Disaster Recovery: Automated Backups to S3
CronJob for Daily Backups
apiVersion: batch/v1
kind: CronJob
metadata:
name: postgres-backup
spec:
schedule: "0 2 * * *" # 2 AM daily
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: amazon/aws-cli:2.13.22
command:
- /bin/sh
- -c
- |
PGPASSWORD=$POSTGRES_PASSWORD pg_dump -h postgres-headless -U admin mydatabase | gzip | aws s3 cp - s3://my-bucket/backups/$(date +\%Y-\%m-\%d).sql.gz
envFrom:
- secretRef:
name: postgres-secrets
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-credentials
key: access_key
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-credentials
key: secret_key
restartPolicy: OnFailure
8. Monitoring with Prometheus and PostgreSQL Exporter
Deploy PostgreSQL Exporter
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-exporter
spec:
replicas: 1
selector:
matchLabels:
app: postgres-exporter
template:
metadata:
labels:
app: postgres-exporter
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9187"
spec:
containers:
- name: exporter
image: prometheuscommunity/postgres-exporter
env:
- name: DATA_SOURCE_NAME
value: "postgresql://admin:$(POSTGRES_PASSWORD)@postgres-headless:5432/mydatabase?sslmode=disable"
ports:
- containerPort: 9187
Setting Up Prometheus
To monitor your PostgreSQL database, you need to set up Prometheus to scrape metrics from the PostgreSQL Exporter. This involves configuring Prometheus to include the exporter in its scrape configuration.
# Example Prometheus configuration
scrape_configs:
- job_name: 'postgres'
static_configs:
- targets: ['postgres-exporter:9187']
9. Common Pitfalls and How to Avoid Them
1. Not Using Resource Limits
Failing to set resource requests and limits can lead to resource starvation. Always define these in your StatefulSet.
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "4Gi"
cpu: "2"
2. Ignoring Liveness and Readiness Probes
Without these probes, Kubernetes cannot effectively manage pod health. Always include them in your manifests.
3. Hardcoding Secrets
Avoid hardcoding sensitive information in your manifests. Use Kubernetes Secrets to manage credentials securely.
4. Not Implementing Backups
Regular backups are crucial for disaster recovery. Implement automated backups using CronJobs as shown earlier.
5. Lack of Monitoring
Without monitoring, you cannot effectively manage your database’s health. Set up Prometheus and Grafana to visualize metrics.
Deploying a production-grade database on Kubernetes requires careful planning and execution. By leveraging StatefulSets, Kubernetes Secrets, and cloud-specific Storage Classes, you can create a resilient and secure database environment. Implementing high availability strategies, automated backups, and monitoring will further enhance your database’s reliability.
Next Steps:
- Experiment with the provided YAML manifests in a test environment.
- Explore additional features like horizontal pod autoscaling for your database.
- Consider integrating with tools like Grafana for advanced monitoring and alerting.
By following the best practices outlined in this guide, you can ensure that your database deployment on Kubernetes is robust, secure, and ready for production workloads.
Labels: Write kubernetes manifest file for database server and mount pvc to it
0 Comments:
Post a Comment
Note: only a member of this blog may post a comment.
<< Home