Thursday, 12 December 2024

Master Node Management in Kubernetes: Cordon and Uncordon Explained

 In Kubernetes, the master node is the control plane responsible for managing cluster operations. While workloads like pods generally run on worker nodes, there might be scenarios where you need to manage scheduling on the master node itself. Two essential commands for this are cordon and uncordon, which help control pod scheduling on the node.

This blog post will explain what cordoning and uncordoning mean and how you can use these commands to manage your Kubernetes master node efficiently.

What Is Cordoning and Uncordoning?

  • Cordon: This action marks a node as unschedulable, preventing any new pods from being scheduled on it. However, existing pods on the node will continue to run.

  • Uncordon: This reverses the cordon operation, making the node schedulable again. New pods can then be scheduled on the node.

These commands are especially useful during maintenance tasks or when troubleshooting node issues.

Scenarios for Cordoning a Master Node

  1. Cluster Maintenance:
    If you need to perform upgrades or maintenance on the master node, cordoning ensures no new workloads are scheduled on it.

  2. Debugging Node Issues:
    Marking a master node unschedulable helps isolate the node for debugging without affecting cluster operations.

  3. Draining Workloads:
    Cordoning is often the first step before draining a node during cluster updates or scaling down.

Cordon and Uncordon Commands

Cordon the Master Node

To mark the master node as unschedulable, use the following command:

kubectl cordon <node-name>

Example:

kubectl cordon master-node

Output:

node/master-node cordoned

This prevents the scheduler from placing new pods on the master-node.

Uncordon the Master Node

To make the node schedulable again, use:

kubectl uncordon <node-name>

Example:

kubectl uncordon master-node

Output:

node/master-node uncordoned

This command allows the scheduler to assign new workloads to the node.

Draining a Node (Optional Step)

If you need to move all existing workloads off the master node (e.g., for upgrades), you can drain it:

kubectl drain <node-name> --ignore-daemonsets --force --delete-emptydir-data

Options Explained:

  • --ignore-daemonsets: Skips daemonset-managed pods during the drain.
  • --force: Forces eviction of pods that are not managed by a controller.
  • --delete-emptydir-data: Deletes data from pods using emptyDir volumes.

Example:

kubectl drain master-node --ignore-daemonsets --force --delete-emptydir-data

Verifying Node Status

After cordoning or uncordoning, verify the node status with:

kubectl get nodes

The output will include a column labeled STATUS that shows SchedulingDisabled for cordoned nodes and Ready for uncordoned nodes.

Example Output:

NAME           STATUS                     ROLES    AGE     VERSION
master-node    Ready,SchedulingDisabled   master   12d     v1.28.0
worker-node1   Ready                      <none>   12d     v1.28.0
worker-node2   Ready                      <none>   12d     v1.28.0

Automating the Process

You can automate the cordon and uncordon process using a shell script for maintenance tasks.

Example Script

#!/bin/bash

NODE_NAME="master-node"

echo "Cordoning $NODE_NAME..."
kubectl cordon $NODE_NAME

echo "Performing maintenance tasks..."
# Add your maintenance commands here

echo "Uncordoning $NODE_NAME..."
kubectl uncordon $NODE_NAME

echo "Maintenance complete. $NODE_NAME is now schedulable."

Run the script to ensure a smooth maintenance process without manual intervention.

Best Practices for Master Node Management

  1. Avoid Running Workloads on the Master Node: Unless explicitly required, workloads should be limited to worker nodes to ensure master node stability.

  2. Label Master Nodes Correctly: Use labels like node-role.kubernetes.io/master to distinguish master nodes and apply scheduling rules accordingly.

  3. Plan Maintenance in Advance: Notify your team and schedule maintenance during off-peak hours to minimize disruption.

  4. Monitor Node Health: Continuously monitor node health metrics to detect potential issues early.

Cordoning and uncordoning are straightforward yet powerful commands that help manage node availability in a Kubernetes cluster. Whether you’re isolating the master node for maintenance or re-enabling scheduling after resolving an issue, these commands are critical for efficient cluster management.

Labels:

0 Comments:

Post a Comment

Note: only a member of this blog may post a comment.

<< Home