Scheduling in Kubernetes

In Kubernetes, scheduling means making sure that the Pods are matched to Nodes that can provide the required resources for said Pods. It is one of the core features of Kubernetes. It can also be known as the process of placing Pods in a Kubernetes cluster according to their resource requirements.

Pod priority, Preemption and Pod priority class change Pod scheduling behaviour.

Prerequisites

Kubernetes Cluster
kubectl

Pod Scheduling Concepts

Pod Priority

Pod priority is a scheduling feature that enables Kubernetes to schedule Pods on nodes by comparing the priority number of different Pods and scheduling them accordingly.

Two main concepts in Pod priority are:

Pod Preemption
Pod Priority Class

Pod Preemption

Pod Preemption allows a Kubernetes cluster to evict/preempt lower-priority Pods from the nodes when high-priority Pods are waiting to be scheduled and when there is a lack of resources in the node.

Pod Priority Class

A Pod priority class is required to assign a Pod to a certain priority in the cluster. Priority for a Pod can be set using the PriorityClass object. The value of the PriorityClass object determines the priority of the Pod.

✍️

The priority value can be in-between one billion to zero. Large number values signal a higher priority for the pods.

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    env: test
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  priorityClassName: high-priority-apps

Example of Pod Priority Class

Kubernetes Pod Priority Value

The priorityClassName will be used in a Pod to set its priority. If we set preemptionPolicy to Never, the Pods will not be evicted by the priority class.

ℹ️

The default behaviour of PriorityClass is to use the PreemptLowerPriority policy.

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority-app
value: 1000000
globalDefault: false
description: "Priority Class for Backends"

Example of Pod Priority Value

Kubernetes Pod Priority Value & Class Example

The example below has a PriorityClass object and a Pod that uses the particular priority class.

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority-apps
  class: high-priority
value:1000000
preeemtionPolicy: PreemeptLowerPriotiry
globalDefault: false
description: "Critical Applications"
---
apiVersion:v1
kind: pod
metadata:
  name: nginx
  labels:
    env:dev
spec:
  containers:
  - name: web
    image: nginx:latest
    imagePullPolicy: IfNotPresent
    priorityClassName: high-priority-apps

Priority Class YAML Manifest Example

High Priority Classes

There are two high-priority classes available by default on a Kubernetes cluster. They can instruct the scheduler to give Pods with those classes the highest priority available during Pod scheduling.

1. system-cluster-critical

This class has a default value of 2000000000. Add-on Pods like coredns, flannel, metrics server, etc., use this priority class.

2. system-node-critical

This class has a default value of 2000001000. Pods like etcd, kube-apiserver, and controller manager use this priority class.

ℹ️

These classes should be used only on Pods that are critical to the functioning of the cluster.
Unnecessarily applying these classes to less important pods is not recommended.

How Does Kubernetes Pod Priority & Preemption Work?

When a Pod is deployed with PriorityClassName, the admission controller gets the priority value using the value of PriorityClassName.
If Pods are in the scheduling queue, the scheduler will arrange for them to be scheduled on nodes based on their priority. High-priority Pods are placed ahead of low-priority Pods.
When no nodes are available with the required resources to run a high-priority Pod, the preemption process starts, and low-priority Pods are evicted to make space for high-priority Pods.
Evicted Pods get a graceful termination time of 30-seconds by default.
If the resource requirements of the high-priority Pod are not met even after eviction, then the low-priority Pods are reinstated in the cluster.

Pod Priority FAQs

1. What is a Kubernetes DaemonSet Priority?

Daemonsets have similar priorities compared to Pods. To keep Daemonsets stable and avoid eviction during a resource crunch, we need to set a higher Pod priority class for the Daemonset.

2. What Is the Relation Between Pod QoS, Pod Priority & Preemption?

The kubelet will first consider the Quality of Service (QOS) class and then the Pod priority value to evict Pods during resource shortage in the nodes, and the preemption logic starts up when high-priority Pods are waiting to be scheduled.

3. What Is the Significance of Pod Priority?

When we deploy apps in Kubernetes, we may not want to kill certain applications during resource crunches like logging agents, databases, payment provider services, etc.

To ensure that these Pods and Daemonsets are always available, we need to create a hierarchy of Pod tiers with priorities. During a resource crunch in the clusters, the kubelet will try to kill the low-priority Pods to accommodate the higher-priority mission-critical Pods.

Summarizing

The Kubernetes scheduler checks for Pods with no nodes assigned and finds the best node for a particular Pod to run on. The scheduler will assign a node to the Pod based on the different concepts that apply to it, like Pod Priority, Pod Priority Class, QoS, etc.

In this article, we learned about the different scheduling concepts in Kubernetes. We also learned to set Pod priority and Pod classes and the significance of setting up Pod Priority in a Cluster.

Thank you for reading. Stay tuned for another article on Kubernetes runtime security and logging. Please comment below if you have any questions.