Despite the innovation in the tech space from mainframes to servers to virtualization to cloud to Kubernetes, one thing holds true – resources are resources. Memory is memory. CPU is CPU. Storage is storage. These are resources that engineers still have to think and care about because, regardless of where you’re running workloads, these resources aren’t unlimited.
Engineers need to think about resource optimization from both a performance perspective and cost savings perspective. Otherwise, application stacks are either going to perform poorly or they’re going to have way too many resources than they actually need.
In Kubernetes there are a few keys way to configure resource optimization.
What Are Requests
When configuring proper resource optimization in Kubernetes, the typical workload is broken down into three categories:
- Requests
- Limits
- Quotas
Let’s start out with requests.
Requests set a minimum guaranteed resource amount. Taking a look at the Kubernetes Manifest below, take a look at the resources
list. It contains both requests for memory and CPU.
What this means is this Nginx deployment is guaranteed at least 64Mi of memory and 250m of CPU. This is the minimum requirement for the workload to run properly.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: test
spec:
selector:
matchLabels:
app: nginxdeployment
replicas: 2
template:
metadata:
namespace: webapp
labels:
app: nginxdeployment
spec:
containers:
- name: nginxdeployment
image: nginx:latest
resources:
requests:
memory: "64Mi"
cpu: "250m"
ports:
- containerPort: 80
What Are Limits
Next, there are limits. Limits specify the absolute maximum amount of CPU and memory that a Deployment can receive.
In the Kubernetes Manifest example, below, the resource
limit contains a memory limit, but why not a CPU limit?
Because setting CPU limits in Kubernetes is a bad practice.
When you set a limit on, for example, memory, the Pod will use the memory as needed. Once it’s done using the memory, the memory will go back into the pool of available resources for the cluster. A Limit is
If a process has limits assigned, it will be throttled whenever it attempts to consume more CPU cycles per time slice than it has been limited to. Throttling means that even if there are unused CPU cycles on the node (there are resources in “the pool”), the throttled process cannot be assigned additional CPU cycles during the throttled time slice. Effectively, this can mean that some of the node’s “pool of available CPU resources” might sit idle rather than being used to run the ready-to-run throttled process, even when no other process wants/needs those CPU cycles. This is wasteful. It’s wasteful of the node’s resources.
The throttled process isn’t directly taking anything away from other workloads by having limits. However the presence of limits on the workload is preventing available and unused node CPU resources from being used to run the limited workload, and this is what is unnecessary/wasteful.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: test
spec:
selector:
matchLabels:
app: nginxdeployment
replicas: 2
template:
metadata:
namespace: webapp
labels:
app: nginxdeployment
spec:
containers:
- name: nginxdeployment
image: nginx:latest
resources:
limits:
memory: "128Mi"
ports:
- containerPort: 80
What Are Quotas
Resource Quotas are either requests or limits that you wish to set on a particular Namespace. The example below shows that for the webapp
Namespace, it’s going to have a minimum of 512Mi available for memory and a maximum of 1,112Mi for memory. This indicates the resources that are available from both a limit and request perspective for the entire Namespace. If you try to deploy Pods and the resources are not available, they will not be scheduled. Instead, the Pods will wait until the resources are available.
apiVersion: v1
kind: Namespace
metadata:
name: webapp
---
apiVersion: v1
kind: ResourceQuota
metadata:
name: memorylimit
namespace: webapp
spec:
hard:
requests.memory: 512Mi
limits.memory: 1112Mi
Here’s another example of a Resource Quota where hard limits are set on CPU, memory, and how many Pods are allowed to run inside of the Namespace.
apiVersion: v1
kind: ResourceQuota
metadata:
name: memorylimit
namespace: test
spec:
hard:
cpu: "5"
memory: 10Gi
pods: "10"
What Happens When They Aren’t Set
As you go through all of the work necessary to ensure that proper limits, requests, and quotas are set, you may ask yourself a question – what happens if they aren’t set?
The two short answers are:
- Your application will perform poorly.
- You’ll be spending way too much money for no reason.
If you, for example, set a limit to low or a request to low, that means the Pod and the overall application stack will perform poorly. If you don’t set proper limits, requests, and quotas, that means the Pod and application stack may use far too many resources than it actually needs, resulting in spending money that’s unnecessary.