A large pile of burning pallets

Impacts Of Not Setting Requests, Limits, and Quotas

Despite the innovation in the tech space from mainframes to servers to virtualization to cloud to Kubernetes, one thing holds true – resources are resources. Memory is memory. CPU is CPU. Storage is storage. These are resources that engineers still have to think and care about because, regardless of where you’re running workloads, these resources aren’t unlimited.

Engineers need to think about resource optimization from both a performance perspective and cost savings perspective. Otherwise, application stacks are either going to perform poorly or they’re going to have way too many resources than they actually need.

In Kubernetes there are a few keys way to configure resource optimization.

What Are Requests

When configuring proper resource optimization in Kubernetes, the typical workload is broken down into three categories:

  1. Requests
  2. Limits
  3. Quotas

Let’s start out with requests.

Requests set a minimum guaranteed resource amount. Taking a look at the Kubernetes Manifest below, take a look at the resources list. It contains both requests for memory and CPU.

What this means is this Nginx deployment is guaranteed at least 64Mi of memory and 250m of CPU. This is the minimum requirement for the workload to run properly.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: test
spec:
  selector:
    matchLabels:
      app: nginxdeployment
  replicas: 2
  template:
    metadata:
      namespace: webapp
      labels:
        app: nginxdeployment
    spec:
      containers:
      - name: nginxdeployment
        image: nginx:latest
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
        ports:
        - containerPort: 80

What Are Limits

Next, there are limits. Limits specify the absolute maximum amount of CPU and memory that a Deployment can receive.

In the Kubernetes Manifest example, below, the resource limit contains a memory limit, but why not a CPU limit?

Because setting CPU limits in Kubernetes is a bad practice.

When you set a limit on, for example, memory, the Pod will use the memory as needed. Once it’s done using the memory, the memory will go back into the pool of available resources for the cluster. A Limit is

If a process has limits assigned, it will be throttled whenever it attempts to consume more CPU cycles per time slice than it has been limited to. Throttling means that even if there are unused CPU cycles on the node (there are resources in “the pool”), the throttled process cannot be assigned additional CPU cycles during the throttled time slice. Effectively, this can mean that some of the node’s “pool of available CPU resources” might sit idle rather than being used to run the ready-to-run throttled process, even when no other process wants/needs those CPU cycles. This is wasteful. It’s wasteful of the node’s resources.

The throttled process isn’t directly taking anything away from other workloads by having limits. However the presence of limits on the workload is preventing available and unused node CPU resources from being used to run the limited workload, and this is what is unnecessary/wasteful.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: test
spec:
  selector:
    matchLabels:
      app: nginxdeployment
  replicas: 2
  template:
    metadata:
      namespace: webapp
      labels:
        app: nginxdeployment
    spec:
      containers:
      - name: nginxdeployment
        image: nginx:latest
        resources:
          limits:
            memory: "128Mi"
        ports:
        - containerPort: 80

What Are Quotas

Resource Quotas are either requests or limits that you wish to set on a particular Namespace. The example below shows that for the webapp Namespace, it’s going to have a minimum of 512Mi available for memory and a maximum of 1,112Mi for memory. This indicates the resources that are available from both a limit and request perspective for the entire Namespace. If you try to deploy Pods and the resources are not available, they will not be scheduled. Instead, the Pods will wait until the resources are available.

apiVersion: v1
kind: Namespace
metadata:
  name: webapp
---
apiVersion: v1
kind: ResourceQuota
metadata:
  name: memorylimit
  namespace: webapp
spec:
  hard:
    requests.memory: 512Mi
    limits.memory: 1112Mi

Here’s another example of a Resource Quota where hard limits are set on CPU, memory, and how many Pods are allowed to run inside of the Namespace.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: memorylimit
  namespace: test
spec:
  hard:
    cpu: "5"
    memory: 10Gi
    pods: "10"

What Happens When They Aren’t Set

As you go through all of the work necessary to ensure that proper limits, requests, and quotas are set, you may ask yourself a question – what happens if they aren’t set?

The two short answers are:

  1. Your application will perform poorly.
  2. You’ll be spending way too much money for no reason.

If you, for example, set a limit to low or a request to low, that means the Pod and the overall application stack will perform poorly. If you don’t set proper limits, requests, and quotas, that means the Pod and application stack may use far too many resources than it actually needs, resulting in spending money that’s unnecessary.

Leave a Comment

Your email address will not be published. Required fields are marked *