Introduction
Kubernetes is about scaling but it doesn’t mean that we have auto scaling out-of-the-box – we must activate some additional components and configure them. In this article I want to show you working example of simple service scaling on Kubernetes using Horizontal Pod Autoscaler. After reaching maximum cluster capacity we will automatically add more workers to our cluster using Cluster Autoscaler. Everything will run on Amazon EKS – managed Kubernetes on AWS.
As always full source code can be found on my github: https://github.com/jakubbujny/article-scale-containers-and-eks
Simple deployment creating load
To test auto scaling we need some simple service which will create big load for us so scaling can be triggered. For that task we are going to use following Python code:
import flask import uuid import hashlib app = flask.Flask(__name__) @app.route("/") def hello(): for i in range(0,800000): hashlib.sha224(uuid.uuid4().hex.upper()[0:6].encode()).hexdigest() return "Done" app.run(host="0.0.0.0", threaded=True)
That code is really simple – just create web endpoint using Flask framework – GET request on “/” will cause long loop which calculate a lot of SHA hashes from random UUID what should take about 5–10 seconds and consume a lot of CPU during that time.
To avoid building own Docker image as we want to avoid creating docker registry (to simplify example) we can use simple trick by taking docker image jazzdd/alpine-flask:python3 which is available on dockerhub and contains Python/Flask installed. So we can create our python file in “command” section and run it, see full yaml below:
--- apiVersion: extensions/v1beta1 kind: Deployment metadata: namespace: default name: microservice spec: replicas: 1 template: metadata: labels: app: microservice spec: containers: - name: microservice image: jazzdd/alpine-flask:python3 command: ["sh"] args: ["-c", "printf \"import flask\\nimport uuid\\nimport hashlib\\napp = flask.Flask(__name__)\\n@app.route(\\\"/\\\")\\ndef hello():\\n for i in range(0,800000):\\n hashlib.sha224(uuid.uuid4().hex.upper()[0:6].encode()).hexdigest()\\n return \\\"Done\\\"\\napp.run(host=\\\"0.0.0.0\\\", threaded=True)\" > script.py && python3 script.py"] ports: - name: http-port containerPort: 5000 resources: requests: cpu: 200m
Important thing there is resources request block which says that on 1 CPU core machine (which we are going to use in this article) we can create 5 microservice PODs (200m x 5 = 1000m = 1CPU) and reaching that number means end of capacity of particular node. Reaching cluster capacity will be trigger for Cluster Autoscaler.
Horizontal Pod Autoscaler
Horizontal scaling in Kubernetes world means adding more pods in particular deployment. To achieve that Horizontal Pod Autoscaler can be used but we need to note one important thing: In the newest Kubernetes version metrics-server need to be installed to use HPA – Heapster is deprecated and shouldn’t be used anymore.
To test that on minikube you need just to type:
minikube addons enable metrics-server
To deploy metrics-server on EKS you need to clone following repository: https://github.com/kubernetes-incubator/metrics-server and then issue command:
kubectl apply -f metrics-server/deploy/1.8+/
To activate HPA for our microservice we need to apply following yaml file:
--- apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: namespace: default name: microservice spec: scaleTargetRef: apiVersion: apps/v1beta1 kind: Deployment name: microservice minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu targetAverageUtilization: 50
targetAverageUtilization: 50 means that Kubernetes will try to maintain half of usage of CPU requested by our microservice (50% * 200m = 100m) on particular POD. E.g. when we have single POD which is having 200m of CPU, Kubernetes will create new POD so 200m can be divided on 2 PODs (100m and 100m).
AWS EKS and Cluster Autoscaler
Disclaimer – why use Cluster Autoscaler instead of ASG scaling trigger based on CPU?
From Cluster Autoscaler FAQ:
“Cluster Autoscaler makes sure that all pods in the cluster have a place to run, no matter if there is any CPU load or not. Moreover, it tries to ensure that there are no unneeded nodes in the cluster.
CPU-usage-based (or any metric-based) cluster/node group autoscalers don’t care about pods when scaling up and down. As a result, they may add a node that will not have any pods, or remove a node that has some system-critical pods on it, like kube-dns. Usage of these autoscalers with Kubernetes is discouraged.”
For EKS deployment we are going to use modified EKS version from my previous article.
Cluster Autoscaler is component which will be installed on EKS cluster. It will look in Kubernetes API and make request to AWS API to scale worker nodes’s ASG. It means that node on which Cluster Autoscaler will reside need proper IAM policy which will allow container from that node to make operations on ASG.
resource "aws_iam_role_policy" "for-autoscaler" { name = "for-autoscaler" policy = <<POLICY { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "autoscaling:DescribeAutoScalingGroups", "autoscaling:DescribeAutoScalingInstances", "autoscaling:SetDesiredCapacity", "autoscaling:DescribeTags", "autoscaling:TerminateInstanceInAutoScalingGroup" ], "Resource": "*" } ] } POLICY role = "${aws_iam_role.eks-node.name}" }
That policy should be probably limited in Resource section but we will leave * to simplify example.
We put some additional tags to ASG to use them in Cluster Autoscaler
tag { key = "k8s.io/cluster-autoscaler/enabled" value = "whatever" propagate_at_launch = false } tag { key = "kubernetes.io/cluster/eks" value = "owned" propagate_at_launch = true }
We must also setup security groups to allow 443 port communication from cluster control plane to nodes as mentioned in this issue: https://github.com/kubernetes-incubator/metrics-server/issues/45
For Cluster Autoscaler we will modify a little example deployment from here: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
We need to modify tags which Cluster Autoscaler will use to discover ASG which will be scaled:
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,kubernetes.io/cluster/eks
Add env with region in which we are operating:
env: - name: "AWS_REGION" value: eu-west-1
Change certificate to use what is required for EKS:
volumeMounts: - name: ssl-certs mountPath: /etc/ssl/certs/ca-bundle.crt readOnly: true
Cluster Autoscaler is ready to use and will scale up or down worker nodes using ASG scaling between 1 and 10 instances.
Testing
Last step is to create load balancer attached to microservice and test auto scaling by making some requests to create load.
apiVersion: v1 kind: Service metadata: name: microservice annotations: service.beta.kubernetes.io/aws-load-balancer-type: nlb labels: app: microservice spec: ports: - port: 80 targetPort: http-port selector: app: microservice type: LoadBalancer
You can just simply try to open load balancer endpoint on root in web browser and hit f5 a few times to generate load or use script like:
while true; do sleep 2; timeout 1 curl http://<elb_id>.elb.eu-west-1.amazonaws.com/; done
After that you should see in Kubernetes that HPA scaled your containers up and reached maximum node capacity. After while Cluster Autoscaler should scale AWS ASG and add new worker node so HPA can complete PODs scaling.