Configuring Cloud Native Compiler Autoscaling
Since the Cloud Native Compiler (CNC) service uses a large amount of resources (recommended 4 CNC vCores for every JVM vCore), it is imperative to correctly configure autoscaling. Kubernetes Horizontal Pod Autoscaler (HPA) automatically increases/decreases the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization.
The gateway
, compile-broker
and cache
components already contain instructions for creating a Kubernetes HPA Autoscaling Node. If the Autoscaler Node sees any unused nodes, it deletes them. If a replication controller, deployment, or replica set tries to start a container and cannot do it due to lack of resources, the Autoscaler Node knows which service is needed and adds this service to the Kubernetes cluster. For more information, see https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.
Horizontal Pod Autoscaler (HPA)
In order to use HPA autoscaling, you need install the Metrics Server component in Kubernetes.
Each of the base/02-infrastructure.yaml
, base/03-cache.yaml
and base/04-compile-broker.yaml
files contains the HPA instructions:
$ # Dependence on k8s metric-server
$ apiVersion: autoscaling/v2beta2
$ kind: HorizontalPodAutoscaler
$ metadata:
$ name: compile-broker
$ spec:
$ scaleTargetRef:
$ apiVersion: apps/v1
$ kind: Deployment
$ name: compile-broker
$ minReplicas: 1
$ maxReplicas: 10
$ metrics:
$ - type: Resource
$ resource:
$ name: cpu
$ target:
$ type: Utilization
$ averageUtilization: 70
Uncomment these sections, then set the minReplicas
and maxReplicas
accordingly. You can also tweak the averageUtilization
property to more aggressively scale out the compile-broker
.
Note
|
The compile-broker component is fairly fast to start but the cache component takes several minutes to fully synchronize with the other existing cache nodes and be ready to respond to requests.
|
+
+
The gateway
component autoscaling is less aggressive as that of the other components due to the fact that gateways going up and down have a direct influence on the connected VMs. It is therefore necessary to detect higher load over a period of time before a new gateway instance is added.