Sizing and Scaling your Optimizer Hub Installation

Table of Contents

Service Scaling
How Optimizer Hub Scales
Scaling API

Need help?

Schedule a consultation with an Azul performance expert.

In order for Optimizer Hub to perform the JIT compilation in time, you need to make sure the installation is sized correctly. You scale Optimizer Hub by specifying the minimum and maximum number of vCores you wish to allocate to the service. The Helm chart automatically sets the sizing of the individual Optimizer Hub components.

Service Scaling

Optimizer Hub can be configured to run one ore multiple services, see Configuring the Active Optimizer Hub Services. According to the selected services, different scaling approaches are required.

Cloud Native Compiler (CNC)

The CNC service must be able to autoscale rapidly to handle resource demands effectively. From time-to-time, depending on the number of starting applications, it needs a large amount of resources to be able to perform all requested compilations in time. As such, it must scale up according to the needs, but also scale down quickly when resources are no longer needed as it’s prohibitively expensive to keep those resources always on.

ReadyNow Orchestrator (RNO)

When Optimizer Hub is configured on RNO-only mode (using values-disable-compiler.yaml, see Configuring the Active Optimizer Hub Services), it doesn’t need to scale. The predefined sizing will be able to handle full RNO functionality.

Combined Services

When both CNC and RNO are enabled, but you only use RNO, the Optimizer Hub service may in some rare cases of extreme traffic scale up more instances.

How Optimizer Hub Scales

A critical metric to measure whether your Cloud Native Compiler is responding to compilation requests in time is the Time to Clear Optimization Backlog (TCOB).

Time to Clear Optimization Backlog graph

When you start a Java program, there is a burst of compilation activity as a large amount of optimization requests are put on the compilation queue. Eventually, the compiler catches up with the optimization backlog and all new compilation requests are started within 2 seconds of being put on the compilation queue. The TCOB is the measurement, for each individual JVM, of how long it took from the start of the compilation activity to when the optimization backlog is cleared and all requests are started within 2 seconds.

Grafana dashboard showing the number of local fallback JVMs

By default, Optimizer Hub is configured to use autoscaling. You can control autoscaling by specifying the minimum number of vCores for the entire Optimizer Hub installation. The minimum vCores for an Optimizer Hub installation, including a management-gateway pod and one compile-broker pod, is 39 vCores. If you want more compilation capacity, increase minVCores.

The maximum number of vCores, configured by maxVCores, defines the maximum number of vCores over which the Optimizer Hub service will not scale regardless of how much load it is under.

These values can be defined by overriding the default values in your values-override.yaml file.

 simpleSizing:
  vCores: 39
  minVCores: 39
  maxVCores: 113

The minimum and maximum number of vCores is used by the Optimizer Hub service to adjust the sizing of the instance to try to meet your timeToClearOptimizationBacklog limit for all the JVMs that request compilations.

Note	Optimizer Hub uses a custom Kubernetes operator to scale and does not use Kubernetes Horizontal Pod Autoscalers.

Scaling API

The Scaling API allows you to instrument Optimizer Hub to temporary increase the minimum number of vCPUs between a start and end timestamp. Multiple calls can be made to this API and Optimizer Hub will take all given timestamps and potential overlaps into account to start and stop the extra resources.