Visit Azul.com Support

Monitoring Cloud Native Compiler

You can monitor your Cloud Native Compiler using the standard Kubernetes monitoring tools: Prometheus and Grafana. Cloud Native Compiler service components are already configured to expose key metrics for scraping by Prometheus.

In your production systems, you will likely want to use your existing Prometheus and Grafana instances to monitor Cloud Native Compiler. If you are just evaluating Cloud Native Compiler, you may want to install a separate instance of Prometheus and Grafana to just monitor your test instance of Cloud Native Compiler. The prime-cnc helm chart contains a basic setup of Prometheus and Grafana.

Installing Prometheus and Grafana

To install a pre-configured monitoring stack along with CNC, all that is necessary to do is to is enable it in the installation values file and then perform installation as described in the general installation guide. The stack will be instaled in the same namespace as CNC. You do not need to do anything else as the stack comes with a pre-configured CNC dashboard.

 
monitoring: enabled: true
Note
In case you are installing to minikube, you do not need to change anything else. If you are installing to a cluster, it is a good idea to configure a nodeSelector to install Prometheus and Grafana on an infrastructure node. See values-aws.yaml for example.

Connecting to Grafana

  1. Find the IP address of any node in the cluster and grafana port.

 
$ kubectl describe pods grafana -n compiler | grep Node: Node: ip-10-21-81-225.us-west-2.compute.internal/10.21.81.225 $ kubectl describe service grafana -n compiler | grep NodePort: NodePort: service 31761/TCP
  1. Access the Grafana UI in a web browser

  2. To view CNC metrics, open the Prime-CNC dashboard.

Note
Grafana does not have persistent storage configured, so any changes to it will not survive pod restart.

Retrieving Cloud Native Compiler Logs

All Cloud Native Compiler components, including third-party ones, log some information to stdout. These logs are very important for diagnosing problems.

You can extract individual logs with the following command:

 
kubectl -n <my-namespace> logs <pod>

However by default Kubernetes keeps only the last 10 MB of logs for every container, which means that in a cluster under load the important diagnostic information can be quickly overwritten by subsequent logs.

You should configure log aggregation from all CNC components, so that logs are moved to some persistent storage and then extracted when some issue needs to be analyzed. You can use any log aggregation One suggested way is to use Loki. You can query the Loki logs using the logcli tool.

Here are some common commands you can run to retrieve logs:

  • Find out host and port where Loki is listening

     
    export LOKI_ADDR=http://<ip-adress>:<port>
  • Get logs of all pods in the selected namespace

     
    logcli query --since 24h --forward --limit=10000 '{namespace="zvm-dev-3606"}'
  • Get logs of a single application in the selected namespace

     
    logcli query --since 24h --forward --limit=10000 '{namespace="zvm-dev-3606" app="compile-broker"}'
  • Get logs of a single pod in the selected namespace

     
    logcli query --since 24h --forward --limit=10000 '{namespace="zvm-dev-3606",pod="compile-broker-5fd956f44f-d5hb2"}'

Extracting Compilation Artifacts

Cloud Native Compiler stores a record of every compilation request and its processing result in a Compilation Index. Also it stores logs of compiler engine executions. By default these logs only include information about failed compilations.

You can retrieve information from the Compilation Index using a special REST endpoint on the gateway:

  • /testing/compilations — returns JSON containing all compilations performed in the life of the serer

  • /testing/compilations?vmid=<VM_ID> — Returns the compilation history for a single VM identified by VM_ID ??? Ho do I get the VM_ID?

  • /testing/diagnostic-dump — returns a ZIP archive containing both compilation metadata and compiler engine logs (including crashdumps and coredumps if present) for all compilations that the service ever performed

  • /testing/diagnostic-dump?vmid=<VM_ID> — the same as above, but only for a single VM identified by VM_ID

It is recommended to only query these endpoints using the vmid=<VM_ID> parameter, as returning information for the entire history of the service can lead your Cloud Native Compiler service performance to degrade.