Visit Azul.com Support

Tuning and Troubleshooting Compilers

Table of Contents

Looking for Zing? The Azul Zing Virtual Machine is now Azul Zulu Prime Builds of OpenJDK and part of Azul Platform Prime. Learn more.

If you have an application performance problem and you suspect that you have a problem with the Azul Zulu Prime Builds of OpenJDK (Azul Zulu Prime JVM) runtime compiler, C1, C2, or Falcon, you will need to collect information about what is executing. The first step is to collect the high-level information about what is running.

  • GC log file and GC Log File Analyzer

  • ZVision and ZVRobot

  • Compiler-specific metric collection

  • Taking advantage of the GC fixup phase

GC log file and GC Log File Analyzer

The Azul Zulu Prime JVM Garbage Collector (GC) log file contains information about not only the garbage collections but also system performance. Enable printing of the GC log file by adding the command-line option: -Xloggc:<filename>_%t_%p.out. Use $JAVA_HOME/bin/java -jar $JAVA_HOME/etc/gcLogAnalyser*jar –f <gcLogFile> to read the GC log file and check the performance information collected and displayed in the Process and System sets of graphs.

Naturally, you can also check the overhead of garbage collections and the impact that safepoint pauses have on application throughput. In the GC and Safepoints - Pause Duration graph, check the frequency and duration of the compiler runtime’s Deoptimize Pause time. These pauses indicate that the compiler runtime had been executing native code generated from bytecode, but an assumption made at compile time was violated. This required that the method’s execution abandon that compiled version of the method. The execution resumes in interpreted or a C1 compiled version of the method.

ZVision and ZVRobot

ZVision and ZVRobot work with Azul Zulu Prime JVM to expose the lower level details about the runtime – both the JVM’s and the application’s.

The questions to be answered:

  1. What methods and functions are executing?

  2. How much CPU time is consumed executing those methods?

  3. What is the call chain to the methods consuming most of the CPU time?

The first step is to add the command line options to the command line for the ZVM:

  -XX:ARTAPort=<port-number> -XX:+UseTickProfiler 

You can connect directly with the ZVM using the ZVision server (run the $JAVA_HOME/bin/zvision program and then connect with your browser to the machine and port for ZVision).

Alternatively, you can use ZVRobot to collect the same information.

To set up ZVRobot, extract the contents of the etc/ZVRobot.zip file to a directory:

  • ZVRobot.prop – the property file that instructs ZVRobot to collect metrics.

  • ZVRobot-<version>.jar – the ZVRobot archive.

  • README– overall instructions on how to run ZVRobot.

  • startsample.sh – simplified script for running ZVRobot.

Each of the files, README, ZVRobot.prop, and startsample.sh contain instructions on how to modify files and run ZVRobot.

Modify the ZVRobot.prop file for your environment. First, the host and ARTAPort for the ZVM you want to monitor.

Set the following properties:

 ZVRobotVars.HOST=YouMustSet_TheZingVM_Host_IPAddress_In_ZVRobot.prop
ZVRobotVars.PORT=YouMustSet_TheZingVM_Host_PortNumber_In_ZVRobot.prop
[source, shell,opts="user,check"] 

To start, use the default values for the following:

  • ZVRobotVars.THSIZE=10000 – Number of data sample sizes to collect; mostly for the thread stacks number of threads.

  • ZVRobotVars.SECONDS=60 – Seconds between samples.

  • ZVRobotVars.COUNT=-1 – Continue collecting until the ZVRobot process is killed.

The following captures additional information for the top FETCHCOUNT entries in the tick profile page. The additional information for each is the machine instruction page that contains the instructions generated by the compiler for each method. In addition, the machine instruction page is annotated with the CPU use at each instruction (ticks).

The third page for each method is the “callers of this method” page, displayed as a call graph sorted on the CPU consumption on each edge. You may want to increase FETCHCOUNT to 10 to get more information about the top methods executing.

Here is what is in the ZVRobot.props file:

 ZVRobotVars.TickProfilePage.FETCHCOUNT=5 

You will not be using this:

 ZVRobotVars.PollOpsPage.FETCHCOUNT=5 

Further down in the file we have the pages that will be collected, represented as URIs to the servlets that provide the XML pages.

Use these, as is:

 ZVRobotPage.CONFIG_PROCESS=http://${HOST}:${PORT}/config/process ${SECONDS}
ZVRobotPage.ENVIRONMENT_VARIABLES=http://${HOST}:${PORT}/config/environment_variables ${SECONDS} 1 

The most important one for this analysis:

 ZVRobotPage.TICKS_PROFILE_DEFAULTVIEW=http://${HOST}:${PORT}/ticks/profile ${SECONDS} ${COUNT} 

Selecting these depends on how invasive you want to be for the information.

The stack trace is very expensive. You can comment them out or reduce the frequency by replacing ${SECONDS} with 120, 180, … or even larger values.

 ZVRobotPage.THREADS_LIST=http://${HOST}:${PORT}/threads/list?byname=&bystatus=any&stride=${THSIZE} ${SECONDS} ${COUNT} ZVRobotPage.THREADS_STACK_TRACE=http://${HOST}:${PORT}/threads/stack_trace?detail=low&byname=&bystatus=any&stride=${THSIZE} ${SECONDS} ${COUNT}
[source, shell,opts="user,check"] 

The lock information is cumulative so you do not need to have fine granularity.

 ZVRobotPage.THREADS_MONITOR_CONTENTION=http://${HOST}:${PORT}/monitors/contention ${SECONDS} ${COUNT} 

This information is in the GC log file, so you can comment them out as shown:

 # ZVRobotPage.MEMORY_SUMMARY=http://${HOST}:${PORT}/memory/summary ${SECONDS} ${COUNT}
# ZVRobotPage.MEMORY_GC_SUMMARY=http://${HOST}:${PORT}/memory/gc_summary ${SECONDS} ${COUNT}
# ZVRobotPage.MEMORY_GC_HISTORY=http://${HOST}:${PORT}/memory/gc_history ${SECONDS} ${COUNT} 

Compiler-specific Metric Collection

The Azul Zulu Prime JVM provides a number of logging options to collect metrics about the behavior of the compilers.

To organize your data collection efforts, the first step is to supply the options to add the metrics to a file. The logging command line options provide you that capability. Here is the set of options to use to monitor the activity of the compilers (both Tier 1 and Tier 2) and write that information to a log file:

 -XX:+PrintCompilation -XX:+PrintCompileDateStamps -XX:-DisplayVMOutput -XX:+LogVMOutput -XX:LogFile=compile_%t_%p.log 

Explanation:

  • -XX:+PrintCompilation – Prints compiler output about the methods being compiled and installed in the CodeCache

  • -XX:+PrintCompileDateStamps – Optional displays of the date stamp; the time stamp is automatically added

  • -XX:-DisplayVMOutput – Does not display all output on the tty

  • -XX:+LogVMOutput – Saves the VM output to the file specified with –XX:LogFile

  • -XX:LogFile=compile_%t_%p.log – Writes to the file with the name in the format: compile_<date>_<time>_pid<PID>.log

An example of the output file’s name: compile_2017-07-30_17-32-03_pid20137.log

A sample of the compiler logging to the file:

 <writer thread='20166'/>
2017-07-30T17:32:03.437-0700: 0.437: 20 3 java.lang.String::hashCode ()I (55 score) (55 bytes) 0.437879
<writer thread='20166'/>
2017-07-30T17:32:03.510-0700: 0.510: 20 3 installed at 0x3000ca50 with size 0x12d 0.510984 

The following describes the information in the line printed when the compiler starts compiling the method:

Date and time: 2017-07-30T17:32:03.437-0700:

Logging elapsed time stamp: 0.437:

Overall compile sequence number: 20

Additional information about the compile:

Tier compiler (C1, C2, or Falcon): 3

Class: java.lang.String::hashCode

Signature: ()I

Compiler accounting information: (55 score) (55 bytes)

Accurate elapsed time stamp: 0.437879

Explanation of “Additional information about the compile”:

The encodings in this field specify attributes about the method being compiled:

% On-stack-replacement compile

! Method includes exception handling

The compilers make assumptions about the class hierarchy and the values of frequently used variables, if possible, to provide higher levels of optimization. At the beginning of an application’s execution, the compilers try to compile methods quickly to improve performance, but sometimes these assumptions are invalidated. The compiler runtime must recompile the method with the new environmental assumptions. The task of switching execution from the compiled code for that method to either the interpreter or a method with a lower level of optimization is called deoptimization.

To track deoptimizations, use the command line option:

 -XX:+TraceDeoptimization 

Inlining called methods in the caller is one of the most beneficial optimizations done by the compilers. Call overhead to the called methods is eliminated and the inlining provides a larger block of code over which optimizations can be performed.

To see the inlining decisions made by the compiler, use the command line option:

 -XX:+PrintInlining 

A warning – the output generated when using this option can be very verbose.

Taking Advantage of the GC Fixup Phase

The Garbage Collector has an additional (optional) fixup phase that fixes all references in the heap to point to their proper targets, shifting more of the fixup work to the collector, and allowing generated code to potentially avoid certain checks between collections.

Without this optional fixup pass, references are still fixed, but the fixup is done lazily, and will completed by the next GC cycle. To enable the fixup pass, use the -XX:+GPGCDoFixupPasses command-line option.

The JIT compiler has an optional optimization mode called "FixupBasedLvbCodeVersioning", which attempts to take advantage of the optional GC fixup phase. This mode can be enabled by using the -XX:+FixupBasedLvbCodeVersioning command-line option. When applied, generated code may avoid certain checks when GC cycles are not running, resulting in faster code. Setting -XX:+FixupBasedLvbCodeVersioning implicitly sets -XX:+GPGCDoFixupPasses.