Tuning and Troubleshooting Compilers
If you have an application performance problem and you suspect that you have a problem with the Azul Zing Builds of OpenJDK (Zing) runtime compiler, C1, C2, or Falcon, you will need to collect information about what is executing. The first step is to collect the high-level information about what is running.
-
GC log file and GC Log File Analyzer
-
ZVision and ZVRobot
-
Compiler-specific metric collection
-
Taking advantage of the GC fixup phase
GC log file and GC Log File Analyzer
The Zing Garbage Collector (GC) log file contains information about not only the garbage collections but also system performance. Enable printing of the GC log file by adding the command-line option: -Xloggc:<filename>_%t_%p.out
. Use $JAVA_HOME/bin/java -jar $JAVA_HOME/etc/gcLogAnalyser*jar –f <gcLogFile>
to read the GC log file and check the performance information collected and displayed in the Process and System sets of graphs.
Naturally, you can also check the overhead of garbage collections and the impact that safepoint pauses have on application throughput. In the GC and Safepoints - Pause Duration graph, check the frequency and duration of the compiler runtime’s Deoptimize Pause
time. These pauses indicate that the compiler runtime had been executing native code generated from bytecode, but an assumption made at compile time was violated. This required that the method’s execution abandon that compiled version of the method. The execution resumes in interpreted or a C1 compiled version of the method.
ZVision and ZVRobot
Note
|
As of Azul Platform Prime 23.08.0.0, ZVision and ZVRobot components have been deprecated and are no longer supported. The information in this section is only informative and is planned for removal. |
ZVision and ZVRobot work with Zing to expose the lower level details about the runtime – both the JVM’s and the application’s.
The questions to be answered:
-
What methods and functions are executing?
-
How much CPU time is consumed executing those methods?
-
What is the call chain to the methods consuming most of the CPU time?
The first step is to add the command line options to the command line for the ZVM:
-XX:ARTAPort=<port-number> -XX:+UseTickProfiler
You can connect directly with the ZVM using the ZVision server (run the $JAVA_HOME/bin/zvision
program and then connect with your browser to the machine and port for ZVision).
Alternatively, you can use ZVRobot to collect the same information.
To set up ZVRobot, extract the contents of the etc/ZVRobot.zip
file to a directory:
-
ZVRobot.prop
– the property file that instructs ZVRobot to collect metrics. -
ZVRobot-<version>.jar
– the ZVRobot archive. -
README
– overall instructions on how to run ZVRobot. -
startsample.sh
– simplified script for running ZVRobot.
Each of the files, README
, ZVRobot.prop
, and startsample.sh
contain instructions on how to modify files and run ZVRobot.
Modify the ZVRobot.prop
file for your environment. First, the host and ARTAPort for the ZVM you want to monitor.
Set the following properties:
ZVRobotVars.HOST=YouMustSet_TheZingVM_Host_IPAddress_In_ZVRobot.prop
ZVRobotVars.PORT=YouMustSet_TheZingVM_Host_PortNumber_In_ZVRobot.prop
To start, use the default values for the following:
-
ZVRobotVars.THSIZE=10000
– Number of data sample sizes to collect; mostly for the thread stacks number of threads. -
ZVRobotVars.SECONDS=60
– Seconds between samples. -
ZVRobotVars.COUNT=-1
– Continue collecting until the ZVRobot process is killed.
The following captures additional information for the top FETCHCOUNT
entries in the tick profile page. The additional information for each is the machine instruction page that contains the instructions generated by the compiler for each method. In addition, the machine instruction page is annotated with the CPU use at each instruction (ticks).
The third page for each method is the “callers of this method” page, displayed as a call graph sorted on the CPU consumption on each edge. You may want to increase FETCHCOUNT
to 10 to get more information about the top methods executing.
Here is what is in the ZVRobot.props file:
ZVRobotVars.TickProfilePage.FETCHCOUNT=5
You will not be using this:
ZVRobotVars.PollOpsPage.FETCHCOUNT=5
Further down in the file we have the pages that will be collected, represented as URIs to the servlets that provide the XML pages.
Use these, as is:
ZVRobotPage.CONFIG_PROCESS=http://${HOST}:${PORT}/config/process ${SECONDS}
ZVRobotPage.ENVIRONMENT_VARIABLES=http://${HOST}:${PORT}/config/environment_variables ${SECONDS} 1
The most important one for this analysis:
ZVRobotPage.TICKS_PROFILE_DEFAULTVIEW=http://${HOST}:${PORT}/ticks/profile ${SECONDS} ${COUNT}
Selecting these depends on how invasive you want to be for the information.
The stack trace is very expensive. You can comment them out or reduce the frequency by replacing ${SECONDS}
with 120, 180, … or even larger values.
ZVRobotPage.THREADS_LIST=http://${HOST}:${PORT}/threads/list?byname=&bystatus=any&stride=${THSIZE} ${SECONDS} ${COUNT} ZVRobotPage.THREADS_STACK_TRACE=http://${HOST}:${PORT}/threads/stack_trace?detail=low&byname=&bystatus=any&stride=${THSIZE} ${SECONDS} ${COUNT}
The lock information is cumulative so you do not need to have fine granularity.
ZVRobotPage.THREADS_MONITOR_CONTENTION=http://${HOST}:${PORT}/monitors/contention ${SECONDS} ${COUNT}
This information is in the GC log file, so you can comment them out as shown:
# ZVRobotPage.MEMORY_SUMMARY=http://${HOST}:${PORT}/memory/summary ${SECONDS} ${COUNT}
# ZVRobotPage.MEMORY_GC_SUMMARY=http://${HOST}:${PORT}/memory/gc_summary ${SECONDS} ${COUNT}
# ZVRobotPage.MEMORY_GC_HISTORY=http://${HOST}:${PORT}/memory/gc_history ${SECONDS} ${COUNT}
Compiler-specific Metric Collection
Zing provides a number of logging options to collect metrics about the behavior of the compilers.
To organize your data collection efforts, the first step is to supply the options to add the metrics to a file. The logging command line options provide you that capability. Here is the set of options to use to monitor the activity of the compilers (both Tier 1 and Tier 2) and write that information to a log file:
-XX:+PrintCompilation -XX:+PrintCompileDateStamps -XX:-DisplayVMOutput -XX:+LogVMOutput -XX:LogFile=compile_%t_%p.log
Explanation:
-
-XX:+PrintCompilation
– Prints compiler output about the methods being compiled and installed in the CodeCache -
-XX:+PrintCompileDateStamps
– Optional displays of the date stamp; the time stamp is automatically added -
-XX:-DisplayVMOutput
– Does not display all output on the tty -
-XX:+LogVMOutput
– Saves the VM output to the file specified with–XX:LogFile
-
-XX:LogFile=compile_%t_%p.log
– Writes to the file with the name in the format:compile_<date>_<time>_pid<PID>.log
An example of the output file’s name: compile_2017-07-30_17-32-03_pid20137.log
A sample of the compiler logging to the file:
<writer thread='20166'/>
2017-07-30T17:32:03.437-0700: 0.437: 20 3 java.lang.String::hashCode ()I (55 score) (55 bytes) 0.437879
<writer thread='20166'/>
2017-07-30T17:32:03.510-0700: 0.510: 20 3 installed at 0x3000ca50 with size 0x12d 0.510984
The following describes the information in the line printed when the compiler starts compiling the method:
Date and time: 2017-07-30T17:32:03.437-0700:
Logging elapsed time stamp: 0.437:
Overall compile sequence number: 20
Additional information about the compiler:
Tier compiler (C1, C2, or Falcon): 3
Class: java.lang.String::hashCode
Signature: ()I
Compiler accounting information: (55 score) (55 bytes)
Accurate elapsed time stamp: 0.437879
Explanation of “Additional information about the compile”:
The encodings in this field specify attributes about the method being compiled:
% On-stack-replacement compile
! Method includes exception handling
The compilers make assumptions about the class hierarchy and the values of frequently used variables, if possible, to provide higher levels of optimization. At the beginning of an application’s execution, the compilers try to compile methods quickly to improve performance, but sometimes these assumptions are invalidated. The compiler runtime must recompile the method with the new environmental assumptions. The task of switching execution from the compiled code for that method to either the interpreter or a method with a lower level of optimization is called deoptimization.
To track deoptimizations, use the command line option:
-XX:+TraceDeoptimization
Inlining called methods in the caller is one of the most beneficial optimizations done by the compilers. Call overhead to the called methods is eliminated and the inlining provides a larger block of code over which optimizations can be performed.
To see the inlining decisions made by the compiler, use the command line option:
-XX:+PrintInlining
A warning – the output generated when using this option can be very verbose.
Taking Advantage of the GC Fixup Phase
The Garbage Collector has an additional (optional) fixup phase that fixes all references in the heap to point to their proper targets, shifting more of the fixup work to the collector, and allowing generated code to potentially avoid certain checks between collections.
Without this optional fixup pass, references are still fixed, but the fixup is done lazily, and will completed by the next GC cycle. To enable the fixup pass, use the -XX:+GPGCDoFixupPasses
command-line option.