Comparing the performance of a Java application on different JVMs is not a simple task. The first question to ask yourself is, "what does performance mean for my application?" Typically, this will be either latency (response time) or throughput. Measuring latency is harder than throughput since many more factors can affect where latency occurs and measurements can sometimes fail to account for all of these.
Let's see how you can assess the impact your platform has on application latency when comparing the Zing JVM with your existing one.
Running the DaCapo suite's h2 benchmark yields the following results for the Oracle's HotSpot JVM (on the left) and Azul's ZVM (on the right). The DaCapo benchmark suite, consists of a set of open sourcebenchmarks with non-trivial workloads that are designed to avoid the problems of microbenchmarks. The h2 benchmark runs a JDBCbench-like in-memory benchmark that executes a number of transactions against a model of a banking application.
These graphs were generated using the jHiccup tool. jHiccup is an open source tool designed to measure the platform pauses ("hiccups") that applications experience. The tool captures the aggregate effects of the Java Virtual Machine (JVM), operating system, the hypervisor (if used) and hardware on application stalls and response time.
Refer to Using jHiccup to Identify Pauses and Compare JVM Runtime Overhead for more information about using jHiccup and the HdrHistogramAnalyzer tool.
jHiccup adds one thread to the application execution environment that spends most of its time asleep so that it will be minimally intrusive on the application's performance. Every one millisecond (which is configurable) the jHiccup thread wakes up and records the difference between the current time and the time it expected to wake up. Since the jHiccup thread has no relationship to the application code, it is only the platform impact that is measured, and the application itself will not directly affect the measurement.
There are two graphs for each set of results. The upper ones show how 'late' the jHiccup thread was when it woke up. The value is measured in milliseconds shown on the y-axis. The x-axis records the length of time since the application started. An ideally performing platform with little or no impact on application performance would give a horizontal line very close to zero. The higher the points on the graph, the worse the application is being impacted by the underlying platform.
The lower graphs show the impact of the platform on the application in terms of percentiles, i.e. how much impact there is for a given percentage of the time. For example, the Oracle HotSpot graph shows that for 0.01% of the time (the 99.9th percentile) the application will experience an effect greater than or equal to one second (1000ms). If your application runs for 24 hours continuously, then for nearly nine seconds you'll have to wait for over a second to get a result, in addition to any time taken for the application to do its work.
Clearly, there is a signifcant difference between the results from the Hotspot and Zing JVMs. Because the graphs have been normalised to use the same scaling on the y-axis, it is hard to see what values are associated with the Zing JVM as they are so close to zero. To make the data for the Zing JVM easier to read, we can use a smaller scale on the y-axis, as shown below.
Now we can see that most of the hiccups are in the 10-20ms range with one or two peaks hitting 30ms and the longest at approximately 43ms. Similarly, the percentile graph shows that for 99.99% of the time you will only need to wait a maximum of 20ms in addition to the application's processing time to get a result.
Two tools being used here: jHiccup to generate the results and HdrHistogramAnalyser to display the graphs.
To assess the relative performance of your application using your existing JVM and the Zing JVM you should generate two sets of data as we have seen above.