Visit Azul.com Support

Safepoint Profiler

Table of Contents

Looking for Zing? The Azul Zing Virtual Machine is now Azul Zulu Prime Builds of OpenJDK and part of Azul Platform Prime. Learn more.

The Safepoint Profiler is a lightweight profiling tool that is built into the Azul Zulu Prime Builds of OpenJDK (Azul Zulu Prime JVM). It reports information about threads that are slow to reach safepoint. This is referred to as Time To Safepoint (TTSP).

All JVM’s use safepoints to bring all of the application threads to a known state so the JVM can perform certain operations. Safepoints are used during Garbage Collection, during JIT compilation, for Thread Dumps, and many other operations. When a safepoint call is issued all of the application threads should "come to safepoint" as fast as possible. Threads that have come to safepoint block until the JVM releases them. Once all of the threads are at safepoint, the JVM performs the operation — GC, compile, thread dump, etc. — and then releases all the threads to run again. But when one or more application threads take a long time to come to safepoint, all of the other threads, which are now blocked, have to wait for the tardy thread(s).

Azul Zulu Prime JVM’s Safepoint Profiler will identify those tardy threads and help you to figure out why they are slow to reach safepoint.

Safepoint Profiler Overview

Safepoint Profiler is an in-process profiling tool that collects stacks of the threads that were late to safepoint at frequent intervals asynchronously and also collects other system related information like the number of cores assigned to the thread. These logs can be collected and analyzed later to find the reason for the high TTSP issue.

Safepoint Profiler is designed to have the minimal possible impact on the Azul Zulu Prime JVM performance. The tool uses Linux signals to collect asynchronous stack traces of “slow” threads. The waiting VMThread sends RealTime signals (the “TTSP signal”) to slow Java threads after exceeding the profiling threshold (tunable by flags). System level attributes are also recorded for the slow threads (wchan, thread state, etc). These attributes are stored in private buffers assigned to slow Java threads. When a slow JavaThread receives the TTSP signal, it records its stacks into its buffer. At the end of safepoint, the VMThread walks all the Java threads in the process and prints out the buffers of the corresponding thread if available.

To make TTSP overhead as lightweight as possible, there is no need to wait until slow Java threads respond to the signal that was sent by the VMThread. Since the Linux kernel sometimes delays delivering signals to a thread, it is entirely possible for a slow JavaThread to receive the signal after the whole safepoint is complete. An additional stack recording mechanism called "synchronous stack" serves as an extra stack that indicates the path from where the JavaThread gave up the jvm_lock (while reaching Safepoint). This way ensures there is at least one stack even if the signal delivery mechanism was not fast enough and didn’t record any asynchronous stack.

Synchronous stack can be identified among the other stacks of a thread by looking at the top two frames in the StackTrace. It would indicate the path through which a JavaThread gave up the jvm_lock.

 
#0 0x00000000209f41d5 SafepointProfilerBuf::record_sync_stack(JavaThread*) #1 0x0000000020a35d68 JavaThread::safepoint_profiler_record_sync_stack() #2 0x000000002067a1c6 JavaThread::jvm_unlock_self() #3 0x00000000206bdd62 JavaThread::poll_at_safepoint() #4 0x0000000020a36fab JavaThread::poll_at_safepoint_static(JavaThread*) #5 0x00000000300068ba StubRoutines::safepoint_handler #6 0x000000003000b43a ContendedClass.run()V #7 0x00000000319a65ca interpreter #8 0x0000000030003035 StubRoutines::call_stub

Configuring Safepoint Profiler

Use the following flags to configure the Safepoint Profiler:

Configuring option Description

-XX:+SafepointWaitTimeProfiler

This is the flag that controls the turning on and off whole profiling mechanism.

-XX:SafepointWaitTimeProfilerLog=<log file>

By default the profiling information is logged onto the stderr, this flag redirects the output to a file.

-XX:SafepointProfilerThreshold

This is a threshold in percentage of safepoint/ checkpoint that controls when to activate profiling. So, for the current default value of SafepointTimoutDelay 50000, we would start profiling the threads once we cross 1000 ms (2% of timeout). The default value of this flag is 2.

-XX:SafepointProfilerInterval

This is the interval in percentage of safepoint/ checkpoint that controls the frequency at which you would like to collect multiple stack samples for a TTSP issue. The default value is 10 and minimum value is 5%.

-XX:SafepointProfilerThresholdMS

Specifies the number of milliseconds, which the VMThread waits before sending the first, second, third, fourth, and fifth backtrace request for each laggard thread. Using this flag replaces the use of SafepointProfilerThreshold, which applies to both checkpoints and safepoints. After the fifth backtrace without reaching [safe/check] point, subsequent wait intervals are computed by using the SafepointProfilerInterval value.

Understanding Safepoint Profiler Output

To review a sample output of the Safepoint Profiler, run the command:

 
$taskset -c 0 java -XX:SafepointTimeoutDelay=10000 -XX:CheckpointTimeoutDelay=10000 -XX:SafepointProfilerThreshold=1 -XX:SafepointProfilerInterval=1 -XX:-DieOnSafepointTimeout <Test program> This command prints the following details: Detected TTSP issue: start: 31.259 wait: 17.526 Dumping stack for thread 0x0000440005e00000 "Thread-1" id: 30703 prio: 5 os_prio: 0 sched: SCHED_OTHER allowed_cpus: 000000000ffffff lock_release: 31.276 last_cpu: 20 cpu_time: 17563 0 0x00000000209f41d5 SafepointProfilerBuf::record_sync_stack(JavaThread*) 1 0x0000000020a35d68 JavaThread::safepoint_profiler_record_sync_stack() 2 0x000000002067a1c6 JavaThread::jvm_unlock_self() 3 0x00000000206bdd62 JavaThread::poll_at_safepoint() 4 0x0000000020a36fab JavaThread::poll_at_safepoint_static(JavaThread*) 5 0x00000000300068ba StubRoutines::safepoint_handler 6 0x000000003000b43a ContendedClass.run()V 7 0x00000000319a65ca interpreter 8 0x0000000030003035 StubRoutines::call_stub signal_sent: 31.271 signal_responded: 31.276 state: R wchan: 0 last_cpu: 20 cpu_time: 17563 0 0x000000003000b382 ContendedClass.run()V 1 0x00000000319a65ca interpreter 2 0x0000000030003035 StubRoutines::call_stub

The following table describes each field of the output.

Field Description

Detected TTSP issue: start: 31.259 wait: 17.526

This is the header that indicates the start of the logging for specific TTSP issue: * start – The VM uptime, this time is used to match against the GC log file to find the corresponding SPS. * wait – The actual wait time in milliseconds as seen in the SPS line.

Dumping stack for thread 0x0000440005e00000

This is the header for each thread that had TTSP and whose stack will be logged.

"Thread-1" id: 30703 prio: 5 os_prio: 0 sched: SCHED_OTHER allowed_cpus: 000000000ffffff

Thread-1 – The thread name.

  • id – LWP of the thread.

  • prio – Java level priority

  • os_prio – OS level priority

  • sched – Scheduler chosen for the thread.

  • allowed_cpus – The number of the CPUs that the thread is allowed to run. Each bit represents a CPU.

lock_release: 31.276

This line is printed only for the synchronously collected stack. The time here means the time this JavaThread actually gave up the lock. But this does not mean that all the threads are at safepoint at this point in time, since there could be other threads that would have taken longer to reach.

signal_sent: 31.271 signal_responded: 31.276

Applicable to only asynchronously collected stacks.

  • signal sent – The time when VMThread issued a signal to this thread to record its stack.

  • signal responded – The time at which the thread actually responded (received) the signal to dump the stack.

If there is a huge difference in the time stamps of these fields, this could possibly indicate the following two among other unknown possibilities:

  • The thread was not scheduled hence the kernel couldn’t deliver the signal immediately. This could possibly hint at starvation issues.

  • The thread was in uninterrupted mode, so the signal was in the pending signal list. Examples: Page fault handler.

state: R wchan: 0

This line is applicable only for asynchronously collected stacks:

  • state – State of the thread when the signal was sent from the VMThread.

  • wchan – Waiting channel state if the thread was waiting for something.

last_cpu: 20 cpu_time: 17563

last cpu – The last CPU the thread was known to be running on.

cpu time – Prints the clock CPU time of the thread.