Coordinated Restore at Checkpoint Usage Guidelines
Your application may be ready for CRaC out-of-the-box! The simplest way is trying to generate a checkpoint. When you meet a CheckpointException, (that describes the problematic state in the application), or if you want to have more control, check the "Implementing the CRaC Resource" section.
Using CRaC in Your Application
Adding the CRaC API
The library provided by org.crac
is designed to provide a smooth CRaC adoption. Add this library to build an application that uses the CRaC API, to be able to run it on Java runtimes with CRaC, or without any implementation.
You can find the library in the Maven Repository.
Maven
<dependency>
<groupId>org.crac</groupId>
<artifactId>crac</artifactId>
<version>${crac.version}</version>
</dependency>
Functionality
During runtime, org.crac
uses reflection to detect the CRaC implementation. When available, all requests to org.crac
are passed to the implementation. Otherwise, requests are forwarded to a dummy implementation.
The dummy implementation allows an application to run but not to use CRaC:
-
Resources can be registered for notification.
-
Checkpoint request fails with an exception.
Implementing the CRaC Resource
To use the API, you need to identify all classes in your code that are considered "resources": classes that must be notified when a checkpoint is about to be made and when a restore has happened. The API provides an eponymous interface, Resource
, which must be implemented for the identified classes. There are only two methods, beforeCheckpoint()
and afterRestore()
which are used as callbacks by the JVM.
package my.app;
import org.crac.Context;
import org.crac.Core;
import org.crac.Resource;
public class MyClass implements Resource {
public MyClass() {
Core.getGlobalContext().register(this);
}
@Override
public void beforeCheckpoint(Context<? extends Resource> context) {
/* ... */
}
@Override
public void afterRestore(Context<? extends Resource> context) {
/* ... */
}
}
The CRaC JavaDoc is available here.
Example use case:
-
If a class reads configuration from a file, the file must be closed in the
beforeCheckpoint()
method. -
In the
afterRestore()
method, the file can be opened again to check configuration updates.
The same applies to network connections, and you can also use the methods to deal with a sudden change in the system clock, which might impact things like cache timeouts.
All Resources
in the application must be registered with the JVM, which can be achieved by obtaining a CRaC Context
and using the register()
method. Although you can create your own Context
, the simplest way is to use the global Context
obtained via the Core class’s static getGlobalContext()
method.
It’s important to register the Resources
in the right order because this order is used to call the beforeCheckpoint
methods. However, the afterRestore
methods are called in the opposite order. This approach simplifies things if there is a particular sequence in which things need to be prepared for a checkpoint; when restoring, there is a predictable inverse sequence.
The API also provides the Core.checkpointRestore()
method to create a checkpoint programmatically within your application. You can add this call in the flow of your program, where you want the checkpoint to be created. The method returns when the restore is completed. This call can also be done with jcmd <PID or your_app.jar> JDK.checkpoint
at any moment during the lifetime of your application, as described in Generating a Checkpoint further on this page.
Running an Application With CRaC
Currently, the full CRaC functionality is only available on Linux/x64 and Linux/ARM64, in version 17, 21, and 22 of Azul Zulu Builds of OpenJDK. This means, for now, you can run an application with CRaC on any system thanks to the crac.org
dependency, but only on the specified OS systems where the CRaC functionality in the JVM can be fully used.
As you can see in the January 2024 Release Notes, downloads are also available for Windows and macOS of Zulu with CRaC support, but only for development purposes. With these runtimes you are able to simulate the CRaC functionality. When you request a checkpoint, it is created and immediately restored without dumping the checkpoint to disk. This enables you to develop and test the CRaC functionality on these platforms, so you can deploy your application with confidence on Linux.
Note
|
Some VMs, like Parallels, cannot run foreign CPU instructions, and you need a build matching your CPU. Some other virtualization environments, like WSL, do not provide a complete feature set of Linux kernel, limiting the CRaC functionality in the current version. |
Detailed instructions per type of environment:
Running CRaC on Linux
Version 17, 21, and 22 of Azul Zulu Builds of OpenJDK, with CRaC functionality, are available as of the release of April 2023. Only the bundles with -crac-
in the name, have the integrated CRaC functionality.
Download a Runtime
On the download section of the Azul website, you can use the "Java Package" > "JDK CRaC" filter.
Note
|
The JDK archive should be extracted with sudo. |
$ sudo tar zxf <jdk>.tar.gz
Generating a Checkpoint
Start the JVM with an additional flag -XX:CRaCCheckpointTo
so it’s prepared to create a checkpoint:
java -XX:CRaCCheckpointTo=$HOME/crac-image/ -jar my_app.jar
Note
|
This eventually generates a set of files in the given directory, whose cumulative size is roughly the size of the JVM resident memory. |
Then you can trigger the checkpoint from outside the JVM, using jcmd
with JDK.checkpoint
:
jcmd my_app.jar JDK.checkpoint
Running CRaC in a Container (Docker)
Creating a Docker Image
-
You need an application in a runnable JAR file.
-
Check which Zulu Docker Image version you want to use on Docker Hub.
-
Create a Dockerfile, based on the following minimum file (using Zulu 21 in this example):
FROM azul/zulu-openjdk:21-jdk-crac-latest COPY build/libs/my_app.jar /opt/app/my_app.jar -
Build the Docker image with:
docker build -t my_app_on_crac .
Starting an Application in a Docker Container
-
Run a Docker container with:
docker run -it --cap-add=CHECKPOINT_RESTORE --cap-add=SYS_PTRACE --rm --name my_app_on_crac \ -v $PWD/crac-files:/opt/crac-files my_app_on_crac \ java -XX:CRaCCheckpointTo=/opt/crac-files -jar /opt/app/my_app.jarNoteIn order to restore without additional capabilities (see below), you should make Java to be PID 1 process. For PID 1, see Using the CRaCMinPid Option for details of impact on resulting PID. -
Leave the shell window open and the application running.
To avoid possible PID conflicts on restore, see Resolving PID Conflicts chapter.
Creating the Checkpoint
-
Open another shell window.
-
In this window run:
docker exec my_app_on_crac jcmd PID-OR-NAME JDK.checkpointNoteYou can find PID or NAME to provide to jcmd
by executing justjcmd
in the container. -
If everything is ok, you see that in the first shell window the checkpoint was created and your application was closed.
Creating a Docker Image with Checkpoint
-
Create a Dockerfile with the checkpoint:
FROM my_app_on_crac COPY crac-files /opt/crac-files -
Build the Docker image with
docker build -t my_app_on_crac_restore .
Run the Docker Container From the Checkpoint
-
Run:
docker run -it --rm my_app_on_crac_restore java \ -XX:CRaCRestoreFrom=/opt/crac-files -
Your application now starts much faster from the saved checkpoint.
Note
|
You can run the Docker container also on macOS or Windows, as long as the machine you are running it on has a x64 cpu architecture (Intel/AMD). |
Running CRaC on Windows or macOS
You can test your CRaC application on Windows or macOS with one of the following approaches:
-
Run your application in a containerized Linux system, e.g. following the Docker approach described above. For instance, on macOS you can use Docker, Podman, Parallels, VirtualBox,…
-
Use an Azul Zulu Build of OpenJDK with CRaC support for development purposes, available for Windows and macOS. These provide a simulated checkpoint/restore mechanism to be used for development and testing.
Using Image Compression on Checkpoint
CRaC has built-in compression for checkpoints. You can use this approach in case you need to save some space in return for a relatively small increase in time on restore.
To enable image compression, you must start the JVM with the additional option -XX:+CRaCImageCompression
, for example:
java -XX:CRaCCheckpointTo=cr-dir -XX:+CRaCImageCompression -jar my_app.jar
A checkpoint created with this additional option, can be restored without the need for additional options:
java -XX:CRaCRestoreFrom=cr-dir
Example With Compression
You can use the following test application to compare the impact of the compression:
public class TestCracCompression {
static public void main(String[] args) throws InterruptedException {
int cnt = 0;
while (true) {
System.out.println(cnt++);
Thread.sleep(1000);
}
}
}
This is the output of the checkpoint creation without compression:
$ java -XX:CRaCCheckpointTo=cr TestCracCompression.java
...
$ jcmd -l | grep TestCracCompression
52794 jdk.compiler/com.sun.tools.javac.launcher.SourceLauncher TestCracCompression.java
$ jcmd 52794 JDK.checkpoint
52794:
CR: Checkpoint ...
$ ls -sh cr/pages-1.img
49M cr/pages-1.img
$ java -XX:CRaCRestoreFrom=cr
This is the output of the checkpoint creation with compression:
$ java -XX:CRaCCheckpointTo=cr -XX:+CRaCImageCompression TestCracCompression.java
...
$ jcmd -l | grep TestCracCompression
53091 jdk.compiler/com.sun.tools.javac.launcher.SourceLauncher TestCracCompression.java
$ jcmd 53091 JDK.checkpoint
53091:
CR: Checkpoint ...
$ ls -sh cr/pages-1.comp.img
12M cr/pages-1.comp.img
$ java -XX:CRaCRestoreFrom=cr
As you can see in the output above, the size of the checkpoint for this test application reduces from 49MB to 12MB.
Example Code
Within the CRaC project on GitHub, a fully documented "Step-by-step CRaC support for a Jetty app" is provided.