Coordinated Restore at Checkpoint Exceptions
Checkpoint Exceptions
During the creation of a checkpoint, the system can throw exceptions if an open file or socket prevented the creation of the checkpoint. When a checkpoint can be created successfully, the JVM stops running (default behavior). But in case of a checkpoint exception, the application continues to run.
The possible exceptions can be found in the sources of the project on GitHub.
CheckpointOpenSocketException
-
The checkpoint can’t be created because there is a listening socket.
-
The exception may be reported several times, for each socket independently.
-
The exception message specifies socket local and remote address and port.
CheckpointOpenSocketException: tcp6 localAddr :: localPort 8080 remoteAddr :: remotePort 0
Restore Exceptions
Additional Mappings Needed for Large Applications
The JVM may need more virtual memory areas than the Linux default of 65530 to restore a checkpoint of a large application. A value of 10,000,000 (ten million) is recommended for systems with a RAM size of up to 2.5 TBytes (2560 GBytes). To set the recommended value, complete the following steps:
Add the following line to the file /etc/sysctl.conf
:
vm.max_map_count=10000000
To activate the setting without reboot, run:
sudo sysctl -p
To check the setting on any system, type:
cat /proc/sys/vm/max_map_count