Hello,

upon filing a PR [1] with some initial support for OpenJDK CRaC [2][3] I was directed here to raise a KIP (I don't have the permissions in wiki/JIRA to create the KIP page yet, though).

In a nutshell, CRaC intends to provide a way to checkpoint (snapshot) and persist a running Java application and later restore it, possibly on a different computer. This can be used to significantly speed up the boot process (from seconds or minutes to tens of milliseconds), live replication or migration of the heated up application. This is not entirely transparent to the application; the application can register for notification when this is happening, and sometime has to assist with that to prevent unexpected state after restore - e.g. close network connections and files.

CRaC is not integrated yet into the mainline JDK; JEP is being prepared, and users are welcome to try out our builds. However even when this gets into JDK we can't expect users jump onto the latest release immediately; therefore we provide a facade package org.crac [4] that delegates to the implementation, if it is present in the running JDK, or provides a no-op implementation.

With or without the implementation, the support for CRaC in the application should be designed to have a minimal impact on performance (few extra objects, some volatile reads...). On the other hand the checkpoint operation itself can be non-trivial in this matter. Therefore the main consideration should be about the maintenance costs - keeping a small JAR in dependencies and some extra code in networking and persistence.

The support for CRaC does not have to be all-in for all components - maybe it does not make sense to snapshot a Broker. My PR was for Kafka Clients because the open network connections need to be handled in a web application (in my case I am enabling CRaC in Quarkus Superheros [5] demo). The PR does not handle all possible client-side uses; as I am not familiar with Kafka I follow the whack-a-mole strategy.

It is possible that the C/R could be handled in a different layer, e.g. in Quarkus integration code. However our intent is to push the changes as low in the technology stack as possible, to provide the best fanout to users without duplicating maintenance efforts. Also having the support higher up can be fragile and break encapsulation.

Thank you for your consideration, I hope that you'll appreciate our attempt to innovate the Java ecosystem.

Radim Vansa

PS: I'd appreciate if someone could give me the permissions on wiki to create a proper KIP! Username: rvansa (both Confluence and JIRA).

[1] https://github.com/apache/kafka/pull/13619

[2] https://wiki.openjdk.org/display/crac

[3] https://github.com/openjdk/crac

[4] https://github.com/CRaC/org.crac

[5] https://quarkus.io/quarkus-workshops/super-heroes/

Reply via email to