Cassandra Java Driver and OpenJDK CRaC

Radim Vansa Thu, 06 Mar 2025 15:26:27 -0800

Hi all,

I would like to make applications using Cassandra Java Driver,particularly those built with Spring Boot, Quarkus or similarframeworks, work with OpenJDK CRaC project [1]. I've already created apatch for Spring Boot [2] but Spring folks think that these changes aretoo dependent on driver internals, suggesting to contribute a support toCassandra directly.

The patch involves closing all connections before checkpoint, andre-establishing these after restore. I have implemented that thoughsending a `NodeStateEvent -> FORCED_DOWN` on the bus for all connectednodes. As a follow-up I could develop some way to inform the sessionabout a new topology e.g. if the cluster addresses change.

Before jumping onto implementing a PR I would like to ask what you thinkis the best approach to do this. I can think of two ways:


1) Native CRaC support

The driver would have a dependency on `org.crac:crac` [3]; this is asmall (13kB) library that provides the interfaces and a dummy noopimplementation if the target JVM does not support CRaC. Then`DefaultSession` would register a `org.crac.Resource` implementationthat would handle the checkpoint. This has the advantage of providingbest fan-out into any project consuming the driver without any further work.


2) Exposing neutral methods

To save frameworks of relying on internals, `DefaultSession` wouldexpose `.suspend()` and `.resume()` methods that would implement theconnection cut-off without importing any dependency. After upgrade tolatest release, frameworks could use these methods in a way that suitsthem. I wouldn't add those methods to the `CqlSession` interface (asthat would be breaking change) but only to `DefaultSession`.

Would Cassandra accept either of these, to let people checkpoint(snapshot) their applications and restore them within tens ofmilliseconds? Naturally it is possible to close the session objectcompletely and create a new one, but the ideal solution would require noapplication changes beyond dependency upgrade.

Btw. I am aware that there is an inherent race between possible topologychange and shutdown of current nodes (and I am listening for hints thatwould let us prevent that), but it is reasonable to expect that userswill checkpoint the application in a quiescent state. And if thetopology update breaks the checkpoint, it is always possible to try itagain.


Thank you for your opinions and ideas!

Radim Vansa


[1] https://wiki.openjdk.org/display/crac

[2] https://github.com/spring-projects/spring-boot/pull/44505

[3] https://mvnrepository.com/artifact/org.crac/crac/1.5.0

Cassandra Java Driver and OpenJDK CRaC

Reply via email to