Hello, With Pulsar 2.8.0 we have the Exclusive Producer, which allows you to use Pulsar as a consistent write-ahead-log for replicated state machines.
It already happened to me a couple of times to need to build some replicated state storage on top of Pulsar and I would like to share some thoughts. We can provide some simple built-in mechanism to share some "state" across several instances of an application without adding some Database or other components to the architecture: - metadata - dynamic configuration - task assignments - key-value database In general we can provide an API to handle a shared distributed Java Object: each client can access the Object and mutate the State, ensuring consistency. I have drafted a small API to build such an abstraction: public interface PulsarDatabase<V, O> { /** * Read from the current state. * @param reader a function that accesses current state and returns a value * @param latest ensure that the value is the latest * @return an handle to the result of the operation */ <K> CompletableFuture<K> read(Function<V, K> reader, boolean latest); /* * Execute a mutation on the state. * The operationsGenerator generates a list of mutations to be * written to the log, the operationApplier function * is executed to mutate the state after each successful write * to the log. Finally the reader function can read from * the current status before releasing the write lock. * @param operationsGenerator generates a list of mutations * @param operationApplier apply each mutation to the current state * @param reader read from the status while inside the write lock * @param <K> the returned data type * @param <O> the operation type * @return a handle to the completion of the operation */ <K> CompletableFuture<K> write(Function<V, List<O>> operationsGenerator, Function<V, K> reader); } Using this simple abstraction it is easy to build for instance a distributed Java "Map" like this https://github.com/eolivelli/pulsar-db/blob/main/src/main/java/org/apache/pulsar/db/PulsarMap.java I believe that we should add this feature to the Pulsar Client API, maybe we can start by adding this in the pulsar-adapters module as it can be loosely coupled with the core Pulsar Client Building distributed data structures on top of that API is simple, but the underlying implementation of the core APi is not straightforward, because there are many edge cases to deal with. If we provide some recipes that are available out-of-the-box we will unleash the secret power of Exclusive producer and we will allow more applications to migrate to Pulsar or to choose Pulsar as storage backbone. You can find the code here https://github.com/eolivelli/pulsar-db, it is only a proof-of-concept, but it is already usable. If there is an interest in this I will be happy to draft a PIP and also to send the implementation to the pulsar-adapters repository. Best regards Enrico