https://github.com/apache/pulsar/pull/22009

===========

# PIP-335: Supporty Oxia metadata store plugin

# Motivation

Oxia is a scalable metadata store and coordination system that can be used
as the core infrastructure
to build large scale distributed systems.

Oxia was created with the primary goal of providing an alternative Pulsar
to replace ZooKeeper as the
long term preferred metadata store, overcoming all the current limitations
in terms of metadata
access throughput and data set size.

# Goals

Add a Pulsar MetadataStore plugin that uses Oxia client SDK.

Users will be able to start a Pulsar cluster using just Oxia, without any
ZooKeeper involved.

## Not in Scope

It's not in the scope of this proposal to change any default behavior or
configuration of Pulsar.

# Detailed Design

## Design & Implementation Details

Oxia semantics and client SDK were already designed with Pulsar and
MetadataStore plugin API in mind, so
there is not much integration work that needs to be done here.

Just a few notes:
 1. Oxia client already provides support for transparent batching of read
and write operations,
    so there will be no use of the batching logic in
`AbstractBatchedMetadataStore`
 2. Oxia does not treat keys as a walkable file-system like interface, with
directories and files. Instead
    all the keys are independent. Though Oxia sorting of keys is aware of
'/' and provides efficient key
    range scanning operations to identify the first level children of a
given key
 3. Oxia, unlike ZooKeeper, doesn't require the parent path of a key to
exist. eg: we can create `/a/b/c` key
    without `/a/b` and `/a` existing.
    In the Pulsar integration for Oxia we're forcing to create all parent
keys when they are not there. This
    is due to several places in BookKeeper access where it does not create
the parent keys, though it will
    later make `getChildren()` operations on the parents.

## Other notes

Unlike in the ZooKeeper implementation, the notification of events is
guaranteed in Oxia, because the Oxia
client SDK will use the transaction offset after server reconnections and
session restarted events. This
will ensure that brokers cache will always be properly invalidated. We will
then be able to remove the
current 5minutes automatic cache refresh which is in place to prevent the
ZooKeeper missed watch issue.

# Links

Oxia: https://github.com/streamnative/oxia
Oxia Java Client SDK: https://github.com/streamnative/oxia-java


--
Matteo Merli
<matteo.me...@gmail.com>

Reply via email to