Hi Sijie,

I second with Raman. Apart from PIP-43 and PIP-44 which ease schema
management, in my opinion, we should also loosely couple the association
between topic and schema (or more precisely *type of data* on topic) which
is 1 to 1 as of now.

   1. The schema (or schema versions of one data type) could be grouped
   into what Kafka calls *subject*.
   2. The schema compatibility should then be done among schemas in the
   same subject only.
   3. One topic can associate with multiple schema subjects and have their
   own evolution paths.
   4. Similarly, one subject can also associate to multiple topics.

*Use case:*
This feature would be handy when one needs different business models in a
strictly ordered fashion. At the same time, these business models have
their own evolution paths too. As an example, an event sourcing system
could have events like customerCreated, customerAddressChanged,
customerInvoicePaid events etc required in order.

The ideas presented above are picked from here
<https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html>.

Regards,
Shivji Kumar Jha
http://www.shivjijha.com/
+91 8884075512


On Wed, Apr 15, 2020 at 2:27 AM Sijie Guo <guosi...@gmail.com> wrote:

> Hi Raman,
>
> The schema compatibility strategies were already there prior to PIP-43.
>
> PIP-44 enhances the schema compatibility strategy support.
>
> Both of the changes are already landed in 2.5.0 release.
>
> Did you see any issues when you tryout this feature?
>
> - Sijie
>
> On Tue, Apr 14, 2020 at 8:35 AM rocketra...@gmail.com <
> rocketra...@gmail.com>
> wrote:
>
> > Now that PIP-43 is released in 2.5.0, I wanted to follow up on the
> > messages below.
> >
> > What is remaining to be done in Pulsar to support having multiple
> > different types on one topic in Pulsar? Yi indicates below that PIP-43
> sets
> > the stage for this, but that the schema compatibility implementation
> still
> > would need some work.
> >
> > Would this require another PIP, or just an issue to track the work?
> >
> > Regards,
> > Raman
> >
> > On 2019/09/16 01:32:39, Yi Tang <ssnailt...@gmail.com> wrote:
> > > Hi rarma,
> > >
> > > It's a great and important feature, I think. This PIP requires the
> > > compatibility check from bottom registry only and doesn't touch the
> > > implementation detail. I think we should address this feature in the
> > > future, and this PIP provides the essential ability to implement it.
> > >
> > > Thanks,
> > > Yi
> > >
> > > rocketra...@gmail.com <rocketra...@gmail.com> 于 2019年9月15日周日 22:36写道:
> > >
> > > > I see a mention of compatibility in the PIP but with no details.  The
> > docs
> > > > about schema compatibility state this:
> > > >
> > > > > Consequently, those events need to go in the same Pulsar partition
> to
> > > > maintain order. This application can use ALWAYS_COMPATIBLE to allow
> > > > different kinds of events co-exist in the same topic.
> > > >
> > > > With this PIP, this limitation can be relaxed, and schema
> compatibility
> > > > should be able to be strengthened, since each type of message on a
> > topic
> > > > can have its own schema, and compatibility can then be checked
> against
> > only
> > > > other schemas for the same type. Kafka does this via the concept of
> > > > "subjects" in the schema registry, and subjects default to just the
> > topic
> > > > name (plus a "-key" or "-value" suffix since keys and values can both
> > have
> > > > their own schemas), but can also include (via an injectable strategy)
> > the
> > > > message type. Compatibility is managed at the subject level.
> > > >
> > > > Is this something that should be addressed in this PIP, or in future
> > > > follow-on work? This is critical to supporting ordering across
> > different
> > > > message types, with schema compatibility verification by Pulsar.
> > > >
> > > > Regards,
> > > > Raman
> > > >
> > > >
> > > >
> > > > On 2019/09/03 05:12:32, 唐谊 <ssnailt...@gmail.com> wrote:
> > > > > Hi all;
> > > > >
> > > > > I am drafting a proposal to support the producer to send messages
> > with
> > > > > different schema.
> > > > >
> > > > > ## Motivation
> > > > > For now, Pulsar producer can only produce messages of one type of
> > schema
> > > > > which is determined by user when it is created, or by fecthing the
> > latest
> > > > > version of schema from registry if AUTO_PRODUCE_BYTES type is
> > specified.
> > > > > Schema, however, can be updated by external system after producer
> > > > started,
> > > > > which would lead to inconsistency between messsage payload and
> schema
> > > > > version metadata. Also some senarios like replicating from kafka
> > require
> > > > a
> > > > > single producer for replicating messages of different schemas from
> > one
> > > > > Kafka partition to one Pulsar partition to guarantee the order and
> no
> > > > > duplicates.
> > > > >
> > > > > Here proposing that messages can indicate the associated schema by
> > > > itself,
> > > > > for more detail,
> > > > >
> > > >
> >
> https://gist.github.com/yittg/56c6dedf7509f634ec7effc4f6f3631d#file-pip-md
> > > > >
> > > > > Looking forward to any feedback.
> > > > >
> > > > > Thanks,
> > > > > Yi
> > > > >
> > > >
> > >
> >
>

Reply via email to