Hello Pulsar community,
I would like to start a discussion about adding a new Message.getSchema() API.

This is a Google Doc with the contents of the PIP
https://docs.google.com/document/d/1VWi5LHP44V31nP4bCui9d5RXwH6xc_phrUes6tvNguk

This feature is particularly needed on the Consumer side, when you are
using Schema.AUTO_CONSUME().

When you use Schema.AUTO_CONSUME() Pulsar downloads the Schema from
the Schema Registry.
The message is a Message<GenericRecord>, but recently we introduced
GenericObject, and now it works with every Schema: BYTES, primities,
KeyValue and Structures (AVRO,JSON, Protobuf).
Just use GenericObject.getNativeObject() to access the decoded Java Object.

Currently we miss a way to get the actual schema per each message: in
fact each message can have a Schema different for the other messages
in the same topic.

Main requirements for the Schema instance returned by Message.getSchema():
- it must represent the actual schema for the message
- it must return accurate an SchemaInfo for the message
- it must return a Native Schema (like a native AVRO Schema) for the message

This is the PR with the implementation
https://github.com/apache/pulsar/pull/10476

Best regards
Enrico Olivelli

Reply via email to