+1

- Mattison

On Wed, 11 Mar 2026 at 22:11, Lari Hotari <[email protected]> wrote:

> +1
>
> -Lari
>
> On Wed, 11 Mar 2026 at 01:22, PengHui Li <[email protected]> wrote:
> >
> > Hi all,
> >
> > I'd like to propose deprecating the legacy Jackson JsonSchema format
> > support for SchemaType.JSON and enforcing strict Avro schema validation
> by
> > default.
> >
> > Background
> >
> > In Pulsar 2.0, JSONSchema originally used Jackson's JsonSchemaGenerator
> to
> > produce schema definitions in the JSON Schema Draft standard (e.g.,
> > {"type":"object","properties":{...}}). In Pulsar 2.1 (commit 1893323bc2,
> PR
> > #2071), we standardized on Avro schema format for all
> >   structured schemas, including SchemaType.JSON. The schema definition
> > stored in SchemaInfo.schema was changed to Avro format (e.g.,
> > {"type":"record","fields":[...]}), while the message payload remains
> plain
> > JSON.
> >
> > To maintain backward compatibility with schemas created during the 2.0
> era,
> > fallback logic was added in several places to accept the old Jackson
> format:
> >
> > - StructSchemaDataValidator — falls back to Jackson JsonSchema parsing
> when
> > Avro parsing fails
> > - JsonSchemaCompatibilityCheck — silently allows mixed old/new format
> > combinations
> > - ProducerImpl — sends old format to brokers below protocol v13
> >
> > The Problem
> >
> > This fallback is too lenient. It accepts any valid JSON as a schema
> > definition for SchemaType.JSON, not just the legacy Jackson format. This
> > has caused real issues for non-Java clients (e.g., the Rust client) where
> > users accidentally register a JSON Schema Draft 2020-12
> >
> > 1. The broker's StructSchemaDataValidator accepts it (Avro parse fails →
> > Jackson fallback succeeds because it accepts any JSON)
> > 2. The broker's compatibility check allows it (empty block for
> > Avro→JsonSchema or JsonSchema→JsonSchema path)
> > 3. But when a Java consumer uses AutoConsumeSchema or GenericJsonSchema,
> it
> > fails with SchemaParseException: Type not supported: object because
> > AvroBaseStructSchema strictly requires Avro format — no fallback
> >
> > The result is that the broker stores a schema that no Java consumer can
> > read.
> >
> > Proposal
> >
> >   1. Add a broker configuration (e.g.,
> schemaJsonAllowLegacyJacksonFormat,
> > default false) to control whether the old Jackson JsonSchema format is
> > accepted for SchemaType.JSON.
> >   2. When disabled (default), both StructSchemaDataValidator and
> > JsonSchemaCompatibilityCheck will strictly require valid Avro schema
> format
> > for SchemaType.JSON, consistent with what the consumer side already
> > requires.
> >   3. When enabled, the current backward-compatible behavior is preserved
> > for users who still have topics with legacy 2.0-era schemas.
> >   4. Document clearly that schema_data for SchemaType.JSON must be an
> Avro
> > schema definition, which is important for non-Java client implementations
> > that construct schema definitions manually.
> >
> > Impact
> >
> > - The legacy Jackson format has been superseded since Pulsar 2.1 (2018).
> > Any active topics with old-format schemas have likely been migrated or
> > recreated by now.
> > - The Java client's JSONSchema.of() has been generating Avro format since
> > 2.1, so Java producers are unaffected.
> > - Non-Java clients will get a clear error at producer registration time
> > instead of a confusing consumer-side failure.
> > - Users who genuinely need the old format can opt in via the
> configuration
> > flag.
> >
> > Looking forward to your thoughts.
> >
> > Thanks,
> > Penghui
>

Reply via email to