+1

Thanks,
Baodi Shi

mattison chao <[email protected]> 于2026年3月12日周四 00:33写道:
>
> +1
>
> - Mattison
>
> On Wed, 11 Mar 2026 at 22:11, Lari Hotari <[email protected]> wrote:
>
> > +1
> >
> > -Lari
> >
> > On Wed, 11 Mar 2026 at 01:22, PengHui Li <[email protected]> wrote:
> > >
> > > Hi all,
> > >
> > > I'd like to propose deprecating the legacy Jackson JsonSchema format
> > > support for SchemaType.JSON and enforcing strict Avro schema validation
> > by
> > > default.
> > >
> > > Background
> > >
> > > In Pulsar 2.0, JSONSchema originally used Jackson's JsonSchemaGenerator
> > to
> > > produce schema definitions in the JSON Schema Draft standard (e.g.,
> > > {"type":"object","properties":{...}}). In Pulsar 2.1 (commit 1893323bc2,
> > PR
> > > #2071), we standardized on Avro schema format for all
> > >   structured schemas, including SchemaType.JSON. The schema definition
> > > stored in SchemaInfo.schema was changed to Avro format (e.g.,
> > > {"type":"record","fields":[...]}), while the message payload remains
> > plain
> > > JSON.
> > >
> > > To maintain backward compatibility with schemas created during the 2.0
> > era,
> > > fallback logic was added in several places to accept the old Jackson
> > format:
> > >
> > > - StructSchemaDataValidator — falls back to Jackson JsonSchema parsing
> > when
> > > Avro parsing fails
> > > - JsonSchemaCompatibilityCheck — silently allows mixed old/new format
> > > combinations
> > > - ProducerImpl — sends old format to brokers below protocol v13
> > >
> > > The Problem
> > >
> > > This fallback is too lenient. It accepts any valid JSON as a schema
> > > definition for SchemaType.JSON, not just the legacy Jackson format. This
> > > has caused real issues for non-Java clients (e.g., the Rust client) where
> > > users accidentally register a JSON Schema Draft 2020-12
> > >
> > > 1. The broker's StructSchemaDataValidator accepts it (Avro parse fails →
> > > Jackson fallback succeeds because it accepts any JSON)
> > > 2. The broker's compatibility check allows it (empty block for
> > > Avro→JsonSchema or JsonSchema→JsonSchema path)
> > > 3. But when a Java consumer uses AutoConsumeSchema or GenericJsonSchema,
> > it
> > > fails with SchemaParseException: Type not supported: object because
> > > AvroBaseStructSchema strictly requires Avro format — no fallback
> > >
> > > The result is that the broker stores a schema that no Java consumer can
> > > read.
> > >
> > > Proposal
> > >
> > >   1. Add a broker configuration (e.g.,
> > schemaJsonAllowLegacyJacksonFormat,
> > > default false) to control whether the old Jackson JsonSchema format is
> > > accepted for SchemaType.JSON.
> > >   2. When disabled (default), both StructSchemaDataValidator and
> > > JsonSchemaCompatibilityCheck will strictly require valid Avro schema
> > format
> > > for SchemaType.JSON, consistent with what the consumer side already
> > > requires.
> > >   3. When enabled, the current backward-compatible behavior is preserved
> > > for users who still have topics with legacy 2.0-era schemas.
> > >   4. Document clearly that schema_data for SchemaType.JSON must be an
> > Avro
> > > schema definition, which is important for non-Java client implementations
> > > that construct schema definitions manually.
> > >
> > > Impact
> > >
> > > - The legacy Jackson format has been superseded since Pulsar 2.1 (2018).
> > > Any active topics with old-format schemas have likely been migrated or
> > > recreated by now.
> > > - The Java client's JSONSchema.of() has been generating Avro format since
> > > 2.1, so Java producers are unaffected.
> > > - Non-Java clients will get a clear error at producer registration time
> > > instead of a confusing consumer-side failure.
> > > - Users who genuinely need the old format can opt in via the
> > configuration
> > > flag.
> > >
> > > Looking forward to your thoughts.
> > >
> > > Thanks,
> > > Penghui
> >

Reply via email to