+1 Thanks, Baodi Shi
mattison chao <[email protected]> 于2026年3月12日周四 00:33写道: > > +1 > > - Mattison > > On Wed, 11 Mar 2026 at 22:11, Lari Hotari <[email protected]> wrote: > > > +1 > > > > -Lari > > > > On Wed, 11 Mar 2026 at 01:22, PengHui Li <[email protected]> wrote: > > > > > > Hi all, > > > > > > I'd like to propose deprecating the legacy Jackson JsonSchema format > > > support for SchemaType.JSON and enforcing strict Avro schema validation > > by > > > default. > > > > > > Background > > > > > > In Pulsar 2.0, JSONSchema originally used Jackson's JsonSchemaGenerator > > to > > > produce schema definitions in the JSON Schema Draft standard (e.g., > > > {"type":"object","properties":{...}}). In Pulsar 2.1 (commit 1893323bc2, > > PR > > > #2071), we standardized on Avro schema format for all > > > structured schemas, including SchemaType.JSON. The schema definition > > > stored in SchemaInfo.schema was changed to Avro format (e.g., > > > {"type":"record","fields":[...]}), while the message payload remains > > plain > > > JSON. > > > > > > To maintain backward compatibility with schemas created during the 2.0 > > era, > > > fallback logic was added in several places to accept the old Jackson > > format: > > > > > > - StructSchemaDataValidator — falls back to Jackson JsonSchema parsing > > when > > > Avro parsing fails > > > - JsonSchemaCompatibilityCheck — silently allows mixed old/new format > > > combinations > > > - ProducerImpl — sends old format to brokers below protocol v13 > > > > > > The Problem > > > > > > This fallback is too lenient. It accepts any valid JSON as a schema > > > definition for SchemaType.JSON, not just the legacy Jackson format. This > > > has caused real issues for non-Java clients (e.g., the Rust client) where > > > users accidentally register a JSON Schema Draft 2020-12 > > > > > > 1. The broker's StructSchemaDataValidator accepts it (Avro parse fails → > > > Jackson fallback succeeds because it accepts any JSON) > > > 2. The broker's compatibility check allows it (empty block for > > > Avro→JsonSchema or JsonSchema→JsonSchema path) > > > 3. But when a Java consumer uses AutoConsumeSchema or GenericJsonSchema, > > it > > > fails with SchemaParseException: Type not supported: object because > > > AvroBaseStructSchema strictly requires Avro format — no fallback > > > > > > The result is that the broker stores a schema that no Java consumer can > > > read. > > > > > > Proposal > > > > > > 1. Add a broker configuration (e.g., > > schemaJsonAllowLegacyJacksonFormat, > > > default false) to control whether the old Jackson JsonSchema format is > > > accepted for SchemaType.JSON. > > > 2. When disabled (default), both StructSchemaDataValidator and > > > JsonSchemaCompatibilityCheck will strictly require valid Avro schema > > format > > > for SchemaType.JSON, consistent with what the consumer side already > > > requires. > > > 3. When enabled, the current backward-compatible behavior is preserved > > > for users who still have topics with legacy 2.0-era schemas. > > > 4. Document clearly that schema_data for SchemaType.JSON must be an > > Avro > > > schema definition, which is important for non-Java client implementations > > > that construct schema definitions manually. > > > > > > Impact > > > > > > - The legacy Jackson format has been superseded since Pulsar 2.1 (2018). > > > Any active topics with old-format schemas have likely been migrated or > > > recreated by now. > > > - The Java client's JSONSchema.of() has been generating Avro format since > > > 2.1, so Java producers are unaffected. > > > - Non-Java clients will get a clear error at producer registration time > > > instead of a confusing consumer-side failure. > > > - Users who genuinely need the old format can opt in via the > > configuration > > > flag. > > > > > > Looking forward to your thoughts. > > > > > > Thanks, > > > Penghui > >
