+1 - Mattison
On Wed, 11 Mar 2026 at 22:11, Lari Hotari <[email protected]> wrote: > +1 > > -Lari > > On Wed, 11 Mar 2026 at 01:22, PengHui Li <[email protected]> wrote: > > > > Hi all, > > > > I'd like to propose deprecating the legacy Jackson JsonSchema format > > support for SchemaType.JSON and enforcing strict Avro schema validation > by > > default. > > > > Background > > > > In Pulsar 2.0, JSONSchema originally used Jackson's JsonSchemaGenerator > to > > produce schema definitions in the JSON Schema Draft standard (e.g., > > {"type":"object","properties":{...}}). In Pulsar 2.1 (commit 1893323bc2, > PR > > #2071), we standardized on Avro schema format for all > > structured schemas, including SchemaType.JSON. The schema definition > > stored in SchemaInfo.schema was changed to Avro format (e.g., > > {"type":"record","fields":[...]}), while the message payload remains > plain > > JSON. > > > > To maintain backward compatibility with schemas created during the 2.0 > era, > > fallback logic was added in several places to accept the old Jackson > format: > > > > - StructSchemaDataValidator — falls back to Jackson JsonSchema parsing > when > > Avro parsing fails > > - JsonSchemaCompatibilityCheck — silently allows mixed old/new format > > combinations > > - ProducerImpl — sends old format to brokers below protocol v13 > > > > The Problem > > > > This fallback is too lenient. It accepts any valid JSON as a schema > > definition for SchemaType.JSON, not just the legacy Jackson format. This > > has caused real issues for non-Java clients (e.g., the Rust client) where > > users accidentally register a JSON Schema Draft 2020-12 > > > > 1. The broker's StructSchemaDataValidator accepts it (Avro parse fails → > > Jackson fallback succeeds because it accepts any JSON) > > 2. The broker's compatibility check allows it (empty block for > > Avro→JsonSchema or JsonSchema→JsonSchema path) > > 3. But when a Java consumer uses AutoConsumeSchema or GenericJsonSchema, > it > > fails with SchemaParseException: Type not supported: object because > > AvroBaseStructSchema strictly requires Avro format — no fallback > > > > The result is that the broker stores a schema that no Java consumer can > > read. > > > > Proposal > > > > 1. Add a broker configuration (e.g., > schemaJsonAllowLegacyJacksonFormat, > > default false) to control whether the old Jackson JsonSchema format is > > accepted for SchemaType.JSON. > > 2. When disabled (default), both StructSchemaDataValidator and > > JsonSchemaCompatibilityCheck will strictly require valid Avro schema > format > > for SchemaType.JSON, consistent with what the consumer side already > > requires. > > 3. When enabled, the current backward-compatible behavior is preserved > > for users who still have topics with legacy 2.0-era schemas. > > 4. Document clearly that schema_data for SchemaType.JSON must be an > Avro > > schema definition, which is important for non-Java client implementations > > that construct schema definitions manually. > > > > Impact > > > > - The legacy Jackson format has been superseded since Pulsar 2.1 (2018). > > Any active topics with old-format schemas have likely been migrated or > > recreated by now. > > - The Java client's JSONSchema.of() has been generating Avro format since > > 2.1, so Java producers are unaffected. > > - Non-Java clients will get a clear error at producer registration time > > instead of a confusing consumer-side failure. > > - Users who genuinely need the old format can opt in via the > configuration > > flag. > > > > Looking forward to your thoughts. > > > > Thanks, > > Penghui >
