Re: [PR] feat: Add variant support description to RFC-99 [hudi]

via GitHub Sat, 04 Apr 2026 21:44:29 -0700


yihua commented on code in PR #18274:
URL: https://github.com/apache/hudi/pull/18274#discussion_r3036400524



##########
rfc/rfc-99/rfc-99.md:
##########
@@ -194,5 +195,444 @@ SQL Extensions needs to be added to define the table in a 
hudi type native way.
 
 TODO: There is an open question regarding the need to maintain type ids to 
track schema evolution and how it would interplay with NBCC. 
 
+---
+
+## Variant Type Implementation
+
+This section documents the implementation of the VARIANT type in Hudi, which 
provides first-class support for
+semi-structured data (e.g., JSON). The Variant type is implemented following 
Spark 4.0's native VariantType
+specification.
+
+### Overview
+
+The Variant type enables Hudi to store and query semi-structured data 
efficiently. It is particularly useful for:
+
+- Schema-on-read flexibility for evolving data structures
+- Storing JSON-like data without requiring predefined schemas
+
+### Motivation
+
+Hudi readers and writers should be able to handle datasets with variant types.

Review Comment:
   🤖 The motivation section is quite brief. It might be worth adding a sentence 
or two about why the Variant type matters specifically for Hudi (e.g., how it 
interacts with Hudi's merge/compaction semantics, or why storing JSON-as-string 
is insufficient for Hudi's indexing and data skipping). Right now it reads more 
like a general Variant primer than a Hudi-specific design rationale.



##########
rfc/rfc-99/rfc-99.md:
##########
@@ -194,5 +195,444 @@ SQL Extensions needs to be added to define the table in a 
hudi type native way.
 
 TODO: There is an open question regarding the need to maintain type ids to 
track schema evolution and how it would interplay with NBCC. 
 
+---
+
+## Variant Type Implementation

Review Comment:
   🤖 The Avro mapping changed from `string or union` to `record of 2 byte 
fields + variant logical type`. Have you considered what happens when existing 
tables that stored Variant as `string` (under the old mapping) need to be read 
with the new code path? Is there a compatibility concern here, or was the old 
mapping never implemented?



##########
rfc/rfc-99/rfc-99.md:
##########
@@ -194,5 +195,444 @@ SQL Extensions needs to be added to define the table in a 
hudi type native way.
 
 TODO: There is an open question regarding the need to maintain type ids to 
track schema evolution and how it would interplay with NBCC. 
 
+---
+
+## Variant Type Implementation
+
+This section documents the implementation of the VARIANT type in Hudi, which 
provides first-class support for
+semi-structured data (e.g., JSON). The Variant type is implemented following 
Spark 4.0's native VariantType
+specification.
+
+### Overview
+
+The Variant type enables Hudi to store and query semi-structured data 
efficiently. It is particularly useful for:
+
+- Schema-on-read flexibility for evolving data structures
+- Storing JSON-like data without requiring predefined schemas
+
+### Motivation
+
+Hudi readers and writers should be able to handle datasets with variant types.
+This will allow users to work with semi-structured data more easily.
+
+The variant type is now formally defined in Parquet and engines like Spark 
have full support for this type.
+Users with semi-structured data are otherwise forced to use strings or byte 
arrays to store this data.
+
+### What is the VARIANT Type?
+
+The `VARIANT` type is a new data type designed to store semi-structured data 
(like JSON) efficiently.
+Unlike storing JSON as a plain string, `VARIANT` uses an optimized binary 
encoding that allows for fast navigation
+and element extraction without needing to parse the entire document.
+It offers the flexibility of a schema-less design (like JSON) with performance 
closer to structured columns.
+
+### Storage Modes: Shredded vs. Unshredded
+
+- **Unshredded** (Binary Blob):
+    - The entire JSON structure is encoded into binary metadata and value 
blobs.
+    - **Pros**: Fast write speed; handles completely dynamic/random schemas 
easily.
+    - **Cons**: To read a single field (e.g., `user.id`), the engine must load 
the entire binary blob.
+- **Shredded** (Columnar Optimization):
+    - The engine identifies common paths in the data (e.g., `v.a` or `v.c`) 
and extracts them into separate, native
+      Parquet columns (e.g., Int32, Decimal).
+    - **Pros**: Massive performance gain for queries. If you query `SELECT 
v:a`, the engine reads only the specific
+      Int32 column and skips the rest of the binary data (Columnar Pruning).
+    - **Cons**: Higher write overhead to analyze and split the data, prone to 
read jitter if there are large variation 
+    - in shredding output.
+
+### Architecture
+
+Variant support is built on a **layered architecture** with version-specific 
adapters:
+
+```
+┌────────────────────────────────────────────────────┐
+│            Application Layer (Spark SQL)           │
+│    SELECT parse_json('{"a": 1}') as data           │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Spark Version Adapters                │
+│  ┌──────────────────┐  ┌────────────────────────┐  │
+│  │ BaseSpark3Adapter│  │   BaseSpark4Adapter    │  │
+│  │ (No Variant)     │  │   (Full Variant)       │  │
+│  └──────────────────┘  └────────────────────────┘  │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│             HoodieSchema.Variant                   │
+│     (Avro Logical Type + Record Schema)            │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Parquet Storage                       │
+│    GROUP { value: BINARY, metadata: BINARY }       │
+└────────────────────────────────────────────────────┘
+```
+
+### Variant Schema Definition
+
+The `HoodieSchema.Variant` class in `hudi-common` defines the Variant type:
+
+```java
+public static class Variant extends HoodieSchema {
+  private static final String VARIANT_METADATA_FIELD = "metadata";
+  private static final String VARIANT_VALUE_FIELD = "value";
+  private static final String VARIANT_TYPED_VALUE_FIELD = "typed_value";
+
+  private final boolean isShredded;
+  private final Option<HoodieSchema> typedValueSchema;
+}
+```
+
+#### Two Storage Modes
+
+1. **Unshredded Variant** (Default):
+    - Created with: `HoodieSchema.createVariant()`
+    - Structure: Record with two REQUIRED binary fields
+    - Fields: `metadata` (BYTES, REQUIRED), `value` (BYTES, REQUIRED)
+    - Use case: Simple semi-structured data storage
+
+2. **Shredded Variant**:
+    - Created with: `HoodieSchema.createVariantShredded(typedValueSchema)`
+    - Structure: Record with `typed_value` group containing extracted typed 
columns
+    - Fields: `value` (BYTES, OPTIONAL), `metadata` (BYTES, REQUIRED), 
`typed_value` (GROUP, OPTIONAL)
+    - Use case: Frequently accessed fields are extracted into native typed 
columns for columnar reads
+
+#### How a Reader Distinguishes the Two Modes
+
+Both modes use the same `VARIANT` logical type annotation in the Parquet 
footer — the annotation does not change.
+The reader inspects the Parquet file schema to determine which mode was used:
+
+- If the Variant group contains only `metadata` and `value`, it is 
**unshredded**.
+- If the Variant group also contains a `typed_value` child group, it is 
**shredded**.
+
+In the shredded case, `value` becomes OPTIONAL because when all fields are 
fully shredded, the binary blob
+may be `null` — the content is entirely represented in the typed columns. The 
reader falls back to `value`
+only for fields that were not shredded.
+
+#### Custom Avro Logical Type
+
+Variant uses a custom Avro logical type for identification:
+
+```java
+public static class VariantLogicalType extends LogicalType {
+  private static final String VARIANT_LOGICAL_TYPE_NAME = "variant";
+}
+```
+
+### On-Disk Representation (Parquet)
+
+Hudi's on-disk Variant representation intentionally aligns with the
+[Parquet Variant 
spec](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md).
+The spec defines a Variant as a group annotated with the `VARIANT` logical 
type. Hudi writes Variant columns using
+this exact layout in both modes:
+
+#### Unshredded
+
+```
+optional group variant_column (VARIANT) {
+  required binary metadata;
+  required binary value;
+}
+```
+
+The entire JSON value is encoded into the `value` blob. Both fields are 
REQUIRED — every row carries the full binary
+representation.
+
+#### Shredded
+
+```
+optional group variant_column (VARIANT) {
+  required binary metadata;
+  optional binary value;
+  optional group typed_value {
+    optional group a {
+      optional binary value;
+      optional int32 typed_value;
+    }
+    optional group b {
+      optional binary value;
+      optional binary typed_value (STRING);
+    }
+    optional group c {
+      optional binary value;
+      optional int64 typed_value;
+    }
+  }
+}
+```
+
+Each child under `typed_value` corresponds to a shredded field. The child's 
`typed_value` column holds the extracted
+native-typed value (e.g., `int32`, `string`, `int64`), while the child's 
`value` column is a fallback binary blob for
+rows where the field's type does not match the shredded type. The top-level 
`value` becomes OPTIONAL — it is `null`
+when all fields in the row are fully covered by shredded columns, and 
populated only for fields that were not shredded.
+
+**How a reader uses this**: The Parquet file schema in the footer tells the 
reader which mode was used. If the Variant
+group contains only `metadata` and `value`, it is unshredded. If a 
`typed_value` child group is present, the reader
+knows which fields have been shredded and can read them as native typed 
columns — enabling column pruning, predicate
+pushdown, and data skipping without touching the binary blob.
+
+The `VARIANT` annotation is what allows readers to recognize the column as a 
Variant and expose semantic operations
+(e.g., `variant_get`, JSON path access). Without it, the group is 
indistinguishable from an ordinary
+`Struct<metadata: Binary, value: Binary>`.
+
+**Alignment with the Parquet spec**: Hudi does not diverge from the Parquet 
Variant spec. The annotation, field names,
+field types, and repetition levels all follow the spec exactly. This means any 
Parquet reader that implements the
+Variant spec can read Hudi-written Variant columns natively, and vice versa.
+
+**Reader compatibility**: Engines that do not yet implement the Parquet 
Variant spec (e.g., Spark 3.5) will ignore the
+`VARIANT` annotation and fall back to reading the column as a plain struct of 
binary fields.
+The data remains physically accessible, but users lose Variant query functions 
and must manually parse
+the binary encoding. See [variant-appendix.md](variant-appendix.md) for 
detailed backward compatibility findings.
+
+**Migration path**: If a future Parquet spec revision changes the Variant 
physical layout, Hudi can introduce a
+table property (e.g., `hoodie.variant.parquet.format.version`) to control 
which format is written, and the reader
+can inspect the Parquet footer to determine which layout to expect. (This is 
outside the scope of this RFC for now)
+
+#### Binary Format
+
+The Variant binary encoding follows the
+[Parquet Variant Binary Encoding 
spec](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md):
+
+| Component    | Description                                                   
      |
+|--------------|---------------------------------------------------------------------|
+| **metadata** | Dictionary of field names and type information for efficient 
access |
+| **value**    | Binary encoding of the actual data (scalars, objects, arrays) 
      |
+
+Example for `{"updated": true, "new_field": 123}`:
+
+```
+Metadata Bytes: [0x01, 0x02, 0x00, 0x07, 0x10, "updated", "new_field"]
+Value Bytes:   [0x02, 0x02, 0x01, 0x00, 0x01, 0x00, 0x03, 0x04, 0x0C, 0x7B]
+```
+
+The metadata contains a dictionary of all field names, while the value 
contains references to these fields plus the
+actual data values.
+
+### Schema Evolution Support
+
+Variant types provide **schema-on-read** flexibility:
+
+| Aspect                   | Behavior                                          
                |
+|--------------------------|-------------------------------------------------------------------|
+| Adding new fields        | ✅ Supported - New JSON fields can be added 
without schema changes |
+| Removing fields          | ✅ Supported - Missing fields return null on read  
                |
+| Type changes within JSON | ✅ Supported - Variant can store any 
JSON-compatible type          |
+| Table schema evolution   | ✅ Supported - Variant column can be added to 
existing tables      |
+| Hudi schema evolution    | ✅ Supported - Works with Hudi's standard schema 
evolution         |
+
+**Important**: The schema flexibility is within the Variant column itself. The 
table-level schema (including the Variant
+column definition) still follows Hudi's standard schema evolution rules.
+
+### Column Statistics and Indexing
+
+Variant statistics and indexing capabilities depend on whether a field is 
accessed from the unshredded binary blob
+or from a shredded (extracted) typed column. With shredding, extracted fields 
become regular typed Parquet columns
+that automatically leverage all existing Hudi metadata infrastructure.
+
+#### Unshredded Variant Column
+
+| Feature            | Status | Notes                                          
                    |
+|--------------------|--------|--------------------------------------------------------------------|
+| Value/null counts  | ✅      | Standard column-level counts via MDT 
`column_stats` partition      |
+| Min/max bounds     | ❌      | Binary blob has no meaningful sort order       
                    |
+| Data skipping      | ❌      | Requires comparable min/max bounds             
                    |
+| Bloom filter       | ❌      | Bloom filters are used for record key lookups 
only                 |
+| Partition stats    | ❌      | Variant columns are not used as partition keys 
                    |
+| Predicate pushdown | ⚠️     | Structural predicates only (`IS NULL`, `IS NOT 
NULL`)              |
+
+#### Shredded Variant Fields
+
+| Feature            | Status | Notes                                          
                                                                        |
+|--------------------|--------|------------------------------------------------------------------------------------------------------------------------|
+| Min/max bounds     | ✅      | Shredded columns are regular typed columns; 
`HoodieTableMetadataUtil.isColumnTypeSupported()` accepts all primitives   |
+| Data skipping      | ✅      | `DataSkippingUtils` translates filters on 
shredded fields to min/max range checks                                      |
+| Expression index   | ✅      | Index over `variant_get()` expressions enables 
stats collection without full shredding                                 |
+| Bloom filter       | ❌      | Bloom filters are used for record key lookups 
only                                                                     |

Review Comment:
   🤖 Could you elaborate on how Variant columns interact with Hudi's record 
merging in MOR tables? The Avro serialization section shows the schema, but 
doesn't discuss merge semantics — e.g., when a log record updates only some 
fields inside the Variant, is the entire binary blob replaced, or is there 
field-level merging? This seems important for users to understand the write 
amplification trade-offs.



##########
rfc/rfc-99/rfc-99.md:
##########
@@ -194,5 +195,444 @@ SQL Extensions needs to be added to define the table in a 
hudi type native way.
 
 TODO: There is an open question regarding the need to maintain type ids to 
track schema evolution and how it would interplay with NBCC. 
 
+---
+
+## Variant Type Implementation
+
+This section documents the implementation of the VARIANT type in Hudi, which 
provides first-class support for
+semi-structured data (e.g., JSON). The Variant type is implemented following 
Spark 4.0's native VariantType
+specification.
+
+### Overview
+
+The Variant type enables Hudi to store and query semi-structured data 
efficiently. It is particularly useful for:
+
+- Schema-on-read flexibility for evolving data structures
+- Storing JSON-like data without requiring predefined schemas
+
+### Motivation
+
+Hudi readers and writers should be able to handle datasets with variant types.
+This will allow users to work with semi-structured data more easily.
+
+The variant type is now formally defined in Parquet and engines like Spark 
have full support for this type.
+Users with semi-structured data are otherwise forced to use strings or byte 
arrays to store this data.
+
+### What is the VARIANT Type?
+
+The `VARIANT` type is a new data type designed to store semi-structured data 
(like JSON) efficiently.
+Unlike storing JSON as a plain string, `VARIANT` uses an optimized binary 
encoding that allows for fast navigation
+and element extraction without needing to parse the entire document.
+It offers the flexibility of a schema-less design (like JSON) with performance 
closer to structured columns.
+
+### Storage Modes: Shredded vs. Unshredded
+
+- **Unshredded** (Binary Blob):
+    - The entire JSON structure is encoded into binary metadata and value 
blobs.
+    - **Pros**: Fast write speed; handles completely dynamic/random schemas 
easily.
+    - **Cons**: To read a single field (e.g., `user.id`), the engine must load 
the entire binary blob.
+- **Shredded** (Columnar Optimization):
+    - The engine identifies common paths in the data (e.g., `v.a` or `v.c`) 
and extracts them into separate, native
+      Parquet columns (e.g., Int32, Decimal).
+    - **Pros**: Massive performance gain for queries. If you query `SELECT 
v:a`, the engine reads only the specific
+      Int32 column and skips the rest of the binary data (Columnar Pruning).
+    - **Cons**: Higher write overhead to analyze and split the data, prone to 
read jitter if there are large variation 
+    - in shredding output.
+
+### Architecture
+
+Variant support is built on a **layered architecture** with version-specific 
adapters:
+
+```
+┌────────────────────────────────────────────────────┐
+│            Application Layer (Spark SQL)           │
+│    SELECT parse_json('{"a": 1}') as data           │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Spark Version Adapters                │
+│  ┌──────────────────┐  ┌────────────────────────┐  │
+│  │ BaseSpark3Adapter│  │   BaseSpark4Adapter    │  │
+│  │ (No Variant)     │  │   (Full Variant)       │  │
+│  └──────────────────┘  └────────────────────────┘  │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│             HoodieSchema.Variant                   │
+│     (Avro Logical Type + Record Schema)            │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Parquet Storage                       │
+│    GROUP { value: BINARY, metadata: BINARY }       │
+└────────────────────────────────────────────────────┘
+```
+
+### Variant Schema Definition
+
+The `HoodieSchema.Variant` class in `hudi-common` defines the Variant type:
+
+```java
+public static class Variant extends HoodieSchema {
+  private static final String VARIANT_METADATA_FIELD = "metadata";
+  private static final String VARIANT_VALUE_FIELD = "value";
+  private static final String VARIANT_TYPED_VALUE_FIELD = "typed_value";
+
+  private final boolean isShredded;
+  private final Option<HoodieSchema> typedValueSchema;
+}
+```
+
+#### Two Storage Modes
+
+1. **Unshredded Variant** (Default):
+    - Created with: `HoodieSchema.createVariant()`
+    - Structure: Record with two REQUIRED binary fields
+    - Fields: `metadata` (BYTES, REQUIRED), `value` (BYTES, REQUIRED)
+    - Use case: Simple semi-structured data storage
+
+2. **Shredded Variant**:
+    - Created with: `HoodieSchema.createVariantShredded(typedValueSchema)`
+    - Structure: Record with `typed_value` group containing extracted typed 
columns
+    - Fields: `value` (BYTES, OPTIONAL), `metadata` (BYTES, REQUIRED), 
`typed_value` (GROUP, OPTIONAL)
+    - Use case: Frequently accessed fields are extracted into native typed 
columns for columnar reads
+
+#### How a Reader Distinguishes the Two Modes
+
+Both modes use the same `VARIANT` logical type annotation in the Parquet 
footer — the annotation does not change.
+The reader inspects the Parquet file schema to determine which mode was used:
+
+- If the Variant group contains only `metadata` and `value`, it is 
**unshredded**.
+- If the Variant group also contains a `typed_value` child group, it is 
**shredded**.
+
+In the shredded case, `value` becomes OPTIONAL because when all fields are 
fully shredded, the binary blob
+may be `null` — the content is entirely represented in the typed columns. The 
reader falls back to `value`
+only for fields that were not shredded.
+
+#### Custom Avro Logical Type
+
+Variant uses a custom Avro logical type for identification:
+
+```java
+public static class VariantLogicalType extends LogicalType {
+  private static final String VARIANT_LOGICAL_TYPE_NAME = "variant";
+}
+```
+
+### On-Disk Representation (Parquet)
+
+Hudi's on-disk Variant representation intentionally aligns with the
+[Parquet Variant 
spec](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md).
+The spec defines a Variant as a group annotated with the `VARIANT` logical 
type. Hudi writes Variant columns using
+this exact layout in both modes:
+
+#### Unshredded
+
+```
+optional group variant_column (VARIANT) {
+  required binary metadata;
+  required binary value;
+}
+```
+
+The entire JSON value is encoded into the `value` blob. Both fields are 
REQUIRED — every row carries the full binary
+representation.
+
+#### Shredded
+
+```
+optional group variant_column (VARIANT) {
+  required binary metadata;
+  optional binary value;
+  optional group typed_value {
+    optional group a {
+      optional binary value;
+      optional int32 typed_value;
+    }
+    optional group b {
+      optional binary value;
+      optional binary typed_value (STRING);
+    }
+    optional group c {
+      optional binary value;
+      optional int64 typed_value;
+    }
+  }
+}
+```
+
+Each child under `typed_value` corresponds to a shredded field. The child's 
`typed_value` column holds the extracted
+native-typed value (e.g., `int32`, `string`, `int64`), while the child's 
`value` column is a fallback binary blob for
+rows where the field's type does not match the shredded type. The top-level 
`value` becomes OPTIONAL — it is `null`
+when all fields in the row are fully covered by shredded columns, and 
populated only for fields that were not shredded.
+
+**How a reader uses this**: The Parquet file schema in the footer tells the 
reader which mode was used. If the Variant
+group contains only `metadata` and `value`, it is unshredded. If a 
`typed_value` child group is present, the reader
+knows which fields have been shredded and can read them as native typed 
columns — enabling column pruning, predicate
+pushdown, and data skipping without touching the binary blob.
+
+The `VARIANT` annotation is what allows readers to recognize the column as a 
Variant and expose semantic operations
+(e.g., `variant_get`, JSON path access). Without it, the group is 
indistinguishable from an ordinary
+`Struct<metadata: Binary, value: Binary>`.
+
+**Alignment with the Parquet spec**: Hudi does not diverge from the Parquet 
Variant spec. The annotation, field names,
+field types, and repetition levels all follow the spec exactly. This means any 
Parquet reader that implements the
+Variant spec can read Hudi-written Variant columns natively, and vice versa.
+
+**Reader compatibility**: Engines that do not yet implement the Parquet 
Variant spec (e.g., Spark 3.5) will ignore the
+`VARIANT` annotation and fall back to reading the column as a plain struct of 
binary fields.
+The data remains physically accessible, but users lose Variant query functions 
and must manually parse
+the binary encoding. See [variant-appendix.md](variant-appendix.md) for 
detailed backward compatibility findings.
+
+**Migration path**: If a future Parquet spec revision changes the Variant 
physical layout, Hudi can introduce a
+table property (e.g., `hoodie.variant.parquet.format.version`) to control 
which format is written, and the reader
+can inspect the Parquet footer to determine which layout to expect. (This is 
outside the scope of this RFC for now)
+
+#### Binary Format
+
+The Variant binary encoding follows the
+[Parquet Variant Binary Encoding 
spec](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md):
+
+| Component    | Description                                                   
      |
+|--------------|---------------------------------------------------------------------|
+| **metadata** | Dictionary of field names and type information for efficient 
access |
+| **value**    | Binary encoding of the actual data (scalars, objects, arrays) 
      |
+
+Example for `{"updated": true, "new_field": 123}`:
+
+```
+Metadata Bytes: [0x01, 0x02, 0x00, 0x07, 0x10, "updated", "new_field"]
+Value Bytes:   [0x02, 0x02, 0x01, 0x00, 0x01, 0x00, 0x03, 0x04, 0x0C, 0x7B]
+```
+
+The metadata contains a dictionary of all field names, while the value 
contains references to these fields plus the
+actual data values.
+
+### Schema Evolution Support
+
+Variant types provide **schema-on-read** flexibility:
+
+| Aspect                   | Behavior                                          
                |
+|--------------------------|-------------------------------------------------------------------|
+| Adding new fields        | ✅ Supported - New JSON fields can be added 
without schema changes |
+| Removing fields          | ✅ Supported - Missing fields return null on read  
                |
+| Type changes within JSON | ✅ Supported - Variant can store any 
JSON-compatible type          |
+| Table schema evolution   | ✅ Supported - Variant column can be added to 
existing tables      |
+| Hudi schema evolution    | ✅ Supported - Works with Hudi's standard schema 
evolution         |
+
+**Important**: The schema flexibility is within the Variant column itself. The 
table-level schema (including the Variant
+column definition) still follows Hudi's standard schema evolution rules.
+
+### Column Statistics and Indexing
+
+Variant statistics and indexing capabilities depend on whether a field is 
accessed from the unshredded binary blob
+or from a shredded (extracted) typed column. With shredding, extracted fields 
become regular typed Parquet columns
+that automatically leverage all existing Hudi metadata infrastructure.
+
+#### Unshredded Variant Column
+
+| Feature            | Status | Notes                                          
                    |
+|--------------------|--------|--------------------------------------------------------------------|
+| Value/null counts  | ✅      | Standard column-level counts via MDT 
`column_stats` partition      |
+| Min/max bounds     | ❌      | Binary blob has no meaningful sort order       
                    |
+| Data skipping      | ❌      | Requires comparable min/max bounds             
                    |
+| Bloom filter       | ❌      | Bloom filters are used for record key lookups 
only                 |
+| Partition stats    | ❌      | Variant columns are not used as partition keys 
                    |
+| Predicate pushdown | ⚠️     | Structural predicates only (`IS NULL`, `IS NOT 
NULL`)              |
+
+#### Shredded Variant Fields
+
+| Feature            | Status | Notes                                          
                                                                        |
+|--------------------|--------|------------------------------------------------------------------------------------------------------------------------|
+| Min/max bounds     | ✅      | Shredded columns are regular typed columns; 
`HoodieTableMetadataUtil.isColumnTypeSupported()` accepts all primitives   |
+| Data skipping      | ✅      | `DataSkippingUtils` translates filters on 
shredded fields to min/max range checks                                      |
+| Expression index   | ✅      | Index over `variant_get()` expressions enables 
stats collection without full shredding                                 |
+| Bloom filter       | ❌      | Bloom filters are used for record key lookups 
only                                                                     |
+| Partition stats    | ✅      | Applicable if a shredded field is used as a 
partition key                                                              |
+| Predicate pushdown | ✅      | Full pushdown support: equality, range, and 
IN-list predicates                                                         |
+| Column pruning     | ✅      | Read only the shredded columns needed; skip 
the binary blob entirely                                                   |
+
+**Recommendation**: Enable shredding for frequently accessed fields to unlock 
full MDT column stats, data skipping,
+and predicate pushdown. For lighter-weight optimization, use expression 
indexes over `variant_get()` expressions to
+collect per-field statistics without materializing shredded columns.
+
+### Usage Guide
+
+#### Spark 4.0+ (Native Support)
+
+```sql
+-- Create table with Variant column
+CREATE TABLE events
+(
+    id      STRING,
+    ts      TIMESTAMP,
+    payload VARIANT
+) USING hudi
+OPTIONS (
+    primaryKey = 'id',
+    preCombineField = 'ts'
+);
+
+-- Insert with parse_json
+INSERT INTO events
+VALUES ('1', current_timestamp(), parse_json('{"event": "click", "page": 
"/home"}')),
+       ('2', current_timestamp(), parse_json('{"event": "purchase", "amount": 
99.99}'));
+
+-- Query Variant data
+SELECT id, payload:event, payload:amount
+FROM events;
+
+-- Update Variant column
+UPDATE events
+SET payload = parse_json('{"event": "click", "page": "/products"}')
+WHERE id = '1';
+
+-- Works with both COW and MOR tables
+CREATE TABLE events_mor
+(
+    id      STRING,
+    ts      TIMESTAMP,
+    payload VARIANT
+) USING hudi
+TBLPROPERTIES (
+    'type' = 'mor',
+    'primaryKey' = 'id',
+    'preCombineField' = 'ts'
+);
+```
+
+#### Spark 3.x (Backward Compatibility Read)
+
+Spark 3.x does not support VariantType natively, but can read Variant tables 
as struct:
+
+```sql
+-- Reading Spark 4.0 Variant table in Spark 3.x
+-- Variant column appears as: STRUCT<value: BINARY, metadata: BINARY>
+
+SELECT id, cast(payload.value as string)
+FROM events;
+```
+
+**Limitations in Spark 3.x**:
+
+- Cannot write Variant data
+- Variant column reads as raw struct with binary fields
+- No helper functions like `parse_json()` available
+
+### Cross-Engine Compatibility
+
+Engines that do not support the Variant type can still read all non-Variant 
columns in the table. The Variant column's
+on-disk layout is a plain Parquet struct (`value BINARY, metadata BINARY`), so 
engines that lack Variant-aware logic
+can either project it as a struct of binary fields or simply exclude it from 
their column projection. This means adding
+a Variant column to a table does not break reads from older or third-party 
engines — they continue to access every
+other column as before.
+
+#### Flink Integration
+
+| Operation                            | Support                            |
+|--------------------------------------|------------------------------------|
+| Reading Spark-written Variant tables | ✅ Supported                        |
+| Variant representation in Flink      | `ROW<value BYTES, metadata BYTES>` |
+| Writing Variant from Flink           | ❌ Not yet implemented              |
+
+Example Flink query reading Variant data:
+
+```sql
+-- Flink sees Variant as ROW type
+SELECT id, variant_col.value, variant_col.metadata
+FROM hudi_variant_table;
+```
+
+#### Avro Serialization (MOR Tables)
+
+For MOR tables, Variant data is serialized to Avro for log files:
+
+```
+{
+  "type": "record",
+  "logicalType": "variant",
+  "fields": [
+    {"name": "value", "type": "bytes"},
+    {"name": "metadata", "type": "bytes"}
+  ]
+}
+```
+
+### Backward Compatibility
+
+The implementation ensures backward compatibility through:
+
+1. **Storage Format**: On-disk layout is a plain Parquet struct (`value 
BINARY, metadata BINARY`), readable by any Parquet-compatible engine
+2. **Logical Type Annotation**: Avro logical type allows newer versions to 
recognize Variant semantics
+3. **Graceful Degradation**: Older readers see Variant as `STRUCT<value: 
BINARY, metadata: BINARY>`
+
+| Scenario                          | Behavior                  |
+|-----------------------------------|---------------------------|
+| Spark 4.0 writes, Spark 4.0 reads | Full Variant support      |
+| Spark 4.0 writes, Spark 3.x reads | Struct with binary fields |
+| Spark 4.0 writes, Flink reads     | ROW with binary fields    |
+| Spark 3.x writes Variant          | ❌ Not supported           |
+
+### Limitations and Constraints
+
+1. **Spark Version Dependency**:
+    - Write support requires Spark 4.0+
+    - Spark 3.x limited to read-only access with degraded experience
+
+2. **Storage Overhead**:
+    - Metadata stored redundantly per row
+    - No column-level compression optimizations for Variant content
+
+3. **Query Performance**:
+    - Predicate pushdown on unshredded Variant content is limited to 
structural predicates (`IS NULL`, `IS NOT NULL`)
+    - Accessing fields from the unshredded blob requires loading the full 
binary value
+    - Shredded fields and expression indexes enable full predicate pushdown, 
data skipping, and column pruning
+
+4. **Shredded Variant**:
+    - `typed_value` field is defined for extracted typed columns within the 
Variant group
+    - Shredded fields unlock full MDT column stats and data skipping via 
`DataSkippingUtils`
+
+5. **Functions**:
+    - `parse_json()` - Spark 4.0+ only
+    - JSON path access (`payload:field`) - Spark 4.0+ only
+    - No UDFs for Variant manipulation in Spark 3.x
+
+### Key Implementation Files
+
+| File                                                      | Description      
                                |
+|-----------------------------------------------------------|--------------------------------------------------|
+| `hudi-common/.../HoodieSchema.java`                       | Core Variant 
schema definition with logical type |
+| `hudi-spark3-common/.../BaseSpark3Adapter.scala`          | Spark 3 adapter 
(no Variant support)             |
+| `hudi-spark4-common/.../BaseSpark4Adapter.scala`          | Spark 4 adapter 
with full Variant API            |
+| `hudi-spark-client/.../HoodieRowParquetWriteSupport.java` | Variant Parquet 
writing                          |
+| `hudi-spark-client/.../HoodieSparkSchemaConverters.scala` | Schema 
conversion for Variant                    |
+| `hudi-spark4.0.x/.../AvroSerializer.scala`                | Spark to Avro 
Variant conversion                 |
+| `hudi-spark4.0.x/.../AvroDeserializer.scala`              | Avro to Spark 
Variant conversion                 |
+
+### Test Coverage
+
+| Test                                      | Description                      
                    |
+|-------------------------------------------|------------------------------------------------------|

Review Comment:
   🤖 The RFC doesn't have an explicit "Alternatives Considered" section. It 
would strengthen the proposal to briefly discuss why a custom Avro logical type 
was chosen over alternatives (e.g., reusing an existing union type, or a raw 
bytes approach with metadata in table properties). Even a short paragraph would 
help future readers understand the design rationale.



##########
rfc/rfc-99/rfc-99.md:
##########
@@ -194,5 +195,444 @@ SQL Extensions needs to be added to define the table in a 
hudi type native way.
 
 TODO: There is an open question regarding the need to maintain type ids to 
track schema evolution and how it would interplay with NBCC. 
 
+---
+
+## Variant Type Implementation
+
+This section documents the implementation of the VARIANT type in Hudi, which 
provides first-class support for
+semi-structured data (e.g., JSON). The Variant type is implemented following 
Spark 4.0's native VariantType
+specification.
+
+### Overview
+
+The Variant type enables Hudi to store and query semi-structured data 
efficiently. It is particularly useful for:
+
+- Schema-on-read flexibility for evolving data structures
+- Storing JSON-like data without requiring predefined schemas
+
+### Motivation
+
+Hudi readers and writers should be able to handle datasets with variant types.
+This will allow users to work with semi-structured data more easily.
+
+The variant type is now formally defined in Parquet and engines like Spark 
have full support for this type.
+Users with semi-structured data are otherwise forced to use strings or byte 
arrays to store this data.
+
+### What is the VARIANT Type?
+
+The `VARIANT` type is a new data type designed to store semi-structured data 
(like JSON) efficiently.
+Unlike storing JSON as a plain string, `VARIANT` uses an optimized binary 
encoding that allows for fast navigation
+and element extraction without needing to parse the entire document.
+It offers the flexibility of a schema-less design (like JSON) with performance 
closer to structured columns.
+
+### Storage Modes: Shredded vs. Unshredded
+
+- **Unshredded** (Binary Blob):
+    - The entire JSON structure is encoded into binary metadata and value 
blobs.
+    - **Pros**: Fast write speed; handles completely dynamic/random schemas 
easily.
+    - **Cons**: To read a single field (e.g., `user.id`), the engine must load 
the entire binary blob.
+- **Shredded** (Columnar Optimization):
+    - The engine identifies common paths in the data (e.g., `v.a` or `v.c`) 
and extracts them into separate, native
+      Parquet columns (e.g., Int32, Decimal).
+    - **Pros**: Massive performance gain for queries. If you query `SELECT 
v:a`, the engine reads only the specific
+      Int32 column and skips the rest of the binary data (Columnar Pruning).
+    - **Cons**: Higher write overhead to analyze and split the data, prone to 
read jitter if there are large variation 
+    - in shredding output.
+
+### Architecture
+
+Variant support is built on a **layered architecture** with version-specific 
adapters:
+
+```
+┌────────────────────────────────────────────────────┐
+│            Application Layer (Spark SQL)           │
+│    SELECT parse_json('{"a": 1}') as data           │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Spark Version Adapters                │
+│  ┌──────────────────┐  ┌────────────────────────┐  │
+│  │ BaseSpark3Adapter│  │   BaseSpark4Adapter    │  │
+│  │ (No Variant)     │  │   (Full Variant)       │  │
+│  └──────────────────┘  └────────────────────────┘  │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│             HoodieSchema.Variant                   │
+│     (Avro Logical Type + Record Schema)            │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Parquet Storage                       │
+│    GROUP { value: BINARY, metadata: BINARY }       │
+└────────────────────────────────────────────────────┘
+```
+
+### Variant Schema Definition
+
+The `HoodieSchema.Variant` class in `hudi-common` defines the Variant type:
+
+```java
+public static class Variant extends HoodieSchema {
+  private static final String VARIANT_METADATA_FIELD = "metadata";
+  private static final String VARIANT_VALUE_FIELD = "value";
+  private static final String VARIANT_TYPED_VALUE_FIELD = "typed_value";
+
+  private final boolean isShredded;
+  private final Option<HoodieSchema> typedValueSchema;
+}
+```
+
+#### Two Storage Modes
+
+1. **Unshredded Variant** (Default):
+    - Created with: `HoodieSchema.createVariant()`
+    - Structure: Record with two REQUIRED binary fields
+    - Fields: `metadata` (BYTES, REQUIRED), `value` (BYTES, REQUIRED)
+    - Use case: Simple semi-structured data storage
+
+2. **Shredded Variant**:
+    - Created with: `HoodieSchema.createVariantShredded(typedValueSchema)`
+    - Structure: Record with `typed_value` group containing extracted typed 
columns
+    - Fields: `value` (BYTES, OPTIONAL), `metadata` (BYTES, REQUIRED), 
`typed_value` (GROUP, OPTIONAL)
+    - Use case: Frequently accessed fields are extracted into native typed 
columns for columnar reads
+
+#### How a Reader Distinguishes the Two Modes
+
+Both modes use the same `VARIANT` logical type annotation in the Parquet 
footer — the annotation does not change.
+The reader inspects the Parquet file schema to determine which mode was used:
+
+- If the Variant group contains only `metadata` and `value`, it is 
**unshredded**.
+- If the Variant group also contains a `typed_value` child group, it is 
**shredded**.
+
+In the shredded case, `value` becomes OPTIONAL because when all fields are 
fully shredded, the binary blob
+may be `null` — the content is entirely represented in the typed columns. The 
reader falls back to `value`
+only for fields that were not shredded.
+
+#### Custom Avro Logical Type
+
+Variant uses a custom Avro logical type for identification:
+
+```java
+public static class VariantLogicalType extends LogicalType {
+  private static final String VARIANT_LOGICAL_TYPE_NAME = "variant";
+}
+```
+
+### On-Disk Representation (Parquet)
+
+Hudi's on-disk Variant representation intentionally aligns with the
+[Parquet Variant 
spec](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md).
+The spec defines a Variant as a group annotated with the `VARIANT` logical 
type. Hudi writes Variant columns using
+this exact layout in both modes:
+
+#### Unshredded
+
+```
+optional group variant_column (VARIANT) {
+  required binary metadata;
+  required binary value;
+}
+```
+
+The entire JSON value is encoded into the `value` blob. Both fields are 
REQUIRED — every row carries the full binary
+representation.
+
+#### Shredded
+
+```
+optional group variant_column (VARIANT) {
+  required binary metadata;
+  optional binary value;
+  optional group typed_value {
+    optional group a {
+      optional binary value;
+      optional int32 typed_value;
+    }
+    optional group b {
+      optional binary value;
+      optional binary typed_value (STRING);
+    }
+    optional group c {
+      optional binary value;
+      optional int64 typed_value;
+    }
+  }
+}
+```
+
+Each child under `typed_value` corresponds to a shredded field. The child's 
`typed_value` column holds the extracted
+native-typed value (e.g., `int32`, `string`, `int64`), while the child's 
`value` column is a fallback binary blob for
+rows where the field's type does not match the shredded type. The top-level 
`value` becomes OPTIONAL — it is `null`
+when all fields in the row are fully covered by shredded columns, and 
populated only for fields that were not shredded.
+
+**How a reader uses this**: The Parquet file schema in the footer tells the 
reader which mode was used. If the Variant
+group contains only `metadata` and `value`, it is unshredded. If a 
`typed_value` child group is present, the reader
+knows which fields have been shredded and can read them as native typed 
columns — enabling column pruning, predicate
+pushdown, and data skipping without touching the binary blob.
+
+The `VARIANT` annotation is what allows readers to recognize the column as a 
Variant and expose semantic operations
+(e.g., `variant_get`, JSON path access). Without it, the group is 
indistinguishable from an ordinary
+`Struct<metadata: Binary, value: Binary>`.
+
+**Alignment with the Parquet spec**: Hudi does not diverge from the Parquet 
Variant spec. The annotation, field names,
+field types, and repetition levels all follow the spec exactly. This means any 
Parquet reader that implements the
+Variant spec can read Hudi-written Variant columns natively, and vice versa.
+
+**Reader compatibility**: Engines that do not yet implement the Parquet 
Variant spec (e.g., Spark 3.5) will ignore the
+`VARIANT` annotation and fall back to reading the column as a plain struct of 
binary fields.
+The data remains physically accessible, but users lose Variant query functions 
and must manually parse
+the binary encoding. See [variant-appendix.md](variant-appendix.md) for 
detailed backward compatibility findings.
+
+**Migration path**: If a future Parquet spec revision changes the Variant 
physical layout, Hudi can introduce a
+table property (e.g., `hoodie.variant.parquet.format.version`) to control 
which format is written, and the reader
+can inspect the Parquet footer to determine which layout to expect. (This is 
outside the scope of this RFC for now)
+
+#### Binary Format
+
+The Variant binary encoding follows the
+[Parquet Variant Binary Encoding 
spec](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md):
+
+| Component    | Description                                                   
      |
+|--------------|---------------------------------------------------------------------|
+| **metadata** | Dictionary of field names and type information for efficient 
access |
+| **value**    | Binary encoding of the actual data (scalars, objects, arrays) 
      |
+
+Example for `{"updated": true, "new_field": 123}`:
+
+```
+Metadata Bytes: [0x01, 0x02, 0x00, 0x07, 0x10, "updated", "new_field"]
+Value Bytes:   [0x02, 0x02, 0x01, 0x00, 0x01, 0x00, 0x03, 0x04, 0x0C, 0x7B]
+```
+
+The metadata contains a dictionary of all field names, while the value 
contains references to these fields plus the
+actual data values.
+
+### Schema Evolution Support
+
+Variant types provide **schema-on-read** flexibility:
+
+| Aspect                   | Behavior                                          
                |
+|--------------------------|-------------------------------------------------------------------|
+| Adding new fields        | ✅ Supported - New JSON fields can be added 
without schema changes |
+| Removing fields          | ✅ Supported - Missing fields return null on read  
                |
+| Type changes within JSON | ✅ Supported - Variant can store any 
JSON-compatible type          |
+| Table schema evolution   | ✅ Supported - Variant column can be added to 
existing tables      |
+| Hudi schema evolution    | ✅ Supported - Works with Hudi's standard schema 
evolution         |
+
+**Important**: The schema flexibility is within the Variant column itself. The 
table-level schema (including the Variant
+column definition) still follows Hudi's standard schema evolution rules.
+
+### Column Statistics and Indexing
+
+Variant statistics and indexing capabilities depend on whether a field is 
accessed from the unshredded binary blob
+or from a shredded (extracted) typed column. With shredding, extracted fields 
become regular typed Parquet columns
+that automatically leverage all existing Hudi metadata infrastructure.
+
+#### Unshredded Variant Column
+
+| Feature            | Status | Notes                                          
                    |
+|--------------------|--------|--------------------------------------------------------------------|
+| Value/null counts  | ✅      | Standard column-level counts via MDT 
`column_stats` partition      |
+| Min/max bounds     | ❌      | Binary blob has no meaningful sort order       
                    |
+| Data skipping      | ❌      | Requires comparable min/max bounds             
                    |
+| Bloom filter       | ❌      | Bloom filters are used for record key lookups 
only                 |
+| Partition stats    | ❌      | Variant columns are not used as partition keys 
                    |
+| Predicate pushdown | ⚠️     | Structural predicates only (`IS NULL`, `IS NOT 
NULL`)              |
+
+#### Shredded Variant Fields
+
+| Feature            | Status | Notes                                          
                                                                        |
+|--------------------|--------|------------------------------------------------------------------------------------------------------------------------|
+| Min/max bounds     | ✅      | Shredded columns are regular typed columns; 
`HoodieTableMetadataUtil.isColumnTypeSupported()` accepts all primitives   |
+| Data skipping      | ✅      | `DataSkippingUtils` translates filters on 
shredded fields to min/max range checks                                      |
+| Expression index   | ✅      | Index over `variant_get()` expressions enables 
stats collection without full shredding                                 |
+| Bloom filter       | ❌      | Bloom filters are used for record key lookups 
only                                                                     |
+| Partition stats    | ✅      | Applicable if a shredded field is used as a 
partition key                                                              |
+| Predicate pushdown | ✅      | Full pushdown support: equality, range, and 
IN-list predicates                                                         |
+| Column pruning     | ✅      | Read only the shredded columns needed; skip 
the binary blob entirely                                                   |
+
+**Recommendation**: Enable shredding for frequently accessed fields to unlock 
full MDT column stats, data skipping,
+and predicate pushdown. For lighter-weight optimization, use expression 
indexes over `variant_get()` expressions to
+collect per-field statistics without materializing shredded columns.
+
+### Usage Guide
+
+#### Spark 4.0+ (Native Support)
+
+```sql
+-- Create table with Variant column
+CREATE TABLE events
+(
+    id      STRING,
+    ts      TIMESTAMP,
+    payload VARIANT
+) USING hudi
+OPTIONS (
+    primaryKey = 'id',
+    preCombineField = 'ts'
+);
+
+-- Insert with parse_json
+INSERT INTO events
+VALUES ('1', current_timestamp(), parse_json('{"event": "click", "page": 
"/home"}')),
+       ('2', current_timestamp(), parse_json('{"event": "purchase", "amount": 
99.99}'));
+
+-- Query Variant data
+SELECT id, payload:event, payload:amount
+FROM events;
+
+-- Update Variant column
+UPDATE events
+SET payload = parse_json('{"event": "click", "page": "/products"}')
+WHERE id = '1';
+
+-- Works with both COW and MOR tables
+CREATE TABLE events_mor
+(
+    id      STRING,
+    ts      TIMESTAMP,
+    payload VARIANT
+) USING hudi
+TBLPROPERTIES (
+    'type' = 'mor',
+    'primaryKey' = 'id',
+    'preCombineField' = 'ts'
+);
+```
+
+#### Spark 3.x (Backward Compatibility Read)
+
+Spark 3.x does not support VariantType natively, but can read Variant tables 
as struct:
+
+```sql
+-- Reading Spark 4.0 Variant table in Spark 3.x
+-- Variant column appears as: STRUCT<value: BINARY, metadata: BINARY>
+
+SELECT id, cast(payload.value as string)
+FROM events;
+```
+
+**Limitations in Spark 3.x**:
+
+- Cannot write Variant data
+- Variant column reads as raw struct with binary fields
+- No helper functions like `parse_json()` available
+
+### Cross-Engine Compatibility
+
+Engines that do not support the Variant type can still read all non-Variant 
columns in the table. The Variant column's
+on-disk layout is a plain Parquet struct (`value BINARY, metadata BINARY`), so 
engines that lack Variant-aware logic
+can either project it as a struct of binary fields or simply exclude it from 
their column projection. This means adding
+a Variant column to a table does not break reads from older or third-party 
engines — they continue to access every
+other column as before.
+
+#### Flink Integration
+
+| Operation                            | Support                            |
+|--------------------------------------|------------------------------------|
+| Reading Spark-written Variant tables | ✅ Supported                        |
+| Variant representation in Flink      | `ROW<value BYTES, metadata BYTES>` |
+| Writing Variant from Flink           | ❌ Not yet implemented              |
+
+Example Flink query reading Variant data:
+
+```sql
+-- Flink sees Variant as ROW type
+SELECT id, variant_col.value, variant_col.metadata
+FROM hudi_variant_table;
+```
+
+#### Avro Serialization (MOR Tables)
+
+For MOR tables, Variant data is serialized to Avro for log files:
+
+```
+{
+  "type": "record",
+  "logicalType": "variant",
+  "fields": [
+    {"name": "value", "type": "bytes"},
+    {"name": "metadata", "type": "bytes"}
+  ]
+}
+```
+
+### Backward Compatibility
+
+The implementation ensures backward compatibility through:
+
+1. **Storage Format**: On-disk layout is a plain Parquet struct (`value 
BINARY, metadata BINARY`), readable by any Parquet-compatible engine
+2. **Logical Type Annotation**: Avro logical type allows newer versions to 
recognize Variant semantics
+3. **Graceful Degradation**: Older readers see Variant as `STRUCT<value: 
BINARY, metadata: BINARY>`
+

Review Comment:
   🤖 The "Limitations and Constraints" section mentions metadata is stored 
redundantly per row but doesn't discuss the storage overhead implications 
quantitatively. For tables with millions of rows where the Variant schema is 
stable, this could be significant. Have you considered whether Hudi could 
deduplicate metadata at the file group or row group level, or is this strictly 
a Parquet-spec constraint that Hudi can't optimize around?



##########
rfc/rfc-99/rfc-99.md:
##########
@@ -194,5 +195,444 @@ SQL Extensions needs to be added to define the table in a 
hudi type native way.
 
 TODO: There is an open question regarding the need to maintain type ids to 
track schema evolution and how it would interplay with NBCC. 
 
+---
+
+## Variant Type Implementation
+
+This section documents the implementation of the VARIANT type in Hudi, which 
provides first-class support for
+semi-structured data (e.g., JSON). The Variant type is implemented following 
Spark 4.0's native VariantType
+specification.
+
+### Overview
+
+The Variant type enables Hudi to store and query semi-structured data 
efficiently. It is particularly useful for:
+
+- Schema-on-read flexibility for evolving data structures
+- Storing JSON-like data without requiring predefined schemas
+
+### Motivation
+
+Hudi readers and writers should be able to handle datasets with variant types.
+This will allow users to work with semi-structured data more easily.
+
+The variant type is now formally defined in Parquet and engines like Spark 
have full support for this type.
+Users with semi-structured data are otherwise forced to use strings or byte 
arrays to store this data.
+
+### What is the VARIANT Type?
+
+The `VARIANT` type is a new data type designed to store semi-structured data 
(like JSON) efficiently.
+Unlike storing JSON as a plain string, `VARIANT` uses an optimized binary 
encoding that allows for fast navigation
+and element extraction without needing to parse the entire document.
+It offers the flexibility of a schema-less design (like JSON) with performance 
closer to structured columns.
+
+### Storage Modes: Shredded vs. Unshredded
+
+- **Unshredded** (Binary Blob):
+    - The entire JSON structure is encoded into binary metadata and value 
blobs.
+    - **Pros**: Fast write speed; handles completely dynamic/random schemas 
easily.
+    - **Cons**: To read a single field (e.g., `user.id`), the engine must load 
the entire binary blob.
+- **Shredded** (Columnar Optimization):
+    - The engine identifies common paths in the data (e.g., `v.a` or `v.c`) 
and extracts them into separate, native
+      Parquet columns (e.g., Int32, Decimal).
+    - **Pros**: Massive performance gain for queries. If you query `SELECT 
v:a`, the engine reads only the specific
+      Int32 column and skips the rest of the binary data (Columnar Pruning).
+    - **Cons**: Higher write overhead to analyze and split the data, prone to 
read jitter if there are large variation 
+    - in shredding output.
+
+### Architecture
+
+Variant support is built on a **layered architecture** with version-specific 
adapters:
+
+```
+┌────────────────────────────────────────────────────┐
+│            Application Layer (Spark SQL)           │
+│    SELECT parse_json('{"a": 1}') as data           │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Spark Version Adapters                │
+│  ┌──────────────────┐  ┌────────────────────────┐  │
+│  │ BaseSpark3Adapter│  │   BaseSpark4Adapter    │  │
+│  │ (No Variant)     │  │   (Full Variant)       │  │
+│  └──────────────────┘  └────────────────────────┘  │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│             HoodieSchema.Variant                   │
+│     (Avro Logical Type + Record Schema)            │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Parquet Storage                       │
+│    GROUP { value: BINARY, metadata: BINARY }       │
+└────────────────────────────────────────────────────┘
+```
+
+### Variant Schema Definition
+
+The `HoodieSchema.Variant` class in `hudi-common` defines the Variant type:
+
+```java
+public static class Variant extends HoodieSchema {
+  private static final String VARIANT_METADATA_FIELD = "metadata";
+  private static final String VARIANT_VALUE_FIELD = "value";
+  private static final String VARIANT_TYPED_VALUE_FIELD = "typed_value";
+
+  private final boolean isShredded;
+  private final Option<HoodieSchema> typedValueSchema;
+}
+```
+
+#### Two Storage Modes
+
+1. **Unshredded Variant** (Default):
+    - Created with: `HoodieSchema.createVariant()`
+    - Structure: Record with two REQUIRED binary fields
+    - Fields: `metadata` (BYTES, REQUIRED), `value` (BYTES, REQUIRED)
+    - Use case: Simple semi-structured data storage
+
+2. **Shredded Variant**:
+    - Created with: `HoodieSchema.createVariantShredded(typedValueSchema)`
+    - Structure: Record with `typed_value` group containing extracted typed 
columns
+    - Fields: `value` (BYTES, OPTIONAL), `metadata` (BYTES, REQUIRED), 
`typed_value` (GROUP, OPTIONAL)
+    - Use case: Frequently accessed fields are extracted into native typed 
columns for columnar reads
+
+#### How a Reader Distinguishes the Two Modes
+
+Both modes use the same `VARIANT` logical type annotation in the Parquet 
footer — the annotation does not change.
+The reader inspects the Parquet file schema to determine which mode was used:
+
+- If the Variant group contains only `metadata` and `value`, it is 
**unshredded**.
+- If the Variant group also contains a `typed_value` child group, it is 
**shredded**.
+
+In the shredded case, `value` becomes OPTIONAL because when all fields are 
fully shredded, the binary blob
+may be `null` — the content is entirely represented in the typed columns. The 
reader falls back to `value`
+only for fields that were not shredded.
+
+#### Custom Avro Logical Type
+
+Variant uses a custom Avro logical type for identification:
+
+```java
+public static class VariantLogicalType extends LogicalType {
+  private static final String VARIANT_LOGICAL_TYPE_NAME = "variant";
+}
+```
+
+### On-Disk Representation (Parquet)
+
+Hudi's on-disk Variant representation intentionally aligns with the
+[Parquet Variant 
spec](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md).
+The spec defines a Variant as a group annotated with the `VARIANT` logical 
type. Hudi writes Variant columns using
+this exact layout in both modes:
+
+#### Unshredded
+
+```
+optional group variant_column (VARIANT) {
+  required binary metadata;
+  required binary value;
+}
+```
+
+The entire JSON value is encoded into the `value` blob. Both fields are 
REQUIRED — every row carries the full binary
+representation.
+
+#### Shredded
+
+```
+optional group variant_column (VARIANT) {
+  required binary metadata;
+  optional binary value;
+  optional group typed_value {
+    optional group a {
+      optional binary value;
+      optional int32 typed_value;
+    }
+    optional group b {
+      optional binary value;
+      optional binary typed_value (STRING);
+    }
+    optional group c {
+      optional binary value;
+      optional int64 typed_value;
+    }
+  }
+}
+```
+
+Each child under `typed_value` corresponds to a shredded field. The child's 
`typed_value` column holds the extracted
+native-typed value (e.g., `int32`, `string`, `int64`), while the child's 
`value` column is a fallback binary blob for
+rows where the field's type does not match the shredded type. The top-level 
`value` becomes OPTIONAL — it is `null`
+when all fields in the row are fully covered by shredded columns, and 
populated only for fields that were not shredded.
+
+**How a reader uses this**: The Parquet file schema in the footer tells the 
reader which mode was used. If the Variant
+group contains only `metadata` and `value`, it is unshredded. If a 
`typed_value` child group is present, the reader
+knows which fields have been shredded and can read them as native typed 
columns — enabling column pruning, predicate
+pushdown, and data skipping without touching the binary blob.
+
+The `VARIANT` annotation is what allows readers to recognize the column as a 
Variant and expose semantic operations
+(e.g., `variant_get`, JSON path access). Without it, the group is 
indistinguishable from an ordinary
+`Struct<metadata: Binary, value: Binary>`.
+
+**Alignment with the Parquet spec**: Hudi does not diverge from the Parquet 
Variant spec. The annotation, field names,
+field types, and repetition levels all follow the spec exactly. This means any 
Parquet reader that implements the
+Variant spec can read Hudi-written Variant columns natively, and vice versa.
+
+**Reader compatibility**: Engines that do not yet implement the Parquet 
Variant spec (e.g., Spark 3.5) will ignore the
+`VARIANT` annotation and fall back to reading the column as a plain struct of 
binary fields.
+The data remains physically accessible, but users lose Variant query functions 
and must manually parse
+the binary encoding. See [variant-appendix.md](variant-appendix.md) for 
detailed backward compatibility findings.
+
+**Migration path**: If a future Parquet spec revision changes the Variant 
physical layout, Hudi can introduce a
+table property (e.g., `hoodie.variant.parquet.format.version`) to control 
which format is written, and the reader
+can inspect the Parquet footer to determine which layout to expect. (This is 
outside the scope of this RFC for now)
+
+#### Binary Format
+
+The Variant binary encoding follows the
+[Parquet Variant Binary Encoding 
spec](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md):
+
+| Component    | Description                                                   
      |
+|--------------|---------------------------------------------------------------------|
+| **metadata** | Dictionary of field names and type information for efficient 
access |
+| **value**    | Binary encoding of the actual data (scalars, objects, arrays) 
      |
+
+Example for `{"updated": true, "new_field": 123}`:
+
+```
+Metadata Bytes: [0x01, 0x02, 0x00, 0x07, 0x10, "updated", "new_field"]
+Value Bytes:   [0x02, 0x02, 0x01, 0x00, 0x01, 0x00, 0x03, 0x04, 0x0C, 0x7B]
+```
+
+The metadata contains a dictionary of all field names, while the value 
contains references to these fields plus the
+actual data values.
+
+### Schema Evolution Support
+
+Variant types provide **schema-on-read** flexibility:
+
+| Aspect                   | Behavior                                          
                |
+|--------------------------|-------------------------------------------------------------------|
+| Adding new fields        | ✅ Supported - New JSON fields can be added 
without schema changes |
+| Removing fields          | ✅ Supported - Missing fields return null on read  
                |
+| Type changes within JSON | ✅ Supported - Variant can store any 
JSON-compatible type          |
+| Table schema evolution   | ✅ Supported - Variant column can be added to 
existing tables      |
+| Hudi schema evolution    | ✅ Supported - Works with Hudi's standard schema 
evolution         |
+
+**Important**: The schema flexibility is within the Variant column itself. The 
table-level schema (including the Variant
+column definition) still follows Hudi's standard schema evolution rules.
+
+### Column Statistics and Indexing
+
+Variant statistics and indexing capabilities depend on whether a field is 
accessed from the unshredded binary blob
+or from a shredded (extracted) typed column. With shredding, extracted fields 
become regular typed Parquet columns
+that automatically leverage all existing Hudi metadata infrastructure.
+
+#### Unshredded Variant Column
+
+| Feature            | Status | Notes                                          
                    |
+|--------------------|--------|--------------------------------------------------------------------|
+| Value/null counts  | ✅      | Standard column-level counts via MDT 
`column_stats` partition      |
+| Min/max bounds     | ❌      | Binary blob has no meaningful sort order       
                    |
+| Data skipping      | ❌      | Requires comparable min/max bounds             
                    |
+| Bloom filter       | ❌      | Bloom filters are used for record key lookups 
only                 |
+| Partition stats    | ❌      | Variant columns are not used as partition keys 
                    |
+| Predicate pushdown | ⚠️     | Structural predicates only (`IS NULL`, `IS NOT 
NULL`)              |
+
+#### Shredded Variant Fields
+
+| Feature            | Status | Notes                                          
                                                                        |
+|--------------------|--------|------------------------------------------------------------------------------------------------------------------------|
+| Min/max bounds     | ✅      | Shredded columns are regular typed columns; 
`HoodieTableMetadataUtil.isColumnTypeSupported()` accepts all primitives   |
+| Data skipping      | ✅      | `DataSkippingUtils` translates filters on 
shredded fields to min/max range checks                                      |
+| Expression index   | ✅      | Index over `variant_get()` expressions enables 
stats collection without full shredding                                 |
+| Bloom filter       | ❌      | Bloom filters are used for record key lookups 
only                                                                     |
+| Partition stats    | ✅      | Applicable if a shredded field is used as a 
partition key                                                              |
+| Predicate pushdown | ✅      | Full pushdown support: equality, range, and 
IN-list predicates                                                         |
+| Column pruning     | ✅      | Read only the shredded columns needed; skip 
the binary blob entirely                                                   |
+
+**Recommendation**: Enable shredding for frequently accessed fields to unlock 
full MDT column stats, data skipping,
+and predicate pushdown. For lighter-weight optimization, use expression 
indexes over `variant_get()` expressions to
+collect per-field statistics without materializing shredded columns.
+
+### Usage Guide
+
+#### Spark 4.0+ (Native Support)
+
+```sql
+-- Create table with Variant column
+CREATE TABLE events
+(
+    id      STRING,
+    ts      TIMESTAMP,
+    payload VARIANT
+) USING hudi
+OPTIONS (
+    primaryKey = 'id',
+    preCombineField = 'ts'
+);
+
+-- Insert with parse_json
+INSERT INTO events
+VALUES ('1', current_timestamp(), parse_json('{"event": "click", "page": 
"/home"}')),
+       ('2', current_timestamp(), parse_json('{"event": "purchase", "amount": 
99.99}'));
+
+-- Query Variant data
+SELECT id, payload:event, payload:amount
+FROM events;
+
+-- Update Variant column
+UPDATE events
+SET payload = parse_json('{"event": "click", "page": "/products"}')
+WHERE id = '1';
+
+-- Works with both COW and MOR tables
+CREATE TABLE events_mor
+(
+    id      STRING,
+    ts      TIMESTAMP,
+    payload VARIANT
+) USING hudi
+TBLPROPERTIES (
+    'type' = 'mor',
+    'primaryKey' = 'id',
+    'preCombineField' = 'ts'
+);
+```
+
+#### Spark 3.x (Backward Compatibility Read)
+
+Spark 3.x does not support VariantType natively, but can read Variant tables 
as struct:
+
+```sql
+-- Reading Spark 4.0 Variant table in Spark 3.x
+-- Variant column appears as: STRUCT<value: BINARY, metadata: BINARY>
+
+SELECT id, cast(payload.value as string)
+FROM events;
+```
+
+**Limitations in Spark 3.x**:
+
+- Cannot write Variant data
+- Variant column reads as raw struct with binary fields
+- No helper functions like `parse_json()` available
+
+### Cross-Engine Compatibility
+
+Engines that do not support the Variant type can still read all non-Variant 
columns in the table. The Variant column's
+on-disk layout is a plain Parquet struct (`value BINARY, metadata BINARY`), so 
engines that lack Variant-aware logic
+can either project it as a struct of binary fields or simply exclude it from 
their column projection. This means adding
+a Variant column to a table does not break reads from older or third-party 
engines — they continue to access every
+other column as before.
+
+#### Flink Integration
+
+| Operation                            | Support                            |
+|--------------------------------------|------------------------------------|
+| Reading Spark-written Variant tables | ✅ Supported                        |
+| Variant representation in Flink      | `ROW<value BYTES, metadata BYTES>` |
+| Writing Variant from Flink           | ❌ Not yet implemented              |
+
+Example Flink query reading Variant data:
+
+```sql
+-- Flink sees Variant as ROW type
+SELECT id, variant_col.value, variant_col.metadata
+FROM hudi_variant_table;
+```
+
+#### Avro Serialization (MOR Tables)
+
+For MOR tables, Variant data is serialized to Avro for log files:
+
+```
+{
+  "type": "record",
+  "logicalType": "variant",
+  "fields": [
+    {"name": "value", "type": "bytes"},
+    {"name": "metadata", "type": "bytes"}
+  ]
+}
+```
+
+### Backward Compatibility
+
+The implementation ensures backward compatibility through:
+
+1. **Storage Format**: On-disk layout is a plain Parquet struct (`value 
BINARY, metadata BINARY`), readable by any Parquet-compatible engine
+2. **Logical Type Annotation**: Avro logical type allows newer versions to 
recognize Variant semantics
+3. **Graceful Degradation**: Older readers see Variant as `STRUCT<value: 
BINARY, metadata: BINARY>`
+
+| Scenario                          | Behavior                  |
+|-----------------------------------|---------------------------|
+| Spark 4.0 writes, Spark 4.0 reads | Full Variant support      |
+| Spark 4.0 writes, Spark 3.x reads | Struct with binary fields |
+| Spark 4.0 writes, Flink reads     | ROW with binary fields    |
+| Spark 3.x writes Variant          | ❌ Not supported           |
+
+### Limitations and Constraints
+
+1. **Spark Version Dependency**:
+    - Write support requires Spark 4.0+
+    - Spark 3.x limited to read-only access with degraded experience
+
+2. **Storage Overhead**:
+    - Metadata stored redundantly per row
+    - No column-level compression optimizations for Variant content
+
+3. **Query Performance**:
+    - Predicate pushdown on unshredded Variant content is limited to 
structural predicates (`IS NULL`, `IS NOT NULL`)
+    - Accessing fields from the unshredded blob requires loading the full 
binary value
+    - Shredded fields and expression indexes enable full predicate pushdown, 
data skipping, and column pruning
+
+4. **Shredded Variant**:
+    - `typed_value` field is defined for extracted typed columns within the 
Variant group
+    - Shredded fields unlock full MDT column stats and data skipping via 
`DataSkippingUtils`
+
+5. **Functions**:
+    - `parse_json()` - Spark 4.0+ only
+    - JSON path access (`payload:field`) - Spark 4.0+ only
+    - No UDFs for Variant manipulation in Spark 3.x
+
+### Key Implementation Files
+
+| File                                                      | Description      
                                |
+|-----------------------------------------------------------|--------------------------------------------------|
+| `hudi-common/.../HoodieSchema.java`                       | Core Variant 
schema definition with logical type |
+| `hudi-spark3-common/.../BaseSpark3Adapter.scala`          | Spark 3 adapter 
(no Variant support)             |

Review Comment:
   🤖 I notice there's no discussion of how Variant interacts with Hudi's 
clustering and compaction strategies. For shredded Variants, does clustering 
need to be aware of the shredding schema to maintain optimal column 
co-location? And during compaction of MOR tables, are there any special 
considerations for merging Variant log records?



##########
rfc/rfc-99/rfc-99.md:
##########
@@ -194,5 +195,444 @@ SQL Extensions needs to be added to define the table in a 
hudi type native way.
 
 TODO: There is an open question regarding the need to maintain type ids to 
track schema evolution and how it would interplay with NBCC. 
 
+---
+
+## Variant Type Implementation
+
+This section documents the implementation of the VARIANT type in Hudi, which 
provides first-class support for
+semi-structured data (e.g., JSON). The Variant type is implemented following 
Spark 4.0's native VariantType
+specification.
+
+### Overview
+
+The Variant type enables Hudi to store and query semi-structured data 
efficiently. It is particularly useful for:
+
+- Schema-on-read flexibility for evolving data structures
+- Storing JSON-like data without requiring predefined schemas
+
+### Motivation
+
+Hudi readers and writers should be able to handle datasets with variant types.
+This will allow users to work with semi-structured data more easily.
+
+The variant type is now formally defined in Parquet and engines like Spark 
have full support for this type.
+Users with semi-structured data are otherwise forced to use strings or byte 
arrays to store this data.
+
+### What is the VARIANT Type?
+
+The `VARIANT` type is a new data type designed to store semi-structured data 
(like JSON) efficiently.
+Unlike storing JSON as a plain string, `VARIANT` uses an optimized binary 
encoding that allows for fast navigation
+and element extraction without needing to parse the entire document.
+It offers the flexibility of a schema-less design (like JSON) with performance 
closer to structured columns.
+
+### Storage Modes: Shredded vs. Unshredded
+
+- **Unshredded** (Binary Blob):
+    - The entire JSON structure is encoded into binary metadata and value 
blobs.
+    - **Pros**: Fast write speed; handles completely dynamic/random schemas 
easily.
+    - **Cons**: To read a single field (e.g., `user.id`), the engine must load 
the entire binary blob.
+- **Shredded** (Columnar Optimization):
+    - The engine identifies common paths in the data (e.g., `v.a` or `v.c`) 
and extracts them into separate, native
+      Parquet columns (e.g., Int32, Decimal).
+    - **Pros**: Massive performance gain for queries. If you query `SELECT 
v:a`, the engine reads only the specific
+      Int32 column and skips the rest of the binary data (Columnar Pruning).
+    - **Cons**: Higher write overhead to analyze and split the data, prone to 
read jitter if there are large variation 
+    - in shredding output.
+
+### Architecture
+
+Variant support is built on a **layered architecture** with version-specific 
adapters:
+
+```
+┌────────────────────────────────────────────────────┐
+│            Application Layer (Spark SQL)           │
+│    SELECT parse_json('{"a": 1}') as data           │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Spark Version Adapters                │
+│  ┌──────────────────┐  ┌────────────────────────┐  │
+│  │ BaseSpark3Adapter│  │   BaseSpark4Adapter    │  │
+│  │ (No Variant)     │  │   (Full Variant)       │  │
+│  └──────────────────┘  └────────────────────────┘  │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│             HoodieSchema.Variant                   │
+│     (Avro Logical Type + Record Schema)            │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Parquet Storage                       │
+│    GROUP { value: BINARY, metadata: BINARY }       │
+└────────────────────────────────────────────────────┘
+```
+
+### Variant Schema Definition
+
+The `HoodieSchema.Variant` class in `hudi-common` defines the Variant type:
+
+```java
+public static class Variant extends HoodieSchema {
+  private static final String VARIANT_METADATA_FIELD = "metadata";
+  private static final String VARIANT_VALUE_FIELD = "value";
+  private static final String VARIANT_TYPED_VALUE_FIELD = "typed_value";
+
+  private final boolean isShredded;
+  private final Option<HoodieSchema> typedValueSchema;
+}
+```
+
+#### Two Storage Modes
+
+1. **Unshredded Variant** (Default):
+    - Created with: `HoodieSchema.createVariant()`
+    - Structure: Record with two REQUIRED binary fields
+    - Fields: `metadata` (BYTES, REQUIRED), `value` (BYTES, REQUIRED)
+    - Use case: Simple semi-structured data storage
+
+2. **Shredded Variant**:
+    - Created with: `HoodieSchema.createVariantShredded(typedValueSchema)`
+    - Structure: Record with `typed_value` group containing extracted typed 
columns
+    - Fields: `value` (BYTES, OPTIONAL), `metadata` (BYTES, REQUIRED), 
`typed_value` (GROUP, OPTIONAL)
+    - Use case: Frequently accessed fields are extracted into native typed 
columns for columnar reads
+
+#### How a Reader Distinguishes the Two Modes
+
+Both modes use the same `VARIANT` logical type annotation in the Parquet 
footer — the annotation does not change.
+The reader inspects the Parquet file schema to determine which mode was used:
+
+- If the Variant group contains only `metadata` and `value`, it is 
**unshredded**.
+- If the Variant group also contains a `typed_value` child group, it is 
**shredded**.
+
+In the shredded case, `value` becomes OPTIONAL because when all fields are 
fully shredded, the binary blob
+may be `null` — the content is entirely represented in the typed columns. The 
reader falls back to `value`
+only for fields that were not shredded.
+
+#### Custom Avro Logical Type
+
+Variant uses a custom Avro logical type for identification:
+
+```java
+public static class VariantLogicalType extends LogicalType {
+  private static final String VARIANT_LOGICAL_TYPE_NAME = "variant";
+}
+```
+
+### On-Disk Representation (Parquet)
+
+Hudi's on-disk Variant representation intentionally aligns with the
+[Parquet Variant 
spec](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md).
+The spec defines a Variant as a group annotated with the `VARIANT` logical 
type. Hudi writes Variant columns using
+this exact layout in both modes:
+
+#### Unshredded
+
+```
+optional group variant_column (VARIANT) {
+  required binary metadata;
+  required binary value;
+}
+```
+
+The entire JSON value is encoded into the `value` blob. Both fields are 
REQUIRED — every row carries the full binary
+representation.
+
+#### Shredded
+
+```
+optional group variant_column (VARIANT) {
+  required binary metadata;
+  optional binary value;
+  optional group typed_value {
+    optional group a {
+      optional binary value;
+      optional int32 typed_value;
+    }
+    optional group b {
+      optional binary value;
+      optional binary typed_value (STRING);
+    }
+    optional group c {
+      optional binary value;
+      optional int64 typed_value;
+    }
+  }
+}
+```
+
+Each child under `typed_value` corresponds to a shredded field. The child's 
`typed_value` column holds the extracted
+native-typed value (e.g., `int32`, `string`, `int64`), while the child's 
`value` column is a fallback binary blob for
+rows where the field's type does not match the shredded type. The top-level 
`value` becomes OPTIONAL — it is `null`
+when all fields in the row are fully covered by shredded columns, and 
populated only for fields that were not shredded.
+
+**How a reader uses this**: The Parquet file schema in the footer tells the 
reader which mode was used. If the Variant
+group contains only `metadata` and `value`, it is unshredded. If a 
`typed_value` child group is present, the reader
+knows which fields have been shredded and can read them as native typed 
columns — enabling column pruning, predicate
+pushdown, and data skipping without touching the binary blob.
+
+The `VARIANT` annotation is what allows readers to recognize the column as a 
Variant and expose semantic operations
+(e.g., `variant_get`, JSON path access). Without it, the group is 
indistinguishable from an ordinary
+`Struct<metadata: Binary, value: Binary>`.
+
+**Alignment with the Parquet spec**: Hudi does not diverge from the Parquet 
Variant spec. The annotation, field names,
+field types, and repetition levels all follow the spec exactly. This means any 
Parquet reader that implements the
+Variant spec can read Hudi-written Variant columns natively, and vice versa.
+
+**Reader compatibility**: Engines that do not yet implement the Parquet 
Variant spec (e.g., Spark 3.5) will ignore the
+`VARIANT` annotation and fall back to reading the column as a plain struct of 
binary fields.
+The data remains physically accessible, but users lose Variant query functions 
and must manually parse
+the binary encoding. See [variant-appendix.md](variant-appendix.md) for 
detailed backward compatibility findings.
+
+**Migration path**: If a future Parquet spec revision changes the Variant 
physical layout, Hudi can introduce a
+table property (e.g., `hoodie.variant.parquet.format.version`) to control 
which format is written, and the reader
+can inspect the Parquet footer to determine which layout to expect. (This is 
outside the scope of this RFC for now)
+
+#### Binary Format
+
+The Variant binary encoding follows the
+[Parquet Variant Binary Encoding 
spec](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md):
+
+| Component    | Description                                                   
      |
+|--------------|---------------------------------------------------------------------|
+| **metadata** | Dictionary of field names and type information for efficient 
access |
+| **value**    | Binary encoding of the actual data (scalars, objects, arrays) 
      |
+
+Example for `{"updated": true, "new_field": 123}`:
+
+```
+Metadata Bytes: [0x01, 0x02, 0x00, 0x07, 0x10, "updated", "new_field"]
+Value Bytes:   [0x02, 0x02, 0x01, 0x00, 0x01, 0x00, 0x03, 0x04, 0x0C, 0x7B]
+```
+
+The metadata contains a dictionary of all field names, while the value 
contains references to these fields plus the
+actual data values.
+
+### Schema Evolution Support
+
+Variant types provide **schema-on-read** flexibility:
+
+| Aspect                   | Behavior                                          
                |
+|--------------------------|-------------------------------------------------------------------|
+| Adding new fields        | ✅ Supported - New JSON fields can be added 
without schema changes |
+| Removing fields          | ✅ Supported - Missing fields return null on read  
                |
+| Type changes within JSON | ✅ Supported - Variant can store any 
JSON-compatible type          |
+| Table schema evolution   | ✅ Supported - Variant column can be added to 
existing tables      |
+| Hudi schema evolution    | ✅ Supported - Works with Hudi's standard schema 
evolution         |
+
+**Important**: The schema flexibility is within the Variant column itself. The 
table-level schema (including the Variant
+column definition) still follows Hudi's standard schema evolution rules.
+
+### Column Statistics and Indexing
+
+Variant statistics and indexing capabilities depend on whether a field is 
accessed from the unshredded binary blob
+or from a shredded (extracted) typed column. With shredding, extracted fields 
become regular typed Parquet columns
+that automatically leverage all existing Hudi metadata infrastructure.
+
+#### Unshredded Variant Column
+
+| Feature            | Status | Notes                                          
                    |
+|--------------------|--------|--------------------------------------------------------------------|
+| Value/null counts  | ✅      | Standard column-level counts via MDT 
`column_stats` partition      |
+| Min/max bounds     | ❌      | Binary blob has no meaningful sort order       
                    |
+| Data skipping      | ❌      | Requires comparable min/max bounds             
                    |
+| Bloom filter       | ❌      | Bloom filters are used for record key lookups 
only                 |
+| Partition stats    | ❌      | Variant columns are not used as partition keys 
                    |
+| Predicate pushdown | ⚠️     | Structural predicates only (`IS NULL`, `IS NOT 
NULL`)              |
+
+#### Shredded Variant Fields
+
+| Feature            | Status | Notes                                          
                                                                        |
+|--------------------|--------|------------------------------------------------------------------------------------------------------------------------|
+| Min/max bounds     | ✅      | Shredded columns are regular typed columns; 
`HoodieTableMetadataUtil.isColumnTypeSupported()` accepts all primitives   |
+| Data skipping      | ✅      | `DataSkippingUtils` translates filters on 
shredded fields to min/max range checks                                      |
+| Expression index   | ✅      | Index over `variant_get()` expressions enables 
stats collection without full shredding                                 |
+| Bloom filter       | ❌      | Bloom filters are used for record key lookups 
only                                                                     |
+| Partition stats    | ✅      | Applicable if a shredded field is used as a 
partition key                                                              |
+| Predicate pushdown | ✅      | Full pushdown support: equality, range, and 
IN-list predicates                                                         |
+| Column pruning     | ✅      | Read only the shredded columns needed; skip 
the binary blob entirely                                                   |
+
+**Recommendation**: Enable shredding for frequently accessed fields to unlock 
full MDT column stats, data skipping,
+and predicate pushdown. For lighter-weight optimization, use expression 
indexes over `variant_get()` expressions to
+collect per-field statistics without materializing shredded columns.
+
+### Usage Guide
+
+#### Spark 4.0+ (Native Support)
+
+```sql
+-- Create table with Variant column
+CREATE TABLE events
+(
+    id      STRING,
+    ts      TIMESTAMP,
+    payload VARIANT
+) USING hudi
+OPTIONS (
+    primaryKey = 'id',
+    preCombineField = 'ts'
+);
+

Review Comment:
   🤖 For Flink integration, reading is supported but writing is not. Is there a 
risk that Flink jobs reading Variant as `ROW<value BYTES, metadata BYTES>` 
could inadvertently write those rows back (e.g., in a streaming upsert 
pipeline) and corrupt the Variant semantics? Should there be a guard or 
validation to prevent this?



##########
rfc/rfc-99/rfc-99.md:
##########
@@ -209,4 +209,299 @@ SQL Extensions needs to be added to define the table in a 
hudi type native way.
 
 TODO: There is an open question regarding the need to maintain type ids to 
track schema evolution and how it would interplay with NBCC. 
 
-The main implementation change would require replacing the Avro schema 
references with the new type system. 
+The main implementation change would require replacing the Avro schema 
references with the new type system.
+
+---
+
+## Variant Type Implementation
+
+This section documents the implementation of the VARIANT type in Hudi, which 
provides first-class support for semi-structured data (e.g., JSON). The Variant 
type is implemented following Spark 4.0's native VariantType specification.
+
+### Overview
+
+The Variant type enables Hudi to store and query semi-structured data 
efficiently. It is particularly useful for:
+- Schema-on-read flexibility for evolving data structures
+- Storing JSON-like data without requiring predefined schemas
+
+### Architecture
+
+Variant support is built on a **layered architecture** with version-specific 
adapters:
+
+```
+┌────────────────────────────────────────────────────┐
+│            Application Layer (Spark SQL)           │
+│    SELECT parse_json('{"a": 1}') as data           │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Spark Version Adapters                │
+│  ┌──────────────────┐  ┌────────────────────────┐  │
+│  │ BaseSpark3Adapter│  │   BaseSpark4Adapter    │  │
+│  │ (No Variant)     │  │   (Full Variant)       │  │
+│  └──────────────────┘  └────────────────────────┘  │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│             HoodieSchema.Variant                   │
+│     (Avro Logical Type + Record Schema)            │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Parquet Storage                       │
+│    GROUP { value: BINARY, metadata: BINARY }       │
+└────────────────────────────────────────────────────┘
+```
+
+### Variant Schema Definition
+
+The `HoodieSchema.Variant` class in `hudi-common` defines the Variant type:
+
+```java
+public static class Variant extends HoodieSchema {
+    private static final String VARIANT_METADATA_FIELD = "metadata";
+    private static final String VARIANT_VALUE_FIELD = "value";
+    private static final String VARIANT_TYPED_VALUE_FIELD = "typed_value";
+
+    private final boolean isShredded;

Review Comment:
   🤖 Good question. Based on the RFC text, readers distinguish shredded vs 
unshredded by inspecting the per-file Parquet schema (presence of `typed_value` 
child group). Since each file is self-describing, a table could in theory 
contain a mix of shredded and unshredded files without breaking reads. However, 
the table-level Avro schema (used for MOR log files and schema evolution 
tracking) would need to change, so it's worth calling out explicitly whether 
Hudi treats this as a compatible or incompatible schema change — and what the 
migration path looks like.



##########
rfc/rfc-99/rfc-99.md:
##########
@@ -194,5 +195,444 @@ SQL Extensions needs to be added to define the table in a 
hudi type native way.
 
 TODO: There is an open question regarding the need to maintain type ids to 
track schema evolution and how it would interplay with NBCC. 
 
+---
+
+## Variant Type Implementation
+
+This section documents the implementation of the VARIANT type in Hudi, which 
provides first-class support for
+semi-structured data (e.g., JSON). The Variant type is implemented following 
Spark 4.0's native VariantType
+specification.
+
+### Overview
+
+The Variant type enables Hudi to store and query semi-structured data 
efficiently. It is particularly useful for:
+
+- Schema-on-read flexibility for evolving data structures
+- Storing JSON-like data without requiring predefined schemas
+
+### Motivation
+
+Hudi readers and writers should be able to handle datasets with variant types.
+This will allow users to work with semi-structured data more easily.
+
+The variant type is now formally defined in Parquet and engines like Spark 
have full support for this type.
+Users with semi-structured data are otherwise forced to use strings or byte 
arrays to store this data.
+
+### What is the VARIANT Type?
+
+The `VARIANT` type is a new data type designed to store semi-structured data 
(like JSON) efficiently.
+Unlike storing JSON as a plain string, `VARIANT` uses an optimized binary 
encoding that allows for fast navigation
+and element extraction without needing to parse the entire document.
+It offers the flexibility of a schema-less design (like JSON) with performance 
closer to structured columns.
+
+### Storage Modes: Shredded vs. Unshredded
+
+- **Unshredded** (Binary Blob):
+    - The entire JSON structure is encoded into binary metadata and value 
blobs.
+    - **Pros**: Fast write speed; handles completely dynamic/random schemas 
easily.
+    - **Cons**: To read a single field (e.g., `user.id`), the engine must load 
the entire binary blob.
+- **Shredded** (Columnar Optimization):
+    - The engine identifies common paths in the data (e.g., `v.a` or `v.c`) 
and extracts them into separate, native
+      Parquet columns (e.g., Int32, Decimal).
+    - **Pros**: Massive performance gain for queries. If you query `SELECT 
v:a`, the engine reads only the specific
+      Int32 column and skips the rest of the binary data (Columnar Pruning).
+    - **Cons**: Higher write overhead to analyze and split the data, prone to 
read jitter if there are large variation 
+    - in shredding output.
+
+### Architecture
+
+Variant support is built on a **layered architecture** with version-specific 
adapters:
+
+```
+┌────────────────────────────────────────────────────┐
+│            Application Layer (Spark SQL)           │
+│    SELECT parse_json('{"a": 1}') as data           │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Spark Version Adapters                │
+│  ┌──────────────────┐  ┌────────────────────────┐  │
+│  │ BaseSpark3Adapter│  │   BaseSpark4Adapter    │  │
+│  │ (No Variant)     │  │   (Full Variant)       │  │
+│  └──────────────────┘  └────────────────────────┘  │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│             HoodieSchema.Variant                   │
+│     (Avro Logical Type + Record Schema)            │
+└────────────────────────────────────────────────────┘
+                        │
+                        ▼
+┌────────────────────────────────────────────────────┐
+│              Parquet Storage                       │
+│    GROUP { value: BINARY, metadata: BINARY }       │
+└────────────────────────────────────────────────────┘
+```
+
+### Variant Schema Definition
+
+The `HoodieSchema.Variant` class in `hudi-common` defines the Variant type:
+
+```java
+public static class Variant extends HoodieSchema {
+  private static final String VARIANT_METADATA_FIELD = "metadata";
+  private static final String VARIANT_VALUE_FIELD = "value";
+  private static final String VARIANT_TYPED_VALUE_FIELD = "typed_value";
+
+  private final boolean isShredded;
+  private final Option<HoodieSchema> typedValueSchema;
+}
+```
+
+#### Two Storage Modes
+
+1. **Unshredded Variant** (Default):
+    - Created with: `HoodieSchema.createVariant()`
+    - Structure: Record with two REQUIRED binary fields
+    - Fields: `metadata` (BYTES, REQUIRED), `value` (BYTES, REQUIRED)
+    - Use case: Simple semi-structured data storage
+
+2. **Shredded Variant**:
+    - Created with: `HoodieSchema.createVariantShredded(typedValueSchema)`
+    - Structure: Record with `typed_value` group containing extracted typed 
columns
+    - Fields: `value` (BYTES, OPTIONAL), `metadata` (BYTES, REQUIRED), 
`typed_value` (GROUP, OPTIONAL)
+    - Use case: Frequently accessed fields are extracted into native typed 
columns for columnar reads
+
+#### How a Reader Distinguishes the Two Modes
+
+Both modes use the same `VARIANT` logical type annotation in the Parquet 
footer — the annotation does not change.
+The reader inspects the Parquet file schema to determine which mode was used:
+
+- If the Variant group contains only `metadata` and `value`, it is 
**unshredded**.
+- If the Variant group also contains a `typed_value` child group, it is 
**shredded**.
+
+In the shredded case, `value` becomes OPTIONAL because when all fields are 
fully shredded, the binary blob
+may be `null` — the content is entirely represented in the typed columns. The 
reader falls back to `value`
+only for fields that were not shredded.
+
+#### Custom Avro Logical Type
+
+Variant uses a custom Avro logical type for identification:
+
+```java
+public static class VariantLogicalType extends LogicalType {
+  private static final String VARIANT_LOGICAL_TYPE_NAME = "variant";
+}
+```
+
+### On-Disk Representation (Parquet)
+
+Hudi's on-disk Variant representation intentionally aligns with the
+[Parquet Variant 
spec](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md).
+The spec defines a Variant as a group annotated with the `VARIANT` logical 
type. Hudi writes Variant columns using
+this exact layout in both modes:
+
+#### Unshredded
+
+```
+optional group variant_column (VARIANT) {
+  required binary metadata;
+  required binary value;
+}
+```
+
+The entire JSON value is encoded into the `value` blob. Both fields are 
REQUIRED — every row carries the full binary
+representation.
+
+#### Shredded
+
+```
+optional group variant_column (VARIANT) {
+  required binary metadata;
+  optional binary value;
+  optional group typed_value {
+    optional group a {
+      optional binary value;
+      optional int32 typed_value;
+    }
+    optional group b {
+      optional binary value;
+      optional binary typed_value (STRING);
+    }
+    optional group c {
+      optional binary value;
+      optional int64 typed_value;
+    }
+  }
+}
+```
+
+Each child under `typed_value` corresponds to a shredded field. The child's 
`typed_value` column holds the extracted
+native-typed value (e.g., `int32`, `string`, `int64`), while the child's 
`value` column is a fallback binary blob for
+rows where the field's type does not match the shredded type. The top-level 
`value` becomes OPTIONAL — it is `null`
+when all fields in the row are fully covered by shredded columns, and 
populated only for fields that were not shredded.

Review Comment:
   🤖 The schema evolution table covers flexibility *within* the Variant column, 
but doesn't address evolving between unshredded and shredded modes for the same 
column. Balaji's review comment raised this — it would be good to document 
explicitly whether switching modes is a backward-compatible operation and what 
the migration path looks like (e.g., does it require rewriting all file 
groups?).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat: Add variant support description to RFC-99 [hudi]

Reply via email to