This is an automated email from the ASF dual-hosted git repository.
alamb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/parquet-format.git
The following commit(s) were added to refs/heads/master by this push:
new 96656a5 GH-561: variant schema examples to use (VARIANT(1)) (#562)
96656a5 is described below
commit 96656a543a2165d57cc1c9abefaad7f9aeb563a5
Author: Steve Loughran <[email protected]>
AuthorDate: Sat Apr 4 10:44:04 2026 +0100
GH-561: variant schema examples to use (VARIANT(1)) (#562)
* GH-561 variant schema examples use (VARIANT) rather than (VARIANT(1))
Fix schema examples and in logical types doc declare that the version
number is required.
* Update LogicalTypes.md
Co-authored-by: Andrew Lamb <[email protected]>
---------
Co-authored-by: Andrew Lamb <[email protected]>
---
LogicalTypes.md | 3 ++-
VariantEncoding.md | 4 ++--
VariantShredding.md | 8 ++++----
3 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/LogicalTypes.md b/LogicalTypes.md
index 78fdf29..f3375e5 100644
--- a/LogicalTypes.md
+++ b/LogicalTypes.md
@@ -571,7 +571,8 @@ type `binary`, which is also called `BYTE_ARRAY` in the
Parquet thrift definitio
The `VARIANT` annotated group can be used to store either an unshredded Variant
value, or a shredded Variant value.
-* The Variant group must be annotated with the `VARIANT` logical type.
+* The Variant group must be annotated with the `VARIANT` logical type, with
the version number
+ included in the declaration.
* Both fields `value` and `metadata` must be of type `binary` (called
`BYTE_ARRAY`
in the Parquet thrift definition).
* The `metadata` field is required and must be a valid Variant metadata
component,
diff --git a/VariantEncoding.md b/VariantEncoding.md
index d393d43..b78c02e 100644
--- a/VariantEncoding.md
+++ b/VariantEncoding.md
@@ -53,7 +53,7 @@ A Variant value in Parquet is represented by a group with 2
fields, named `value
This is the expected unshredded representation in Parquet:
```
-optional group variant_name (VARIANT) {
+optional group variant_name (VARIANT(1)) {
required binary metadata;
required binary value;
}
@@ -61,7 +61,7 @@ optional group variant_name (VARIANT) {
This is an example representation of a shredded Variant in Parquet:
```
-optional group shredded_variant_name (VARIANT) {
+optional group shredded_variant_name (VARIANT(1)) {
required binary metadata;
optional binary value;
optional int64 typed_value;
diff --git a/VariantShredding.md b/VariantShredding.md
index cfc05fd..bbebbdd 100644
--- a/VariantShredding.md
+++ b/VariantShredding.md
@@ -44,7 +44,7 @@ When `typed_value` is present, readers **must** reconstruct
shredded values acco
For example, a Variant field, `measurement` may be shredded as long values by
adding `typed_value` with type `int64`:
```
-required group measurement (VARIANT) {
+required group measurement (VARIANT(1)) {
required binary metadata;
optional binary value;
optional int64 typed_value;
@@ -128,7 +128,7 @@ However, at least one of the two fields must be present.
For example, a `tags` Variant may be shredded as a list of strings using the
following definition:
```
-optional group tags (VARIANT) {
+optional group tags (VARIANT(1)) {
required binary metadata;
optional binary value;
optional group typed_value (LIST) { # must be optional to allow a null list
@@ -174,7 +174,7 @@ As a result, reads when a field is defined in both `value`
and a `typed_value` s
For example, a Variant `event` field may shred `event_type` (`string`) and
`event_ts` (`timestamp`) columns using the following definition:
```
-optional group event (VARIANT) {
+optional group event (VARIANT(1)) {
required binary metadata;
optional binary value; # a variant, expected to be an object
optional group typed_value { # shredded fields for the variant
object
@@ -229,7 +229,7 @@ The `typed_value` associated with any Variant `value` field
can be any shredded
For example, the `event` object above may also shred sub-fields as object
(`location`) or array (`tags`).
```
-optional group event (VARIANT) {
+optional group event (VARIANT(1)) {
required binary metadata;
optional binary value;
optional group typed_value {