aihuaxu commented on code in PR #3258:
URL: https://github.com/apache/parquet-java/pull/3258#discussion_r2268635129
##########
parquet-avro/src/main/java/org/apache/parquet/avro/AvroRecordConverter.java:
##########
@@ -396,7 +408,7 @@ private static Converter newConverter(
return newStringConverter(schema, model, parent, validator);
case RECORD:
if (type.getLogicalTypeAnnotation() instanceof
LogicalTypeAnnotation.VariantLogicalTypeAnnotation) {
- return new AvroVariantConverter(parent, type.asGroupType(), schema,
model);
+ return new AvroVariantConverter(parent, type.asGroupType(),
VARIANT_SCHEMA, model);
Review Comment:
We are reading avro schema from the `parquet.avro.schema` of parquet file .
If this schema doesn't have value field, then in
[AvroVariantConverter.java#L51](https://github.com/apache/parquet-java/blob/master/parquet-avro/src/main/java/org/apache/parquet/avro/AvroVariantConverter.java#L51),
it will error out.
The schema in `parquet.avro.schema` seems to be correct to represent the
write schema and we we need to make the change to pass in record of value +
metadata for variant logical type.
```
aixu@K7YJWY4PK6 variant % parquet2 meta case-111.parquet
File path: case-111.parquet
Created by: parquet-mr version 1.16.0-SNAPSHOT (build
ee34713e4d906d61f95d2b09145945638b2e2296)
Properties:
parquet.avro.schema:
{"type":"record","name":"table","fields":[{"name":"id","type":"int"},{"name":"var","type":["null",{"type":"record","name":"var","fields":[{"name":"metadata","type":"bytes"},{"name":"value","type":["null","bytes"],"default":null},{"name":"typed_value","type":["null","string"],"default":null}]}],"default":null}]}
writer.model.name: avro
Schema:
message table {
required int32 id = 1;
optional group var (VARIANT(1)) = 2 {
required binary metadata;
optional binary value;
optional binary typed_value (STRING);
}
}
```
Let me know if I misunderstand `parquet.avro.schema`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]