aihuaxu commented on code in PR #3258:
URL: https://github.com/apache/parquet-java/pull/3258#discussion_r2268635129


##########
parquet-avro/src/main/java/org/apache/parquet/avro/AvroRecordConverter.java:
##########
@@ -396,7 +408,7 @@ private static Converter newConverter(
         return newStringConverter(schema, model, parent, validator);
       case RECORD:
         if (type.getLogicalTypeAnnotation() instanceof 
LogicalTypeAnnotation.VariantLogicalTypeAnnotation) {
-          return new AvroVariantConverter(parent, type.asGroupType(), schema, 
model);
+          return new AvroVariantConverter(parent, type.asGroupType(), 
VARIANT_SCHEMA, model);

Review Comment:
   We are reading avro schema from the `parquet.avro.schema` of parquet file . 
If this schema doesn't have value field, then in 
[AvroVariantConverter.java#L51](https://github.com/apache/parquet-java/blob/master/parquet-avro/src/main/java/org/apache/parquet/avro/AvroVariantConverter.java#L51),
  it will error out.  
   The schema in `parquet.avro.schema` seems to be correct to represent the 
write schema and we  we need to make the change to pass in record of value + 
metadata for variant logical type.
   ```
   aixu@K7YJWY4PK6 variant % parquet2 meta case-111.parquet
   File path: case-111.parquet
   Created by: parquet-mr version 1.16.0-SNAPSHOT (build 
ee34713e4d906d61f95d2b09145945638b2e2296)
   Properties:
    parquet.avro.schema: 
{"type":"record","name":"table","fields":[{"name":"id","type":"int"},{"name":"var","type":["null",{"type":"record","name":"var","fields":[{"name":"metadata","type":"bytes"},{"name":"value","type":["null","bytes"],"default":null},{"name":"typed_value","type":["null","string"],"default":null}]}],"default":null}]}
     writer.model.name: avro
   Schema:
   message table {
    required int32 id = 1;
    optional group var (VARIANT(1)) = 2 {
     required binary metadata;
     optional binary value;
     optional binary typed_value (STRING);
    }
   }
   ```
   
   Let me know if I misunderstand `parquet.avro.schema`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to