Re: [PR] [FLINK-37729][flink-formats] ResultTypeQueryable for Avro serialization schema [flink]

via GitHub Wed, 07 May 2025 09:16:30 -0700


davidradl commented on code in PR #26507:
URL: https://github.com/apache/flink/pull/26507#discussion_r2078014721



##########
flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroRowDataSerializationSchema.java:
##########
@@ -139,4 +143,14 @@ public boolean equals(Object o) {
     public int hashCode() {
         return Objects.hash(nestedSchema, rowType);
     }
+
+    @Override
+    public TypeInformation<GenericRecord> getProducedType() {
+        if (schema == null) {
+            throw new IllegalStateException(
+                    "The produced type is not available before the schema is 
initialized.");
+        } else {
+            return new GenericRecordAvroTypeInfo(schema);
+        }
+    }

Review Comment:
   I see that in the [Avro 
deserialization](https://github.com/apache/flink/blob/ddd65bd03749b740ea978570509f2286cda161db/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroRowDataDeserializationSchema.java#L153)
 it returns the type as `TypeInformation<RowData>`.
   
   I can see in the [Parquet 
Avro](https://github.com/apache/flink/blob/ddd65bd03749b740ea978570509f2286cda161db/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/avro/AvroParquetRecordFormat.java#L129)
 and the Orc code it is returning a type.
   
   It would be good if what we do here is consistent with the Parquet Avro case.
   
   I also see that there is a comment at the top of this class saying to keep 
this class in step with AvroRowDataDeserializationSchema} and schema converter 
{@link AvroSchemaConverter}. AvroSchemaConverter does deal with Type info.
   
   It would be useful for me to understand what how lineage is calling this and 
why we end up in this class. Maybe a lineage unit test bringing in an Avro 
schema and a Parquet Avro schema should be helpful.   
   
   
   
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-37729][flink-formats] ResultTypeQueryable for Avro serialization schema [flink]

Reply via email to