codope commented on a change in pull request #5077:
URL: https://github.com/apache/hudi/pull/5077#discussion_r832172618



##########
File path: 
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java
##########
@@ -1042,22 +1046,27 @@ public static void aggregateColumnStats(IndexedRecord 
record, Schema schema,
 
     schema.getFields().forEach(field -> {
       Map<String, Object> columnStats = 
columnToStats.getOrDefault(field.name(), new HashMap<>());
-      final String fieldVal = getNestedFieldValAsString((GenericRecord) 
record, field.name(), true, consistentLogicalTimestampEnabled);
+      final Object fieldVal = getNestedFieldVal((GenericRecord) record, 
field.name(), true, consistentLogicalTimestampEnabled);

Review comment:
       > Instead we can directly access the field value and perform logical 
type conversion by invoking convertValueForSpecificDataTypes directly.
   
   > As such my proposal is to limit min/max stats only for primitive types.
   
   Will do.
   
   > `fieldVal` we're getting should be of type `Comparable<?>`, otherwise we 
set it to null (since there's no plausible order we can impose on these
   
   > I don't think we need this method. Instead we should make sure at the 
place where we collect min/max stats that they're proper Java objects, and 
simply treat min/max as Comparable
   
   How do you get a `Comparable<?>` given the record is an avro 
`GenericRecord`? I can refactor in a way such that we don't have to convert 
while merging stats, and hide the conversion. Still, we need to infer the java 
type from avro type and do the conversion. What I am thinking is to add another 
method, say `convertValueToNativeJavaType` in `HoodieAvroUtils` which will
   ```
   Comparable<?> convertValueToNativeJavaType(Schema fieldSchema, Object val) {
     Object v = convertValueForSpecificDataTypes(fieldSchema, val)
     return convertToNativeJavaType(fieldSchema, v);
   }
   
   and then in HoodieTableMetadataUtil#aggregateColumnStats
   
   Comparable<?> fieldVal = convertValueToNativeJavaType(field.schema(), 
genericRecord.get(field.name()));
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to