rahil-c commented on code in PR #5786:
URL: https://github.com/apache/hudi/pull/5786#discussion_r918250071
##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeRecordReaderUtils.java:
##########
@@ -189,7 +190,13 @@ public static Writable avroToArrayWritable(Object value,
Schema schema) {
Writable[] recordValues = new Writable[schema.getFields().size()];
int recordValueIndex = 0;
for (Schema.Field field : schema.getFields()) {
- recordValues[recordValueIndex++] =
avroToArrayWritable(record.get(field.name()), field.schema());
+ Object fieldValue = null;
+ try {
+ fieldValue = record.get(field.name());
+ } catch (AvroRuntimeException e) {
+ LOG.debug("Field:" + field.name() + "not found in Schema:" +
schema.toString());
Review Comment:
From my understanding we want to catch this exception as opposed to throw,
the reason being that before in avro 1.8.2 if a field was not found it would
continue forward, so in this code path we would not see issues.
However now we are targeting the avro `<avro.version>1.10.2</avro.version>`,
in this avro version if a field is not found it will throw an exception which
will break several tests.
With this change we have this test as well
```
@Test
public void testAvroToArrayWritable() throws IOException {
Schema schema = SchemaTestUtil.getEvolvedSchema();
GenericRecord record = SchemaTestUtil.generateAvroRecordFromJson(schema,
1, "100", "100", false);
ArrayWritable aWritable = (ArrayWritable)
HoodieRealtimeRecordReaderUtils.avroToArrayWritable(record, schema);
assertEquals(schema.getFields().size(), aWritable.get().length);
// In some queries, generic records that Hudi gets are just part of the
full records.
// Here test the case that some fields are missing in the record.
Schema schemaWithMetaFields = HoodieAvroUtils.addMetadataFields(schema);
ArrayWritable aWritable2 = (ArrayWritable)
HoodieRealtimeRecordReaderUtils.avroToArrayWritable(record,
schemaWithMetaFields);
assertEquals(schemaWithMetaFields.getFields().size(),
aWritable2.get().length);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]