the-other-tim-brown commented on code in PR #13411:
URL: https://github.com/apache/hudi/pull/13411#discussion_r2152308434


##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java:
##########
@@ -376,19 +380,44 @@ public Option<IndexedRecord> 
combineAndGetUpdateValue(IndexedRecord oldRecord, S
   }
 
   @Override
-  public Option<IndexedRecord> getInsertValue(Schema schemaIgnored, Properties 
propertiesIgnored) throws IOException {
+  public Option<IndexedRecord> getInsertValue(Schema schema, Properties 
propertiesIgnored) throws IOException {
     if (key == null || this.isDeletedRecord) {
       return Option.empty();
     }
 
-    HoodieMetadataRecord record = new HoodieMetadataRecord(key, type, 
filesystemMetadata, bloomFilterMetadata,
-        columnStatMetadata, recordIndexMetadata, secondaryIndexMetadata);
-    return Option.of(record);
+    if (schema == null || HOODIE_METADATA_SCHEMA == schema) {
+      // If the schema is same or none is provided, we can return the record 
directly
+      HoodieMetadataRecord record = new HoodieMetadataRecord(key, type, 
filesystemMetadata, bloomFilterMetadata,
+          columnStatMetadata, recordIndexMetadata, secondaryIndexMetadata);
+      return Option.of(record);
+    } else {
+      // Otherwise, the assumption is that the schema required contains the 
metadata fields so we construct a new GenericRecord with these fields
+      GenericData.Record record = new GenericData.Record(schema);
+      int offset = schema.getField("key").pos();
+      record.put(offset, key);
+      record.put(offset + 1, type);
+      if (filesystemMetadata != null) {

Review Comment:
   Right now we're doing some optimization by assuming the schema is always 
cached so we can use `==`, if we can be sure all paths for metadata table are 
tested then we can hunt down any places skipping the cache. Otherwise, we don't 
know if the schema is not the same instance because it is coming from a 
different code path or because it has the metadata fields added



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to