This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a commit to branch release-0.10.1-rc1
in repository https://gitbox.apache.org/repos/asf/hudi.git

commit f91b3f348f3d283f28cc5b12751ce1d3952e684a
Author: Manoj Govindassamy <[email protected]>
AuthorDate: Tue Jan 4 13:41:33 2022 -0800

    [HUDI-3141] Metadata merged log record reader - avoiding 
NullPointerException when records by keys (#4505)
    
    - HoodieMetadataMergedLogRecordReader#getRecordsByKeys() and its parent 
class methods
       are not thread safe. When multiple queries come in for gettting log 
records
       by keys, they all operate on the same log record reader instance 
provided by
       HoodieBackedTableMetadata#openReadersIfNeeded() and they trip over each 
other
       as they clear/put/get the same class memeber records.
    
     - The fix is to streamline the mutatation to class member records. Making
       HoodieMetadataMergedLogRecordReader#getRecordsByKeys() a synchronized 
method
    to avoid concurrent log records readers getting into NPE.
---
 .../apache/hudi/metadata/HoodieMetadataMergedLogRecordReader.java    | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git 
a/hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataMergedLogRecordReader.java
 
b/hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataMergedLogRecordReader.java
index e635eea..01c8d05 100644
--- 
a/hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataMergedLogRecordReader.java
+++ 
b/hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataMergedLogRecordReader.java
@@ -120,7 +120,10 @@ public class HoodieMetadataMergedLogRecordReader extends 
HoodieMergedLogRecordSc
     return Collections.singletonList(Pair.of(key, 
Option.ofNullable((HoodieRecord) records.get(key))));
   }
 
-  public List<Pair<String, Option<HoodieRecord<HoodieMetadataPayload>>>> 
getRecordsByKeys(List<String> keys) {
+  public synchronized List<Pair<String, 
Option<HoodieRecord<HoodieMetadataPayload>>>> getRecordsByKeys(List<String> 
keys) {
+    // Following operations have to be atomic, otherwise concurrent
+    // readers would race with each other and could crash when
+    // processing log block records as part of scan.
     records.clear();
     scan(Option.of(keys));
     List<Pair<String, Option<HoodieRecord<HoodieMetadataPayload>>>> 
metadataRecords = new ArrayList<>();

Reply via email to