xushiyan commented on code in PR #5282:
URL: https://github.com/apache/hudi/pull/5282#discussion_r846771115


##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java:
##########
@@ -471,7 +470,8 @@ public void remove() {
   private static FSDataInputStream getFSDataInputStream(FileSystem fs,
                                                         HoodieLogFile logFile,
                                                         int bufferSize) throws 
IOException {
-    FSDataInputStream fsDataInputStream = fs.open(logFile.getPath(), 
bufferSize);
+    String escapePathName = 
PartitionPathEncodeUtils.unescapePathName(logFile.getPath().toString());
+    FSDataInputStream fsDataInputStream = fs.open(new Path(escapePathName), 
bufferSize);

Review Comment:
   this has performance impact; the unescape is looking at every char, also 
calling `logFile.getPath()` construct a Path object and then L474 constructs 
another Path, which is costly. There should be a deeper fix to this: we 
shouldn't have to unescape every `logFile.getPath()` to read.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to