xushiyan commented on code in PR #5282: URL: https://github.com/apache/hudi/pull/5282#discussion_r846771115
########## hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java: ########## @@ -471,7 +470,8 @@ public void remove() { private static FSDataInputStream getFSDataInputStream(FileSystem fs, HoodieLogFile logFile, int bufferSize) throws IOException { - FSDataInputStream fsDataInputStream = fs.open(logFile.getPath(), bufferSize); + String escapePathName = PartitionPathEncodeUtils.unescapePathName(logFile.getPath().toString()); + FSDataInputStream fsDataInputStream = fs.open(new Path(escapePathName), bufferSize); Review Comment: this has performance impact; the unescape is looking at every char, also calling `logFile.getPath()` construct a Path object and then L474 constructs another Path, which is costly. There should be a deeper fix to this: we shouldn't have to unescape every `logFile.getPath()` to read. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org