Yida Wu created IMPALA-13992: -------------------------------- Summary: Incorrect file path in logging while spilling to remote filesystem Key: IMPALA-13992 URL: https://issues.apache.org/jira/browse/IMPALA-13992 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Yida Wu Assignee: Yida Wu
In [https://github.com/apache/impala/blob/ef8f8ca27b52f7fd842a7a887d5c9a8db9831f79/be/src/runtime/io/disk-io-mgr.cc#L280C31-L280C36,] when spilling data to a remote filesystem, a write range is used to first write the buffer to local storage. However, the file path set in the write range may incorrectly point to the remote file path, as assigned in [https://github.com/apache/impala/blob/ef8f8ca27b52f7fd842a7a887d5c9a8db9831f79/be/src/runtime/tmp-file-mgr.cc#L1987]. Although the actual write logic works correctly, since it uses the writer's configured file which is not the path from the write range, the logging relies on the file path from the write range. This results in misleading logs that incorrectly indicate the data is being written directly to the remote file. The difficulty here is that the write range is used in two different modes, either as a buffer before uploading to remote storage, or as a purely local file. So this can lead to confusion. This logic should be reviewed to ensure logging accurately reflects the actual write file. -- This message was sent by Atlassian Jira (v8.20.10#820010)