Yida Wu created IMPALA-13992:
--------------------------------

             Summary: Incorrect file path in logging while spilling to remote 
filesystem
                 Key: IMPALA-13992
                 URL: https://issues.apache.org/jira/browse/IMPALA-13992
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
            Reporter: Yida Wu
            Assignee: Yida Wu


In 
[https://github.com/apache/impala/blob/ef8f8ca27b52f7fd842a7a887d5c9a8db9831f79/be/src/runtime/io/disk-io-mgr.cc#L280C31-L280C36,]
 when spilling data to a remote filesystem, a write range is used to first 
write the buffer to local storage. However, the file path set in the write 
range may incorrectly point to the remote file path, as assigned in 
[https://github.com/apache/impala/blob/ef8f8ca27b52f7fd842a7a887d5c9a8db9831f79/be/src/runtime/tmp-file-mgr.cc#L1987].

Although the actual write logic works correctly, since it uses the writer's 
configured file which is not the path from the write range, the logging relies 
on the file path from the write range. This results in misleading logs that 
incorrectly indicate the data is being written directly to the remote file.

The difficulty here is that the write range is used in two different modes, 
either as a buffer before uploading to remote storage, or as a purely local 
file. So this can lead to confusion. This logic should be reviewed to ensure 
logging accurately reflects the actual write file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to