On Thursday, April 4, 2024 11:28:13 PM CEST Robert Elz wrote: > Yes, in cases where temp files are actually needed, using mmap() is a > very minor gain indeed - the buffering cost might be saved, but sorting > a large file is a cpu costly endeavour (lots of comparisons, lots of times > even with the best sorting algorithms available) so when temp files are > needed in the first place (large input files) the saving is liklely to be > a few ms in an operation which takes minutes of cpu time (or more). > Not worth the bother.
I quite disagree here. mmap for the temp files with an appropriate madvise can minimize data copies (by using the VFS cache directly) as well reduce the cache foot print (by evicting pages once they are used up). Especially for storage layers like NVME that can use a significant part of the main memory bandwidth, that's important. I don't think it helps for the original input though, especially with the associated problems of concurrent writes or truncations. Joerg