Re: Using mmap(2) in sort(1) instead of temp files

2024-04-06 Thread Brett Lymn
On Fri, Apr 05, 2024 at 07:36:42AM -0400, Mouse wrote: > > Well, I'm not the one (putatively) doing the work. But my answers to > that are: > Me neither. > (1) Small sorts are not the issue, IMO. Even a speedup as great as > halving the time taken is not enough to worry about when it's on a p

Re: Using mmap(2) in sort(1) instead of temp files

2024-04-05 Thread Mouse
>> (4) Are there still incoherencies between mmap and read/write >> access? At one time there were, [...] > This bug was fixed nearly a quarter century ago, in November 2000, > with the merge of the unified buffer cache. Ah, I recall UBC being brought in. > I think using any version of NetBSD re

Re: Using mmap(2) in sort(1) instead of temp files

2024-04-05 Thread Taylor R Campbell
> Date: Fri, 5 Apr 2024 07:36:42 -0400 (EDT) > From: Mouse > > (4) Are there still incoherencies between mmap and read/write access? > At one time there were, and I never got a good handle on what needed to > be done to avoid them. This bug was fixed nearly a quarter century ago, in November 200

Re: Using mmap(2) in sort(1) instead of temp files

2024-04-05 Thread Mouse
>> [...] > Why not stat the input file and decide to use in memory iff the file > is small enough? This way sort will handle large sorts on small > memory machines automatically. Well, I'm not the one (putatively) doing the work. But my answers to that are: (1) Small sorts are not the issue, IM

Re: Using mmap(2) in sort(1) instead of temp files

2024-04-05 Thread Brett Lymn
On Thu, Apr 04, 2024 at 02:38:02PM +0200, Martin Husemann wrote: > > Since the original comment hints at "instead of temp files" it is pretty > clear that the second variant is meant. This avoids all file system operations > and if the machine you run on has enough free memory it might not even >

Re: Using mmap(2) in sort(1) instead of temp files

2024-04-04 Thread Joerg Sonnenberger
On Thursday, April 4, 2024 11:28:13 PM CEST Robert Elz wrote: > Yes, in cases where temp files are actually needed, using mmap() is a > very minor gain indeed - the buffering cost might be saved, but sorting > a large file is a cpu costly endeavour (lots of comparisons, lots of times > even with th

Re: Using mmap(2) in sort(1) instead of temp files

2024-04-04 Thread Robert Elz
Date:Thu, 4 Apr 2024 10:05:17 -0400 (EDT) From:Mouse Message-ID: <202404041405.kaa01...@stone.rodents-montreal.org> | Actually, if you mmap it PROT_WRITE and MAP_PRIVATE, you could go right | ahead. But that'll cost RAM or swap space when the COW fault happens.

Re: Using mmap(2) in sort(1) instead of temp files

2024-04-04 Thread Mouse
>> Given the issues about using mmap, can anybody suggest how I should >> proceed with the implementation, or if I should at all? > There are two potential ways where mmap(2) could help improve the speed > of sort: > - If you know the input file name, use a read-only mmap() of that file >and

Re: Using mmap(2) in sort(1) instead of temp files

2024-04-04 Thread Martin Husemann
On Thu, Apr 04, 2024 at 12:02:30PM +, Ice Cream wrote: > Given the issues about using mmap, can anybody suggest how > I should proceed with the implementation, or if I should at all? There are two potential ways where mmap(2) could help improve the speed of sort: - If you know the input file