On Fri, 2012-03-16 at 11:25 +0100, Andres Freund wrote: > > I take that back. There was something wrong with my test -- fadvise > > helps, but it only takes it from ~10s to ~6.5s. Not quite as good as I > > hoped. > Thats surprising. I wouldn't expect such a big difference between fadvise + > sync_file_range. Rather strange.
I discussed this with my colleague who knows linux internals, and he pointed me directly at the problem. fadvise and sync_file_range in this case are both trying to put the data in the io scheduler queue (still in the kernel, not on the device). The difference is that fadvise doesn't wait, and sync_file_range does (keep in mind, this is waiting to get in a queue to go to the device, not waiting for the device to write it or even receive it). He indicated that 4096 is a normal number that one might use for the queue size. But on my workstation at home (ubuntu 11.10), the queue is only 128. I bumped it up to 256 and now fadvise is just as fast! This won't be a problem on production systems, but that doesn't help us a lot. People setting up a production system don't care about 6.5 seconds of set-up time anyway. Casual users and developers do (the latter problem can be solved with the --nosync switch, but the former problem is the one we're discussing). So, it looks like fadvise is the "right" thing to do, but I expect we'll get some widely varying results from actual users. Then again, maybe casual users don't care much about ~10s for initdb anyway? It's a fairly rare operation for everyone except developers. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers