On Wed, Nov 15, 2017 at 10:35 AM, Andres Freund <and...@anarazel.de> wrote: >> I realize you're sort of joking here, but I think it's necessary to >> care about fairness between pieces of code. > > Indeed I kinda was.
When I posted v1 of parallel CREATE INDEX, it followed the hash join model of giving workMem (maintenance_work_mem) to every worker. Robert suggested that my comparison with a serial case was therefore not representative, since I was using much more memory. I actually changed the patch to use a single maintenance_work_mem share for the entire operation afterwards, which seemed to work better. And, it made very little difference to performance for my original benchmark in the end, so I was arguably wasting memory in v1. >> I mean, the very first version of this patch that Thomas submitted was >> benchmarked by Rafia and had phenomenally good performance >> characteristics. That turned out to be because it wasn't respecting >> work_mem; you can often do a lot better with more memory, and >> generally you can't do nearly as well with less. To make comparisons >> meaningful, they have to be comparisons between algorithms that use >> the same amount of memory. And it's not just about testing. If we >> add an algorithm that will run twice as fast with equal memory but >> only allow it half as much, it will probably never get picked and the >> whole patch is a waste of time. The contrast with the situation with Thomas and his hash join patch is interesting. Hash join is *much* more sensitive to the availability of memory than a sort operation is. > I don't really have a good answer to "but what should we otherwise do", > but I'm doubtful this is quite the right answer. I think that the work_mem model should be replaced by something that centrally budgets memory. It would make sense to be less generous with sorts and more generous with hash joins when memory is in short supply, for example, and a model like this can make that possible. The work_mem model has always forced users to be far too conservative. Workloads are very complicated, and always having users target the worst case leaves a lot to be desired. -- Peter Geoghegan