On Fri, Nov 17, 2017 at 3:31 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Thu, Nov 16, 2017 at 11:50 AM, Serge Rielau <se...@rielau.com> wrote: > >> Just as you have, we have also considered holistic provisioning of >> work_mem across all consumers, but we find that to be too complex. >> Having an “emergency fund” in shared memory is also an option, but I find >> it too limiting. >> > > I agree. > > I think this is basically a planning problem. For example, say we wanted > to have work_mem_per_query instead of work_mem_per_node. There is an > obvious design: consider memory use as an independent dimension of merit > during path generation and comparison (less is better). Discard candidate > paths whose memory use exceeds the work_mem_per_query budget unless there > are no other alternatives. At the end of planning, pick the cheapest path > that survived the memory-budget filter. Now, this has the problem that it > would make planning more expensive (because we'd hang on to more paths for > longer) but it solves a lot of other problems. If there's no memory > pressure, we can use memory like mad even when it doesn't save much, but > when we have to pick between using more memory for one part of the plan and > using more memory for another part of the plan, the choice that does the > best job reducing overall execution time will win. Awesome. > > We could also do more localized variants of this that don't provide hard > guarantees but do tend to avoid squandering resources. I don't think that > we can directly incorporate memory use into cost because that will distort > the costs of higher-level nodes in the plan tree; cost needs to mean > execution time. However, what we could do is refuse to replace a more > expensive path in a relation's path list with a cheaper one when the > savings are small and the cheaper path uses a lot more memory. That way, > you wouldn't replace a nested loop that costs a million units with a hash > join that costs 999,999 units but uses a GB of RAM; you'd save the hash > join for cases where we think it will help significantly. > > Yet another thing we could do is to try to get nodes to voluntarily use > less than work_mem when possible. This is particularly an issue for > sorts. A 2-batch hash join is so much more expensive than a single-batch > hash join that it's almost never going to make sense unless we have no > realistic alternative, although I suppose a 64-batch hash join might be not > that different from a 32-batch hash join. But for sorts, given all Peter's > work in this area, I bet there are a lot of sorts that could budget a > quarter or less of work_mem and really not be hurt very much. It depends > somewhat on how fast and how contended your I/O is, though, which we don't > have an especially good way to model. I'm starting to wonder if that > sort_mem GUC might be a good idea... use that for sorts, and keep work_mem > for everything else. > > If we really want to be able to dynamically react to change memory > conditions, what we need is a forest of plans for a given query rather than > just one. Pick plan A if memory is limited, otherwise pick B. Or use > admission control. > FWIW, lack of per-connection and/or global memory limit for work_mem is major PITA when running shared and/or large-scale setup. Currently we are doing a poor job with the work_mem parameter because we don't have a good way to let our customers increase it without also giving them ability to shoot themselves in a foot. Even a simple param limiting global total number of work_mem buffers would help here. -- Vladimir Rusinov PostgreSQL SRE, Google Ireland Google Ireland Ltd.,Gordon House, Barrow Street, Dublin 4, Ireland Registered in Dublin, Ireland Registration Number: 368047
smime.p7s
Description: S/MIME Cryptographic Signature