Re: Treating work_mem as a shared resource (Was: Parallel Hash take II)

Vladimir Rusinov Fri, 17 Nov 2017 08:10:17 -0800

On Fri, Nov 17, 2017 at 3:31 PM, Robert Haas <robertmh...@gmail.com> wrote:


> On Thu, Nov 16, 2017 at 11:50 AM, Serge Rielau <se...@rielau.com> wrote:
>
>> Just as you have, we have also considered holistic provisioning of
>> work_mem across all consumers, but we find that to be too complex.
>> Having an “emergency fund” in shared memory is also an option, but I find
>> it too limiting.
>>
>
> I agree.
>
> I think this is basically a planning problem.  For example, say we wanted
> to have work_mem_per_query instead of work_mem_per_node.  There is an
> obvious design: consider memory use as an independent dimension of merit
> during path generation and comparison (less is better).  Discard candidate
> paths whose memory use exceeds the work_mem_per_query budget unless there
> are no other alternatives.  At the end of planning, pick the cheapest path
> that survived the memory-budget filter.  Now, this has the problem that it
> would make planning more expensive (because we'd hang on to more paths for
> longer) but it solves a lot of other problems.  If there's no memory
> pressure, we can use memory like mad even when it doesn't save much, but
> when we have to pick between using more memory for one part of the plan and
> using more memory for another part of the plan, the choice that does the
> best job reducing overall execution time will win.  Awesome.
>
> We could also do more localized variants of this that don't provide hard
> guarantees but do tend to avoid squandering resources.  I don't think that
> we can directly incorporate memory use into cost because that will distort
> the costs of higher-level nodes in the plan tree; cost needs to mean
> execution time.  However, what we could do is refuse to replace a more
> expensive path in a relation's path list with a cheaper one when the
> savings are small and the cheaper path uses a lot more memory.  That way,
> you wouldn't replace a nested loop that costs a million units with a hash
> join that costs 999,999 units but uses a GB of RAM; you'd save the hash
> join for cases where we think it will help significantly.
>
> Yet another thing we could do is to try to get nodes to voluntarily use
> less than work_mem when possible.  This is particularly an issue for
> sorts.  A 2-batch hash join is so much more expensive than a single-batch
> hash join that it's almost never going to make sense unless we have no
> realistic alternative, although I suppose a 64-batch hash join might be not
> that different from a 32-batch hash join.  But for sorts, given all Peter's
> work in this area, I bet there are a lot of sorts that could budget a
> quarter or less of work_mem and really not be hurt very much.  It depends
> somewhat on how fast and how contended your I/O is, though, which we don't
> have an especially good way to model.  I'm starting to wonder if that
> sort_mem GUC might be a good idea... use that for sorts, and keep work_mem
> for everything else.
>
> If we really want to be able to dynamically react to change memory
> conditions, what we need is a forest of plans for a given query rather than
> just one.  Pick plan A if memory is limited, otherwise pick B.  Or use
> admission control.
>

FWIW, lack of per-connection and/or global memory limit for work_mem is
major PITA when running shared and/or large-scale setup.

Currently we are doing a poor job with the work_mem parameter because we
don't have a good way to let our customers increase it without also giving
them ability to shoot themselves in a foot.
Even a simple param limiting global total number of work_mem buffers would
help here.

-- 
Vladimir Rusinov
PostgreSQL SRE, Google Ireland

Google Ireland Ltd.,Gordon House, Barrow Street, Dublin 4, Ireland
Registered in Dublin, Ireland
Registration Number: 368047

smime.p7s
Description: S/MIME Cryptographic Signature

Re: Treating work_mem as a shared resource (Was: Parallel Hash take II)

Reply via email to