Hi, On 06/14/15 21:43, Jeff Davis wrote:
This patch tracks memory usage (at the block level) for all memory contexts. Individual palloc()s aren't tracked; only the new blocks allocated to the memory context with malloc().
I see it's adding the new field as int64 - wouldn't a Size be more appropriate, considering that's what we use in mctx.h and aset.c?
It also adds a new function, MemoryContextMemAllocated() which can either retrieve the total allocated for the context, or it can recurse and sum up the allocations for all subcontexts as well. This is intended to be used by HashAgg in an upcoming patch that will bound the memory and spill to disk. Previous discussion here: http://www.postgresql.org/message-id/1407012053.15301.53.camel@jeff-desktop Previous concerns: * There was a slowdown reported of around 1-3% (depending on the exact version of the patch) on an IBM power machine when doing an index rebuild. The results were fairly noisy for me, but it seemed to hold up. See http://www.postgresql.org/message-id/CA +Tgmobnu7XEn1gRdXnFo37P79bF=qLt46=37ajp3cro9db...@mail.gmail.com * Adding a struct field to MemoryContextData may be bad for the CPU caching behavior, and may be the cause of the above slowdown.
* Previous versions of the patch updated the parent contexts' allocations as well, which avoided the need to recurse when querying the total allocation. That seemed to penalize cases that didn't need to track the allocation. We discussed trying to "opt-in" to this behavior, but it seemed more awkward than helpful. Now, the patch only updates the allocation of the current context, and querying means recursing through the child contexts.
I don't think the opt-in idea itself was awkward, it was rather about the particular APIs that we came up with, especially when combined with the 'context inheritance' thing.
I still think the opt-in approach and updating accounting for the parent contexts was the best one, because it (a) minimizes impact in cases that don't use the accounting, and (b) makes finding the current amount of memory cheap ...
* There was a concern that, if MemoryContextMemAllocated needs to recurse to the child contexts, it will be too slow for HashAgg of array_agg, which creates a child context per group. That was solved with http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b419865a814abbca12bdd6eef6a3d5ed67f432e1
I wouldn't say this was "solved" - we have fixed one particular example of such aggregate implementation, because it was causing OOM issues with many groups, but there may be other custom aggregates using the same pattern.
Granted, built-in aggregates are probably more critical than aggregates provided by extensions, but I wouldn't dare to mark this solved ...
My general answer to the performance concerns is that they aren't a reason to block this patch, unless someone has a suggestion about how to fix them. Adding one field to a struct and a few arithmetic operations for each malloc() or realloc() seems reasonable to me.
I'm not buying this, sorry. While I agree that we should not expect the memory accounting to be entirely free, we should be very careful about the overhead especially if we're dropping the opt-in and thus imposing the overhead on everyone.
But "performance concerns are not a reason to block this patch" approach seems wrong. With any other patch a 3% regression would be considered a serious issue IMNSHO.
The current state, where HashAgg just blows up the memory, is just not reasonable, and we need to track the memory to fix that problem. Others have also mentioned that we might want to use this mechanism to track memory for other operators, like Sort or HashJoin, which might be simpler and more accurate.
Dropping the memory accounting implementations and keeping just this new solution would be nice, only if we agree the performance impact to be acceptable. We already have accounting solution for each of those places, so I don't think the unification alone outweighs the regression.
regards -- Tomas Vondra http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers