On 19 January 2016 at 17:14, Robert Haas <robertmh...@gmail.com> wrote:
> On Mon, Jan 18, 2016 at 12:34 PM, Robert Haas <robertmh...@gmail.com> > wrote: > > On Mon, Jan 18, 2016 at 12:09 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > >> Robert Haas <robertmh...@gmail.com> writes: > >>> Yeah. The API contract for an expanded object took me quite a while > >>> to puzzle out, but it seems to be this: if somebody hands you an R/W > >>> pointer to an expanded object, you're entitled to assume that you can > >>> "take over" that object and mutate it however you like. But the > >>> object might be in some other memory context, so you have to move it > >>> into your own memory context. > >> > >> Only if you intend to keep it --- for example, a function that is > mutating > >> and returning an object isn't required to move it somewhere else, if the > >> input is R/W, and I think it generally shouldn't. > >> > >> In the context here, I'd think it is the responsibility of nodeAgg.c > >> not individual datatype functions to make sure that expanded objects > >> live where it wants them to. > > > > That's how I did it in my prototype, but the problem with that is that > > spinning up a memory context for every group sucks when there are many > > groups with only a small number of elements each - hence the 50% > > regression that David Rowley observed. If we're going to use expanded > > objects here, which seems like a good idea in principle, that's going > > to have to be fixed somehow. We're flogging the heck out of malloc by > > repeatedly creating a context, doing one or two allocations in it, and > > then destroying the context. > > > > I think that, in general, the memory context machinery wasn't really > > designed to manage lots of small contexts. The overhead of spinning > > up a new context for just a few allocations is substantial. That > > matters in some other situations too, I think - I've commonly seen > > AllocSetContextCreate taking several percent of runtime in profiles. > > But here it's much exacerbated. I'm not sure whether it's better to > > attack that problem at the root and try to make AllocSetContextCreate > > cheaper, or whether we should try to figure out some change to the > > expanded-object machinery to address the issue. > > Here is a patch that helps a good deal. I changed things so that when > we create a context, we always allocate at least 1kB. If that's more > than we need for the node itself and the name, then we use the rest of > the space as a sort of keeper block. I think there's more that can be > done to improve this, but I'm wimping out for now because it's late > here. > > I suspect something like this is a good idea even if we don't end up > using it for aggregate transition functions. Thanks for writing this up, but I think the key issue is not addressed, which is the memory context per aggregate group just being a performance killer. I ran the test query again with the attached patch and the hashagg query still took 300 seconds on my VM with 4GB ram. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services