On 09/14/2017 04:21 PM, Simon Riggs wrote: > On 14 August 2017 at 01:35, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: >> Hi, >> >> Attached is a rebased version of the Generational context, originally >> submitted with SlabContext (which was already committed into Pg 10). >> >> The main change is that I've abandoned the pattern of defining a Data >> structure and then a pointer typedef, i.e. >> >> typedef struct GenerationContextData { ... } GenerationContextData; >> typedef struct GenerationContextData *GenerationContext; >> >> Now it's just >> >> typedef struct GenerationContext { ... } GenerationContext; >> >> mostly because SlabContext was committed like that, and because Andres was >> complaining about this code pattern ;-) >> >> Otherwise the design is the same as repeatedly discussed before. >> >> To show that this is still valuable change (even after SlabContext and >> adding doubly-linked list to AllocSet), I've repeated the test done by >> Andres in [1] using the test case described in [2], that is >> >> -- generate data >> SELECT COUNT(*) FROM (SELECT test1() >> FROM generate_series(1, 50000)) foo; >> >> -- benchmark (measure time and VmPeak) >> SELECT COUNT(*) FROM (SELECT * >> FROM pg_logical_slot_get_changes('test', NULL, >> NULL, 'include-xids', '0')) foo; >> >> with different values passed to the first step (instead of the 50000). The >> VmPeak numbers look like this: >> >> N master patched >> -------------------------------------- >> 100000 1155220 kB 361604 kB >> 200000 2020668 kB 434060 kB >> 300000 2890236 kB 502452 kB >> 400000 3751592 kB 570816 kB >> 500000 4621124 kB 639168 kB >> >> and the timing (on assert-enabled build): >> >> N master patched >> -------------------------------------- >> 100000 1103.182 ms 412.734 ms >> 200000 2216.711 ms 820.438 ms >> 300000 3320.095 ms 1223.576 ms >> 400000 4584.919 ms 1621.261 ms >> 500000 5590.444 ms 2113.820 ms >> >> So it seems it's still a significant improvement, both in terms of memory >> usage and timing. Admittedly, this is a single test, so ideas of other >> useful test cases are welcome. > > This all looks good. > > What I think this needs is changes to > src/backend/utils/mmgr/README > which decribe the various options that now exist (normal?, slab) and > will exist (generational) > > Don't really care about the name, as long as its clear when to use it > and when not to use it. > > This description of generational seems wrong: "When the allocated > chunks have similar lifespan, this works very well and is extremely > cheap." > They don't need the same lifespan they just need to be freed in groups > and in the order they were allocated. >
Agreed, should be described differently. What matters is (mostly) FIFO pattern of the palloc/pfree requests, which allows us to release the memory. > > For this patch specifically, we need additional comments in > reorderbuffer.c to describe the memory allocation pattern in that > module so that it is clear that the choice of allocator is useful > and appropriate, possibly with details of how that testing was > performed, so it can be re-tested later or tested on a variety of > platforms. > Including details about the testing into reorderbuffer.c does not seem very attractive to me. I don't recall any other place describing the tests in detail. Why not to discuss the details here, and then include a link to this thread in the commit message? In any case, I doubt anyone will repeat the testing on a variety of platforms (and I don't have any such comprehensive test suite anyway). What will likely happen is someone bumping into a poorly performing corner case, we will analyze it and fix it as usual. > > Particularly in reorderbuffer, surely we will almost immediately > reuse chunks again, so is it worth issuing free() and then malloc() > again soon after? Does that cause additional overhead we could also > avoid? Could we possibly keep the last/few free'd chunks around > rather than re-malloc? > I haven't seen anything like that in tests I've done. The malloc/free overhead is negligible thanks as our allocators significantly reduce the number of calls to those functions. If we happen to run into such case, it shouldn't be difficult to keep a few empty blocks. But perhaps we can leave that as a future optimization. > > Seems like we should commit this soon. > Thanks. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers