On Mon, Aug 22, 2011 at 6:45 PM, Jim Nasby <j...@nasby.net> wrote: > Something that would be really nice to fix is our reliance on a fixed size of > shared memory, and I'm wondering if this could be an opportunity to start in > a new direction. My thought is that we could maintain two distinct shared > memory snapshots and alternate between them. That would allow us to actually > resize them as needed. We would still need something like what you suggest to > allow for adding to the list without locking, but with this scheme we > wouldn't need to worry about extra locking when taking a snapshot since we'd > be doing that in a new segment that no one is using yet. > > The downside is such a scheme does add non-trivial complexity on top of what > you proposed. I suspect it would be much better if we had a separate > mechanism for dealing with shared memory requirements (shalloc?). But if it's > just not practical to make a generic shared memory manager it would be good > to start thinking about ways we can work around fixed shared memory size > issues.
Well, the system I'm proposing is actually BETTER than having two distinct shared memory snapshots. For example, right now we cache up to 64 subxids per backend. I'm imagining that going away and using that memory for the ring buffer. Out of the box, that would imply a ring buffer of 64 * 103 = 6592 slots. If the average snapshot lists 100 XIDs, you could rewrite the snapshot dozens of times times before the buffer wraps around, which is obviously a lot more than two. Even if subtransactions are being heavily used and each snapshot lists 1000 XIDs, you still have enough space to rewrite the snapshot several times over before wraparound occurs. Of course, at some point the snapshot gets too big and you have to switch to retaining only the toplevel XIDs, which is more or less the equivalent of what happens under the current implementation when any single transaction's subxid cache overflows. With respect to a general-purpose shared memory allocator, I think that there are cases where that would be useful to have, but I don't think there are as many of them as many people seem to think. I wouldn't choose to implement this using a general-purpose allocator even if we had it, both because it's undesirable to allow this or any subsystem to consume an arbitrary amount of memory (nor can it fail... especially in the abort path) and because a ring buffer is almost certainly faster than a general-purpose allocator. We have enough trouble with palloc overhead already. That having been said, I do think there are cases where it would be nice to have... and it wouldn't surprise me if I end up working on something along those lines in the next year or so. It turns out that memory management is a major issue in lock-free programming; you can't assume that it's safe to recycle an object once the last pointer to it has been removed from shared memory - because someone may have fetched the pointer just before you removed it and still be using it to examine the object. An allocator with some built-in capabilities for handling such problems seems like it might be very useful.... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers