On Wed, Dec 10, 2014 at 7:34 AM, Heikki Linnakangas <hlinnakan...@vmware.com> wrote: >>> 1. I don't really see why resource owners shouldn't be traversed like >>> that. They are clearly intended to form a hierarchy, and there's >>> existing code that recurses through the hierarchy from a given level >>> downward. What's ugly about that? > > I can't exactly point my finger on it, but it just feels wrong from a > modularity point of view. Their purpose is to make sure that we don't leak > resources on abort, by allowing easy an "release everything" operation. It's > not designed for finding objects based on some other criteria. > > There is similar double bookkeeping of many other things that are tracked by > resource owners. Heavy-weight locks are tracked by LOCALLOCK structs, buffer > pins in PrivateRefCount array etc. Those things existed before resource > owners were invented, but if we were starting from scratch, that design > would still make sense, as different objects have different access criteria. > fd.c needs to be able to find the least-recently-used open file, for > example, and you need to find the snapshot with the lowest xmin.
I think all of these are performance trade-offs rather than questions of style. LOCALLOCK structs and private buffer pin tracking existed before resource owners because we need to do lookups of locks by lock tag or buffers by buffer number frequently enough that, if we had to grovel trough all the locks or buffer pins in the system without any indexing, it would be slow. We *could* do it that way now that we have resource owners, but it's pretty obvious that it would suck. That's less obvious here. If we throw all of the registered snapshots into a binary heap, finding the lowest xmin will be cheaper every time we do it, but we'll have the overhead of maintaining that binary heap even if it never gets used. > I think you're confusing the N and the N above. It's true that deleting a > resource owner is O(M), where the M is the number of children of that > resource owner. It does not follow that releasing N resource owners is > O(N^2), where N is the number of resource owners released. Calling > ResourceOwnerDelete(x) will only visit each resource owner in that tree > once, so it's O(N), where N is the total number of resource owners in the > tree. Not if the portals are dropped in separate commands with an unpredictable ordering. Then you have N separate calls to ResourceOwnerDelete. > I did some testing of the worst case scenario. The attached script first > creates a lot of cursors, then a lot of savepoints, and finally closes the > cursors in FIFO order. When the number of savepoints is high enough, this > actually segfaults with your patch, because you run out of stack space when > recursing the subxact resource owners. That's hardly this patch's fault, I'm > actually surprised it doesn't crash without it, because we recurse into all > resource owners in ResourceOwnerRelease too. Apparently the subxacts are > closed in LIFO order at commit, but there might be are other cases where you > could trigger that. In any case, a stack-depth check would be nice. Thanks for the test case; that's helpful. I added a stack depth check, but it doesn't help much: ERROR: stack depth limit exceeded HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack depth limit is adequate. STATEMENT: CLOSE cur0; ERROR: stack depth limit exceeded HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack depth limit is adequate. WARNING: AbortSubTransaction while in ABORT state ERROR: stack depth limit exceeded HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack depth limit is adequate. WARNING: AbortSubTransaction while in ABORT state ERROR: stack depth limit exceeded HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack depth limit is adequate. WARNING: AbortSubTransaction while in ABORT state ERROR: stack depth limit exceeded HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack depth limit is adequate. PANIC: ERRORDATA_STACK_SIZE exceeded That's at 100k subtransactions. I took some performance data for lower numbers of subtransactions: -- at 1000 subtransactions -- master 0.156s 0.157s 0.154s advance-xmin 0.177s 0.175s 0.177s -- at 10000 subtransactions -- master 0.458s 0.457s 0.456s advance-xmin 0.639s 0.637s 0.633 I guess this bears some further thought. I certainly don't like the fact that this makes the whole system crap out at a lower number of subtransactions than presently. The actual performance numbers don't bother me very much; I'm comfortable with the possibility that closing a cursor will be some modest percentage slower if you've got thousands of active savepoints. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers