On 12/09/2014 10:35 PM, Robert Haas wrote:
On Mon, Dec 8, 2014 at 9:31 AM, Robert Haas <robertmh...@gmail.com> wrote:On Mon, Dec 8, 2014 at 4:56 AM, Heikki Linnakangas <hlinnakan...@vmware.com> wrote:I don't immediately see the problem either, but I have to say that grovelling through all the resource owners seems ugly anyway. Resource owners are not meant to be traversed like that. And there could be a lot of them, and you have to visit every one of them. That could get expensive if there are a lot of resource owners.1. I don't really see why resource owners shouldn't be traversed like that. They are clearly intended to form a hierarchy, and there's existing code that recurses through the hierarchy from a given level downward. What's ugly about that?
I can't exactly point my finger on it, but it just feels wrong from a modularity point of view. Their purpose is to make sure that we don't leak resources on abort, by allowing easy an "release everything" operation. It's not designed for finding objects based on some other criteria.
There is similar double bookkeeping of many other things that are tracked by resource owners. Heavy-weight locks are tracked by LOCALLOCK structs, buffer pins in PrivateRefCount array etc. Those things existed before resource owners were invented, but if we were starting from scratch, that design would still make sense, as different objects have different access criteria. fd.c needs to be able to find the least-recently-used open file, for example, and you need to find the snapshot with the lowest xmin.
Upthread, I suggested keeping a tally of the number of snapshots with the advertised xmin and recomputing the xmin to advertise only when it reaches 0. This patch doesn't implementation that optimization, but it does have code that aborts the traversal of the resource owner hierarchy as soon as we see an xmin that will preclude advancing our advertised xmin. Releasing N resource owners could therefore cost O(N^2) in the worst case, but note that releasing N resource owners is *already* an O(N^2) operation in the worst case, because the list of children of a particular parent resource owner is singly linked, and thus deleting a resource owner is O(N). It's been that way for an awfully long time without anyone complaining, probably because (a) it's not particularly common to have large numbers of cursors open simultaneously and (b) even if you do have that, the constant factor is pretty low.
I think you're confusing the N and the N above. It's true that deleting a resource owner is O(M), where the M is the number of children of that resource owner. It does not follow that releasing N resource owners is O(N^2), where N is the number of resource owners released. Calling ResourceOwnerDelete(x) will only visit each resource owner in that tree once, so it's O(N), where N is the total number of resource owners in the tree.
I did some testing of the worst case scenario. The attached script first creates a lot of cursors, then a lot of savepoints, and finally closes the cursors in FIFO order. When the number of savepoints is high enough, this actually segfaults with your patch, because you run out of stack space when recursing the subxact resource owners. That's hardly this patch's fault, I'm actually surprised it doesn't crash without it, because we recurse into all resource owners in ResourceOwnerRelease too. Apparently the subxacts are closed in LIFO order at commit, but there might be are other cases where you could trigger that. In any case, a stack-depth check would be nice.
- Heikki
cursors.sh
Description: Bourne shell script
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers