On 01/11/2022 02:15, Andres Freund wrote:
On 2022-10-31 10:51:36 +0100, Heikki Linnakangas wrote:
These are functions where quite a lot of things happen between the
ResourceOwnerEnlarge and ResourceOwnerRemember calls. It's important that
there are no unrelated ResourceOwnerRemember() calls in the code in
between, otherwise the entry reserved by the ResourceOwnerEnlarge() call
might be used up by the intervening ResourceOwnerRemember() and not be
available at the intended ResourceOwnerRemember() call anymore. The longer
the code path between them is, the harder it is to verify that.

This seems to work towards a future where only one kind of resource can be
reserved ahead of time. That doesn't strike me as great.

True. It is enough for all the current callers AFAICS, though. I don't remember wanting to reserve multiple resources at the same time. Usually, the distance between the Enlarge and Remember calls is very short. If it's not, it's error-prone anyway, so we should try to keep the distance short.

While we're at it, who says that it's enough to reserve space for only one resource of a kind either? For example, what if you want to pin two buffers?

If we really need to support that, I propose something like this:

/*
 * Reserve a slot in the resource owner.
 *
 * This is separate from actually inserting an entry because if we run out
 * of memory, it's critical to do so *before* acquiring the resource.
 */
ResOwnerReservation *
ResourceOwnerReserve(ResourceOwner owner)

/*
 * Remember that an object is owner by a ResourceOwner
 *
 * 'reservation' is a slot in the resource owner that was pre-reserved
 * by ResOwnerReservation.
 */
void
ResOwnerRemember(ResOwnerReservaton *reservation, Datum value, ResourceOwnerFuncs *kind)


Instead of having a separate array/hash for each resource kind, use a
single array and hash to hold all kinds of resources. This makes it
possible to introduce new resource "kinds" without having to modify the
ResourceOwnerData struct. In particular, this makes it possible for
extensions to register custom resource kinds.

As a goal I like this.

However, I'm not quite sold on the implementation. Two main worries:

1) As far as I can tell, the way ResourceOwnerReleaseAll() now works seems to
    assume that within a phase the reset order does not matter. I don't think
    that's a good assumption. I e.g. have a patch to replace InProgressBuf with
    resowner handling, and in-progress IO errors should be processed before
    before pins are released.

Hmm. Currently, you're not supposed to hold any resources at commit. You get warnings about resource leaks if a resource owner is not empty on ResourceOwnerReleaseAll(). On abort, does the order matter? I'm not familiar with your InProgressBuf patch, but I guess you could handle the in-progress IO errors in ReleaseBuffer().

If we do need to worry about release order, perhaps add a "priority" or "phase" to each resource kind, and release them in priority order. We already have before- and after-locks as phases, but we could generalize that.

However, I feel that trying to enforce a particular order moves the goalposts. If we need that, let's add it as a separate patch later.

2) There's quite a few resource types where we actually don't need an entry in
    an array, because we can instead add a dlist_node to the resource -
    avoiding memory overhead and making removal cheap. I have a few pending
    patches that use that approach, and this doesn't really provide a path for
    that anymore.

Is that materially better than using the array? The fast path with an array is very fast. If it is better, perhaps we should bite the bullet and require a dlist node and use that mechanism for all resource types?

I did try out the benchmark from
https://postgr.es/m/20221029200025.w7bvlgvamjfo6z44%40awork3.anarazel.de and
the patches performed well, slightly better than my approach of allocating
some initial memory for each resarray.

Thank you, glad to hear that!

- Heikki



Reply via email to