On Wed, Sep 24, 2025 at 08:37:36PM +0200, Martin Wilck wrote: > On Tue, 2025-09-23 at 18:12 -0400, Benjamin Marzinski wrote: > > Instead of open-coding mostly the same work, just make > > mpath_prout_rel() > > call check_holding_reservation(). > > > > Signed-off-by: Benjamin Marzinski <[email protected]> > > --- > > libmpathpersist/mpath_persist_int.c | 86 ++++++++++++--------------- > > -- > > 1 file changed, 35 insertions(+), 51 deletions(-) > > > > diff --git a/libmpathpersist/mpath_persist_int.c > > b/libmpathpersist/mpath_persist_int.c > > index 33c9e57a..9c71011f 100644 > > --- a/libmpathpersist/mpath_persist_int.c > > +++ b/libmpathpersist/mpath_persist_int.c > > @@ -639,12 +639,42 @@ static int preempt_all(struct multipath *mpp, > > int rq_servact, int rq_scope, > > noisy, false); > > } > > > > +static int reservation_key_matches(struct multipath *mpp, uint8_t > > *key, > > + unsigned int *type) > > +{ > > + struct prin_resp resp = {{{.prgeneration = 0}}}; > > + int status; > > + > > + status = mpath_prin_activepath(mpp, MPATH_PRIN_RRES_SA, > > &resp, 0); > > + if (status != MPATH_PR_SUCCESS) { > > + condlog(0, "%s: pr in read reservation command > > failed.", mpp->wwid); > > + return YNU_UNDEF; > > + } > > + if (!resp.prin_descriptor.prin_readresv.additional_length) > > + return YNU_NO; > > + if (memcmp(key, resp.prin_descriptor.prin_readresv.key, 8) > > == 0) { > > + if (type) > > + *type = > > resp.prin_descriptor.prin_readresv.scope_type & > > + MPATH_PR_TYPE_MASK; > > + return YNU_YES; > > + } > > + return YNU_NO; > > +} > > + > > +static bool check_holding_reservation(struct multipath *mpp, > > unsigned int *type) > > +{ > > + if (get_be64(mpp->reservation_key) && > > + get_prhold(mpp->alias) == PR_SET && > > + reservation_key_matches(mpp, (uint8_t *)&mpp- > > >reservation_key, type) == YNU_YES) > > + return true; > > + return false; > > +} > > + > > static int mpath_prout_rel(struct multipath *mpp,int rq_servact, int > > rq_scope, > > unsigned int rq_type, > > struct prout_param_descriptor * paramp, > > int noisy) > > { > > int i, j; > > - int num = 0; > > struct pathgroup *pgp = NULL; > > struct path *pp = NULL; > > int active_pathcount = 0; > > @@ -652,9 +682,8 @@ static int mpath_prout_rel(struct multipath > > *mpp,int rq_servact, int rq_scope, > > int rc; > > int count = 0; > > int status = MPATH_PR_SUCCESS; > > - struct prin_resp resp = {{{.prgeneration = 0}}}; > > bool all_threads_failed; > > - unsigned int scope_type; > > + unsigned int res_type; > > > > if (!mpp) > > return MPATH_PR_DMMP_ERROR; > > @@ -733,27 +762,13 @@ static int mpath_prout_rel(struct multipath > > *mpp,int rq_servact, int rq_scope, > > return status; > > } > > > > - status = mpath_prin_activepath (mpp, MPATH_PRIN_RRES_SA, > > &resp, noisy); > > - if (status != MPATH_PR_SUCCESS){ > > - condlog (0, "%s: pr in read reservation command > > failed.", mpp->wwid); > > - return MPATH_PR_OTHER; > > - } > > - > > - num = resp.prin_descriptor.prin_readresv.additional_length / > > 8; > > - if (num == 0){ > > - condlog (2, "%s: Path holding reservation is > > released.", mpp->wwid); > > - return MPATH_PR_SUCCESS; > > - } > > - if (!get_be64(mpp->reservation_key) || > > - memcmp(&mpp->reservation_key, > > resp.prin_descriptor.prin_readresv.key, 8)) { > > + if (!check_holding_reservation(mpp, &res_type)) { > > The old code didn't call check_prhold() while > check_holding_reservation() does. > > Is it possible that get_prhold() would return a "false negative" and > thus we'd wrongly skip releasing the reservation?
At the point where we check this, we've already done the RELEASE and gotten a success, but the reservation is still held. This means that either an unavailable path on this device is holding it, or another device that is using the same key is holding it. These checks are to determine whether or not we PREEMPT the key to force-clear the reservation. What we are adding here is a requirement that multipathd says we are holding the reservation. I'm not sure how we would get a false negative. When multipathd first starts tracking a device (when it starts up, creates a new device, or is reconfigured) it assumes that there aren't multiple devices holding the same key. So if it sees that its configured key is holding the reservation, it assumes that this device is holding the reservation. It will only unset the holding state if it notices that the devices key is not registered, or it is told to drop the reservation. Once it is started the only way it will set the holding state on a device is when libmpathpersist tells it to, but that should always happen when libmpathpersist grabs a reservation. Now, this code is admittedly complex, and there are lots of corner cases that I might not have thought through which could produce a false negative, and it is always possible that libmpathpersist fails to communicate with multipathd to set prhold. Also, I don't think we really need to worry much about another device holding a duplicate key when we are explicitly asked to release the reservation (that means whoever is issuing this command also thinks we own the device). So I'm fine with backing this change out. I was on the fence about it myself. If I do, check_holding_reservation() and reservation_key_matches() will still need to be moved earlier in the code for the next patch. That patch also makes changes to check_holding_reservation(). Do you prefer moving the functions in a separate patch from editting them. Or are you fine with doing both in the same patch, as long as I call it out. -Ben > Martin
