On Sun, 2025-08-24 at 23:00 +0200, Martin Wilck wrote: > On Thu, 2025-07-10 at 14:11 -0400, Benjamin Marzinski wrote: > > When a reservation key is changed from one non-zero value to > > another, > > libmpathpersist checks if the old key is still holding the > > reservation, > > and preempts it if it is. This was only safe if two nodes never > > share > > the same key. If a node uses the same key as another node that is > > holding the reservation, and switches keys so that it no longer > > matches, > > it will end up preempting the reservation. This is clearly > > unexpected > > behavior, and it can come about by a simple accident of registering > > a > > device with the wrong key, and then immediately fixing it. > > > > To handle this, add code to track if a device is the reservation > > holder > > to multipathd. multipathd now has three new commands "getprhold", > > "setprhold", and "unsetprhold". These commands work like the > > equivalent > > *prstatus commands. libmpathpersist calls setprhold on RESERVE > > commands > > and PREEMPT commands when the preempted key is holding the > > reservation > > and unsetprhold on RELEASE commands. Also, calling unsetprflag > > causes > > prhold to be unset as well, so CLEAR commands and REGISTER commands > > with > > a 0x0 service action key will also unset prhold. libmpathpersist() > > will > > also unset prhold if it notices that the device cannot be holding a > > reservation in preempt_missing_path(). > > > > When a new multipath device is created, its initial prhold state is > > PR_UNKNOWN until it checks the current reservation, just like with > > prflag. If multipathd ever finds that a device's registration has > > been > > preempted or cleared in update_map_pr(), it unsets prhold, just > > like > > with prflag. > > > > Now, before libmpathpersist preempts a reservation when changing > > keys > > it also checks if multipathd thinks that the device is holding > > the reservation. If it does not, then libmpathpersist won't preempt > > the key. It will assume that another node is holding the > > reservation > > with the same key. > > > > I should note that this safety check only stops a node not holding > > the > > reservation from preempting the node holding the reservation. If > > the > > node holding the reservation changes its key, but it fails to > > change > > the > > resevation key, because that path is down or gone, it will still > > issue > > the preempt to move the reservation to a usable path, even if > > another > > node is using the same key. This will remove the registrations for > > that > > other node. It also will not work correctly if multipathd stops > > tracking > > a device for some reason. It's only a best-effort safety check. > > > > Signed-off-by: Benjamin Marzinski <bmarz...@redhat.com> > > --- > > libmpathpersist/mpath_persist_int.c | 87 ++++++++++++++++++++++++- > > -- > > -- > > libmpathpersist/mpath_updatepr.c | 19 ++++++- > > libmpathpersist/mpathpr.h | 2 + > > libmultipath/structs.h | 1 + > > multipathd/callbacks.c | 3 + > > multipathd/cli.c | 4 +- > > multipathd/cli.h | 3 + > > multipathd/cli_handlers.c | 48 ++++++++++++++++ > > multipathd/main.c | 33 ++++++++++- > > 9 files changed, 182 insertions(+), 18 deletions(-) > > > > Two questions all the way down. > > > > > +static void check_prhold(struct multipath *mpp, struct path *pp) > > +{ > > + struct prin_resp resp = {0}; > > + int status; > > + > > + if (mpp->prflag == PR_UNSET) { > > + mpp->prhold = PR_UNSET; > > + return; > > + } > > + if (mpp->prflag != PR_SET || mpp->prhold != PR_UNKNOWN) > > + return; > > How can this situation come to pass? prflag must be PR_UNKNOWN. > Shouldn't we reset prhold to PR_UNKNOWN as well?
Forget this remark. I was confused. > > > + > > + status = prin_do_scsi_ioctl(pp->dev, MPATH_PRIN_RRES_SA, > > &resp, 0); > > + if (status != MPATH_PR_SUCCESS) { > > + condlog (0, "%s: pr in read reservation command > > failed: %d", > > + mpp->wwid, status); > > + return; > > + } > > + mpp->prhold = PR_UNSET; > > + if (!resp.prin_descriptor.prin_readresv.additional_length) > > + return; > > + > > + if (memcmp(&mpp->reservation_key, > > resp.prin_descriptor.prin_readresv.key, 8) == 0) > > + mpp->prhold = PR_SET; > > +} > > + > > static void mpath_pr_event_handle(struct path *pp) > > { > > struct multipath *mpp = pp->mpp; > > @@ -4233,8 +4259,13 @@ static void mpath_pr_event_handle(struct > > path > > *pp) > > return; > > } > > > > - if (update_map_pr(mpp, pp) != MPATH_PR_SUCCESS) > > + if (update_map_pr(mpp, pp) != MPATH_PR_SUCCESS) { > > + if (mpp->prflag == PR_UNSET) > > + mpp->prhold = PR_UNSET; > > Perhaps we should move setting prhold to update_map_pr()? You fixed this in your 2nd set. Martin