On Mon, 2025-09-22 at 20:38 +0200, Martin Wilck wrote: > On Thu, 2025-09-11 at 13:56 -0400, Benjamin Marzinski wrote: > > On Wed, Sep 10, 2025 at 09:41:57PM +0200, Xose Vazquez Perez wrote: > > > Source: > > > https://dl.acronis.com/u/software-defined/html/AcronisCyberInfrastructure_3_5_users_guide_en-US/accessing-iscsi/accessing-iscsi-targets-from-linux.html > > > > > > Cc: Martin Wilck <[email protected]> > > > Cc: Benjamin Marzinski <[email protected]> > > > Cc: Christophe Varoqui <[email protected]> > > > Cc: DM_DEVEL-ML <[email protected]> > > > Signed-off-by: Xose Vazquez Perez <[email protected]> > > > --- > > > libmultipath/hwtable.c | 15 +++++++++++++++ > > > 1 file changed, 15 insertions(+) > > > > > > diff --git a/libmultipath/hwtable.c b/libmultipath/hwtable.c > > > index 1a78c36d..12e10577 100644 > > > --- a/libmultipath/hwtable.c > > > +++ b/libmultipath/hwtable.c > > > @@ -1371,6 +1371,21 @@ static struct hwentry default_hw[] = { > > > .pgpolicy = GROUP_BY_SERIAL, > > > .no_path_retry = 30, > > > }, > > > + /* > > > + * Acronis > > > + */ > > > + { > > > + // Cyber Infrastructure > > > + .vendor = "VSTORAGE", > > > + .product = "VSTOR-DISK", > > > + .prio_name = PRIO_ALUA, > > > + .pgpolicy = GROUP_BY_NODE_NAME, > > > + .detect_prio = DETECT_PRIO_OFF, > > > + .features = "2 pg_init_retries 50", > > > + .pgfailback = -FAILBACK_FOLLOWOVER, > > > > I'm not sure about the pgfailback setting. the FAILBACK_FOLLOWOVER > > mode > > is not something that is really needed for a specific array type. > > It > > works like FAILOVER_IMMEDIATE, but it's designed to work so that if > > multiple nodes are using the same multipath device, and one of them > > loses access to the highest priority pathgroup but the others > > don't, > > multipath won't keep switching pathgroups back and forth. The nodes > > that > > can see the higher priority pathgroup won't automatically switch > > back > > to > > it, since they weren't the ones that switched away from it. So the > > setting is more for a specific use case, and not a specific array > > type. > > > > On the other hand, it is what the vendor tested, and with a single > > machine accessing the multipath device, you usually only switch > > away > > from the highest priority pathgroup because you lost access to all > > the > > paths in it. > > Not necessarily. path_group_prio_update() calculates the average of > the > path priorities in the group. With ALUA and groups of 6+ paths, an > optimized group with one healthy path will have a lower prio (8) than > a > non-optimized group with all healthy paths (10). In such a case it > could happen that multipathd switches to the non-optimized group and > never switches back.
Sorry, this example was incorrect. Only paths in UP or GHOST states are counted in for the PG prio. In this example the optimized group would still have p = 50, and what I described would not occur. The example would be correct if the 5 non-"healthy" paths in the first PG were in standby aka GHOST state (resulting in p = 55 / 6 = 9). But that's a very different scenario, and highly theoretical. > I would suggest setting FAILBACK_IMMEDIATE instead. It's well > documented that FOLLOWOVER is only for cluster environments. Despite the wrong example, I still think FAILBACK_IMMEDIATE makes more sense as a general default. We can add a comment about the vendor recommendations. Martin
