No problem Ethan. I just sent a v2 of the patch with a more elaborate commit message. You can ignore that as we can continue to use this thread for the discussion. My comments are inline
On 7/8/13 2:04 PM, "Ethan Jackson" <et...@nicira.com> wrote: >Sorry it took me so long to get back to you on this, been busy with >the holiday and what not. > >> Thanks for the quick response. The primary driver for this patch is >> providing backwards compatibility. >> >> In the upcoming XenServer release we are updating OpenVSwitch from 1.4.2 >> to 1.9.x. We found this change in LACP behavior while testing 1.9.x. We >> would like to continue to provide the 1.4.2 behavior for XenServer >> customers upgrading to the version containing 1.9.x. By making it >> configurable, it allows for changing it as needed. The default in the >> patch is to provide the current 1.9.x behavior. The default from >>XenServer >> commands is to provide 1.4.2 behavior so there is no change on upgrade. > >I understand that OVS 1.4.2 had different behavior, but this doesn't >really answer my original question. Does this feature really make >sense? Do you think there exists real XenServer customers who >explicitly configured LACP on their hypervisors but not on their >upstream switches? Even if there are some, wouldn't a release note to >the effect of "make sure LACP is off unless you need it" would be >sufficient? What's the actual use case beyond "it once worked >differently"? We do not have any known usecases from XenServer customers where this is needed. We also do not expect a customer to explicitly configure this way. However, we want to avoid the case of the customer having configured this way with proper connectivity that breaks on upgrade. > >Assuming there's a satisfying response to the above, we'd want the >default to be the current 1.9 behavior. You can set the option in >XenServer if you think it's appropriate there. The default in this patch going into openvswitch code base is 1.9 behavior. > >> We internally had discussed options to make the fallback method as >> Active-Backup or user-configurable. In the interest of keeping the >>change >> small and retaining 1.4.2 behavior we reverted to SLB with a flag check. >> Fallback to SLB in this patch is mostly existing (1.4.2) code. > >Active backup is significantly safer than SLB, so if we're falling >back as a convenience to users, I feel pretty strongly that it's a >better choice. If they want SLB they can configure it through the >proper channels. I believe we do not have any objections to making the fallback as Active-Backup. We can re-work the patch to make AB as the fallback. Thanks, Ravi. > >Ethan > >> On 6/28/13 1:04 PM, "Ethan Jackson" <et...@nicira.com> wrote: >> >>>This doesn't constitute a full review, but I have a couple of comments. >>> >>>Can you describe a realistic situation in which a user configured >>>their hypervisor to run LACP, the negotiation failed, and the correct >>>behavior is to forward traffic despite a clear problem in their >>>network config? At best, the upstream switch is dead or missing, at >>>worst it's completely misconfigured and we could be doing more harm >>>than good. In short, I don't understand the use case for this option. >>> >>>Assuming we decide to optionally fall back. I'd prefer we fall back >>>to active-backup instead of SLB which is far more dangerous. >>> >>>Please add your signed-off-by to this patch. >>> >>>Ethan >>> >>> >>> >>>On Fri, Jun 28, 2013 at 9:58 AM, Ravi Kondamuru >>><ravi.kondam...@citrix.com> wrote: >>>> Commit bdebeece5 (lacp: Require successful LACP negotiations when >>>> configured.) makes successful LACP negotiation mandatory. >>>> This change makes it configurable. The user has the option to >>>> set other-config:lacp-fallback-slb to true if pre 1.5.0 behavior >>>> is needed. The switch will fallback to balance-slb when LACP >>>> negotiation fails or is unsupported by the partner switch. The >>>> default is false, requiring a successful LACP negotiation (which is >>>> the current behavior). >>>> >>>> Signed-off-by: Dominic Curran <dominic.cur...@citrix.com> >>>> --- >>>> lib/bond.c | 41 +++++++++++++++++++++++++++-------------- >>>> lib/bond.h | 1 + >>>> lib/lacp.c | 15 +++++++++++++-- >>>> lib/lacp.h | 1 + >>>> vswitchd/bridge.c | 17 +++++++++++++++++ >>>> vswitchd/vswitch.xml | 19 +++++++++++++++++-- >>>> 6 files changed, 76 insertions(+), 18 deletions(-) >>>> >>>> diff --git a/lib/bond.c b/lib/bond.c >>>> index 68ac068..6365cd0 100644 >>>> --- a/lib/bond.c >>>> +++ b/lib/bond.c >>>> @@ -103,6 +103,7 @@ struct bond { >>>> >>>> /* Legacy compatibility. */ >>>> long long int next_fake_iface_update; /* LLONG_MAX if disabled. >>>>*/ >>>> + bool lacp_fallback_slb; /* fallback to balance-slb on lacp >>>>failure >>>>*/ >>>> >>>> /* Tag set saved for next bond_run(). This tag set is a kluge >>>>for >>>>cases >>>> * where we can't otherwise provide revalidation feedback to the >>>>client. >>>> @@ -238,6 +239,11 @@ bond_reconfigure(struct bond *bond, const struct >>>>bond_settings *s) >>>> bond->updelay = s->up_delay; >>>> bond->downdelay = s->down_delay; >>>> >>>> + if (bond->lacp_fallback_slb != s->lacp_fallback_slb_cfg) { >>>> + bond->lacp_fallback_slb = s->lacp_fallback_slb_cfg; >>>> + revalidate = true; >>>> + } >>>> + >>>> if (bond->rebalance_interval != s->rebalance_interval) { >>>> bond->rebalance_interval = s->rebalance_interval; >>>> revalidate = true; >>>> @@ -463,8 +469,9 @@ bond_wait(struct bond *bond) >>>> static bool >>>> may_send_learning_packets(const struct bond *bond) >>>> { >>>> - return bond->lacp_status == LACP_DISABLED >>>> - && (bond->balance == BM_SLB || bond->balance == BM_AB) >>>> + return ((bond->lacp_status == LACP_DISABLED >>>> + && (bond->balance == BM_SLB || bond->balance == BM_AB)) >>>> + || (bond->lacp_fallback_slb && bond->lacp_status == >>>>LACP_CONFIGURED)) >>>> && bond->active_slave; >>>> } >>>> >>>> @@ -546,10 +553,12 @@ bond_check_admissibility(struct bond *bond, >>>>const >>>>void *slave_, >>>> * packets to them. >>>> * >>>> * If LACP is configured, but LACP negotiations have been >>>>unsuccessful, we >>>> - * drop all incoming traffic. */ >>>> + * drop all incoming traffic except if lacp_fallback_slb is true. >>>>*/ >>>> switch (bond->lacp_status) { >>>> case LACP_NEGOTIATED: return slave->enabled ? BV_ACCEPT : >>>>BV_DROP; >>>> - case LACP_CONFIGURED: return BV_DROP; >>>> + case LACP_CONFIGURED: >>>> + if (!bond->lacp_fallback_slb) >>>> + return BV_DROP; >>>> case LACP_DISABLED: break; >>>> } >>>> >>>> @@ -578,9 +587,11 @@ bond_check_admissibility(struct bond *bond, const >>>>void *slave_, >>>> >>>> case BM_TCP: >>>> /* TCP balanced bonds require successful LACP negotiated. >>>>Based on the >>>> - * above check, LACP is off on this bond. Therfore, we drop >>>>all >>>> - * incoming traffic. */ >>>> - return BV_DROP; >>>> + * above check, LACP is off or lacp_fallback_slb is true on >>>>this bond. >>>> + * If lacp_fallback_slb is true fall through to BM_SLB case >>>>else, we >>>> + * drop all incoming traffic. */ >>>> + if (!bond->lacp_fallback_slb) >>>> + return BV_DROP; >>>> >>>> case BM_SLB: >>>> /* Drop all packets for which we have learned a different >>>>input port, >>>> @@ -1359,9 +1370,9 @@ choose_output_slave(const struct bond *bond, >>>>const struct flow *flow, >>>> { >>>> struct bond_entry *e; >>>> >>>> - if (bond->lacp_status == LACP_CONFIGURED) { >>>> + if (bond->lacp_status == LACP_CONFIGURED && >>>>!bond->lacp_fallback_slb) { >>>> /* LACP has been configured on this bond but negotiations >>>>were >>>> - * unsuccussful. Drop all traffic. */ >>>> + * unsuccussful and lacp_fallback_slb not true. Drop all >>>>traffic. */ >>>> return NULL; >>>> } >>>> >>>> @@ -1370,13 +1381,15 @@ choose_output_slave(const struct bond *bond, >>>>const struct flow *flow, >>>> return bond->active_slave; >>>> >>>> case BM_TCP: >>>> - if (bond->lacp_status != LACP_NEGOTIATED) { >>>> - /* Must have LACP negotiations for TCP balanced bonds. */ >>>> + if (bond->lacp_status == LACP_NEGOTIATED) { >>>> + if (wc) >>>> + flow_mask_hash_fields(wc, >>>>NX_HASH_FIELDS_SYMMETRIC_L4); >>>> + } else if (!((bond->lacp_status == LACP_CONFIGURED) >>>> + && bond->lacp_fallback_slb)) { >>>> + /* Must have LACP negotiations for TCP balanced bonds or >>>>LACP >>>> + * Configured with lacp_fallback_slb set to true. */ >>>> return NULL; >>>> } >>>> - if (wc) { >>>> - flow_mask_hash_fields(wc, NX_HASH_FIELDS_SYMMETRIC_L4); >>>> - } >>>> /* Fall Through. */ >>>> case BM_SLB: >>>> if (wc) { >>>> diff --git a/lib/bond.h b/lib/bond.h >>>> index 306cf42..f466383 100644 >>>> --- a/lib/bond.h >>>> +++ b/lib/bond.h >>>> @@ -54,6 +54,7 @@ struct bond_settings { >>>> >>>> /* Legacy compatibility. */ >>>> bool fake_iface; /* Update fake stats for netdev >>>>'name'? */ >>>> + bool lacp_fallback_slb_cfg; /* fallback to balance-slb on >>>>lacp failure */ >>>> }; >>>> >>>> /* Program startup. */ >>>> diff --git a/lib/lacp.c b/lib/lacp.c >>>> index 8bc115d..e2b6e91 100644 >>>> --- a/lib/lacp.c >>>> +++ b/lib/lacp.c >>>> @@ -100,6 +100,7 @@ struct lacp { >>>> bool fast; /* True if using fast probe interval. */ >>>> bool negotiated; /* True if LACP negotiations were >>>>successful. */ >>>> bool update; /* True if lacp_update() needs to be >>>>called. */ >>>> + bool fallback_slb; /* fallback to balance-slb on lacp >>>>failure */ >>>> }; >>>> >>>> struct slave { >>>> @@ -238,6 +239,11 @@ lacp_configure(struct lacp *lacp, const struct >>>>lacp_settings *s) >>>> >>>> lacp->active = s->active; >>>> lacp->fast = s->fast; >>>> + >>>> + if (lacp->fallback_slb != s->fallback_slb_cfg) { >>>> + lacp->fallback_slb = s->fallback_slb_cfg; >>>> + lacp->update = true; >>>> + } >>>> } >>>> >>>> /* Returns true if 'lacp' is configured in active mode, false if >>>>'lacp' is >>>> @@ -366,7 +372,9 @@ slave_may_enable__(struct slave *slave) >>>> { >>>> /* The slave may be enabled if it's attached to an aggregator and >>>>its >>>> * partner is synchronized.*/ >>>> - return slave->attached && (slave->partner.state & >>>>LACP_STATE_SYNC); >>>> + return slave->attached && (slave->partner.state & LACP_STATE_SYNC >>>> + || (slave->lacp && slave->lacp->fallback_slb >>>> + && slave->status == LACP_DEFAULTED)); >>>> } >>>> >>>> /* This function should be called before enabling 'slave_' to send or >>>>receive >>>> @@ -483,6 +491,8 @@ lacp_update_attached(struct lacp *lacp) >>>> } >>>> >>>> if (slave->status == LACP_DEFAULTED) { >>>> + if (lacp->fallback_slb) >>>> + slave->attached = true; >>>> continue; >>>> } >>>> >>>> @@ -499,7 +509,8 @@ lacp_update_attached(struct lacp *lacp) >>>> >>>> if (lead) { >>>> HMAP_FOR_EACH (slave, node, &lacp->slaves) { >>>> - if (lead->partner.key != slave->partner.key >>>> + if ((lacp->fallback_slb && slave->status == >>>>LACP_DEFAULTED) >>>> + || lead->partner.key != slave->partner.key >>>> || !eth_addr_equals(lead->partner.sys_id, >>>> slave->partner.sys_id)) { >>>> slave->attached = false; >>>> diff --git a/lib/lacp.h b/lib/lacp.h >>>> index 399b39e..21f2290 100644 >>>> --- a/lib/lacp.h >>>> +++ b/lib/lacp.h >>>> @@ -35,6 +35,7 @@ struct lacp_settings { >>>> uint16_t priority; /* System priority. */ >>>> bool active; /* Active or passive mode? */ >>>> bool fast; /* Fast or slow probe interval. >>>>*/ >>>> + bool fallback_slb_cfg; /* fallback to BM_SLB on lacp >>>>failure */ >>>> }; >>>> >>>> void lacp_init(void); >>>> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c >>>> index 4ac2b26..4c9e698 100644 >>>> --- a/vswitchd/bridge.c >>>> +++ b/vswitchd/bridge.c >>>> @@ -234,6 +234,7 @@ static struct lacp_settings >>>>*port_configure_lacp(struct port *, >>>> struct lacp_settings >>>>*); >>>> static void port_configure_bond(struct port *, struct bond_settings >>>>*); >>>> static bool port_is_synthetic(const struct port *); >>>> +static bool set_lacp_fallback_slb_cfg(struct port *); >>>> >>>> static void reconfigure_system_stats(const struct ovsrec_open_vswitch >>>>*); >>>> static void run_system_stats(void); >>>> @@ -3264,6 +3265,17 @@ enable_lacp(struct port *port, bool *activep) >>>> } >>>> } >>>> >>>> +static bool set_lacp_fallback_slb_cfg(struct port *port) >>>> +{ >>>> + const char *lacp_fallback_s; >>>> + >>>> + lacp_fallback_s = smap_get(&port->cfg->other_config, >>>>"lacp-fallback-slb"); >>>> + if (lacp_fallback_s && !strcmp(lacp_fallback_s, "true")) >>>> + return true; >>>> + >>>> + return false; >>>> +} >>>> + >>>> static struct lacp_settings * >>>> port_configure_lacp(struct port *port, struct lacp_settings *s) >>>> { >>>> @@ -3302,6 +3314,9 @@ port_configure_lacp(struct port *port, struct >>>>lacp_settings *s) >>>> >>>> lacp_time = smap_get(&port->cfg->other_config, "lacp-time"); >>>> s->fast = lacp_time && !strcasecmp(lacp_time, "fast"); >>>> + >>>> + s->fallback_slb_cfg = set_lacp_fallback_slb_cfg(port); >>>> + >>>> return s; >>>> } >>>> >>>> @@ -3390,6 +3405,8 @@ port_configure_bond(struct port *port, struct >>>>bond_settings *s) >>>> >>>> s->fake_iface = port->cfg->bond_fake_iface; >>>> >>>> + s->lacp_fallback_slb_cfg = set_lacp_fallback_slb_cfg(port); >>>> + >>>> LIST_FOR_EACH (iface, port_elem, &port->ifaces) { >>>> netdev_set_miimon_interval(iface->netdev, miimon_interval); >>>> } >>>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml >>>> index 12780d6..2e894c5 100644 >>>> --- a/vswitchd/vswitch.xml >>>> +++ b/vswitchd/vswitch.xml >>>> @@ -925,7 +925,9 @@ >>>> >>>> <p> >>>> The following modes require the upstream switch to support >>>>802.3ad with >>>> - successful LACP negotiation: >>>> + successful LACP negotiation. If LACP negotiation fails and >>>> + other-config:lacp-fallback-slb is true, then >>>><code>balance-slb</code> >>>> + style flow hashing is used: >>>> </p> >>>> >>>> <dl> >>>> @@ -1015,7 +1017,8 @@ >>>> in LACP negotiations initiated by a remote switch, but not >>>>allowed to >>>> initiate such negotiations themselves. If LACP is enabled >>>>on a port >>>> whose partner switch does not support LACP, the bond will >>>>be >>>> - disabled. Defaults to <code>off</code> if unset. >>>> + disabled, unless other-config:lacp-fallback-slb is set to >>>>true. >>>> + Defaults to <code>off</code> if unset. >>>> </column> >>>> >>>> <column name="other_config" key="lacp-system-id"> >>>> @@ -1043,6 +1046,18 @@ >>>> rate of once every 30 seconds. >>>> </p> >>>> </column> >>>> + >>>> + <column name="other_config" key="lacp-fallback-slb" >>>> + type='{"type": "boolean"}'> >>>> + <p> >>>> + Determines the behavior of openvswitch bond in LACP mode. >>>>If >>>> + the partner switch does not support LACP, setting this >>>>option >>>> + to <code>true</code> allows openvswitch to fallback to >>>> + balance-slb. If the option is set to <code>false</code>, >>>>the >>>> + bond will be disabled. In both the cases, once the >>>>partner >>>>switch >>>> + is configured to LACP mode then the bond will use LACP. >>>> + </p> >>>> + </column> >>>> </group> >>>> >>>> <group title="Rebalancing Configuration"> >>>> -- >>>> 1.7.2.5 >>>> >>>> _______________________________________________ >>>> dev mailing list >>>> dev@openvswitch.org >>>> http://openvswitch.org/mailman/listinfo/dev >> _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev