Jay Vosburgh wrote: > Moni Shoua <[EMAIL PROTECTED]> wrote: > >> Delay sending a gratuitous_arp when LINK_STATE_LINKWATCH_PENDING bit >> in dev->state field is on. This improves the chances for the arp packet to >> be transmitted. > > Under what circumstances were you seeing problems that delaying > the gratuitous ARP until linkwatch is done improves things? Is this > really an IB thing, or did you experience problems here over regular > ethernet? >
I tried to figure out what is the difference in the state/flags of the device when grat. ARP send succeeds and when it fails. I found exact correlation with the LINK_STATE_LINKWATCH_PENDING bit on. I don't think that this is an IB issue but I can't be sure. I didn't run tests for Ethernet. >> Signed-off-by: Moni Shoua <[EMAIL PROTECTED]> >> --- >> drivers/net/bonding/bond_main.c | 25 +++++++++++++++++++++---- >> drivers/net/bonding/bonding.h | 1 + >> 2 files changed, 22 insertions(+), 4 deletions(-) >> >> Index: net-2.6/drivers/net/bonding/bond_main.c >> =================================================================== >> --- net-2.6.orig/drivers/net/bonding/bond_main.c 2007-07-25 >> 15:33:25.000000000 +0300 >> +++ net-2.6/drivers/net/bonding/bond_main.c 2007-07-26 18:42:59.296296622 >> +0300 >> @@ -1134,8 +1134,13 @@ void bond_change_active_slave(struct bon >> if (new_active && !bond->do_set_mac_addr) >> memcpy(bond->dev->dev_addr, new_active->dev->dev_addr, >> new_active->dev->addr_len); >> - >> - bond_send_gratuitous_arp(bond); >> + if (bond->curr_active_slave && >> + test_bit(__LINK_STATE_LINKWATCH_PENDING, >> &bond->curr_active_slave->dev->state)){ >> + dprintk("delaying gratuitous arp on >> %s\n",bond->curr_active_slave->dev->name); >> + bond->send_grat_arp=1; >> + }else{ >> + bond_send_gratuitous_arp(bond); >> + } > > Style issues throughout the patch series: many lines are too > long, many things are all smashed together, e.g., "}else{" instead of > "} else {", "send_grat_arp=1" instead of "send_grat_arp = 1", and so on. > OK thanks. I'll fix and repost. >> } >> } >> >> @@ -2120,6 +2125,15 @@ void bond_mii_monitor(struct net_device >> * program could monitor the link itself if needed. >> */ >> >> + if (bond->send_grat_arp) { >> + if (bond->curr_active_slave && >> test_bit(__LINK_STATE_LINKWATCH_PENDING, >> &bond->curr_active_slave->dev->state)) >> + dprintk("Needs to send gratuitous arp but not >> yet\n",__FUNCTION__); >> + else { >> + dprintk("sending delayed gratuitous arp on >> ond->curr_active_slave->dev->name\n"); >> + bond_send_gratuitous_arp(bond); >> + bond->send_grat_arp=0; >> + } >> + } > > >> read_lock(&bond->curr_slave_lock); >> oldcurrent = bond->curr_active_slave; >> read_unlock(&bond->curr_slave_lock); >> @@ -2513,6 +2527,7 @@ static void bond_send_gratuitous_arp(str >> struct slave *slave = bond->curr_active_slave; >> struct vlan_entry *vlan; >> struct net_device *vlan_dev; >> + int i; >> >> dprintk("bond_send_grat_arp: bond %s slave %s\n", bond->dev->name, >> slave ? slave->dev->name : "NULL"); >> @@ -2520,8 +2535,9 @@ static void bond_send_gratuitous_arp(str >> return; >> >> if (bond->master_ip) { >> - bond_arp_send(slave->dev, ARPOP_REPLY, bond->master_ip, >> - bond->master_ip, 0); >> + for (i=0;i<3;i++) >> + bond_arp_send(slave->dev, ARPOP_REPLY, bond->master_ip, >> + bond->master_ip, 0); >> } > > If you delay the grat ARP until linkwatch is done, why is it > also necessary to shotgun several ARPs instead of one? Why are the ARPs > sent for VLANs not also shotgunned in a similar fashion? Besides the linkwatch issue I also noticed that on rare occasions, grat. ARPs that found their way to the slave's xmit function were not xmitted. The 3 times send is just an another attempt to improve chances. I'd like to emphasize here that with IB slaves, grat. ARP is much more crucial to a successful change of slaves and that was my focus. > If shotgunning like this really is useful, would it not make > more sense to space them out a bit, e.g., over successive monitor > passes? > I guess you are right about that. >> list_for_each_entry(vlan, &bond->vlan_list, vlan_list) { >> @@ -4331,6 +4347,7 @@ static int bond_init(struct net_device * >> bond->current_arp_slave = NULL; >> bond->primary_slave = NULL; >> bond->dev = bond_dev; >> + bond->send_grat_arp=0; >> INIT_LIST_HEAD(&bond->vlan_list); >> >> /* Initialize the device entry points */ >> Index: net-2.6/drivers/net/bonding/bonding.h >> =================================================================== >> --- net-2.6.orig/drivers/net/bonding/bonding.h 2007-07-25 >> 15:20:10.000000000 +0300 >> +++ net-2.6/drivers/net/bonding/bonding.h 2007-07-26 18:42:43.652087660 >> +0300 >> @@ -203,6 +203,7 @@ struct bonding { >> struct vlan_group *vlgrp; >> struct packet_type arp_mon_pt; >> s8 do_set_mac_addr; >> + int send_grat_arp; > > This need not be a full int, and (this applies to > do_set_mac_addr, also) could probably be squeezed into gaps already > existing within the struct bonding somewhere. Thanks. Will be fixed. > > -J > > --- > -Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED] > _______________________________________________ > general mailing list > [EMAIL PROTECTED] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html