Jay Vosburgh wrote:
> Moni Shoua <[EMAIL PROTECTED]> wrote:
> 
>> Delay sending a gratuitous_arp when LINK_STATE_LINKWATCH_PENDING bit
>> in dev->state field is on. This improves the chances for the arp packet to
>> be transmitted.
> 
>       Under what circumstances were you seeing problems that delaying
> the gratuitous ARP until linkwatch is done improves things?  Is this
> really an IB thing, or did you experience problems here over regular
> ethernet?
> 

I tried to figure out what is the difference in the state/flags of the device 
when 
grat. ARP send succeeds and when it fails. I found exact correlation with the 
LINK_STATE_LINKWATCH_PENDING bit on.
I don't think that this is an IB issue but I can't be sure. I didn't run tests
for Ethernet.

>> Signed-off-by: Moni Shoua <[EMAIL PROTECTED]>
>> ---
>> drivers/net/bonding/bond_main.c |   25 +++++++++++++++++++++----
>> drivers/net/bonding/bonding.h   |    1 +
>> 2 files changed, 22 insertions(+), 4 deletions(-)
>>
>> Index: net-2.6/drivers/net/bonding/bond_main.c
>> ===================================================================
>> --- net-2.6.orig/drivers/net/bonding/bond_main.c     2007-07-25 
>> 15:33:25.000000000 +0300
>> +++ net-2.6/drivers/net/bonding/bond_main.c  2007-07-26 18:42:59.296296622 
>> +0300
>> @@ -1134,8 +1134,13 @@ void bond_change_active_slave(struct bon
>>              if (new_active && !bond->do_set_mac_addr)
>>                      memcpy(bond->dev->dev_addr,  new_active->dev->dev_addr,
>>                              new_active->dev->addr_len);
>> -
>> -            bond_send_gratuitous_arp(bond);
>> +            if (bond->curr_active_slave &&
>> +                    test_bit(__LINK_STATE_LINKWATCH_PENDING, 
>> &bond->curr_active_slave->dev->state)){
>> +                    dprintk("delaying gratuitous arp on 
>> %s\n",bond->curr_active_slave->dev->name);
>> +                    bond->send_grat_arp=1;
>> +            }else{
>> +                    bond_send_gratuitous_arp(bond);
>> +            }
> 
>       Style issues throughout the patch series: many lines are too
> long, many things are all smashed together, e.g., "}else{" instead of
> "} else {", "send_grat_arp=1" instead of "send_grat_arp = 1", and so on.
> 
OK thanks. I'll fix and repost.
>>      }
>> }
>>
>> @@ -2120,6 +2125,15 @@ void bond_mii_monitor(struct net_device 
>>       * program could monitor the link itself if needed.
>>       */
>>
>> +    if (bond->send_grat_arp) {
>> +            if (bond->curr_active_slave && 
>> test_bit(__LINK_STATE_LINKWATCH_PENDING, 
>> &bond->curr_active_slave->dev->state))
>> +                    dprintk("Needs to send gratuitous arp but not 
>> yet\n",__FUNCTION__);
>> +            else {
>> +                    dprintk("sending delayed gratuitous arp on 
>> ond->curr_active_slave->dev->name\n");
>> +                    bond_send_gratuitous_arp(bond);
>> +                    bond->send_grat_arp=0;
>> +            }
>> +    }
> 
> 
>>      read_lock(&bond->curr_slave_lock);
>>      oldcurrent = bond->curr_active_slave;
>>      read_unlock(&bond->curr_slave_lock);
>> @@ -2513,6 +2527,7 @@ static void bond_send_gratuitous_arp(str
>>      struct slave *slave = bond->curr_active_slave;
>>      struct vlan_entry *vlan;
>>      struct net_device *vlan_dev;
>> +    int i;
>>
>>      dprintk("bond_send_grat_arp: bond %s slave %s\n", bond->dev->name,
>>                              slave ? slave->dev->name : "NULL");
>> @@ -2520,8 +2535,9 @@ static void bond_send_gratuitous_arp(str
>>              return;
>>
>>      if (bond->master_ip) {
>> -            bond_arp_send(slave->dev, ARPOP_REPLY, bond->master_ip,
>> -                              bond->master_ip, 0);
>> +            for (i=0;i<3;i++)
>> +                    bond_arp_send(slave->dev, ARPOP_REPLY, bond->master_ip,
>> +                                      bond->master_ip, 0);
>>      }
> 
>       If you delay the grat ARP until linkwatch is done, why is it
> also necessary to shotgun several ARPs instead of one?  Why are the ARPs
> sent for VLANs not also shotgunned in a similar fashion?
Besides the linkwatch issue I also noticed that on rare occasions, grat. ARPs
that found their way to the slave's xmit function were not xmitted.
The 3 times send is just an another attempt to improve chances.

I'd like to emphasize here that with IB slaves, grat. ARP is much more crucial 
to 
a successful change of slaves and that was my focus.

>       If shotgunning like this really is useful, would it not make
> more sense to space them out a bit, e.g., over successive monitor
> passes?
> 
I guess you are right about that.
>>      list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
>> @@ -4331,6 +4347,7 @@ static int bond_init(struct net_device *
>>      bond->current_arp_slave = NULL;
>>      bond->primary_slave = NULL;
>>      bond->dev = bond_dev;
>> +    bond->send_grat_arp=0;
>>      INIT_LIST_HEAD(&bond->vlan_list);
>>
>>      /* Initialize the device entry points */
>> Index: net-2.6/drivers/net/bonding/bonding.h
>> ===================================================================
>> --- net-2.6.orig/drivers/net/bonding/bonding.h       2007-07-25 
>> 15:20:10.000000000 +0300
>> +++ net-2.6/drivers/net/bonding/bonding.h    2007-07-26 18:42:43.652087660 
>> +0300
>> @@ -203,6 +203,7 @@ struct bonding {
>>      struct   vlan_group *vlgrp;
>>      struct   packet_type arp_mon_pt;
>>      s8       do_set_mac_addr;
>> +    int      send_grat_arp;
> 
>       This need not be a full int, and (this applies to
> do_set_mac_addr, also) could probably be squeezed into gaps already
> existing within the struct bonding somewhere.
Thanks. Will be fixed.
> 
>       -J
> 
> ---
>       -Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED]
> _______________________________________________
> general mailing list
> [EMAIL PROTECTED]
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to