Andi,

On Mon, Sep 04, 2000 at 10:45:06AM +0200, Andi Kleen wrote:
> On Mon, Sep 04, 2000 at 10:22:42AM +0800, Andrey Savochkin wrote:
> > Andi, there may be two reasons of this behavior:
> > 1. skb that triggered ARP request had a.b.c.1 source, either because
> >    a) the socket had been bound to that address, or
> >    b) preferred source in the routing table is wrong;
> > 2. the request source address was selected basing on interface address list,
> >    and produced a wrong result.
> > I would say that 1b case is the least likely for me.
> > If the reason of this behavior is 1a or 2, it's a kernel bug in my opinion.
> 
> The prefered source address is usually 0 in the fib.

I can't say for sure, I never add routing table entries without prefsrc :-)

> 
> The problem is likely that ip_route_output_slow() does never passes in the
> daddr into inet_select_addr(), so it does not even have the necessary 
> information.
> 
> I'm not sure if it is worth fixing though.

Well, let's put aside the preferred source.
What if the application explicitly bound the socket to a.b.c.1?
The ARP request to d.e.f.2 will still have a.b.c.1 as the source.
Do you still think that it's ok?
d.e.f.2 may not have a clue about a.b.c.1 reachable by the same media.
I think that it should be fixed.

>From IP point of view, the configuration where not all nodes are aware of the
topology is not so good, but acceptable and working (using asymmetric path for
different directions).  However, this ARP issue make the configuration
completely non-working.  Packets just can't go because link-level address
resolution doesn't work.

So, I think that we have to be sure that we use the "best" address for this
destination.
What about an unconditional use of inet_select_addr() or fib_select_addr()
based on prefsrc with inet_select_addr() fallback?

        Andrey
--- linux-lt-2.4.0-test7.prev/include/net/route.h.route Thu Aug  3 16:47:58 2000
+++ linux-lt-2.4.0-test7.prev/include/net/route.h       Fri Aug 25 15:31:09 2000
@@ -106,6 +106,7 @@
 extern void            ip_rt_send_redirect(struct sk_buff *skb);
 
 extern unsigned                inet_addr_type(u32 addr);
+extern u32             fib_select_addr(struct net_device *, u32 dst, int scope);
 extern void            ip_rt_multicast_event(struct in_device *);
 extern int             ip_rt_ioctl(unsigned int cmd, void *arg);
 extern void            ip_rt_get_source(u8 *src, struct rtable *rt);
--- linux-lt-2.4.0-test7.prev/net/ipv4/arp.c.route      Thu Aug 10 11:42:11 2000
+++ linux-lt-2.4.0-test7.prev/net/ipv4/arp.c    Fri Aug 25 15:31:09 2000
@@ -330,10 +330,7 @@
        u32 target = *(u32*)neigh->primary_key;
        int probes = atomic_read(&neigh->probes);
 
-       if (skb && inet_addr_type(skb->nh.iph->saddr) == RTN_LOCAL)
-               saddr = skb->nh.iph->saddr;
-       else
-               saddr = inet_select_addr(dev, target, RT_SCOPE_LINK);
+       saddr = fib_select_addr(dev, target, RT_SCOPE_LINK);
 
        if ((probes -= neigh->parms->ucast_probes) < 0) {
                if (!(neigh->nud_state&NUD_VALID))
--- linux-lt-2.4.0-test7.prev/net/ipv4/fib_frontend.c.route     Thu Dec 23 11:55:38 
1999
+++ linux-lt-2.4.0-test7.prev/net/ipv4/fib_frontend.c   Fri Aug 25 15:31:10 2000
@@ -30,6 +30,7 @@
 #include <linux/in.h>
 #include <linux/inet.h>
 #include <linux/netdevice.h>
+#include <linux/inetdevice.h>
 #include <linux/if_arp.h>
 #include <linux/proc_fs.h>
 #include <linux/skbuff.h>
@@ -180,6 +201,26 @@
        return ret;
 }
 
+u32 fib_select_addr(struct net_device *dev, u32 dst, int scope)
+{
+       struct rt_key           key;
+       struct fib_result       res;
+       u32                     ret;
+
+       memset(&key, 0, sizeof(key));
+       key.src = dst;
+       key.dst = dst;
+       key.oif = dev->ifindex;
+       key.scope = scope;
+       
+       if (fib_lookup(&key, &res) == 0) {
+               ret = FIB_RES_PREFSRC(res);
+               fib_res_put(&res);
+       } else
+               ret = inet_select_addr(dev, dst, scope);
+       return ret;
+}
+
 /* Given (packet source, input interface) and optional (dst, oif, tos):
    - (main) check, that source is valid i.e. not broadcast or our local
      address.

Reply via email to