Am Montag 07 November 2005 18:45 schrieb Thomas Graf: > I'm sorry, I don't get the point. I assume the above is not a carrier > failure but something else to be detected by a keepalive protocol.
No, it's a carrier failure. Try this, eth1 is an ethernet interface without the cable connected: dose:~ # ip addr add 192.168.200.1/24 dev eth1 dose:~ # ip link set eth1 up dose:~ # ifconfig eth1 | fgrep UP UP BROADCAST MULTICAST MTU:1500 Metric:1 dose:~ # ip route show 192.168.200.0/24 192.168.200.0/24 dev eth1 proto kernel scope link src 192.168.200.1 dose:~ # ping 192.168.200.10 PING 192.168.200.10 (192.168.200.10) 56(84) bytes of data. >From 192.168.200.1: icmp_seq=2 Destination Host Unreachable dose:~ # arp 192.168.200.10 (incomplete) eth1 Even though the interface is not IFF_RUNNING and queueing is therefore disabled, the kernel created a route pointing to it and uses it. > Isn't it the routing daemon's fault when preferring a route > which has the IFF_RUNNING flag cleared? I'm sorry fot not getting it. ;-) As the kernel maintains this connected route, userspace is IMHO not responsible. Quagga could add its routes with a higher metric (administrative distance in cisco terms), kernel should use them as soon as the connected route becomes unavailable due to carrier failure (or dormant for wireless interfaces). And now if we support this, we'd need a dormant state with the route disabled and one with it enabled. Btw, adding a userspace workaround would be dangerous. If the routing daemon crashes while a link is down, the userspace removed connected route would not come back, leaving the router unreachable on this interface. *Very* bad. > > OTOH, we don't need to be completely atomic as the > > netif_carrier_*-functions already require driver controlled > > synchronisation. We just need to make sure that the caches are coherent > > before linkwatch kernel thread runs. > > The reason we now have netif_carrier_* instead of |= IFF_RUNNING as it > used to be is exactly atomicty. Since you write to oper_state from > various locations the operation must be atomic to avoid corruption. No, it's because the IFF_* flags are a bit field that can only be changed from process context protected by rtnl/dev_base_lock, different to __LINK_STATE_NO_CARRIER (or a operational state field) that must be accessible from interrupts. > > No, !netif_running() is ADMIN down, mostly representing the IFF_UP flag. > > Right but the RFC specifies that admin. down implies operational down: Indeed, but we should not set OPER_DOWN if and only if admin is down ;-) > This also describes why your scheme cannot work, we have to memorize > the status of for example carrier state. Can't see that requirement from RFC, and devices are normally initialized in their open()-"methods". > We're not far apart, the difference is for the additional L3 disabled > dormant state which I don't understand yet and secondaly that I continue > to keep the current link states with the addition of a new dormant state > and then translate it into the RFC2863 operational status. Ok, so I hope a finally managed to make my point clear about the different dormant states ;-) Stefan - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html