I suspect that the netifd changes are related, since that looks like the 
only relevant area of major activity in the past month when this began 
happening. Then again, the timing-sensitive nature means that the 
underlying problem may have been present for a while, and only exposed 
with the recent netifd changes.

I've noticed that it's possible for the MAC address to change during a 
DHCP client transaction when OpenWrt is configured to obtain a DHCP lease 
on a bridged interface. In my example, I have a WNDR3700 configured as 
follows in /etc/config/network:

config interface lan
        option ifname   eth0.1
        option type     bridge
        option proto    dhcp

wlan0 and wlan1 are also configured to join this bridged interface, both 
configured in /etc/config/wireless with:

config wifi-iface
        option network  lan

eth0.1's MAC address is c6:xx:xx:xx:xx:01 (with the locally-administered 
[LA] bit set). wlan0's is c4:xx:xx:xx:xx:01, and wlan1's is 
c4:xx:xx:xx:xx:03. On the DHCP server (also OpenWrt running dnsmasq), I 
observe the DHCP transaction when the WNDR3700 boots:

Jun 26 12:21:57 gw1 daemon.info dnsmasq-dhcp[2285]: DHCPDISCOVER(br-lan) 
c6:xx:xx:xx:xx:01
Jun 26 12:21:57 gw1 daemon.info dnsmasq-dhcp[2285]: DHCPOFFER(br-lan) 
192.168.1.211 c6:xx:xx:xx:xx:01
Jun 26 12:21:57 gw1 daemon.info dnsmasq-dhcp[2285]: DHCPREQUEST(br-lan) 
192.168.1.211 c4:xx:xx:xx:xx:01
Jun 26 12:21:57 gw1 daemon.info dnsmasq-dhcp[2285]: DHCPACK(br-lan) 
192.168.1.211 c4:xx:xx:xx:xx:01

Note that the MAC address changes from having the LA bit set 
(c6:xx:xx:xx:xx:01) when the client sends its DHCPDISCOVER to not having 
the LA bit set (c6:xx:xx:xx:xx:01) when it sends its DHCPREQUEST. This is 
because the br-lan interface begins life taking the MAC address of its 
only initial underlying interface, eth0.1, which has the LA bit set. When 
wlan0 is added to this bridge, the bridge winds up taking that interface's 
MAC address (c4:xx:xx:xx:xx:01) instead. I have observed that this occurs 
repeatably immediately in the middle of the DHCP transaction intended to 
assign the interface's address. The logs from the client side confirm 
this:

Dec 31 19:00:12 ap2 kern.debug kernel: [   12.890000] ar71xx: pll_reg 
0xb8050010: 0x11110000
Dec 31 19:00:12 ap2 kern.info kernel: [   12.890000] eth0: link up 
(1000Mbps/Full duplex)
Dec 31 19:00:12 ap2 kern.info kernel: [   12.890000] device eth0.1 entered 
promiscuous mode
Dec 31 19:00:12 ap2 kern.info kernel: [   12.900000] device eth0 entered 
promiscuous mode
Dec 31 19:00:12 ap2 kern.info kernel: [   12.920000] br-lan: port 1(eth0.1) 
entered forwarding state
Dec 31 19:00:12 ap2 kern.info kernel: [   12.920000] br-lan: port 1(eth0.1) 
entered forwarding state
Dec 31 19:00:13 ap2 daemon.notice netifd: lan (628): udhcpc (v1.19.4) started
Dec 31 19:00:13 ap2 daemon.notice netifd: lan (628): Sending discover...
Dec 31 19:00:14 ap2 kern.info kernel: [   14.700000] device wlan0 entered 
promiscuous mode
Dec 31 19:00:14 ap2 kern.info kernel: [   14.920000] br-lan: port 1(eth0.1) 
entered forwarding state
Dec 31 19:00:14 ap2 kern.info kernel: [   14.960000] br-lan: port 2(wlan0) 
entered forwarding state
Dec 31 19:00:14 ap2 kern.info kernel: [   14.960000] br-lan: port 2(wlan0) 
entered forwarding state
Dec 31 19:00:15 ap2 daemon.notice netifd: lan (628): Sending select for 
192.168.1.211...
Dec 31 19:00:15 ap2 daemon.notice netifd: lan (628): Lease of 192.168.1.211 
obtained, lease time 43200
Dec 31 19:00:15 ap2 daemon.notice netifd: Interface 'lan' is now up
Jun 26 12:21:58 ap2 kern.info kernel: [   16.960000] br-lan: port 2(wlan0) 
entered forwarding state
Jun 26 12:21:58 ap2 kern.info kernel: [   17.480000] device wlan1 entered 
promiscuous mode
Jun 26 12:22:01 ap2 kern.info kernel: [   20.250000] br-lan: port 3(wlan1) 
entered forwarding state
Jun 26 12:22:01 ap2 kern.info kernel: [   20.260000] br-lan: port 3(wlan1) 
entered forwarding state
Jun 26 12:22:03 ap2 kern.info kernel: [   22.260000] br-lan: port 3(wlan1) 
entered forwarding state

Using a variable hardware address for a DHCP transaction is no good. 
Having the hardware address be unpredicatable is also a problem. I 
discovered this problem when debugging why a static DHCP address 
assignment ("config host" in /etc/config/dhcp on the server) was not 
effective. If I use c6:xx:xx:xx:xx:01 on the server, then the client won't 
be able to DHCPREQUST the desired address once it begins using MAC address 
c4:xx:xx:xx:xx:01. If I use c4:xx:xx:xx:xx:01 on the server, then the 
server won't send a DHCPOFFER for the desired address because the 
DHCPREQUEST will have a different MAC address.

Ultimately, it may just be a bad idea to have a bridge's MAC address 
change once established, at least as long as the bridge still contains an 
underlying interface that "owns" the MAC address it's using.
_______________________________________________
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel

Reply via email to