On Fri, Jul 29, 2016 at 7:12 AM, David Ahern <d...@cumulusnetworks.com> wrote: > On 7/28/16 10:20 PM, David Miller wrote: >> >> From: John Stultz <john.stu...@linaro.org> >> Date: Thu, 28 Jul 2016 21:18:16 -0700 >> >>> After moving my HiKey tree to pre-v4.8-rc, I noticed when using >>> Android that I was getting routing errors after toggling networking on >>> and off (or entering suspend). Wifi associated, but I got some >>> rounting errors in the logcat the connection manager wouldn't detect a >>> valid network. >>> >>> Not being able to figure out exactly what was going wrong from the >>> userspace side, I bisected (manually rebasing a 70 patch stack each >>> step :P) it down and it seems that the commit: 153380ec4b9b >>> ("fib_rules: Added NLM_F_EXCL support to fib_nl_newrule") is causing >>> the problem. >>> >>> Reverting that patch seems to make things work again. >>> >>> I'm no networking guru, but I'm happy to help debug this further if >>> folks can walk me through it a bit. >> >> >> It simply sounds like Android's userspace is specifying NLM_F_EXCL when >> it shouldn't be during FIB rule netlink operations. >> > > I take Android userspace inserts the same rule multiple times? (ip rule ls)
With the patch reverted, and the system working, I see: # ip rule ls 0: from all lookup local 10000: from all fwmark 0xc0000/0xd0000 lookup legacy_system 13000: from all fwmark 0x10063/0x1ffff lookup local_network 13000: from all fwmark 0x10065/0x1ffff lookup wlan0 14000: from all oif wlan0 lookup wlan0 14000: from all oif wlan0 lookup wlan0 15000: from all fwmark 0x0/0x10000 lookup legacy_system 16000: from all fwmark 0x0/0x10000 lookup legacy_network 17000: from all fwmark 0x0/0x10000 lookup local_network 19000: from all fwmark 0x64/0x1ffff lookup wlan0 19000: from all fwmark 0x65/0x1ffff lookup wlan0 22000: from all fwmark 0x0/0xffff lookup wlan0 32000: from all unreachable With the patch applied, and after toggling wifi, when I see the problem: # ip rule ls 0: from all lookup local 10000: from all fwmark 0xc0000/0xd0000 lookup legacy_system 13000: from all fwmark 0x10063/0x1ffff lookup local_network 13000: from all fwmark 0x10065/0x1ffff lookup wlan0 14000: from all oif wlan0 lookup wlan0 15000: from all fwmark 0x0/0x10000 lookup legacy_system 16000: from all fwmark 0x0/0x10000 lookup legacy_network 17000: from all fwmark 0x0/0x10000 lookup local_network 19000: from all fwmark 0x64/0x1ffff lookup wlan0 32000: from all unreachable > If so and multiple components expect to manage their own 'copy' of the rule > they will need to remove the NLM_F_EXCL flag. Adding more networky Android folks to the CC. thanks -john