On 16/12/17 12:40, Nikolay Aleksandrov wrote: > On 16/12/17 11:29, Nikolay Aleksandrov wrote: >> On 16/12/17 11:17, Nikolay Aleksandrov wrote: >>> On 16/12/17 02:37, Andrei Vagin wrote: >>>> Hi, >>>> >>>> We run criu tests for linux-next and today we get this bug: >>>> >>>> The kernel version is 4.15.0-rc3-next-20171215 >>>> >>>> [ 235.397328] BUG: unable to handle kernel NULL pointer dereference >>>> at 000000000000000c >>>> [ 235.398624] IP: fdb_find_rcu+0x3c/0x130 >>> [snip] >>> >>> Hi, >>> Thanks for the report, I've missed the changelink before dev creation case >>> when I did >> >> err, s/changelink/br_stp_change_bridge_id/ >> the other options are set after register_netdevice, this is the only one >> changed before >> >>> the rhashtable conversion, some of the options do fdb lookups as part of >>> their routine >>> but we don't have the table initialized yet at that point. >>> I'll send a fix after some testing. >>> >>> Thanks, >>> Nik >>> >>> >> > > We need to fix this in -net, it has a memory leak that has existed since the > introduction of br_stp_change_bridge_id() before register_netdevice because > it adds an fdb entry which never gets deleted if an error happens, also the > notifications for that fdb entry come with ifindex = 0 because the bridge > netdev > doesn't exist yet. All of that looks wrong, I'll send a fix for -net to move > the bridge id change after the netdev register and cleanup any bridge fdbs > on error. > > The commit with that change is: > 30313a3d5794 ("bridge: Handle IFLA_ADDRESS correctly when creating bridge > device") > Before the changelink in while doing newlink in bridge was possible, this > would happen > only on netdev register fail, but now it is much easier to trigger (as below) > since > changelink can fail if called with wrong arguments. >
One more thing before sending the patch, the actual commit that introduced the fdb insert in br_stp_change_bridge_id() is: commit a4b816d8ba1c Author: Toshiaki Makita <makita.toshi...@lab.ntt.co.jp> Date: Fri Feb 7 16:48:21 2014 +0900 bridge: Change local fdb entries whenever mac address of bridge device changes So that is what we need to fix.