Hi Felix, Applied the new ubus patch and it solves the netifd crash. As can be seen in the traces below the ubus requests are deferred as netifd_reload is called when config_init_all is terminated.
Thanks for the patch, Hans Jun 25 09:53:53 OpenWrt daemon.notice netifd: config_init_all : Enter Jun 25 09:53:53 OpenWrt daemon.notice netifd: Create interface 'loopback' Jun 25 09:53:53 OpenWrt daemon.notice netifd: Create simple device 'lo' Jun 25 09:53:53 OpenWrt daemon.notice netifd: Initialize device 'lo' Jun 25 09:53:53 OpenWrt daemon.notice netifd: Network device 'lo' is now present Jun 25 09:53:53 OpenWrt daemon.notice netifd: Add user for device 'lo', refcount=1 Jun 25 09:53:53 OpenWrt daemon.notice netifd: Interface 'loopback', available=1 Jun 25 09:53:53 OpenWrt daemon.notice netifd: Add user for device 'lo', refcount=2 Jun 25 09:53:53 OpenWrt daemon.notice netifd: Create new device 'br-lan' (Bridge) Jun 25 09:53:53 OpenWrt daemon.notice netifd: Initialize device 'br-lan' Jun 25 09:53:53 OpenWrt daemon.notice netifd: Create interface 'lan' ... Jun 25 09:53:54 OpenWrt daemon.notice netifd: config_init_all : Exit Jun 25 09:53:54 OpenWrt daemon.notice netifd: netifd_reload : Enter Jun 25 09:53:54 OpenWrt daemon.notice netifd: config_init_all : Enter Jun 25 09:53:54 OpenWrt daemon.notice netifd: Device 'br-lan': config applied Jun 25 09:53:54 OpenWrt daemon.notice netifd: Update interface 'lan' .... Jun 25 09:53:54 OpenWrt daemon.notice netifd: netifd_reload : Exit Jun 25 09:53:54 OpenWrt daemon.notice netifd: Network device 'eth1' link is up Jun 25 09:53:54 OpenWrt daemon.notice netifd: Bridge 'br-lan' link is up Jun 25 09:53:54 OpenWrt daemon.notice netifd: Interface 'lan' has link connectivity Jun 25 09:53:54 OpenWrt daemon.notice netifd: Interface 'lan' is setting up now Jun 25 09:53:54 OpenWrt daemon.notice netifd: Queue hotplug handler for interface 'lan', event 'ifup' Jun 25 09:53:54 OpenWrt daemon.notice netifd: Call hotplug handler for interface 'lan', event 'ifup' (br-lan) Jun 25 09:53:54 OpenWrt daemon.notice netifd: Interface 'lan' is now up On Tue, Jun 24, 2014 at 5:36 PM, Felix Fietkau <n...@openwrt.org> wrote: > Hi Hans, > > thanks for testing. I uploaded a new patch (same URL), which uses a > uloop timer to defer processing of incoming invoke msgs. > Note that this changes the ubus context data structure and thus affects > everything that depends on ubus, so it's better to reflash after rebuilding. > > - Felix > > On 2014-06-24 16:11, Hans Dedecker wrote: >> Hi, >> >> Applied the ubus patch but netifd_reload is still called while >> config_init_all is processing the config and thus leading to a crash >> when netifd_reload is done >> >> Added extra traces in netifd which confirms this : >> un 24 16:00:44 OpenWrt daemon.notice netifd: config_init_all : Enter >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Create interface >> 'loopback' >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Create simple device >> 'lo' >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Initialize device 'lo' >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Remove a route from >> device lo >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Remove a route from >> device lo >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Remove a route from >> device lo >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Network device 'lo' is >> now present >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Add user for device >> 'lo', refcount=1 >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Interface 'loopback', >> available=1 >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Add user for device >> 'lo', refcount=2 >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: netifd_reload : Enter >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: config_init_all : Enter >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Update interface >> 'loopback' >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Create new device >> 'br-lan' (Bridge) >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Initialize device >> 'br-lan' >> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Create interface 'lan' >> ..... >> Jun 24 16:00:45 OpenWrt daemon.notice netifd: config_init_all : Exit >> Jun 24 16:00:45 OpenWrt daemon.notice netifd: netifd_reload : Exit >> >> >> Hans >> On Tue, Jun 24, 2014 at 2:05 PM, Felix Fietkau <n...@openwrt.org> wrote: >>> On 2014-06-24 12:46, Hans Dedecker wrote: >>>> Netifd is crashing when when a network reload (ubus call network reload) >>>> is handled during the parsing of the network config in the function >>>> config_init_all (called from main) at startup. >>>> As an ubus_invoke function call is issued when the interfaces are created; >>>> ubus will also process the pending ubus calls in this case the network >>>> reload during the invoke. >>>> As netifd_reload calls again config_init_all network config will be parsed >>>> again; on return from netifd_reload the original config_init_all function >>>> call will continue but will crash as references hold to >>>> interface/device/etc ... lists are not correct anymore. >>>> This potential problem has always been present but due to netifd_reload >>>> timing behavior change in netifd commit >>>> 5db02763d61785529bef538f196c180e968b7c26 this problem can easily be >>>> triggered. >>>> To solve the issue I was thinking about deferring the network reload when >>>> the function config_init_all is parsing the config. >>>> Any opinion if this is the correct way to go or any other alternatives ? >>> Please try applying this patch to ubus: >>> http://nbd.name/libubus-req-defer.patch >>> >>> It should ensure that no invoke will be processed while netifd is busy >>> with registering/unregistering objects or sending notify calls. >>> >>> - Felix >> _______________________________________________ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel