Hi Felix,

Applied the new ubus patch and it solves the netifd crash. As can be seen in 
the traces below the ubus requests are deferred as netifd_reload is called when 
config_init_all is terminated.

Thanks for the patch,
Hans

Jun 25 09:53:53 OpenWrt daemon.notice netifd: config_init_all : Enter           
                                  
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Create interface 'loopback'       
                                  
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Create simple device 'lo'         
                                  
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Initialize device 'lo'            
                                  
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Network device 'lo' is now 
present                                  
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Add user for device 'lo', 
refcount=1                                
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Interface 'loopback', available=1 
                                  
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Add user for device 'lo', 
refcount=2                                
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Create new device 'br-lan' 
(Bridge)                                 
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Initialize device 'br-lan'        
                                  
Jun 25 09:53:53 OpenWrt daemon.notice netifd: Create interface 'lan'            
                                  
...
Jun 25 09:53:54 OpenWrt daemon.notice netifd: config_init_all : Exit            
                                  
Jun 25 09:53:54 OpenWrt daemon.notice netifd: netifd_reload : Enter             
                                  
Jun 25 09:53:54 OpenWrt daemon.notice netifd: config_init_all : Enter           
                                  
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Device 'br-lan': config applied   
                                  
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Update interface 'lan'            
                                  
....
Jun 25 09:53:54 OpenWrt daemon.notice netifd: netifd_reload : Exit              
                                  
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Network device 'eth1' link is up  
                                  
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Bridge 'br-lan' link is up        
                                  
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Interface 'lan' has link 
connectivity                               
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Interface 'lan' is setting up now 
                                  
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Queue hotplug handler for 
interface 'lan', event 'ifup'             
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Call hotplug handler for 
interface 'lan', event 'ifup' (br-lan)     
Jun 25 09:53:54 OpenWrt daemon.notice netifd: Interface 'lan' is now up         
                                  


On Tue, Jun 24, 2014 at 5:36 PM, Felix Fietkau <n...@openwrt.org> wrote:
> Hi Hans,
>
> thanks for testing. I uploaded a new patch (same URL), which uses a
> uloop timer to defer processing of incoming invoke msgs.
> Note that this changes the ubus context data structure and thus affects
> everything that depends on ubus, so it's better to reflash after rebuilding.
>
> - Felix
>
> On 2014-06-24 16:11, Hans Dedecker wrote:
>> Hi,
>>
>> Applied the ubus patch but netifd_reload is still called while
>> config_init_all is processing the config and thus leading to a crash
>> when netifd_reload is done
>>
>> Added extra traces in netifd which confirms this :
>> un 24 16:00:44 OpenWrt daemon.notice netifd: config_init_all : Enter
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Create interface
>> 'loopback'
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Create simple device
>> 'lo'
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Initialize device 'lo'
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Remove a route from
>> device lo
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Remove a route from
>> device lo
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Remove a route from
>> device lo
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Network device 'lo' is
>> now present
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Add user for device
>> 'lo', refcount=1
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Interface 'loopback',
>> available=1
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Add user for device
>> 'lo', refcount=2
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: netifd_reload : Enter
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: config_init_all : Enter
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Update interface
>> 'loopback'
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Create new device
>> 'br-lan' (Bridge)
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Initialize device
>> 'br-lan'
>> Jun 24 16:00:44 OpenWrt daemon.notice netifd: Create interface 'lan'
>> .....
>> Jun 24 16:00:45 OpenWrt daemon.notice netifd: config_init_all : Exit
>> Jun 24 16:00:45 OpenWrt daemon.notice netifd: netifd_reload : Exit
>>
>>
>> Hans
>> On Tue, Jun 24, 2014 at 2:05 PM, Felix Fietkau <n...@openwrt.org> wrote:
>>> On 2014-06-24 12:46, Hans Dedecker wrote:
>>>> Netifd is crashing when when a network reload (ubus call network reload) 
>>>> is handled during the parsing of the network config in the function 
>>>> config_init_all (called from main) at startup.
>>>> As an ubus_invoke function call is issued when the interfaces are created; 
>>>> ubus will also process the pending ubus calls in this case the network 
>>>> reload during the invoke.
>>>> As netifd_reload calls again config_init_all network config will be parsed 
>>>> again; on return from netifd_reload the original config_init_all function 
>>>> call will continue but will crash as references hold to 
>>>> interface/device/etc ... lists are not correct anymore.
>>>> This potential problem has always been present but due to netifd_reload 
>>>> timing behavior change in netifd commit 
>>>> 5db02763d61785529bef538f196c180e968b7c26 this problem can easily be 
>>>> triggered.
>>>> To solve the issue I was thinking about deferring the network reload when 
>>>> the function config_init_all is parsing the config.
>>>> Any opinion if this is the correct way to go or any other alternatives ?
>>> Please try applying this patch to ubus:
>>> http://nbd.name/libubus-req-defer.patch
>>>
>>> It should ensure that no invoke will be processed while netifd is busy
>>> with registering/unregistering objects or sending notify calls.
>>>
>>> - Felix
>>
_______________________________________________
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel

Reply via email to