Re: [LEDE-DEV] [PATCH] ubus cli: wait_for: fix race causing false timeouts

Alexandru Ardelean Fri, 07 Oct 2016 05:20:14 -0700

On Fri, Oct 7, 2016 at 3:09 PM, Felix Fietkau <n...@nbd.name> wrote:
> On 2016-10-07 13:57, Zefir Kurtisi wrote:
>> In ubus_cli_wait_for() there is a critical section between
>> initially checking for the requested services and the
>> following handling of 'ubus.object.add' events.
>>
>> In our system we let procd (re)start services and synchronize
>> inter-service dependencies by using 'ubus wait_for' in the
>> initscripts' service_started() functions. There we observe
>> that 'wait_for' randomly is waiting for the full timeout
>> and returning UBUS_STATUS_TIMEOUT, even if the service it
>> is waiting for is already up and running.
>>
>> This happens when the service is started in the critical
>> section mentioned above. This commit adds periodic lookup
>> for the requested services while waiting for the 'add' event
>> and with that fixes the observed failure.
>>
>> Signed-off-by: Zefir Kurtisi <zefir.kurt...@neratec.com>
> Instead of introducing yet another timer, wouldn't it also be possible
> to close this race window by registering the event handler before
> attempting the lookup?
>
> - Felix
>
> _______________________________________________
> Lede-dev mailing list
> Lede-dev@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/lede-dev


I've also seen this race.
I tried something like this:
https://github.com/commodo/ubus/commit/8c3986caaa7cd2c12f2b8907ceea54c5bdce3bd2

But never got around to doing much testing to see if the race goes
away completely.
So, I never pushed it upstream.

@Zefir, maybe you could try it ?

Thanks
Alex

_______________________________________________
Lede-dev mailing list
Lede-dev@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/lede-dev

Re: [LEDE-DEV] [PATCH] ubus cli: wait_for: fix race causing false timeouts

Reply via email to