05/10/2022 22:33, Mattias Rönnblom:
> On 2022-10-05 21:14, David Marchand wrote:
> > Hello,
> > 
> > The service_autotest unit test has been failing randomly.
> > This is not something new.
> > We have been fixing this unit test and the service code, here and there.
> > For some time we were "fine": the failures were rare.
> > 
> > But recenly (for the last two weeks at least), it started failing more
> > frequently in UNH lab.
> > 
> > The symptoms are linked to places where the unit test code is "waiting
> > for some time":
> > 
> > -  service_lcore_attr_get:
> > + TestCase [ 5] : service_lcore_attr_get failed
> > EAL: Test assert service_lcore_attr_get line 422 failed: Service lcore
> > not stopped after waiting.
> > 
> > 
> > -  service_may_be_active:
> > + TestCase [15] : service_may_be_active failed
> > ...
> > EAL: Test assert service_may_be_active line 960 failed: Error: Service
> > not stopped after 100ms
> > 
> > Ideas?
> > 
> > 
> > Thanks.
> 
> Do you run the test suite in a controlled environment? I.e., one where 
> you can trust that the lcore threads aren't interrupted for long periods 
> of time.
> 
> 100 ms is not a long time if a SCHED_OTHER lcore thread competes for the 
> CPU with other threads.

You mean the tests cannot be interrupted?
Then it looks very fragile.
Please could help making it more robust?


Reply via email to