RE: [PATCH v3] test/service: fix spurious failures by extending timeout

Van Haaren, Harry Tue, 31 Jan 2023 09:25:05 -0800

> -----Original Message-----
> From: David Marchand <david.march...@redhat.com>
> Sent: Thursday, January 26, 2023 9:30 AM
> To: Van Haaren, Harry <harry.van.haa...@intel.com>
> Cc: dev@dpdk.org; dpdk...@iol.unh.edu; c...@dpdk.org;
> honnappa.nagaraha...@arm.com; mattias.ronnblom
> <mattias.ronnb...@ericsson.com>; tho...@monjalon.net; Morten Brørup
> <m...@smartsharesystems.com>; Tyler Retzlaff <roret...@linux.microsoft.com>;
> Aaron Conole <acon...@redhat.com>
> Subject: Re: [PATCH v3] test/service: fix spurious failures by extending 
> timeout
> 
> Hello Harry,


Hi David,

> On Thu, Oct 6, 2022 at 9:33 PM David Marchand <david.march...@redhat.com>
> wrote:
> >
> > On Thu, Oct 6, 2022 at 3:27 PM Morten Brørup <m...@smartsharesystems.com>
> wrote:
> > > > This commit extends the timeout for service_may_be_active()
> > > > from 100ms to 1000ms. Local testing on a idle and loaded system
> > > > (compiling DPDK with all cores) always completes after 1 ms.
> > > >
> > > > The wait time for a service-lcore to finish is also extended
> > > > from 100ms to 1000ms.
> > > >
> > > > The same timeout waiting code was duplicated in two tests, and
> > > > is now refactored to a standalone function avoiding duplication.
> > > >
> > > > Reported-by: David Marchand <david.march...@redhat.com>
> > > > Suggested-by: Mattias Ronnblom <mattias.ronnb...@ericsson.com>
> > > > Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com>
> > > Acked-by: Morten Brørup <m...@smartsharesystems.com>
> > Reviewed-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
> >
> > Ok, let's see if the situation gets better with this.
> > Applied, thanks.
> 
> I took a look at the january month failures at UNH.
> 
> Downloads/dpdk_31608e4db568_2023-01-03_06-58-00_NA/out/testlog.txt:EAL:
> Test assert service_lcore_attr_get line 422 failed: Service lcore not
> stopped after waiting.
> Extending the timeout just made it less likely.

Aha, okay.

<snip>
> The timeout approach just does not have its place in a functional test.
> Either this test is rewritten, or it must go to the performance tests
> list so that we stop getting false positives.
> Can you work on this?

I'll investigate various approaches on Thursday and reply here with suggested 
next steps.

Regards, -Harry

RE: [PATCH v3] test/service: fix spurious failures by extending timeout

Reply via email to