Hi Aaron,

> -----Original Message-----
> From: Aaron Conole <acon...@redhat.com>
> Sent: Monday, November 25, 2019 10:54 PM
> To: Thomas Monjalon <tho...@monjalon.net>
> Cc: Van Haaren, Harry <harry.van.haa...@intel.com>; Amber, Kumar
> <kumar.am...@intel.com>; dev@dpdk.org; Wang, Yipeng1
> <yipeng1.w...@intel.com>; Yigit, Ferruh <ferruh.yi...@intel.com>; Thakur,
> Sham Singh <sham.singh.tha...@intel.com>; David Marchand
> <dmarc...@redhat.com>
> Subject: Re: [dpdk-dev] [PATCH v3] hash: added a new API to hash to query
> key id
> 
> Aaron Conole <acon...@redhat.com> writes:
> 
> > Thomas Monjalon <tho...@monjalon.net> writes:
> >
> >>> From: Aaron Conole <acon...@redhat.com>
> >>> > -       if (!service_valid(id))
> >>> > +       if (id >= RTE_SERVICE_NUM_MAX || !service_valid(id))
> >>
> >> Why not adding this check in service_valid()?
> >
> > I think the best fix is to use SERVICE_VALID_GET_OR_ERR_RET() in these
> > places.  For this, I at least want to try and show that there aren't any
> > further errors.  And my test loop has been running for a while now
> > without any more errors or segfaults, so I guess it's okay to build a
> > proper patch.
> 
> This popped up:
> 
> EAL: Test assert service_lcore_en_dis_able line 487 failed: Ex-service core
> function call had no effect.
> 
> So I'll spend some time in this area, it seems.


The below diff makes it 100% reproducible here, failing every time.

It seems like the main thread is returning, before the service thread has 
returned.

The rte_eal_mp_wait_lcore() call seems to not wait on the service-core, which 
allows
the main thread to read the "service_remote_launch_flag" value as 0 (before the 
service-thread writes it to 1).

Adding the delay between the service launch and service write being performed 
makes this issue much much more likely to occur - so the above description I 
have confidence in.

What I'm not clear on (yet) is why the eal_mp_wait_lcore() isn't waiting...

-H


diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c
index 9fe38f5e0..846ad00d1 100644
--- a/app/test/test_service_cores.c
+++ b/app/test/test_service_cores.c
@@ -445,6 +445,7 @@ static int
 service_remote_launch_func(void *arg)
 {
        RTE_SET_USED(arg);
+       rte_delay_ms(100);
        service_remote_launch_flag = 1;
        return 0;
 }

Reply via email to