Hi Pavan,

> -----Original Message-----
> From: Pavan Nikhilesh Bhagavatula <pbhagavat...@marvell.com>
> Sent: Friday, October 25, 2019 12:26 PM
> To: Gavin Hu (Arm Technology China) <gavin...@arm.com>;
> jer...@marvell.com
> Cc: dev@dpdk.org; nd <n...@arm.com>
> Subject: RE: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for
> head
> 
> Hi Gavin,
> 
> >-----Original Message-----
> >From: dev <dev-boun...@dpdk.org> On Behalf Of Gavin Hu (Arm
> >Technology China)
> >Sent: Thursday, October 24, 2019 9:23 PM
> >To: Pavan Nikhilesh Bhagavatula <pbhagavat...@marvell.com>; Jerin
> >Jacob Kollanukkaran <jer...@marvell.com>
> >Cc: dev@dpdk.org; nd <n...@arm.com>
> >Subject: Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while
> >waiting for head
> >
> >Hi Pavan,
> >
> >> -----Original Message-----
> >> From: pbhagavat...@marvell.com <pbhagavat...@marvell.com>
> >> Sent: Thursday, October 24, 2019 12:13 AM
> >> To: Gavin Hu (Arm Technology China) <gavin...@arm.com>;
> >> jer...@marvell.com; Pavan Nikhilesh <pbhagavat...@marvell.com>
> >> Cc: dev@dpdk.org
> >> Subject: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting
> >for
> >> head
> >>
> >> From: Pavan Nikhilesh <pbhagavat...@marvell.com>
> >>
> >> Use wfe to save power while waiting for tag to become head.
> >>
> >> SSO signals EVENTI to allow cores to exit from wfe when they
> >> are waiting for specific operations in which one of them is
> >> setting HEAD bit in GWS_TAG.
> >>
> >> Signed-off-by: Pavan Nikhilesh <pbhagavat...@marvell.com>
> >> ---
> >>  drivers/event/octeontx2/otx2_worker.h | 30
> >++++++++++++++++++++++++--
> >> -
> >>  1 file changed, 27 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/event/octeontx2/otx2_worker.h
> >> b/drivers/event/octeontx2/otx2_worker.h
> >> index 4e971f27c..7a55caca5 100644
> >> --- a/drivers/event/octeontx2/otx2_worker.h
> >> +++ b/drivers/event/octeontx2/otx2_worker.h
> >> @@ -226,10 +226,34 @@ otx2_ssogws_swtag_wait(struct
> >otx2_ssogws *ws)
> >>  }
> >>
> >>  static __rte_always_inline void
> >> -otx2_ssogws_head_wait(struct otx2_ssogws *ws, const uint8_t
> >wait_flag)
> >> +otx2_ssogws_head_wait(struct otx2_ssogws *ws)
> >>  {
> >> -  while (wait_flag && !(otx2_read64(ws->tag_op) &
> >BIT_ULL(35)))
> >> +#ifdef RTE_ARCH_ARM64
> >> +  uint64_t tag;
> >> +
> >> +  asm volatile (
> >> +                  "       ldr %[tag], [%[tag_op]]         \n"
> >"ldxr" should be used, exclusive-load is required to "monitor" the
> >location, then a write to the location will cause clear of the exclusive
> >monitor, thus a wake up event is generated implicitly.
> 
> As I have mentioned in the commit log:
> "SSO signals EVENTI to allow cores to exit from wfe when they
> are waiting for specific operations in which one of them is
> setting HEAD bit in GWS_TAG."
If you have other expected wake up sources, that is ok. Just curious is this 
signal explicitly sent to quit WFE? 
Just wondering, implicit event(Clear of exclusive monitor) vs explicit signal, 
which has shorter latency?
/Gavin
> 
> The address need not be tracked by the global monitor.
> 
> >You can find more explanation is here:
> >https://urldefense.proofpoint.com/v2/url?u=http-
> >3A__inbox.dpdk.org_dev_AM0PR08MB5363F9D1BA158B66B803EA068F
> >6B0-
> >40AM0PR08MB5363.eurprd08.prod.outlook.com_&d=DwIFAg&c=nKjW
> >ec2b6R0mOyPaz7xtfQ&r=1cjuAHrGh745jHNmj2fD85sUMIJ2IPIDsIJzo6F
> >N6Z0&m=JMzT-4V2megNsFYxaO0V2wE0-
> >GlK9UPUvE1K0pPA9aQ&s=JajU2VklhV_jFE0WKAZ076KjjWymIC-
> >iTiJXU0Vwxr4&e=
> >/Gavin
> >> +                  "       tbnz %[tag], 35, done%=
> >     \n"
> >> +                  "       sevl                            \n"
> >> +                  "rty%=: wfe                             \n"
> >> +                  "       ldr %[tag], [%[tag_op]]         \n"
> >> +                  "       tbz %[tag], 35, rty%=           \n"
> >> +                  "done%=:                                \n"
> >> +                  : [tag] "=&r" (tag)
> >> +                  : [tag_op] "r" (ws->tag_op)
> >> +                  );
> >> +#else
> >> +  /* Wait for the HEAD to be set */
> >> +  while (!(otx2_read64(ws->tag_op) & BIT_ULL(35)))
> >>            ;
> >> +#endif
> >> +}
> >> +
> >> +static __rte_always_inline void
> >> +otx2_ssogws_order(struct otx2_ssogws *ws, const uint8_t
> >wait_flag)
> >> +{
> >> +  if (wait_flag)
> >> +          otx2_ssogws_head_wait(ws);
> >>
> >>    rte_cio_wmb();
> >What ordering does this barrier try to keep?  If there is a write then wait
> >for kind of response, should this barrier move before
> >otx2_ssogws_head_wait?
> 
> The barrier is used to flush out write buffer to LLC (octeontx2 point of
> coherence) so
> that NIX Tx picks up all the modifications done to the packet.
Looking at the otx2_ssogws_event_tx function, so far at the point of 
rte_cio_wmb, only the header is written?
Should it be delayed after the whole packet written and before the submission? 
If NIX is not falling within the SMP configuration, should it be rte_io_wmb 
instead?
/Gavin
> >>  }
> >> @@ -258,7 +282,7 @@ otx2_ssogws_event_tx(struct otx2_ssogws
> >*ws,
> >> struct rte_event ev[],
> >>
> >>    /* Perform header writes before barrier for TSO */
> >>    otx2_nix_xmit_prepare_tso(m, flags);
> >> -  otx2_ssogws_head_wait(ws, !ev->sched_type);
> >> +  otx2_ssogws_order(ws, !ev->sched_type);
> >>    otx2_ssogws_prepare_pkt(txq, m, cmd, flags);
> >>
> >>    if (flags & NIX_TX_MULTI_SEG_F) {
> >> --
> >> 2.17.1

Reply via email to