On Mon, Jul 13, 2020 at 10:55:30AM +0100, Bruce Richardson wrote:
> On Mon, Jul 13, 2020 at 07:15:19AM +0000, Cheng Jiang wrote:
> > Added a flag which controls whether rte_ioat_enqueue_copy
> > and rte_ioat_completed_copies function should process
> > handle parameters to improve the performance when handle
> > parameters are not necessary to use. This is targeting
> > 20.11 release.
> > 
> > Signed-off-by: Cheng Jiang <cheng1.ji...@intel.com>
> > ---
> >  drivers/raw/ioat/ioat_rawdev.c     |  1 +
> >  drivers/raw/ioat/rte_ioat_rawdev.h | 14 +++++++++++---
> >  2 files changed, 12 insertions(+), 3 deletions(-)
> > 
> <snip>
> > @@ -208,6 +213,11 @@ rte_ioat_completed_copies(int dev_id, uint8_t 
> > max_copies,
> >     if (count > max_copies)
> >             count = max_copies;
> >  
> > +   ioat->next_read = read + count;
> > +   ioat->completed += count;
> > +   if (!ioat->hdls_enable)
> > +           return count;
> > +
> >     for (; i < count - 1; i += 2, read += 2) {
> >             __m128i hdls0 = _mm_load_si128(&ioat->hdls[read & mask]);
> >             __m128i hdls1 = _mm_load_si128(&ioat->hdls[(read + 1) & mask]);
> > @@ -223,8 +233,6 @@ rte_ioat_completed_copies(int dev_id, uint8_t 
> > max_copies,
> >             dst_hdls[i] = hdls[1];
> >     }
> >  
> > -   ioat->next_read = read;
> > -   ioat->completed += count;
> >     return count;
> >  }
> 
> This change I think may cause problems if we ever want to have one thread
> enqueuing and another taking completions. The next_read and completed
> counters should really only be updated after we have finished reading the
> completed handles array. Therefore, for safety, I tihnk it might be better
> to keep the updates in their original places and put an "end:" label before
> them. Then the "return count" in the middle of the function can be "goto
> end;"
> 
A further suggestion to the changes to this function: if we are not
actually returning completion handles, then there is no need to limit the
count to "max_copies". Therefore move the "return count" or "goto end"
above the previous length check, and update the doxygen comments to
indicate that max_copies is ignored if "hdls_enable" is false, and that the
final two parameters can also be NULL when calling the function in this
case.

/Bruce

Reply via email to