On 2023-05-12 15:56, Morten Brørup wrote:
From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com]
Sent: Friday, 12 May 2023 15.15

On 2023-05-12 13:59, Jerin Jacob wrote:
On Thu, May 11, 2023 at 2:00 PM Mattias Rönnblom
<mattias.ronnb...@ericsson.com> wrote:

Use non-burst event enqueue and dequeue calls from burst enqueue and
dequeue only when the burst size is compile-time constant (and equal
to one).

Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com>

---

v3: Actually include the change v2 claimed to contain.
v2: Wrap builtin call in __extension__, to avoid compiler warnings if
      application is compiled with -pedantic. (Morten Brørup)
---
   lib/eventdev/rte_eventdev.h | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index a90e23ac8b..a471caeb6d 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -1944,7 +1944,7 @@ __rte_event_enqueue_burst(uint8_t dev_id, uint8_t
port_id,
           * Allow zero cost non burst mode routine invocation if
application
           * requests nb_events as const one
           */
-       if (nb_events == 1)
+       if (__extension__(__builtin_constant_p(nb_events)) && nb_events ==
1)

"Why" part is not clear from the commit message. Is this to avoid
nb_events read if it is built-in const.

The __builtin_constant_p() is introduced to avoid having the compiler
generate a conditional branch and two different code paths in case
nb_elem is a run-time variable.

In particular, this matters if nb_elems is run-time variable and varies
between 1 and some larger value.

I should have mention this in the commit message.

A very slight performance improvement. It also makes the code better
match the comment, imo. Zero cost for const one enqueues, but no impact
non-compile-time-constant-length enqueues.

Feel free to ignore.

If so, check should be following. Right?

if (__extension__((__builtin_constant_p(nb_events)) && nb_events == 1)
|| nb_events  == 1)

@Mattias: You missed the second part of this comparison, also catching 
nb_events == 1 with non-constant nb_events.


I didn't comment on that code snippet since it was based on a misconception of the intention of my patch.

@Jerin: Such a change has no effect, compared to the original code.


At least, It was my original intention in the code.

@Jerin: Mattias implemented exactly what the comment says.

Perhaps only the comment should be updated, not the code.

Is nb_events likely to be non-constant 1, and are there benefits to calling 
either of the non-burst functions in those cases, vs. the branch cost of this 
comparison (which Mattias' patch gets rid of)?


I think the main worry would be the cost of branch mispredictions in
case of alternating enqueue sizes (between 1 and some other size).

If there is a performance upside to calling single-event enqueue in a scenario where all enqueues are *run-time variable* and 1 (which I find unlikely, but well inside the realms of the possibility), the next question would be: OK, but how about for two events? Three? Four. Etc.




                  return (fp_ops->enqueue)(port, ev);
          else
                  return fn(port, ev, nb_events);
@@ -2200,7 +2200,7 @@ rte_event_dequeue_burst(uint8_t dev_id, uint8_t
port_id, struct rte_event ev[],
           * Allow zero cost non burst mode routine invocation if
application
           * requests nb_events as const one
           */
-       if (nb_events == 1)
+       if (__extension__(__builtin_constant_p(nb_events)) && nb_events ==
1)
                  return (fp_ops->dequeue)(port, ev, timeout_ticks);
          else
                  return (fp_ops->dequeue_burst)(port, ev, nb_events,
--
2.34.1


Reply via email to