On 2023-05-12 15:56, Morten Brørup wrote:
From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com]
Sent: Friday, 12 May 2023 15.15
On 2023-05-12 13:59, Jerin Jacob wrote:
On Thu, May 11, 2023 at 2:00 PM Mattias Rönnblom
<mattias.ronnb...@ericsson.com> wrote:
Use non-burst event enqueue and dequeue calls from burst enqueue and
dequeue only when the burst size is compile-time constant (and equal
to one).
Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
---
v3: Actually include the change v2 claimed to contain.
v2: Wrap builtin call in __extension__, to avoid compiler warnings if
application is compiled with -pedantic. (Morten Brørup)
---
lib/eventdev/rte_eventdev.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index a90e23ac8b..a471caeb6d 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -1944,7 +1944,7 @@ __rte_event_enqueue_burst(uint8_t dev_id, uint8_t
port_id,
* Allow zero cost non burst mode routine invocation if
application
* requests nb_events as const one
*/
- if (nb_events == 1)
+ if (__extension__(__builtin_constant_p(nb_events)) && nb_events ==
1)
"Why" part is not clear from the commit message. Is this to avoid
nb_events read if it is built-in const.
The __builtin_constant_p() is introduced to avoid having the compiler
generate a conditional branch and two different code paths in case
nb_elem is a run-time variable.
In particular, this matters if nb_elems is run-time variable and varies
between 1 and some larger value.
I should have mention this in the commit message.
A very slight performance improvement. It also makes the code better
match the comment, imo. Zero cost for const one enqueues, but no impact
non-compile-time-constant-length enqueues.
Feel free to ignore.
If so, check should be following. Right?
if (__extension__((__builtin_constant_p(nb_events)) && nb_events == 1)
|| nb_events == 1)
@Mattias: You missed the second part of this comparison, also catching
nb_events == 1 with non-constant nb_events.
I didn't comment on that code snippet since it was based on a
misconception of the intention of my patch.
@Jerin: Such a change has no effect, compared to the original code.
At least, It was my original intention in the code.
@Jerin: Mattias implemented exactly what the comment says.
Perhaps only the comment should be updated, not the code.
Is nb_events likely to be non-constant 1, and are there benefits to calling
either of the non-burst functions in those cases, vs. the branch cost of this
comparison (which Mattias' patch gets rid of)?
I think the main worry would be the cost of branch mispredictions in
case of alternating enqueue sizes (between 1 and some other size).
If there is a performance upside to calling single-event enqueue in a
scenario where all enqueues are *run-time variable* and 1 (which I find
unlikely, but well inside the realms of the possibility), the next
question would be: OK, but how about for two events? Three? Four. Etc.
return (fp_ops->enqueue)(port, ev);
else
return fn(port, ev, nb_events);
@@ -2200,7 +2200,7 @@ rte_event_dequeue_burst(uint8_t dev_id, uint8_t
port_id, struct rte_event ev[],
* Allow zero cost non burst mode routine invocation if
application
* requests nb_events as const one
*/
- if (nb_events == 1)
+ if (__extension__(__builtin_constant_p(nb_events)) && nb_events ==
1)
return (fp_ops->dequeue)(port, ev, timeout_ticks);
else
return (fp_ops->dequeue_burst)(port, ev, nb_events,
--
2.34.1