> From: Tomasz Duszynski [mailto:tduszyn...@marvell.com] > Sent: Sunday, 8 January 2023 16.41 > > >From: Morten Brørup <m...@smartsharesystems.com> > >Sent: Thursday, January 5, 2023 11:08 PM > > > >> From: Tomasz Duszynski [mailto:tduszyn...@marvell.com] > >> Sent: Thursday, 5 January 2023 22.14 > >> > >> Hi Morten, > >> > >> A few comments inline. > >> > >> >From: Morten Brørup <m...@smartsharesystems.com> > >> >Sent: Wednesday, December 14, 2022 11:41 AM > >> > > >> >External Email > >> > > >> >------------------------------------------------------------------- > -- > >> >- > >> >+CC: Mattias, see my comment below about per-thread constructor for > >> this > >> > > >> >> From: Tomasz Duszynski [mailto:tduszyn...@marvell.com] > >> >> Sent: Wednesday, 14 December 2022 10.39 > >> >> > >> >> Hello Morten, > >> >> > >> >> Thanks for review. Answers inline. > >> >>
[...] > >> >> > > +{ > >> >> > > + int lcore_id = rte_lcore_id(); > >> >> > > + struct rte_pmu_event_group *group; > >> >> > > + int ret; > >> >> > > + > >> >> > > + if (!rte_pmu) > >> >> > > + return 0; > >> >> > > + > >> >> > > + group = &rte_pmu->group[lcore_id]; > >> >> > > + if (!group->enabled) { > >> > > >> >Optimized: if (unlikely(!group->enabled)) { > >> > > >> > >> Compiler will optimize the branch itself correctly. Extra hint is > not > >> required. > > > >I haven't reviewed the output from this, so I'll take your word for > it. I suggested the unlikely() > >because I previously tested some very simple code, and it optimized > for taking the "if": > > > >void testb(bool b) > >{ > > if (!b) > > exit(1); > > > > exit(99); > >} > > > >I guess I should experiment with more realistic code, and update my > optimization notes! > > > > I think this may be too simple to draw far-reaching conclusions from > it. Compiler will make the > fall-through path more likely. If I recall Intel Optimization Reference > Manual has some more > info on this. IIRC, the Intel Optimization Reference Manual discusses branch optimization for assembler, not C. > > Lets take a different example. > > int main(int argc, char *argv[]) > { > int *p; > > p = malloc(sizeof(*p)); > if (!p) > return 1; > *p = atoi(argv[1]); > if (*p < 0) > return 2; > free(p); > > return 0; > } > > If compiled with -O3 and disassembled I got below. > > 00000000000010a0 <main>: > 10a0: f3 0f 1e fa endbr64 > 10a4: 55 push %rbp > 10a5: bf 04 00 00 00 mov $0x4,%edi > 10aa: 53 push %rbx > 10ab: 48 89 f3 mov %rsi,%rbx > 10ae: 48 83 ec 08 sub $0x8,%rsp > 10b2: e8 d9 ff ff ff call 1090 <malloc@plt> > 10b7: 48 85 c0 test %rax,%rax > 10ba: 74 31 je 10ed <main+0x4d> > 10bc: 48 8b 7b 08 mov 0x8(%rbx),%rdi > 10c0: ba 0a 00 00 00 mov $0xa,%edx > 10c5: 31 f6 xor %esi,%esi > 10c7: 48 89 c5 mov %rax,%rbp > 10ca: e8 b1 ff ff ff call 1080 <strtol@plt> > 10cf: 49 89 c0 mov %rax,%r8 > 10d2: b8 02 00 00 00 mov $0x2,%eax > 10d7: 45 85 c0 test %r8d,%r8d > 10da: 78 0a js 10e6 <main+0x46> > 10dc: 48 89 ef mov %rbp,%rdi > 10df: e8 8c ff ff ff call 1070 <free@plt> > 10e4: 31 c0 xor %eax,%eax > 10e6: 48 83 c4 08 add $0x8,%rsp > 10ea: 5b pop %rbx > 10eb: 5d pop %rbp > 10ec: c3 ret > 10ed: b8 01 00 00 00 mov $0x1,%eax > 10f2: eb f2 jmp 10e6 <main+0x46> > > Looking at both 10ba and 10da suggests that code was laid out in a way > that jumping is frowned upon. Also > potentially lest frequently executed code (at 10ed) is pushed further > down the memory hence optimizing cache line usage. In my notes, I have (ptr == NULL) marked as considered unlikely, but (int == 0) marked as considered likely. Since group->enabled is bool, I guessed the compiler would treat it like int and consider (!group->enabled) as likely. Like in your example here, I also have (int < 0) marked as considered unlikely. > > That said, each and every scenario needs analysis on its own. Agree. Theory is good, validation is better. ;-) > > >You could add the unlikely() for readability purposes. ;-) > > > > Sure. That won't hurt performance. I think we are both in agreement about the intentions here, so I won't hold you back with further academic discussions at this point. I might resume the discussion with your next patch version, though. ;-)