<snip> > > As a general remark consider writing all of the tbl entries including > tbl8 with atomic_store. Now "lpm->tbl8[j] = new_tbl8_entry;" is looks like > > 1e9: 44 88 9c 47 40 01 00 mov > %r11b,0x2000140(%rdi,%rax,2) <-write first byte > 1f0: 02 > 1f1: 48 83 c0 01 add $0x1,%rax > 1f5: 42 88 8c 47 41 01 00 mov %cl,0x2000141(%rdi,%r8,2) <-write > second byte > 1fc: 02 > > This may cause an incorrect nexthop to be returned. If the byte with valid > flag > is updated first, the old(and maybe invalid) next hop could be returned. +1
It is surprising that the compiler is not generating a single 32b store. As you mentioned 'relaxed' __atomic_store_n should be good. > > Please evaluate performance drop after. > > -- > Regards, > Vladimir