> On Mon, 29 May 2023 20:25:01 +0530 > <pbhagavat...@marvell.com> wrote: > > > + return (k1->id_key_len != k2->id_key_len) || > > + (k1->key_len == IPV4_KEYLEN ? k1->src_dst[0] != k2->src_dst[0] : > > + rte_hash_k32_cmp_eq(k1, k2, > 32)); > > If you make another version, one small comment. > Breaking this into a couple of if statements would make reading easier > for human readers. Compiler doesn't care.
I have modified the above code to if (k1->id_key_len != k2->id_key_len) return 1; if (k1->key_len == IPV4_KEYLEN) return k1->src_dst[0] != k2->src_dst[0]; else return rte_hash_k32_cmp_eq(k1, k2, 32); But upon remeasuring performance I see a performance loss of 1.2% Compiler(GCC 10) generates additional branches with the above code. I have also profiled the ip_reassembly application with and without the changes and see lot of additional branch misses Current implementation: ============== Branch Metrics ============== Branch MPKI : 0.159 Branch PKI : 156.566 Branch Mis-prediction Rate : 0.101 INST_RETIRED : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 9.493B BR_RETIRED : ▇▇▇▇▇▇▇ 1.486B BR_MIS_PRED_RETIRED: ▏ 1.508M BR_IMMED_SPEC : ▇▇▇▇▇▇▇ 1.395B BR_RETURN_SPEC : ▏ 105.203M BR_INDIRECT_SPEC : ▏ 106.044M Modified implementation: ============== Branch Metrics ============== Branch MPKI : 0.282 Branch PKI : 156.566 Branch Mis-prediction Rate : 0.180 INST_RETIRED : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 9.444B BR_RETIRED : ▇▇▇▇▇▇▇ 1.479B BR_MIS_PRED_RETIRED: ▏ 2.662M BR_IMMED_SPEC : ▇▇▇▇▇▇▇ 1.388B BR_RETURN_SPEC : ▏ 104.518M BR_INDIRECT_SPEC : ▏ 105.354M I will retain the current implementation in the next patch. Thanks, Pavan.