> On Mon, 29 May 2023 20:25:01 +0530
> <pbhagavat...@marvell.com> wrote:
> 
> > +   return (k1->id_key_len != k2->id_key_len) ||
> > +          (k1->key_len == IPV4_KEYLEN ? k1->src_dst[0] != k2->src_dst[0] :
> > +                                        rte_hash_k32_cmp_eq(k1, k2,
> 32));
> 
> If you make another version, one small comment.
> Breaking this into a couple of if statements would make reading easier
> for human readers. Compiler doesn't care.

I have modified the above code to 

       if (k1->id_key_len != k2->id_key_len)
               return 1;
       if (k1->key_len == IPV4_KEYLEN)
               return k1->src_dst[0] != k2->src_dst[0];
       else
               return rte_hash_k32_cmp_eq(k1, k2, 32);

But upon remeasuring performance I see a performance loss of 1.2%
Compiler(GCC 10) generates additional branches with the above code.

I have also profiled the ip_reassembly application with and without the changes 
and see lot of 
additional branch misses


Current implementation:

==============
Branch Metrics
==============
Branch MPKI                                                  : 0.159            
 
Branch PKI                                                   : 156.566          
 
Branch Mis-prediction Rate                                   : 0.101            
 

INST_RETIRED       : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 9.493B
BR_RETIRED         : ▇▇▇▇▇▇▇ 1.486B
BR_MIS_PRED_RETIRED: ▏ 1.508M
BR_IMMED_SPEC      : ▇▇▇▇▇▇▇ 1.395B
BR_RETURN_SPEC     : ▏ 105.203M
BR_INDIRECT_SPEC   : ▏ 106.044M

Modified implementation:

==============
Branch Metrics
==============
Branch MPKI                                                  : 0.282            
 
Branch PKI                                                   : 156.566          
 
Branch Mis-prediction Rate                                   : 0.180            
 

INST_RETIRED       : ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 9.444B
BR_RETIRED         : ▇▇▇▇▇▇▇ 1.479B
BR_MIS_PRED_RETIRED: ▏ 2.662M
BR_IMMED_SPEC      : ▇▇▇▇▇▇▇ 1.388B
BR_RETURN_SPEC     : ▏ 104.518M
BR_INDIRECT_SPEC   : ▏ 105.354M


I will retain the current implementation in the next patch.

Thanks,
Pavan.

Reply via email to