On Fri, 16 Feb 2018 15:44:13 +0000 alangordonde...@gmail.com wrote: > - qindex = tc_index * 4; > - > - pipe->wrr_tokens[qindex] = (grinder->wrr_tokens[0] & > grinder->wrr_mask[0]) > - >> RTE_SCHED_WRR_SHIFT; > - pipe->wrr_tokens[qindex + 1] = (grinder->wrr_tokens[1] & > grinder->wrr_mask[1]) > - >> RTE_SCHED_WRR_SHIFT; > - pipe->wrr_tokens[qindex + 2] = (grinder->wrr_tokens[2] & > grinder->wrr_mask[2]) > - >> RTE_SCHED_WRR_SHIFT; > - pipe->wrr_tokens[qindex + 3] = (grinder->wrr_tokens[3] & > grinder->wrr_mask[3]) > - >> RTE_SCHED_WRR_SHIFT; > + uint32_t q; > + uint8_t tokens; > + > + qindex = tc_index * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; > + for (q = 0; q < RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS; q++) { > + tokens = (grinder->wrr_tokens[q] & grinder->wrr_mask[q]) >> > + RTE_SCHED_WRR_SHIFT; > + pipe->wrr_tokens[qindex + q] = tokens;
You could use #pragma to tell compiler to unroll the loop which would make it as fast as the original.