Hi, when I read the dpdk dlb eventdev driver code, I find that it used the cldemote instruction in the dlb_recv_qe(). But I don't understand why it used there? The cldemote instruction means to move the cache line to the more remote cache, which helps to accelerate core-to-core communication. But who will be use the memory of cache_line_base?
>static __rte_always_inline int dlb_recv_qe(struct dlb_port *qm_port, struct dlb_dequeue_qe *qe, uint8_t *offset) >{ > > cq_addr = dlb_port[qm_port->id][PORT_TYPE(qm_port)].cq_base; > cq_addr = &cq_addr[qm_port->cq_idx]; > cache_line_base = (void *)(((uintptr_t)cq_addr) & ~0x3F); > *offset = ((uintptr_t)cq_addr & 0x30) >> 4; > /* Load the next CQ cache line from memory. Pack these reads as tight > * as possible to reduce the chance that DLB invalidates the line while > * the CPU is reading it. Read the cache line backwards to ensure that > * if QE[N] (N > 0) is valid, then QEs[0:N-1] are too. > * > * (Valid QEs start at &qe[offset]) > */ > qes[3] = _mm_load_si128((__m128i *)&cache_line_base[6]); > qes[2] = _mm_load_si128((__m128i *)&cache_line_base[4]); > qes[1] = _mm_load_si128((__m128i *)&cache_line_base[2]); > qes[0] = _mm_load_si128((__m128i *)&cache_line_base[0]); > > /* Evict the cache line ASAP */ > rte_cldemote(cache_line_base); Thanks.