John Fastabend <john.fastab...@gmail.com> writes:

> Jesper Dangaard Brouer wrote:
>> On Wed, 27 Jan 2021 13:20:50 +0100
>> Maciej Fijalkowski <maciej.fijalkow...@intel.com> wrote:
>> 
>> > On Wed, Jan 27, 2021 at 10:41:44AM +0100, Toke Høiland-Jørgensen wrote:
>> > > John Fastabend <john.fastab...@gmail.com> writes:
>> > >   
>> > > > Hangbin Liu wrote:  
>> > > >> From: Jesper Dangaard Brouer <bro...@redhat.com>
>> > > >> 
>> > > >> This changes the devmap XDP program support to run the program when 
>> > > >> the
>> > > >> bulk queue is flushed instead of before the frame is enqueued. This 
>> > > >> has
>> > > >> a couple of benefits:
>> > > >> 
>> > > >> - It "sorts" the packets by destination devmap entry, and then runs 
>> > > >> the
>> > > >>   same BPF program on all the packets in sequence. This ensures that 
>> > > >> we
>> > > >>   keep the XDP program and destination device properties hot in 
>> > > >> I-cache.
>> > > >> 
>> > > >> - It makes the multicast implementation simpler because it can just
>> > > >>   enqueue packets using bq_enqueue() without having to deal with the
>> > > >>   devmap program at all.
>> > > >> 
>> > > >> The drawback is that if the devmap program drops the packet, the 
>> > > >> enqueue
>> > > >> step is redundant. However, arguably this is mostly visible in a
>> > > >> micro-benchmark, and with more mixed traffic the I-cache benefit 
>> > > >> should
>> > > >> win out. The performance impact of just this patch is as follows:
>> > > >> 
>> > > >> The bq_xmit_all's logic is also refactored and error label is removed.
>> > > >> When bq_xmit_all() is called from bq_enqueue(), another packet will
>> > > >> always be enqueued immediately after, so clearing dev_rx, xdp_prog and
>> > > >> flush_node in bq_xmit_all() is redundant. Let's move the clear to
>> > > >> __dev_flush(), and only check them once in bq_enqueue() since they are
>> > > >> all modified together.
>> > > >> 
>> > > >> By using xdp_redirect_map in sample/bpf and send pkts via pktgen cmd:
>> > > >> ./pktgen_sample03_burst_single_flow.sh -i eno1 -d $dst_ip -m $dst_mac 
>> > > >> -t 10 -s 64
>> > > >> 
>> > > >> There are about +/- 0.1M deviation for native testing, the performance
>> > > >> improved for the base-case, but some drop back with xdp devmap prog 
>> > > >> attached.
>> > > >> 
>> > > >> Version          | Test                           | Generic | Native 
>> > > >> | Native + 2nd xdp_prog
>> > > >> 5.10 rc6         | xdp_redirect_map   i40e->i40e  |    2.0M |   9.1M 
>> > > >> |  8.0M
>> > > >> 5.10 rc6         | xdp_redirect_map   i40e->veth  |    1.7M |  11.0M 
>> > > >> |  9.7M
>> > > >> 5.10 rc6 + patch | xdp_redirect_map   i40e->i40e  |    2.0M |   9.5M 
>> > > >> |  7.5M
>> > > >> 5.10 rc6 + patch | xdp_redirect_map   i40e->veth  |    1.7M |  11.6M 
>> > > >> |  9.1M
>> > > >>   
>> > > >
>> > > > [...]
>
> Acked-by: John Fastabend <john.fastab...@gmail.com>
>
>> > > >>  static void bq_xmit_all(struct xdp_dev_bulk_queue *bq, u32 flags)
>> > > >>  {
>> > > >>       struct net_device *dev = bq->dev;
>> > > >> -     int sent = 0, drops = 0, err = 0;
>> > > >> +     unsigned int cnt = bq->count;
>> > > >> +     int drops = 0, err = 0;
>> > > >> +     int to_send = cnt;
>> > > >> +     int sent = cnt;
>> > > >>       int i;
>> > > >>  
>> > > >> -     if (unlikely(!bq->count))
>> > > >> +     if (unlikely(!cnt))
>> > > >>               return;
>> > > >>  
>> > > >> -     for (i = 0; i < bq->count; i++) {
>> > > >> +     for (i = 0; i < cnt; i++) {
>> > > >>               struct xdp_frame *xdpf = bq->q[i];
>> > > >>  
>> > > >>               prefetch(xdpf);
>> > > >>       }
>> > > >>  
>> > > >> -     sent = dev->netdev_ops->ndo_xdp_xmit(dev, bq->count, bq->q, 
>> > > >> flags);
>> > > >> +     if (bq->xdp_prog) {
>> > > >> +             to_send = dev_map_bpf_prog_run(bq->xdp_prog, bq->q, 
>> > > >> cnt, dev);
>> > > >> +             if (!to_send) {
>> > > >> +                     sent = 0;
>> > > >> +                     goto out;
>> > > >> +             }
>> > > >> +             drops = cnt - to_send;
>> > > >> +     }  
>> > > >
>> > > > I might be missing something about how *bq works here. What happens 
>> > > > when
>> > > > dev_map_bpf_prog_run returns to_send < cnt?
>> > > >
>> > > > So I read this as it will send [0, to_send] and [to_send, cnt] will be
>> > > > dropped? How do we know the bpf prog would have dropped the set,
>> > > > [to_send+1, cnt]?  
>> > 
>> > You know that via recalculation of 'drops' value after you returned from
>> > dev_map_bpf_prog_run() which later on is provided onto 
>> > trace_xdp_devmap_xmit.
>> > 
>> > > 
>> > > Because dev_map_bpf_prog_run() compacts the array:
>> > > 
>> > > +                case XDP_PASS:
>> > > +                        err = xdp_update_frame_from_buff(&xdp, xdpf);
>> > > +                        if (unlikely(err < 0))
>> > > +                                xdp_return_frame_rx_napi(xdpf);
>> > > +                        else
>> > > +                                frames[nframes++] = xdpf;
>> > > +                        break;  
>> > 
>> > To expand this a little, 'frames' array is reused and 'nframes' above is
>> > the value that is returned and we store it onto 'to_send' variable.
>> > 
>
> In the morning with coffee looks good to me. Thanks Toke, Jesper.

Haha, yeah, coffee does tend to help, doesn't it? You're welcome :)

-Toke

Reply via email to