[PATCH net-next v1 0/7] virtnet_net: prepare for af-xdp
This patch set prepares for supporting af-xdp zerocopy. There is no feature change in this patch set. I just want to reduce the patch num of the final patch set, so I split the patch set. #1-#3 add independent directory for virtio-net #4-#7 do some refactor, the sub-functions will be used by the subsequent commits Thanks. v1: 1. resend for the new net-next merge window Xuan Zhuo (7): virtio_net: independent directory virtio_net: move core structures to virtio_net.h virtio_net: add prefix virtnet to all struct inside virtio_net.h virtio_net: separate virtnet_rx_resize() virtio_net: separate virtnet_tx_resize() virtio_net: separate receive_mergeable virtio_net: separate receive_buf MAINTAINERS | 2 +- drivers/net/Kconfig | 9 +- drivers/net/Makefile | 2 +- drivers/net/virtio/Kconfig| 12 + drivers/net/virtio/Makefile | 8 + drivers/net/virtio/virtnet.h | 248 .../{virtio_net.c => virtio/virtnet_main.c} | 536 ++ 7 files changed, 454 insertions(+), 363 deletions(-) create mode 100644 drivers/net/virtio/Kconfig create mode 100644 drivers/net/virtio/Makefile create mode 100644 drivers/net/virtio/virtnet.h rename drivers/net/{virtio_net.c => virtio/virtnet_main.c} (94%) -- 2.32.0.3.g01195cf9f
[PATCH net-next v1 1/7] virtio_net: independent directory
Create a separate directory for virtio-net. AF_XDP support will be added later, then a separate xsk.c file will be added, so we should create a directory for virtio-net. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- MAINTAINERS | 2 +- drivers/net/Kconfig | 9 + drivers/net/Makefile| 2 +- drivers/net/virtio/Kconfig | 12 drivers/net/virtio/Makefile | 8 drivers/net/{virtio_net.c => virtio/virtnet_main.c} | 0 6 files changed, 23 insertions(+), 10 deletions(-) create mode 100644 drivers/net/virtio/Kconfig create mode 100644 drivers/net/virtio/Makefile rename drivers/net/{virtio_net.c => virtio/virtnet_main.c} (100%) diff --git a/MAINTAINERS b/MAINTAINERS index 27367ad339ea..e426fdbaacb8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -23776,7 +23776,7 @@ F: Documentation/devicetree/bindings/virtio/ F: Documentation/driver-api/virtio/ F: drivers/block/virtio_blk.c F: drivers/crypto/virtio/ -F: drivers/net/virtio_net.c +F: drivers/net/virtio/ F: drivers/vdpa/ F: drivers/virtio/ F: include/linux/vdpa.h diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index 9920b3a68ed1..b80793a0bd17 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -443,14 +443,7 @@ config VETH When one end receives the packet it appears on its pair and vice versa. -config VIRTIO_NET - tristate "Virtio network driver" - depends on VIRTIO - select NET_FAILOVER - select DIMLIB - help - This is the virtual network driver for virtio. It can be used with - QEMU based VMMs (like KVM or Xen). Say Y or M. +source "drivers/net/virtio/Kconfig" config NLMON tristate "Virtual netlink monitoring device" diff --git a/drivers/net/Makefile b/drivers/net/Makefile index 13743d0e83b5..505385d7f6b7 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -32,7 +32,7 @@ obj-$(CONFIG_NET_TEAM) += team/ obj-$(CONFIG_TUN) += tun.o obj-$(CONFIG_TAP) += tap.o obj-$(CONFIG_VETH) += veth.o -obj-$(CONFIG_VIRTIO_NET) += virtio_net.o +obj-$(CONFIG_VIRTIO_NET) += virtio/ obj-$(CONFIG_VXLAN) += vxlan/ obj-$(CONFIG_GENEVE) += geneve.o obj-$(CONFIG_BAREUDP) += bareudp.o diff --git a/drivers/net/virtio/Kconfig b/drivers/net/virtio/Kconfig new file mode 100644 index ..e162535ca213 --- /dev/null +++ b/drivers/net/virtio/Kconfig @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# virtio-net device configuration +# +config VIRTIO_NET + tristate "Virtio network driver" + depends on VIRTIO + select NET_FAILOVER + select DIMLIB + help + This is the virtual network driver for virtio. It can be used with + QEMU based VMMs (like KVM or Xen). Say Y or M. diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile new file mode 100644 index ..c4602337c78c --- /dev/null +++ b/drivers/net/virtio/Makefile @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Makefile for the virtio network device drivers. +# + +obj-$(CONFIG_VIRTIO_NET) += virtio_net.o + +virtio_net-y := virtnet_main.o diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio/virtnet_main.c similarity index 100% rename from drivers/net/virtio_net.c rename to drivers/net/virtio/virtnet_main.c -- 2.32.0.3.g01195cf9f
[PATCH net-next v1 3/7] virtio_net: add prefix virtnet to all struct inside virtio_net.h
We move some structures to the header file, but these structures do not prefixed with virtnet. This patch adds virtnet for these. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio/virtnet.h | 12 ++-- drivers/net/virtio/virtnet_main.c | 110 +++--- 2 files changed, 61 insertions(+), 61 deletions(-) diff --git a/drivers/net/virtio/virtnet.h b/drivers/net/virtio/virtnet.h index e46db9491605..d4cc4ddb0786 100644 --- a/drivers/net/virtio/virtnet.h +++ b/drivers/net/virtio/virtnet.h @@ -51,8 +51,8 @@ struct virtnet_rq_dma { }; /* Internal representation of a send virtqueue */ -struct send_queue { - /* Virtqueue associated with this send _queue */ +struct virtnet_sq { + /* Virtqueue associated with this virtnet_sq */ struct virtqueue *vq; /* TX: fragments + linear part + virtio header */ @@ -72,8 +72,8 @@ struct send_queue { }; /* Internal representation of a receive virtqueue */ -struct receive_queue { - /* Virtqueue associated with this receive_queue */ +struct virtnet_rq { + /* Virtqueue associated with this virtnet_rq */ struct virtqueue *vq; struct napi_struct napi; @@ -144,8 +144,8 @@ struct virtnet_info { struct virtio_device *vdev; struct virtqueue *cvq; struct net_device *dev; - struct send_queue *sq; - struct receive_queue *rq; + struct virtnet_sq *sq; + struct virtnet_rq *rq; unsigned int status; /* Max # of queue pairs supported by the device */ diff --git a/drivers/net/virtio/virtnet_main.c b/drivers/net/virtio/virtnet_main.c index faff5d719440..fc28fadf944e 100644 --- a/drivers/net/virtio/virtnet_main.c +++ b/drivers/net/virtio/virtnet_main.c @@ -275,7 +275,7 @@ static struct xdp_frame *ptr_to_xdp(void *ptr) return (struct xdp_frame *)((unsigned long)ptr & ~VIRTIO_XDP_FLAG); } -static void __free_old_xmit(struct send_queue *sq, bool in_napi, +static void __free_old_xmit(struct virtnet_sq *sq, bool in_napi, struct virtnet_sq_free_stats *stats) { unsigned int len; @@ -344,7 +344,7 @@ skb_vnet_common_hdr(struct sk_buff *skb) * private is used to chain pages for big packets, put the whole * most recent used list in the beginning for reuse */ -static void give_pages(struct receive_queue *rq, struct page *page) +static void give_pages(struct virtnet_rq *rq, struct page *page) { struct page *end; @@ -354,7 +354,7 @@ static void give_pages(struct receive_queue *rq, struct page *page) rq->pages = page; } -static struct page *get_a_page(struct receive_queue *rq, gfp_t gfp_mask) +static struct page *get_a_page(struct virtnet_rq *rq, gfp_t gfp_mask) { struct page *p = rq->pages; @@ -368,7 +368,7 @@ static struct page *get_a_page(struct receive_queue *rq, gfp_t gfp_mask) } static void virtnet_rq_free_buf(struct virtnet_info *vi, - struct receive_queue *rq, void *buf) + struct virtnet_rq *rq, void *buf) { if (vi->mergeable_rx_bufs) put_page(virt_to_head_page(buf)); @@ -483,7 +483,7 @@ static struct sk_buff *virtnet_build_skb(void *buf, unsigned int buflen, /* Called from bottom half context */ static struct sk_buff *page_to_skb(struct virtnet_info *vi, - struct receive_queue *rq, + struct virtnet_rq *rq, struct page *page, unsigned int offset, unsigned int len, unsigned int truesize, unsigned int headroom) @@ -581,7 +581,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi, return skb; } -static void virtnet_rq_unmap(struct receive_queue *rq, void *buf, u32 len) +static void virtnet_rq_unmap(struct virtnet_rq *rq, void *buf, u32 len) { struct page *page = virt_to_head_page(buf); struct virtnet_rq_dma *dma; @@ -610,7 +610,7 @@ static void virtnet_rq_unmap(struct receive_queue *rq, void *buf, u32 len) put_page(page); } -static void *virtnet_rq_get_buf(struct receive_queue *rq, u32 *len, void **ctx) +static void *virtnet_rq_get_buf(struct virtnet_rq *rq, u32 *len, void **ctx) { void *buf; @@ -621,7 +621,7 @@ static void *virtnet_rq_get_buf(struct receive_queue *rq, u32 *len, void **ctx) return buf; } -static void virtnet_rq_init_one_sg(struct receive_queue *rq, void *buf, u32 len) +static void virtnet_rq_init_one_sg(struct virtnet_rq *rq, void *buf, u32 len) { struct virtnet_rq_dma *dma; dma_addr_t addr; @@ -641,7 +641,7 @@ static void virtnet_rq_init_one_sg(struct receive_queue *rq, void *buf, u32 len) rq->sg[0].length = len; } -static void *virtnet_rq_alloc(struct receive_queue *rq, u32 size, gfp_t gfp) +static void *virtnet_rq_alloc(struct virtnet_rq *rq, u32 size, gfp_t gfp)
[PATCH net-next v1 5/7] virtio_net: separate virtnet_tx_resize()
This patch separates two sub-functions from virtnet_tx_resize(): * virtnet_tx_pause * virtnet_tx_resume Then the subsequent virtnet_tx_reset() can share these two functions. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio/virtnet.h | 2 ++ drivers/net/virtio/virtnet_main.c | 35 +-- 2 files changed, 31 insertions(+), 6 deletions(-) diff --git a/drivers/net/virtio/virtnet.h b/drivers/net/virtio/virtnet.h index b0f2ae4bd1c4..b56ebc7fcdcc 100644 --- a/drivers/net/virtio/virtnet.h +++ b/drivers/net/virtio/virtnet.h @@ -239,4 +239,6 @@ struct virtnet_info { void virtnet_rx_pause(struct virtnet_info *vi, struct virtnet_rq *rq); void virtnet_rx_resume(struct virtnet_info *vi, struct virtnet_rq *rq); +void virtnet_tx_pause(struct virtnet_info *vi, struct virtnet_sq *sq); +void virtnet_tx_resume(struct virtnet_info *vi, struct virtnet_sq *sq); #endif diff --git a/drivers/net/virtio/virtnet_main.c b/drivers/net/virtio/virtnet_main.c index a2bf576e644c..285443da040c 100644 --- a/drivers/net/virtio/virtnet_main.c +++ b/drivers/net/virtio/virtnet_main.c @@ -2416,12 +2416,11 @@ static int virtnet_rx_resize(struct virtnet_info *vi, return err; } -static int virtnet_tx_resize(struct virtnet_info *vi, -struct virtnet_sq *sq, u32 ring_num) +void virtnet_tx_pause(struct virtnet_info *vi, struct virtnet_sq *sq) { bool running = netif_running(vi->dev); struct netdev_queue *txq; - int err, qindex; + int qindex; qindex = sq - vi->sq; @@ -2442,10 +2441,17 @@ static int virtnet_tx_resize(struct virtnet_info *vi, netif_stop_subqueue(vi->dev, qindex); __netif_tx_unlock_bh(txq); +} - err = virtqueue_resize(sq->vq, ring_num, virtnet_sq_free_unused_buf); - if (err) - netdev_err(vi->dev, "resize tx fail: tx queue index: %d err: %d\n", qindex, err); +void virtnet_tx_resume(struct virtnet_info *vi, struct virtnet_sq *sq) +{ + bool running = netif_running(vi->dev); + struct netdev_queue *txq; + int qindex; + + qindex = sq - vi->sq; + + txq = netdev_get_tx_queue(vi->dev, qindex); __netif_tx_lock_bh(txq); sq->reset = false; @@ -2454,6 +2460,23 @@ static int virtnet_tx_resize(struct virtnet_info *vi, if (running) virtnet_napi_tx_enable(vi, sq->vq, &sq->napi); +} + +static int virtnet_tx_resize(struct virtnet_info *vi, struct virtnet_sq *sq, +u32 ring_num) +{ + int qindex, err; + + qindex = sq - vi->sq; + + virtnet_tx_pause(vi, sq); + + err = virtqueue_resize(sq->vq, ring_num, virtnet_sq_free_unused_buf); + if (err) + netdev_err(vi->dev, "resize tx fail: tx queue index: %d err: %d\n", qindex, err); + + virtnet_tx_resume(vi, sq); + return err; } -- 2.32.0.3.g01195cf9f
[PATCH net-next v1 4/7] virtio_net: separate virtnet_rx_resize()
This patch separates two sub-functions from virtnet_rx_resize(): * virtnet_rx_pause * virtnet_rx_resume Then the subsequent reset rx for xsk can share these two functions. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio/virtnet.h | 3 +++ drivers/net/virtio/virtnet_main.c | 29 + 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/drivers/net/virtio/virtnet.h b/drivers/net/virtio/virtnet.h index d4cc4ddb0786..b0f2ae4bd1c4 100644 --- a/drivers/net/virtio/virtnet.h +++ b/drivers/net/virtio/virtnet.h @@ -236,4 +236,7 @@ struct virtnet_info { u64 device_stats_cap; }; + +void virtnet_rx_pause(struct virtnet_info *vi, struct virtnet_rq *rq); +void virtnet_rx_resume(struct virtnet_info *vi, struct virtnet_rq *rq); #endif diff --git a/drivers/net/virtio/virtnet_main.c b/drivers/net/virtio/virtnet_main.c index fc28fadf944e..a2bf576e644c 100644 --- a/drivers/net/virtio/virtnet_main.c +++ b/drivers/net/virtio/virtnet_main.c @@ -2378,28 +2378,41 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } -static int virtnet_rx_resize(struct virtnet_info *vi, -struct virtnet_rq *rq, u32 ring_num) +void virtnet_rx_pause(struct virtnet_info *vi, struct virtnet_rq *rq) { bool running = netif_running(vi->dev); - int err, qindex; - - qindex = rq - vi->rq; if (running) { napi_disable(&rq->napi); cancel_work_sync(&rq->dim.work); } +} - err = virtqueue_resize(rq->vq, ring_num, virtnet_rq_unmap_free_buf); - if (err) - netdev_err(vi->dev, "resize rx fail: rx queue index: %d err: %d\n", qindex, err); +void virtnet_rx_resume(struct virtnet_info *vi, struct virtnet_rq *rq) +{ + bool running = netif_running(vi->dev); if (!try_fill_recv(vi, rq, GFP_KERNEL)) schedule_delayed_work(&vi->refill, 0); if (running) virtnet_napi_enable(rq->vq, &rq->napi); +} + +static int virtnet_rx_resize(struct virtnet_info *vi, +struct virtnet_rq *rq, u32 ring_num) +{ + int err, qindex; + + qindex = rq - vi->rq; + + virtnet_rx_pause(vi, rq); + + err = virtqueue_resize(rq->vq, ring_num, virtnet_rq_unmap_free_buf); + if (err) + netdev_err(vi->dev, "resize rx fail: rx queue index: %d err: %d\n", qindex, err); + + virtnet_rx_resume(vi, rq); return err; } -- 2.32.0.3.g01195cf9f
[PATCH net-next v1 2/7] virtio_net: move core structures to virtio_net.h
Move some core structures (send_queue, receive_queue, virtnet_info) definitions and the relative structures definitions into the virtio_net.h file. That will be used by the other c code files. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio/virtnet.h | 239 ++ drivers/net/virtio/virtnet_main.c | 235 + 2 files changed, 241 insertions(+), 233 deletions(-) create mode 100644 drivers/net/virtio/virtnet.h diff --git a/drivers/net/virtio/virtnet.h b/drivers/net/virtio/virtnet.h new file mode 100644 index ..e46db9491605 --- /dev/null +++ b/drivers/net/virtio/virtnet.h @@ -0,0 +1,239 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +#ifndef __VIRTIO_NET_H__ +#define __VIRTIO_NET_H__ + +#include +#include + +/* RX packet size EWMA. The average packet size is used to determine the packet + * buffer size when refilling RX rings. As the entire RX ring may be refilled + * at once, the weight is chosen so that the EWMA will be insensitive to short- + * term, transient changes in packet size. + */ +DECLARE_EWMA(pkt_len, 0, 64) + +struct virtnet_sq_stats { + struct u64_stats_sync syncp; + u64_stats_t packets; + u64_stats_t bytes; + u64_stats_t xdp_tx; + u64_stats_t xdp_tx_drops; + u64_stats_t kicks; + u64_stats_t tx_timeouts; + u64_stats_t stop; + u64_stats_t wake; +}; + +struct virtnet_rq_stats { + struct u64_stats_sync syncp; + u64_stats_t packets; + u64_stats_t bytes; + u64_stats_t drops; + u64_stats_t xdp_packets; + u64_stats_t xdp_tx; + u64_stats_t xdp_redirects; + u64_stats_t xdp_drops; + u64_stats_t kicks; +}; + +struct virtnet_interrupt_coalesce { + u32 max_packets; + u32 max_usecs; +}; + +/* The dma information of pages allocated at a time. */ +struct virtnet_rq_dma { + dma_addr_t addr; + u32 ref; + u16 len; + u16 need_sync; +}; + +/* Internal representation of a send virtqueue */ +struct send_queue { + /* Virtqueue associated with this send _queue */ + struct virtqueue *vq; + + /* TX: fragments + linear part + virtio header */ + struct scatterlist sg[MAX_SKB_FRAGS + 2]; + + /* Name of the send queue: output.$index */ + char name[16]; + + struct virtnet_sq_stats stats; + + struct virtnet_interrupt_coalesce intr_coal; + + struct napi_struct napi; + + /* Record whether sq is in reset state. */ + bool reset; +}; + +/* Internal representation of a receive virtqueue */ +struct receive_queue { + /* Virtqueue associated with this receive_queue */ + struct virtqueue *vq; + + struct napi_struct napi; + + struct bpf_prog __rcu *xdp_prog; + + struct virtnet_rq_stats stats; + + /* The number of rx notifications */ + u16 calls; + + /* Is dynamic interrupt moderation enabled? */ + bool dim_enabled; + + /* Used to protect dim_enabled and inter_coal */ + struct mutex dim_lock; + + /* Dynamic Interrupt Moderation */ + struct dim dim; + + u32 packets_in_napi; + + struct virtnet_interrupt_coalesce intr_coal; + + /* Chain pages by the private ptr. */ + struct page *pages; + + /* Average packet length for mergeable receive buffers. */ + struct ewma_pkt_len mrg_avg_pkt_len; + + /* Page frag for packet buffer allocation. */ + struct page_frag alloc_frag; + + /* RX: fragments + linear part + virtio header */ + struct scatterlist sg[MAX_SKB_FRAGS + 2]; + + /* Min single buffer size for mergeable buffers case. */ + unsigned int min_buf_len; + + /* Name of this receive queue: input.$index */ + char name[16]; + + struct xdp_rxq_info xdp_rxq; + + /* Record the last dma info to free after new pages is allocated. */ + struct virtnet_rq_dma *last_dma; +}; + +/* This structure can contain rss message with maximum settings for indirection table and keysize + * Note, that default structure that describes RSS configuration virtio_net_rss_config + * contains same info but can't handle table values. + * In any case, structure would be passed to virtio hw through sg_buf split by parts + * because table sizes may be differ according to the device configuration. + */ +#define VIRTIO_NET_RSS_MAX_KEY_SIZE 40 +#define VIRTIO_NET_RSS_MAX_TABLE_LEN128 +struct virtio_net_ctrl_rss { + u32 hash_types; + u16 indirection_table_mask; + u16 unclassified_queue; + u16 indirection_table[VIRTIO_NET_RSS_MAX_TABLE_LEN]; + u16 max_tx_vq; + u8 hash_key_length; + u8 key[VIRTIO_NET_RSS_MAX_KEY_SIZE]; +}; + +struct virtnet_info { + struct virtio_device *vdev; + struct virtqueue *cvq; + struct net_device *dev; + struct send_queue *sq; + struct receive_queue *rq; + unsigned int status; + + /* Ma
[PATCH net-next v1 6/7] virtio_net: separate receive_mergeable
This commit separates the function receive_mergeable(), put the logic of appending frag to the skb as an independent function. The subsequent commit will reuse it. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio/virtnet.h | 4 ++ drivers/net/virtio/virtnet_main.c | 77 +++ 2 files changed, 51 insertions(+), 30 deletions(-) diff --git a/drivers/net/virtio/virtnet.h b/drivers/net/virtio/virtnet.h index b56ebc7fcdcc..c6ef54160ddc 100644 --- a/drivers/net/virtio/virtnet.h +++ b/drivers/net/virtio/virtnet.h @@ -241,4 +241,8 @@ void virtnet_rx_pause(struct virtnet_info *vi, struct virtnet_rq *rq); void virtnet_rx_resume(struct virtnet_info *vi, struct virtnet_rq *rq); void virtnet_tx_pause(struct virtnet_info *vi, struct virtnet_sq *sq); void virtnet_tx_resume(struct virtnet_info *vi, struct virtnet_sq *sq); +struct sk_buff *virtnet_skb_append_frag(struct sk_buff *head_skb, + struct sk_buff *curr_skb, + struct page *page, void *buf, + int len, int truesize); #endif diff --git a/drivers/net/virtio/virtnet_main.c b/drivers/net/virtio/virtnet_main.c index 285443da040c..6cc99d9b768b 100644 --- a/drivers/net/virtio/virtnet_main.c +++ b/drivers/net/virtio/virtnet_main.c @@ -1557,6 +1557,49 @@ static struct sk_buff *receive_mergeable_xdp(struct net_device *dev, return NULL; } +struct sk_buff *virtnet_skb_append_frag(struct sk_buff *head_skb, + struct sk_buff *curr_skb, + struct page *page, void *buf, + int len, int truesize) +{ + int num_skb_frags; + int offset; + + num_skb_frags = skb_shinfo(curr_skb)->nr_frags; + if (unlikely(num_skb_frags == MAX_SKB_FRAGS)) { + struct sk_buff *nskb = alloc_skb(0, GFP_ATOMIC); + + if (unlikely(!nskb)) + return NULL; + + if (curr_skb == head_skb) + skb_shinfo(curr_skb)->frag_list = nskb; + else + curr_skb->next = nskb; + curr_skb = nskb; + head_skb->truesize += nskb->truesize; + num_skb_frags = 0; + } + + if (curr_skb != head_skb) { + head_skb->data_len += len; + head_skb->len += len; + head_skb->truesize += truesize; + } + + offset = buf - page_address(page); + if (skb_can_coalesce(curr_skb, num_skb_frags, page, offset)) { + put_page(page); + skb_coalesce_rx_frag(curr_skb, num_skb_frags - 1, +len, truesize); + } else { + skb_add_rx_frag(curr_skb, num_skb_frags, page, + offset, len, truesize); + } + + return curr_skb; +} + static struct sk_buff *receive_mergeable(struct net_device *dev, struct virtnet_info *vi, struct virtnet_rq *rq, @@ -1606,8 +1649,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, if (unlikely(!curr_skb)) goto err_skb; while (--num_buf) { - int num_skb_frags; - buf = virtnet_rq_get_buf(rq, &len, &ctx); if (unlikely(!buf)) { pr_debug("%s: rx error: %d buffers out of %d missing\n", @@ -1632,34 +1673,10 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, goto err_skb; } - num_skb_frags = skb_shinfo(curr_skb)->nr_frags; - if (unlikely(num_skb_frags == MAX_SKB_FRAGS)) { - struct sk_buff *nskb = alloc_skb(0, GFP_ATOMIC); - - if (unlikely(!nskb)) - goto err_skb; - if (curr_skb == head_skb) - skb_shinfo(curr_skb)->frag_list = nskb; - else - curr_skb->next = nskb; - curr_skb = nskb; - head_skb->truesize += nskb->truesize; - num_skb_frags = 0; - } - if (curr_skb != head_skb) { - head_skb->data_len += len; - head_skb->len += len; - head_skb->truesize += truesize; - } - offset = buf - page_address(page); - if (skb_can_coalesce(curr_skb, num_skb_frags, page, offset)) { - put_page(page); - skb_coalesce_rx_frag(curr_skb, num_skb_frags - 1, -len, truesize); - } else { - skb_add_rx_frag(curr_skb, num_skb_frags, page, -
[PATCH net-next v1 7/7] virtio_net: separate receive_buf
This commit separates the function receive_buf(), then we wrap the logic of handling the skb to an independent function virtnet_receive_done(). The subsequent commit will reuse it. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio/virtnet_main.c | 56 ++- 1 file changed, 32 insertions(+), 24 deletions(-) diff --git a/drivers/net/virtio/virtnet_main.c b/drivers/net/virtio/virtnet_main.c index 6cc99d9b768b..68b90ee788bd 100644 --- a/drivers/net/virtio/virtnet_main.c +++ b/drivers/net/virtio/virtnet_main.c @@ -1721,32 +1721,11 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash, skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type); } -static void receive_buf(struct virtnet_info *vi, struct virtnet_rq *rq, - void *buf, unsigned int len, void **ctx, - unsigned int *xdp_xmit, - struct virtnet_rq_stats *stats) +static void virtnet_receive_done(struct virtnet_info *vi, struct virtnet_rq *rq, +struct sk_buff *skb) { - struct net_device *dev = vi->dev; - struct sk_buff *skb; struct virtio_net_common_hdr *hdr; - - if (unlikely(len < vi->hdr_len + ETH_HLEN)) { - pr_debug("%s: short packet %i\n", dev->name, len); - DEV_STATS_INC(dev, rx_length_errors); - virtnet_rq_free_buf(vi, rq, buf); - return; - } - - if (vi->mergeable_rx_bufs) - skb = receive_mergeable(dev, vi, rq, buf, ctx, len, xdp_xmit, - stats); - else if (vi->big_packets) - skb = receive_big(dev, vi, rq, buf, len, stats); - else - skb = receive_small(dev, vi, rq, buf, ctx, len, xdp_xmit, stats); - - if (unlikely(!skb)) - return; + struct net_device *dev = vi->dev; hdr = skb_vnet_common_hdr(skb); if (dev->features & NETIF_F_RXHASH && vi->has_rss_hash_report) @@ -1776,6 +1755,35 @@ static void receive_buf(struct virtnet_info *vi, struct virtnet_rq *rq, dev_kfree_skb(skb); } +static void receive_buf(struct virtnet_info *vi, struct virtnet_rq *rq, + void *buf, unsigned int len, void **ctx, + unsigned int *xdp_xmit, + struct virtnet_rq_stats *stats) +{ + struct net_device *dev = vi->dev; + struct sk_buff *skb; + + if (unlikely(len < vi->hdr_len + ETH_HLEN)) { + pr_debug("%s: short packet %i\n", dev->name, len); + DEV_STATS_INC(dev, rx_length_errors); + virtnet_rq_free_buf(vi, rq, buf); + return; + } + + if (vi->mergeable_rx_bufs) + skb = receive_mergeable(dev, vi, rq, buf, ctx, len, xdp_xmit, + stats); + else if (vi->big_packets) + skb = receive_big(dev, vi, rq, buf, len, stats); + else + skb = receive_small(dev, vi, rq, buf, ctx, len, xdp_xmit, stats); + + if (unlikely(!skb)) + return; + + virtnet_receive_done(vi, rq, skb); +} + /* Unlike mergeable buffers, all buffers are allocated to the * same size, except for the headroom. For this reason we do * not need to use mergeable_len_to_ctx here - it is enough -- 2.32.0.3.g01195cf9f
Re: [PATCH net-next v1 0/7] virtnet_net: prepare for af-xdp
On Thu, May 30, 2024 at 03:26:42PM +0800, Xuan Zhuo wrote: > This patch set prepares for supporting af-xdp zerocopy. > There is no feature change in this patch set. > I just want to reduce the patch num of the final patch set, > so I split the patch set. > > #1-#3 add independent directory for virtio-net > #4-#7 do some refactor, the sub-functions will be used by the subsequent > commits > > Thanks. > > v1: > 1. resend for the new net-next merge window What I said at the time is I am fine adding xsk in a new file or just adding in same file working on a split later. Given this was a year ago and all we keep seing is "prepare" patches, I am inclined to say do it in the reverse order: add af-xdp first then do the split when it's clear there is not a lot of code sharing going on. > > Xuan Zhuo (7): > virtio_net: independent directory > virtio_net: move core structures to virtio_net.h > virtio_net: add prefix virtnet to all struct inside virtio_net.h > virtio_net: separate virtnet_rx_resize() > virtio_net: separate virtnet_tx_resize() > virtio_net: separate receive_mergeable > virtio_net: separate receive_buf > > MAINTAINERS | 2 +- > drivers/net/Kconfig | 9 +- > drivers/net/Makefile | 2 +- > drivers/net/virtio/Kconfig| 12 + > drivers/net/virtio/Makefile | 8 + > drivers/net/virtio/virtnet.h | 248 > .../{virtio_net.c => virtio/virtnet_main.c} | 536 ++ > 7 files changed, 454 insertions(+), 363 deletions(-) > create mode 100644 drivers/net/virtio/Kconfig > create mode 100644 drivers/net/virtio/Makefile > create mode 100644 drivers/net/virtio/virtnet.h > rename drivers/net/{virtio_net.c => virtio/virtnet_main.c} (94%) > > -- > 2.32.0.3.g01195cf9f
Re: [PATCH net-next v1 0/7] virtnet_net: prepare for af-xdp
On Thu, 30 May 2024 03:55:35 -0400, "Michael S. Tsirkin" wrote: > On Thu, May 30, 2024 at 03:26:42PM +0800, Xuan Zhuo wrote: > > This patch set prepares for supporting af-xdp zerocopy. > > There is no feature change in this patch set. > > I just want to reduce the patch num of the final patch set, > > so I split the patch set. > > > > #1-#3 add independent directory for virtio-net > > #4-#7 do some refactor, the sub-functions will be used by the subsequent > > commits > > > > Thanks. > > > > v1: > > 1. resend for the new net-next merge window > > What I said at the time is > > I am fine adding xsk in a new file or just adding in same file working > on a split later. > > Given this was a year ago and all we keep seing is "prepare" patches, > I am inclined to say do it in the reverse order: add > af-xdp first then do the split when it's clear there is not > a lot of code sharing going on. If all is done in one patch set, maybe is ok. But we have about 14 commits for af-xdp. If that patch set includes these commits, then we will exceed 15 (net-next limits the commit number of one patch set). I separated these patches from the final patch set because I think these commits can exist independently even without af-xdp. Whether the final xsk should use a separate file, we can look at it in future patches. If you think we can merge it into one file, I am also OK with it. Although other drivers currently use separate files. So if you think this patch set itself is fine, then I hope we can merge this first. Thanks. > > > > > > Xuan Zhuo (7): > > virtio_net: independent directory > > virtio_net: move core structures to virtio_net.h > > virtio_net: add prefix virtnet to all struct inside virtio_net.h > > virtio_net: separate virtnet_rx_resize() > > virtio_net: separate virtnet_tx_resize() > > virtio_net: separate receive_mergeable > > virtio_net: separate receive_buf > > > > MAINTAINERS | 2 +- > > drivers/net/Kconfig | 9 +- > > drivers/net/Makefile | 2 +- > > drivers/net/virtio/Kconfig| 12 + > > drivers/net/virtio/Makefile | 8 + > > drivers/net/virtio/virtnet.h | 248 > > .../{virtio_net.c => virtio/virtnet_main.c} | 536 ++ > > 7 files changed, 454 insertions(+), 363 deletions(-) > > create mode 100644 drivers/net/virtio/Kconfig > > create mode 100644 drivers/net/virtio/Makefile > > create mode 100644 drivers/net/virtio/virtnet.h > > rename drivers/net/{virtio_net.c => virtio/virtnet_main.c} (94%) > > > > -- > > 2.32.0.3.g01195cf9f >
Re: [PATCH net v3 2/2] virtio_net: fix a spurious deadlock issue
On Tue, 2024-05-28 at 21:41 +0800, Heng Qi wrote: > When the following snippet is run, lockdep will report a deadlock[1]. > > /* Acquire all queues dim_locks */ > for (i = 0; i < vi->max_queue_pairs; i++) > mutex_lock(&vi->rq[i].dim_lock); > > There's no deadlock here because the vq locks are always taken > in the same order, but lockdep can not figure it out. So refactoring > the code to alleviate the problem. > > [1] > > WARNING: possible recursive locking detected > 6.9.0-rc7+ #319 Not tainted > > ethtool/962 is trying to acquire lock: > > but task is already holding lock: > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > > lock(&vi->rq[i].dim_lock); > lock(&vi->rq[i].dim_lock); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 3 locks held by ethtool/962: > #0: 82dbaab0 (cb_lock){}-{3:3}, at: genl_rcv+0x19/0x40 > #1: 82dad0a8 (rtnl_mutex){+.+.}-{3:3}, at: > ethnl_default_set_doit+0xbe/0x1e0 > > stack backtrace: > CPU: 6 PID: 962 Comm: ethtool Not tainted 6.9.0-rc7+ #319 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > Call Trace: > > dump_stack_lvl+0x79/0xb0 > check_deadlock+0x130/0x220 > __lock_acquire+0x861/0x990 > lock_acquire.part.0+0x72/0x1d0 > ? lock_acquire+0xf8/0x130 > __mutex_lock+0x71/0xd50 > virtnet_set_coalesce+0x151/0x190 > __ethnl_set_coalesce.isra.0+0x3f8/0x4d0 > ethnl_set_coalesce+0x34/0x90 > ethnl_default_set_doit+0xdd/0x1e0 > genl_family_rcv_msg_doit+0xdc/0x130 > genl_family_rcv_msg+0x154/0x230 > ? __pfx_ethnl_default_set_doit+0x10/0x10 > genl_rcv_msg+0x4b/0xa0 > ? __pfx_genl_rcv_msg+0x10/0x10 > netlink_rcv_skb+0x5a/0x110 > genl_rcv+0x28/0x40 > netlink_unicast+0x1af/0x280 > netlink_sendmsg+0x20e/0x460 > __sys_sendto+0x1fe/0x210 > ? find_held_lock+0x2b/0x80 > ? do_user_addr_fault+0x3a2/0x8a0 > ? __lock_release+0x5e/0x160 > ? do_user_addr_fault+0x3a2/0x8a0 > ? lock_release+0x72/0x140 > ? do_user_addr_fault+0x3a7/0x8a0 > __x64_sys_sendto+0x29/0x30 > do_syscall_64+0x78/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Fixes: 4d4ac2ececd3 ("virtio_net: Add a lock for per queue RX coalesce") > Signed-off-by: Heng Qi This would have deserved a changelog after the commit message. The patch LGTM (for obvious reasons ;), but it deserves an explicit ack from Jason and/or Michael Cheers, Paolo
Re: [PATCH net v3 2/2] virtio_net: fix a spurious deadlock issue
On Thu, 30 May 2024 10:34:07 +0200, Paolo Abeni wrote: > On Tue, 2024-05-28 at 21:41 +0800, Heng Qi wrote: > > When the following snippet is run, lockdep will report a deadlock[1]. > > > > /* Acquire all queues dim_locks */ > > for (i = 0; i < vi->max_queue_pairs; i++) > > mutex_lock(&vi->rq[i].dim_lock); > > > > There's no deadlock here because the vq locks are always taken > > in the same order, but lockdep can not figure it out. So refactoring > > the code to alleviate the problem. > > > > [1] > > > > WARNING: possible recursive locking detected > > 6.9.0-rc7+ #319 Not tainted > > > > ethtool/962 is trying to acquire lock: > > > > but task is already holding lock: > > > > other info that might help us debug this: > > Possible unsafe locking scenario: > > > > CPU0 > > > > lock(&vi->rq[i].dim_lock); > > lock(&vi->rq[i].dim_lock); > > > > *** DEADLOCK *** > > > > May be due to missing lock nesting notation > > > > 3 locks held by ethtool/962: > > #0: 82dbaab0 (cb_lock){}-{3:3}, at: genl_rcv+0x19/0x40 > > #1: 82dad0a8 (rtnl_mutex){+.+.}-{3:3}, at: > > ethnl_default_set_doit+0xbe/0x1e0 > > > > stack backtrace: > > CPU: 6 PID: 962 Comm: ethtool Not tainted 6.9.0-rc7+ #319 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > >rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > > Call Trace: > > > > dump_stack_lvl+0x79/0xb0 > > check_deadlock+0x130/0x220 > > __lock_acquire+0x861/0x990 > > lock_acquire.part.0+0x72/0x1d0 > > ? lock_acquire+0xf8/0x130 > > __mutex_lock+0x71/0xd50 > > virtnet_set_coalesce+0x151/0x190 > > __ethnl_set_coalesce.isra.0+0x3f8/0x4d0 > > ethnl_set_coalesce+0x34/0x90 > > ethnl_default_set_doit+0xdd/0x1e0 > > genl_family_rcv_msg_doit+0xdc/0x130 > > genl_family_rcv_msg+0x154/0x230 > > ? __pfx_ethnl_default_set_doit+0x10/0x10 > > genl_rcv_msg+0x4b/0xa0 > > ? __pfx_genl_rcv_msg+0x10/0x10 > > netlink_rcv_skb+0x5a/0x110 > > genl_rcv+0x28/0x40 > > netlink_unicast+0x1af/0x280 > > netlink_sendmsg+0x20e/0x460 > > __sys_sendto+0x1fe/0x210 > > ? find_held_lock+0x2b/0x80 > > ? do_user_addr_fault+0x3a2/0x8a0 > > ? __lock_release+0x5e/0x160 > > ? do_user_addr_fault+0x3a2/0x8a0 > > ? lock_release+0x72/0x140 > > ? do_user_addr_fault+0x3a7/0x8a0 > > __x64_sys_sendto+0x29/0x30 > > do_syscall_64+0x78/0x180 > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > Fixes: 4d4ac2ececd3 ("virtio_net: Add a lock for per queue RX coalesce") > > Signed-off-by: Heng Qi > > This would have deserved a changelog after the commit message. I declared the changelog in the cover-letter, but I can initiate a new RESEND version with a changelog in this patch if you want :) > > The patch LGTM (for obvious reasons ;), but it deserves an explicit ack > from Jason and/or Michael Thanks. > > Cheers, > > Paolo >
Re: [PATCH net v3 2/2] virtio_net: fix a spurious deadlock issue
On Tue, May 28, 2024 at 09:41:16PM +0800, Heng Qi wrote: > When the following snippet is run, lockdep will report a deadlock[1]. > > /* Acquire all queues dim_locks */ > for (i = 0; i < vi->max_queue_pairs; i++) > mutex_lock(&vi->rq[i].dim_lock); > > There's no deadlock here because the vq locks are always taken > in the same order, but lockdep can not figure it out. So refactoring > the code to alleviate the problem. > > [1] > > WARNING: possible recursive locking detected > 6.9.0-rc7+ #319 Not tainted > > ethtool/962 is trying to acquire lock: > > but task is already holding lock: > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > > lock(&vi->rq[i].dim_lock); > lock(&vi->rq[i].dim_lock); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 3 locks held by ethtool/962: > #0: 82dbaab0 (cb_lock){}-{3:3}, at: genl_rcv+0x19/0x40 > #1: 82dad0a8 (rtnl_mutex){+.+.}-{3:3}, at: > ethnl_default_set_doit+0xbe/0x1e0 > > stack backtrace: > CPU: 6 PID: 962 Comm: ethtool Not tainted 6.9.0-rc7+ #319 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > Call Trace: > > dump_stack_lvl+0x79/0xb0 > check_deadlock+0x130/0x220 > __lock_acquire+0x861/0x990 > lock_acquire.part.0+0x72/0x1d0 > ? lock_acquire+0xf8/0x130 > __mutex_lock+0x71/0xd50 > virtnet_set_coalesce+0x151/0x190 > __ethnl_set_coalesce.isra.0+0x3f8/0x4d0 > ethnl_set_coalesce+0x34/0x90 > ethnl_default_set_doit+0xdd/0x1e0 > genl_family_rcv_msg_doit+0xdc/0x130 > genl_family_rcv_msg+0x154/0x230 > ? __pfx_ethnl_default_set_doit+0x10/0x10 > genl_rcv_msg+0x4b/0xa0 > ? __pfx_genl_rcv_msg+0x10/0x10 > netlink_rcv_skb+0x5a/0x110 > genl_rcv+0x28/0x40 > netlink_unicast+0x1af/0x280 > netlink_sendmsg+0x20e/0x460 > __sys_sendto+0x1fe/0x210 > ? find_held_lock+0x2b/0x80 > ? do_user_addr_fault+0x3a2/0x8a0 > ? __lock_release+0x5e/0x160 > ? do_user_addr_fault+0x3a2/0x8a0 > ? lock_release+0x72/0x140 > ? do_user_addr_fault+0x3a7/0x8a0 > __x64_sys_sendto+0x29/0x30 > do_syscall_64+0x78/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Fixes: 4d4ac2ececd3 ("virtio_net: Add a lock for per queue RX coalesce") > Signed-off-by: Heng Qi Acked-by: Michael S. Tsirkin > --- > drivers/net/virtio_net.c | 36 > 1 file changed, 16 insertions(+), 20 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 4f828a9e5889..ecb5203d0372 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -4257,7 +4257,6 @@ static int virtnet_send_rx_notf_coal_cmds(struct > virtnet_info *vi, > struct virtio_net_ctrl_coal_rx *coal_rx __free(kfree) = NULL; > bool rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce; > struct scatterlist sgs_rx; > - int ret = 0; > int i; > > if (rx_ctrl_dim_on && !virtio_has_feature(vi->vdev, > VIRTIO_NET_F_VQ_NOTF_COAL)) > @@ -4267,27 +4266,27 @@ static int virtnet_send_rx_notf_coal_cmds(struct > virtnet_info *vi, > ec->rx_max_coalesced_frames != > vi->intr_coal_rx.max_packets)) > return -EINVAL; > > - /* Acquire all queues dim_locks */ > - for (i = 0; i < vi->max_queue_pairs; i++) > - mutex_lock(&vi->rq[i].dim_lock); > - > if (rx_ctrl_dim_on && !vi->rx_dim_enabled) { > vi->rx_dim_enabled = true; > - for (i = 0; i < vi->max_queue_pairs; i++) > + for (i = 0; i < vi->max_queue_pairs; i++) { > + mutex_lock(&vi->rq[i].dim_lock); > vi->rq[i].dim_enabled = true; > - goto unlock; > + mutex_unlock(&vi->rq[i].dim_lock); > + } > + return 0; > } > > coal_rx = kzalloc(sizeof(*coal_rx), GFP_KERNEL); > - if (!coal_rx) { > - ret = -ENOMEM; > - goto unlock; > - } > + if (!coal_rx) > + return -ENOMEM; > > if (!rx_ctrl_dim_on && vi->rx_dim_enabled) { > vi->rx_dim_enabled = false; > - for (i = 0; i < vi->max_queue_pairs; i++) > + for (i = 0; i < vi->max_queue_pairs; i++) { > + mutex_lock(&vi->rq[i].dim_lock); > vi->rq[i].dim_enabled = false; > + mutex_unlock(&vi->rq[i].dim_lock); > + } > } > > /* Since the per-queue coalescing params can be set, > @@ -4300,22 +4299,19 @@ static int virtnet_send_rx_notf_coal_cmds(struct > virtnet_info *vi, > > if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL, > VIRTIO_NET_CTRL_NOTF_COAL_RX_SET, > - &sgs_rx))
Re: [PATCH net v3 1/2] virtio_net: fix possible dim status unrecoverable
On Tue, May 28, 2024 at 09:41:15PM +0800, Heng Qi wrote: > When the dim worker is scheduled, if it no longer needs to issue > commands, dim may not be able to return to the working state later. > > For example, the following single queue scenario: > 1. The dim worker of rxq0 is scheduled, and the dim status is > changed to DIM_APPLY_NEW_PROFILE; > 2. dim is disabled or parameters have not been modified; > 3. virtnet_rx_dim_work exits directly; > > Then, even if net_dim is invoked again, it cannot work because the > state is not restored to DIM_START_MEASURE. > > Fixes: 6208799553a8 ("virtio-net: support rx netdim") > Signed-off-by: Heng Qi > Reviewed-by: Jiri Pirko Acked-by: Michael S. Tsirkin > --- > drivers/net/virtio_net.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 4a802c0ea2cb..4f828a9e5889 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -4417,9 +4417,9 @@ static void virtnet_rx_dim_work(struct work_struct > *work) > if (err) > pr_debug("%s: Failed to send dim parameters on rxq%d\n", >dev->name, qnum); > - dim->state = DIM_START_MEASURE; > } > out: > + dim->state = DIM_START_MEASURE; > mutex_unlock(&rq->dim_lock); > } > > -- > 2.32.0.3.g01195cf9f
Re: [PATCH net v3 1/2] virtio_net: fix possible dim status unrecoverable
On Tue, 28 May 2024 21:41:15 +0800, Heng Qi wrote: > When the dim worker is scheduled, if it no longer needs to issue > commands, dim may not be able to return to the working state later. > > For example, the following single queue scenario: > 1. The dim worker of rxq0 is scheduled, and the dim status is > changed to DIM_APPLY_NEW_PROFILE; > 2. dim is disabled or parameters have not been modified; > 3. virtnet_rx_dim_work exits directly; > > Then, even if net_dim is invoked again, it cannot work because the > state is not restored to DIM_START_MEASURE. > > Fixes: 6208799553a8 ("virtio-net: support rx netdim") > Signed-off-by: Heng Qi > Reviewed-by: Jiri Pirko Reviewed-by: Xuan Zhuo > --- > drivers/net/virtio_net.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 4a802c0ea2cb..4f828a9e5889 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -4417,9 +4417,9 @@ static void virtnet_rx_dim_work(struct work_struct > *work) > if (err) > pr_debug("%s: Failed to send dim parameters on rxq%d\n", >dev->name, qnum); > - dim->state = DIM_START_MEASURE; > } > out: > + dim->state = DIM_START_MEASURE; > mutex_unlock(&rq->dim_lock); > } > > -- > 2.32.0.3.g01195cf9f >
Re: [PATCH net v3 2/2] virtio_net: fix a spurious deadlock issue
On Tue, 28 May 2024 21:41:16 +0800, Heng Qi wrote: > When the following snippet is run, lockdep will report a deadlock[1]. > > /* Acquire all queues dim_locks */ > for (i = 0; i < vi->max_queue_pairs; i++) > mutex_lock(&vi->rq[i].dim_lock); > > There's no deadlock here because the vq locks are always taken > in the same order, but lockdep can not figure it out. So refactoring > the code to alleviate the problem. > > [1] > > WARNING: possible recursive locking detected > 6.9.0-rc7+ #319 Not tainted > > ethtool/962 is trying to acquire lock: > > but task is already holding lock: > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > > lock(&vi->rq[i].dim_lock); > lock(&vi->rq[i].dim_lock); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 3 locks held by ethtool/962: > #0: 82dbaab0 (cb_lock){}-{3:3}, at: genl_rcv+0x19/0x40 > #1: 82dad0a8 (rtnl_mutex){+.+.}-{3:3}, at: > ethnl_default_set_doit+0xbe/0x1e0 > > stack backtrace: > CPU: 6 PID: 962 Comm: ethtool Not tainted 6.9.0-rc7+ #319 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > Call Trace: > > dump_stack_lvl+0x79/0xb0 > check_deadlock+0x130/0x220 > __lock_acquire+0x861/0x990 > lock_acquire.part.0+0x72/0x1d0 > ? lock_acquire+0xf8/0x130 > __mutex_lock+0x71/0xd50 > virtnet_set_coalesce+0x151/0x190 > __ethnl_set_coalesce.isra.0+0x3f8/0x4d0 > ethnl_set_coalesce+0x34/0x90 > ethnl_default_set_doit+0xdd/0x1e0 > genl_family_rcv_msg_doit+0xdc/0x130 > genl_family_rcv_msg+0x154/0x230 > ? __pfx_ethnl_default_set_doit+0x10/0x10 > genl_rcv_msg+0x4b/0xa0 > ? __pfx_genl_rcv_msg+0x10/0x10 > netlink_rcv_skb+0x5a/0x110 > genl_rcv+0x28/0x40 > netlink_unicast+0x1af/0x280 > netlink_sendmsg+0x20e/0x460 > __sys_sendto+0x1fe/0x210 > ? find_held_lock+0x2b/0x80 > ? do_user_addr_fault+0x3a2/0x8a0 > ? __lock_release+0x5e/0x160 > ? do_user_addr_fault+0x3a2/0x8a0 > ? lock_release+0x72/0x140 > ? do_user_addr_fault+0x3a7/0x8a0 > __x64_sys_sendto+0x29/0x30 > do_syscall_64+0x78/0x180 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > Fixes: 4d4ac2ececd3 ("virtio_net: Add a lock for per queue RX coalesce") > Signed-off-by: Heng Qi Reviewed-by: Xuan Zhuo > --- > drivers/net/virtio_net.c | 36 > 1 file changed, 16 insertions(+), 20 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 4f828a9e5889..ecb5203d0372 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -4257,7 +4257,6 @@ static int virtnet_send_rx_notf_coal_cmds(struct > virtnet_info *vi, > struct virtio_net_ctrl_coal_rx *coal_rx __free(kfree) = NULL; > bool rx_ctrl_dim_on = !!ec->use_adaptive_rx_coalesce; > struct scatterlist sgs_rx; > - int ret = 0; > int i; > > if (rx_ctrl_dim_on && !virtio_has_feature(vi->vdev, > VIRTIO_NET_F_VQ_NOTF_COAL)) > @@ -4267,27 +4266,27 @@ static int virtnet_send_rx_notf_coal_cmds(struct > virtnet_info *vi, > ec->rx_max_coalesced_frames != > vi->intr_coal_rx.max_packets)) > return -EINVAL; > > - /* Acquire all queues dim_locks */ > - for (i = 0; i < vi->max_queue_pairs; i++) > - mutex_lock(&vi->rq[i].dim_lock); > - > if (rx_ctrl_dim_on && !vi->rx_dim_enabled) { > vi->rx_dim_enabled = true; > - for (i = 0; i < vi->max_queue_pairs; i++) > + for (i = 0; i < vi->max_queue_pairs; i++) { > + mutex_lock(&vi->rq[i].dim_lock); > vi->rq[i].dim_enabled = true; > - goto unlock; > + mutex_unlock(&vi->rq[i].dim_lock); > + } > + return 0; > } > > coal_rx = kzalloc(sizeof(*coal_rx), GFP_KERNEL); > - if (!coal_rx) { > - ret = -ENOMEM; > - goto unlock; > - } > + if (!coal_rx) > + return -ENOMEM; > > if (!rx_ctrl_dim_on && vi->rx_dim_enabled) { > vi->rx_dim_enabled = false; > - for (i = 0; i < vi->max_queue_pairs; i++) > + for (i = 0; i < vi->max_queue_pairs; i++) { > + mutex_lock(&vi->rq[i].dim_lock); > vi->rq[i].dim_enabled = false; > + mutex_unlock(&vi->rq[i].dim_lock); > + } > } > > /* Since the per-queue coalescing params can be set, > @@ -4300,22 +4299,19 @@ static int virtnet_send_rx_notf_coal_cmds(struct > virtnet_info *vi, > > if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_NOTF_COAL, > VIRTIO_NET_CTRL_NOTF_COAL_RX_SET, > - &sgs_rx)) { > - ret = -EINVAL; > -
Re: [PATCH net v3 2/2] virtio_net: fix a spurious deadlock issue
On Thu, May 30, 2024 at 5:17 PM Michael S. Tsirkin wrote: > > On Tue, May 28, 2024 at 09:41:16PM +0800, Heng Qi wrote: > > When the following snippet is run, lockdep will report a deadlock[1]. > > > > /* Acquire all queues dim_locks */ > > for (i = 0; i < vi->max_queue_pairs; i++) > > mutex_lock(&vi->rq[i].dim_lock); > > > > There's no deadlock here because the vq locks are always taken > > in the same order, but lockdep can not figure it out. So refactoring > > the code to alleviate the problem. > > > > [1] > > > > WARNING: possible recursive locking detected > > 6.9.0-rc7+ #319 Not tainted > > > > ethtool/962 is trying to acquire lock: > > > > but task is already holding lock: > > > > other info that might help us debug this: > > Possible unsafe locking scenario: > > > > CPU0 > > > > lock(&vi->rq[i].dim_lock); > > lock(&vi->rq[i].dim_lock); > > > > *** DEADLOCK *** > > > > May be due to missing lock nesting notation > > > > 3 locks held by ethtool/962: > > #0: 82dbaab0 (cb_lock){}-{3:3}, at: genl_rcv+0x19/0x40 > > #1: 82dad0a8 (rtnl_mutex){+.+.}-{3:3}, at: > > ethnl_default_set_doit+0xbe/0x1e0 > > > > stack backtrace: > > CPU: 6 PID: 962 Comm: ethtool Not tainted 6.9.0-rc7+ #319 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > > Call Trace: > > > > dump_stack_lvl+0x79/0xb0 > > check_deadlock+0x130/0x220 > > __lock_acquire+0x861/0x990 > > lock_acquire.part.0+0x72/0x1d0 > > ? lock_acquire+0xf8/0x130 > > __mutex_lock+0x71/0xd50 > > virtnet_set_coalesce+0x151/0x190 > > __ethnl_set_coalesce.isra.0+0x3f8/0x4d0 > > ethnl_set_coalesce+0x34/0x90 > > ethnl_default_set_doit+0xdd/0x1e0 > > genl_family_rcv_msg_doit+0xdc/0x130 > > genl_family_rcv_msg+0x154/0x230 > > ? __pfx_ethnl_default_set_doit+0x10/0x10 > > genl_rcv_msg+0x4b/0xa0 > > ? __pfx_genl_rcv_msg+0x10/0x10 > > netlink_rcv_skb+0x5a/0x110 > > genl_rcv+0x28/0x40 > > netlink_unicast+0x1af/0x280 > > netlink_sendmsg+0x20e/0x460 > > __sys_sendto+0x1fe/0x210 > > ? find_held_lock+0x2b/0x80 > > ? do_user_addr_fault+0x3a2/0x8a0 > > ? __lock_release+0x5e/0x160 > > ? do_user_addr_fault+0x3a2/0x8a0 > > ? lock_release+0x72/0x140 > > ? do_user_addr_fault+0x3a7/0x8a0 > > __x64_sys_sendto+0x29/0x30 > > do_syscall_64+0x78/0x180 > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > Fixes: 4d4ac2ececd3 ("virtio_net: Add a lock for per queue RX coalesce") > > Signed-off-by: Heng Qi > > > Acked-by: Michael S. Tsirkin > Acked-by: Jason Wang Btw, adding notation seems to be another way. Thanks
[PATCH net-next v2 00/12] virtnet_net: prepare for af-xdp
This patch set prepares for supporting af-xdp zerocopy. There is no feature change in this patch set. I just want to reduce the patch num of the final patch set, so I split the patch set. Thanks. v2: 1. Add five commits. That provides some helper for sq to support premapped mode. And the last one refactors distinguishing xmit types. v1: 1. resend for the new net-next merge window Xuan Zhuo (12): virtio_net: independent directory virtio_net: move core structures to virtio_net.h virtio_net: add prefix virtnet to all struct inside virtio_net.h virtio_net: separate virtnet_rx_resize() virtio_net: separate virtnet_tx_resize() virtio_net: separate receive_mergeable virtio_net: separate receive_buf virtio_ring: introduce vring_need_unmap_buffer virtio_ring: introduce dma map api for page virtio_ring: introduce virtqueue_dma_map_sg_attrs virtio_ring: virtqueue_set_dma_premapped() support to disable virtio_net: refactor the xmit type MAINTAINERS | 2 +- drivers/net/Kconfig | 9 +- drivers/net/Makefile | 2 +- drivers/net/virtio/Kconfig| 12 + drivers/net/virtio/Makefile | 8 + drivers/net/virtio/virtnet.h | 248 .../{virtio_net.c => virtio/virtnet_main.c} | 596 +++--- drivers/virtio/virtio_ring.c | 118 +++- include/linux/virtio.h| 12 +- 9 files changed, 606 insertions(+), 401 deletions(-) create mode 100644 drivers/net/virtio/Kconfig create mode 100644 drivers/net/virtio/Makefile create mode 100644 drivers/net/virtio/virtnet.h rename drivers/net/{virtio_net.c => virtio/virtnet_main.c} (93%) -- 2.32.0.3.g01195cf9f
[PATCH net-next v2 02/12] virtio_net: move core structures to virtio_net.h
Move some core structures (send_queue, receive_queue, virtnet_info) definitions and the relative structures definitions into the virtio_net.h file. That will be used by the other c code files. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio/virtnet.h | 239 ++ drivers/net/virtio/virtnet_main.c | 235 + 2 files changed, 241 insertions(+), 233 deletions(-) create mode 100644 drivers/net/virtio/virtnet.h diff --git a/drivers/net/virtio/virtnet.h b/drivers/net/virtio/virtnet.h new file mode 100644 index ..e46db9491605 --- /dev/null +++ b/drivers/net/virtio/virtnet.h @@ -0,0 +1,239 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +#ifndef __VIRTIO_NET_H__ +#define __VIRTIO_NET_H__ + +#include +#include + +/* RX packet size EWMA. The average packet size is used to determine the packet + * buffer size when refilling RX rings. As the entire RX ring may be refilled + * at once, the weight is chosen so that the EWMA will be insensitive to short- + * term, transient changes in packet size. + */ +DECLARE_EWMA(pkt_len, 0, 64) + +struct virtnet_sq_stats { + struct u64_stats_sync syncp; + u64_stats_t packets; + u64_stats_t bytes; + u64_stats_t xdp_tx; + u64_stats_t xdp_tx_drops; + u64_stats_t kicks; + u64_stats_t tx_timeouts; + u64_stats_t stop; + u64_stats_t wake; +}; + +struct virtnet_rq_stats { + struct u64_stats_sync syncp; + u64_stats_t packets; + u64_stats_t bytes; + u64_stats_t drops; + u64_stats_t xdp_packets; + u64_stats_t xdp_tx; + u64_stats_t xdp_redirects; + u64_stats_t xdp_drops; + u64_stats_t kicks; +}; + +struct virtnet_interrupt_coalesce { + u32 max_packets; + u32 max_usecs; +}; + +/* The dma information of pages allocated at a time. */ +struct virtnet_rq_dma { + dma_addr_t addr; + u32 ref; + u16 len; + u16 need_sync; +}; + +/* Internal representation of a send virtqueue */ +struct send_queue { + /* Virtqueue associated with this send _queue */ + struct virtqueue *vq; + + /* TX: fragments + linear part + virtio header */ + struct scatterlist sg[MAX_SKB_FRAGS + 2]; + + /* Name of the send queue: output.$index */ + char name[16]; + + struct virtnet_sq_stats stats; + + struct virtnet_interrupt_coalesce intr_coal; + + struct napi_struct napi; + + /* Record whether sq is in reset state. */ + bool reset; +}; + +/* Internal representation of a receive virtqueue */ +struct receive_queue { + /* Virtqueue associated with this receive_queue */ + struct virtqueue *vq; + + struct napi_struct napi; + + struct bpf_prog __rcu *xdp_prog; + + struct virtnet_rq_stats stats; + + /* The number of rx notifications */ + u16 calls; + + /* Is dynamic interrupt moderation enabled? */ + bool dim_enabled; + + /* Used to protect dim_enabled and inter_coal */ + struct mutex dim_lock; + + /* Dynamic Interrupt Moderation */ + struct dim dim; + + u32 packets_in_napi; + + struct virtnet_interrupt_coalesce intr_coal; + + /* Chain pages by the private ptr. */ + struct page *pages; + + /* Average packet length for mergeable receive buffers. */ + struct ewma_pkt_len mrg_avg_pkt_len; + + /* Page frag for packet buffer allocation. */ + struct page_frag alloc_frag; + + /* RX: fragments + linear part + virtio header */ + struct scatterlist sg[MAX_SKB_FRAGS + 2]; + + /* Min single buffer size for mergeable buffers case. */ + unsigned int min_buf_len; + + /* Name of this receive queue: input.$index */ + char name[16]; + + struct xdp_rxq_info xdp_rxq; + + /* Record the last dma info to free after new pages is allocated. */ + struct virtnet_rq_dma *last_dma; +}; + +/* This structure can contain rss message with maximum settings for indirection table and keysize + * Note, that default structure that describes RSS configuration virtio_net_rss_config + * contains same info but can't handle table values. + * In any case, structure would be passed to virtio hw through sg_buf split by parts + * because table sizes may be differ according to the device configuration. + */ +#define VIRTIO_NET_RSS_MAX_KEY_SIZE 40 +#define VIRTIO_NET_RSS_MAX_TABLE_LEN128 +struct virtio_net_ctrl_rss { + u32 hash_types; + u16 indirection_table_mask; + u16 unclassified_queue; + u16 indirection_table[VIRTIO_NET_RSS_MAX_TABLE_LEN]; + u16 max_tx_vq; + u8 hash_key_length; + u8 key[VIRTIO_NET_RSS_MAX_KEY_SIZE]; +}; + +struct virtnet_info { + struct virtio_device *vdev; + struct virtqueue *cvq; + struct net_device *dev; + struct send_queue *sq; + struct receive_queue *rq; + unsigned int status; + + /* Ma
[PATCH net-next v2 03/12] virtio_net: add prefix virtnet to all struct inside virtio_net.h
We move some structures to the header file, but these structures do not prefixed with virtnet. This patch adds virtnet for these. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio/virtnet.h | 12 ++-- drivers/net/virtio/virtnet_main.c | 110 +++--- 2 files changed, 61 insertions(+), 61 deletions(-) diff --git a/drivers/net/virtio/virtnet.h b/drivers/net/virtio/virtnet.h index e46db9491605..d4cc4ddb0786 100644 --- a/drivers/net/virtio/virtnet.h +++ b/drivers/net/virtio/virtnet.h @@ -51,8 +51,8 @@ struct virtnet_rq_dma { }; /* Internal representation of a send virtqueue */ -struct send_queue { - /* Virtqueue associated with this send _queue */ +struct virtnet_sq { + /* Virtqueue associated with this virtnet_sq */ struct virtqueue *vq; /* TX: fragments + linear part + virtio header */ @@ -72,8 +72,8 @@ struct send_queue { }; /* Internal representation of a receive virtqueue */ -struct receive_queue { - /* Virtqueue associated with this receive_queue */ +struct virtnet_rq { + /* Virtqueue associated with this virtnet_rq */ struct virtqueue *vq; struct napi_struct napi; @@ -144,8 +144,8 @@ struct virtnet_info { struct virtio_device *vdev; struct virtqueue *cvq; struct net_device *dev; - struct send_queue *sq; - struct receive_queue *rq; + struct virtnet_sq *sq; + struct virtnet_rq *rq; unsigned int status; /* Max # of queue pairs supported by the device */ diff --git a/drivers/net/virtio/virtnet_main.c b/drivers/net/virtio/virtnet_main.c index faff5d719440..fc28fadf944e 100644 --- a/drivers/net/virtio/virtnet_main.c +++ b/drivers/net/virtio/virtnet_main.c @@ -275,7 +275,7 @@ static struct xdp_frame *ptr_to_xdp(void *ptr) return (struct xdp_frame *)((unsigned long)ptr & ~VIRTIO_XDP_FLAG); } -static void __free_old_xmit(struct send_queue *sq, bool in_napi, +static void __free_old_xmit(struct virtnet_sq *sq, bool in_napi, struct virtnet_sq_free_stats *stats) { unsigned int len; @@ -344,7 +344,7 @@ skb_vnet_common_hdr(struct sk_buff *skb) * private is used to chain pages for big packets, put the whole * most recent used list in the beginning for reuse */ -static void give_pages(struct receive_queue *rq, struct page *page) +static void give_pages(struct virtnet_rq *rq, struct page *page) { struct page *end; @@ -354,7 +354,7 @@ static void give_pages(struct receive_queue *rq, struct page *page) rq->pages = page; } -static struct page *get_a_page(struct receive_queue *rq, gfp_t gfp_mask) +static struct page *get_a_page(struct virtnet_rq *rq, gfp_t gfp_mask) { struct page *p = rq->pages; @@ -368,7 +368,7 @@ static struct page *get_a_page(struct receive_queue *rq, gfp_t gfp_mask) } static void virtnet_rq_free_buf(struct virtnet_info *vi, - struct receive_queue *rq, void *buf) + struct virtnet_rq *rq, void *buf) { if (vi->mergeable_rx_bufs) put_page(virt_to_head_page(buf)); @@ -483,7 +483,7 @@ static struct sk_buff *virtnet_build_skb(void *buf, unsigned int buflen, /* Called from bottom half context */ static struct sk_buff *page_to_skb(struct virtnet_info *vi, - struct receive_queue *rq, + struct virtnet_rq *rq, struct page *page, unsigned int offset, unsigned int len, unsigned int truesize, unsigned int headroom) @@ -581,7 +581,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi, return skb; } -static void virtnet_rq_unmap(struct receive_queue *rq, void *buf, u32 len) +static void virtnet_rq_unmap(struct virtnet_rq *rq, void *buf, u32 len) { struct page *page = virt_to_head_page(buf); struct virtnet_rq_dma *dma; @@ -610,7 +610,7 @@ static void virtnet_rq_unmap(struct receive_queue *rq, void *buf, u32 len) put_page(page); } -static void *virtnet_rq_get_buf(struct receive_queue *rq, u32 *len, void **ctx) +static void *virtnet_rq_get_buf(struct virtnet_rq *rq, u32 *len, void **ctx) { void *buf; @@ -621,7 +621,7 @@ static void *virtnet_rq_get_buf(struct receive_queue *rq, u32 *len, void **ctx) return buf; } -static void virtnet_rq_init_one_sg(struct receive_queue *rq, void *buf, u32 len) +static void virtnet_rq_init_one_sg(struct virtnet_rq *rq, void *buf, u32 len) { struct virtnet_rq_dma *dma; dma_addr_t addr; @@ -641,7 +641,7 @@ static void virtnet_rq_init_one_sg(struct receive_queue *rq, void *buf, u32 len) rq->sg[0].length = len; } -static void *virtnet_rq_alloc(struct receive_queue *rq, u32 size, gfp_t gfp) +static void *virtnet_rq_alloc(struct virtnet_rq *rq, u32 size, gfp_t gfp)
[PATCH net-next v2 01/12] virtio_net: independent directory
Create a separate directory for virtio-net. AF_XDP support will be added later, then a separate xsk.c file will be added, so we should create a directory for virtio-net. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- MAINTAINERS | 2 +- drivers/net/Kconfig | 9 + drivers/net/Makefile| 2 +- drivers/net/virtio/Kconfig | 12 drivers/net/virtio/Makefile | 8 drivers/net/{virtio_net.c => virtio/virtnet_main.c} | 0 6 files changed, 23 insertions(+), 10 deletions(-) create mode 100644 drivers/net/virtio/Kconfig create mode 100644 drivers/net/virtio/Makefile rename drivers/net/{virtio_net.c => virtio/virtnet_main.c} (100%) diff --git a/MAINTAINERS b/MAINTAINERS index 27367ad339ea..e426fdbaacb8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -23776,7 +23776,7 @@ F: Documentation/devicetree/bindings/virtio/ F: Documentation/driver-api/virtio/ F: drivers/block/virtio_blk.c F: drivers/crypto/virtio/ -F: drivers/net/virtio_net.c +F: drivers/net/virtio/ F: drivers/vdpa/ F: drivers/virtio/ F: include/linux/vdpa.h diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index 9920b3a68ed1..b80793a0bd17 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -443,14 +443,7 @@ config VETH When one end receives the packet it appears on its pair and vice versa. -config VIRTIO_NET - tristate "Virtio network driver" - depends on VIRTIO - select NET_FAILOVER - select DIMLIB - help - This is the virtual network driver for virtio. It can be used with - QEMU based VMMs (like KVM or Xen). Say Y or M. +source "drivers/net/virtio/Kconfig" config NLMON tristate "Virtual netlink monitoring device" diff --git a/drivers/net/Makefile b/drivers/net/Makefile index 13743d0e83b5..505385d7f6b7 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -32,7 +32,7 @@ obj-$(CONFIG_NET_TEAM) += team/ obj-$(CONFIG_TUN) += tun.o obj-$(CONFIG_TAP) += tap.o obj-$(CONFIG_VETH) += veth.o -obj-$(CONFIG_VIRTIO_NET) += virtio_net.o +obj-$(CONFIG_VIRTIO_NET) += virtio/ obj-$(CONFIG_VXLAN) += vxlan/ obj-$(CONFIG_GENEVE) += geneve.o obj-$(CONFIG_BAREUDP) += bareudp.o diff --git a/drivers/net/virtio/Kconfig b/drivers/net/virtio/Kconfig new file mode 100644 index ..e162535ca213 --- /dev/null +++ b/drivers/net/virtio/Kconfig @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# virtio-net device configuration +# +config VIRTIO_NET + tristate "Virtio network driver" + depends on VIRTIO + select NET_FAILOVER + select DIMLIB + help + This is the virtual network driver for virtio. It can be used with + QEMU based VMMs (like KVM or Xen). Say Y or M. diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile new file mode 100644 index ..c4602337c78c --- /dev/null +++ b/drivers/net/virtio/Makefile @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Makefile for the virtio network device drivers. +# + +obj-$(CONFIG_VIRTIO_NET) += virtio_net.o + +virtio_net-y := virtnet_main.o diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio/virtnet_main.c similarity index 100% rename from drivers/net/virtio_net.c rename to drivers/net/virtio/virtnet_main.c -- 2.32.0.3.g01195cf9f
[PATCH net-next v2 04/12] virtio_net: separate virtnet_rx_resize()
This patch separates two sub-functions from virtnet_rx_resize(): * virtnet_rx_pause * virtnet_rx_resume Then the subsequent reset rx for xsk can share these two functions. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio/virtnet.h | 3 +++ drivers/net/virtio/virtnet_main.c | 29 + 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/drivers/net/virtio/virtnet.h b/drivers/net/virtio/virtnet.h index d4cc4ddb0786..b0f2ae4bd1c4 100644 --- a/drivers/net/virtio/virtnet.h +++ b/drivers/net/virtio/virtnet.h @@ -236,4 +236,7 @@ struct virtnet_info { u64 device_stats_cap; }; + +void virtnet_rx_pause(struct virtnet_info *vi, struct virtnet_rq *rq); +void virtnet_rx_resume(struct virtnet_info *vi, struct virtnet_rq *rq); #endif diff --git a/drivers/net/virtio/virtnet_main.c b/drivers/net/virtio/virtnet_main.c index fc28fadf944e..a2bf576e644c 100644 --- a/drivers/net/virtio/virtnet_main.c +++ b/drivers/net/virtio/virtnet_main.c @@ -2378,28 +2378,41 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } -static int virtnet_rx_resize(struct virtnet_info *vi, -struct virtnet_rq *rq, u32 ring_num) +void virtnet_rx_pause(struct virtnet_info *vi, struct virtnet_rq *rq) { bool running = netif_running(vi->dev); - int err, qindex; - - qindex = rq - vi->rq; if (running) { napi_disable(&rq->napi); cancel_work_sync(&rq->dim.work); } +} - err = virtqueue_resize(rq->vq, ring_num, virtnet_rq_unmap_free_buf); - if (err) - netdev_err(vi->dev, "resize rx fail: rx queue index: %d err: %d\n", qindex, err); +void virtnet_rx_resume(struct virtnet_info *vi, struct virtnet_rq *rq) +{ + bool running = netif_running(vi->dev); if (!try_fill_recv(vi, rq, GFP_KERNEL)) schedule_delayed_work(&vi->refill, 0); if (running) virtnet_napi_enable(rq->vq, &rq->napi); +} + +static int virtnet_rx_resize(struct virtnet_info *vi, +struct virtnet_rq *rq, u32 ring_num) +{ + int err, qindex; + + qindex = rq - vi->rq; + + virtnet_rx_pause(vi, rq); + + err = virtqueue_resize(rq->vq, ring_num, virtnet_rq_unmap_free_buf); + if (err) + netdev_err(vi->dev, "resize rx fail: rx queue index: %d err: %d\n", qindex, err); + + virtnet_rx_resume(vi, rq); return err; } -- 2.32.0.3.g01195cf9f
[PATCH net-next v2 05/12] virtio_net: separate virtnet_tx_resize()
This patch separates two sub-functions from virtnet_tx_resize(): * virtnet_tx_pause * virtnet_tx_resume Then the subsequent virtnet_tx_reset() can share these two functions. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio/virtnet.h | 2 ++ drivers/net/virtio/virtnet_main.c | 35 +-- 2 files changed, 31 insertions(+), 6 deletions(-) diff --git a/drivers/net/virtio/virtnet.h b/drivers/net/virtio/virtnet.h index b0f2ae4bd1c4..b56ebc7fcdcc 100644 --- a/drivers/net/virtio/virtnet.h +++ b/drivers/net/virtio/virtnet.h @@ -239,4 +239,6 @@ struct virtnet_info { void virtnet_rx_pause(struct virtnet_info *vi, struct virtnet_rq *rq); void virtnet_rx_resume(struct virtnet_info *vi, struct virtnet_rq *rq); +void virtnet_tx_pause(struct virtnet_info *vi, struct virtnet_sq *sq); +void virtnet_tx_resume(struct virtnet_info *vi, struct virtnet_sq *sq); #endif diff --git a/drivers/net/virtio/virtnet_main.c b/drivers/net/virtio/virtnet_main.c index a2bf576e644c..285443da040c 100644 --- a/drivers/net/virtio/virtnet_main.c +++ b/drivers/net/virtio/virtnet_main.c @@ -2416,12 +2416,11 @@ static int virtnet_rx_resize(struct virtnet_info *vi, return err; } -static int virtnet_tx_resize(struct virtnet_info *vi, -struct virtnet_sq *sq, u32 ring_num) +void virtnet_tx_pause(struct virtnet_info *vi, struct virtnet_sq *sq) { bool running = netif_running(vi->dev); struct netdev_queue *txq; - int err, qindex; + int qindex; qindex = sq - vi->sq; @@ -2442,10 +2441,17 @@ static int virtnet_tx_resize(struct virtnet_info *vi, netif_stop_subqueue(vi->dev, qindex); __netif_tx_unlock_bh(txq); +} - err = virtqueue_resize(sq->vq, ring_num, virtnet_sq_free_unused_buf); - if (err) - netdev_err(vi->dev, "resize tx fail: tx queue index: %d err: %d\n", qindex, err); +void virtnet_tx_resume(struct virtnet_info *vi, struct virtnet_sq *sq) +{ + bool running = netif_running(vi->dev); + struct netdev_queue *txq; + int qindex; + + qindex = sq - vi->sq; + + txq = netdev_get_tx_queue(vi->dev, qindex); __netif_tx_lock_bh(txq); sq->reset = false; @@ -2454,6 +2460,23 @@ static int virtnet_tx_resize(struct virtnet_info *vi, if (running) virtnet_napi_tx_enable(vi, sq->vq, &sq->napi); +} + +static int virtnet_tx_resize(struct virtnet_info *vi, struct virtnet_sq *sq, +u32 ring_num) +{ + int qindex, err; + + qindex = sq - vi->sq; + + virtnet_tx_pause(vi, sq); + + err = virtqueue_resize(sq->vq, ring_num, virtnet_sq_free_unused_buf); + if (err) + netdev_err(vi->dev, "resize tx fail: tx queue index: %d err: %d\n", qindex, err); + + virtnet_tx_resume(vi, sq); + return err; } -- 2.32.0.3.g01195cf9f
[PATCH net-next v2 06/12] virtio_net: separate receive_mergeable
This commit separates the function receive_mergeable(), put the logic of appending frag to the skb as an independent function. The subsequent commit will reuse it. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio/virtnet.h | 4 ++ drivers/net/virtio/virtnet_main.c | 77 +++ 2 files changed, 51 insertions(+), 30 deletions(-) diff --git a/drivers/net/virtio/virtnet.h b/drivers/net/virtio/virtnet.h index b56ebc7fcdcc..c6ef54160ddc 100644 --- a/drivers/net/virtio/virtnet.h +++ b/drivers/net/virtio/virtnet.h @@ -241,4 +241,8 @@ void virtnet_rx_pause(struct virtnet_info *vi, struct virtnet_rq *rq); void virtnet_rx_resume(struct virtnet_info *vi, struct virtnet_rq *rq); void virtnet_tx_pause(struct virtnet_info *vi, struct virtnet_sq *sq); void virtnet_tx_resume(struct virtnet_info *vi, struct virtnet_sq *sq); +struct sk_buff *virtnet_skb_append_frag(struct sk_buff *head_skb, + struct sk_buff *curr_skb, + struct page *page, void *buf, + int len, int truesize); #endif diff --git a/drivers/net/virtio/virtnet_main.c b/drivers/net/virtio/virtnet_main.c index 285443da040c..6cc99d9b768b 100644 --- a/drivers/net/virtio/virtnet_main.c +++ b/drivers/net/virtio/virtnet_main.c @@ -1557,6 +1557,49 @@ static struct sk_buff *receive_mergeable_xdp(struct net_device *dev, return NULL; } +struct sk_buff *virtnet_skb_append_frag(struct sk_buff *head_skb, + struct sk_buff *curr_skb, + struct page *page, void *buf, + int len, int truesize) +{ + int num_skb_frags; + int offset; + + num_skb_frags = skb_shinfo(curr_skb)->nr_frags; + if (unlikely(num_skb_frags == MAX_SKB_FRAGS)) { + struct sk_buff *nskb = alloc_skb(0, GFP_ATOMIC); + + if (unlikely(!nskb)) + return NULL; + + if (curr_skb == head_skb) + skb_shinfo(curr_skb)->frag_list = nskb; + else + curr_skb->next = nskb; + curr_skb = nskb; + head_skb->truesize += nskb->truesize; + num_skb_frags = 0; + } + + if (curr_skb != head_skb) { + head_skb->data_len += len; + head_skb->len += len; + head_skb->truesize += truesize; + } + + offset = buf - page_address(page); + if (skb_can_coalesce(curr_skb, num_skb_frags, page, offset)) { + put_page(page); + skb_coalesce_rx_frag(curr_skb, num_skb_frags - 1, +len, truesize); + } else { + skb_add_rx_frag(curr_skb, num_skb_frags, page, + offset, len, truesize); + } + + return curr_skb; +} + static struct sk_buff *receive_mergeable(struct net_device *dev, struct virtnet_info *vi, struct virtnet_rq *rq, @@ -1606,8 +1649,6 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, if (unlikely(!curr_skb)) goto err_skb; while (--num_buf) { - int num_skb_frags; - buf = virtnet_rq_get_buf(rq, &len, &ctx); if (unlikely(!buf)) { pr_debug("%s: rx error: %d buffers out of %d missing\n", @@ -1632,34 +1673,10 @@ static struct sk_buff *receive_mergeable(struct net_device *dev, goto err_skb; } - num_skb_frags = skb_shinfo(curr_skb)->nr_frags; - if (unlikely(num_skb_frags == MAX_SKB_FRAGS)) { - struct sk_buff *nskb = alloc_skb(0, GFP_ATOMIC); - - if (unlikely(!nskb)) - goto err_skb; - if (curr_skb == head_skb) - skb_shinfo(curr_skb)->frag_list = nskb; - else - curr_skb->next = nskb; - curr_skb = nskb; - head_skb->truesize += nskb->truesize; - num_skb_frags = 0; - } - if (curr_skb != head_skb) { - head_skb->data_len += len; - head_skb->len += len; - head_skb->truesize += truesize; - } - offset = buf - page_address(page); - if (skb_can_coalesce(curr_skb, num_skb_frags, page, offset)) { - put_page(page); - skb_coalesce_rx_frag(curr_skb, num_skb_frags - 1, -len, truesize); - } else { - skb_add_rx_frag(curr_skb, num_skb_frags, page, -
[PATCH net-next v2 07/12] virtio_net: separate receive_buf
This commit separates the function receive_buf(), then we wrap the logic of handling the skb to an independent function virtnet_receive_done(). The subsequent commit will reuse it. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio/virtnet_main.c | 56 ++- 1 file changed, 32 insertions(+), 24 deletions(-) diff --git a/drivers/net/virtio/virtnet_main.c b/drivers/net/virtio/virtnet_main.c index 6cc99d9b768b..68b90ee788bd 100644 --- a/drivers/net/virtio/virtnet_main.c +++ b/drivers/net/virtio/virtnet_main.c @@ -1721,32 +1721,11 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash, skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type); } -static void receive_buf(struct virtnet_info *vi, struct virtnet_rq *rq, - void *buf, unsigned int len, void **ctx, - unsigned int *xdp_xmit, - struct virtnet_rq_stats *stats) +static void virtnet_receive_done(struct virtnet_info *vi, struct virtnet_rq *rq, +struct sk_buff *skb) { - struct net_device *dev = vi->dev; - struct sk_buff *skb; struct virtio_net_common_hdr *hdr; - - if (unlikely(len < vi->hdr_len + ETH_HLEN)) { - pr_debug("%s: short packet %i\n", dev->name, len); - DEV_STATS_INC(dev, rx_length_errors); - virtnet_rq_free_buf(vi, rq, buf); - return; - } - - if (vi->mergeable_rx_bufs) - skb = receive_mergeable(dev, vi, rq, buf, ctx, len, xdp_xmit, - stats); - else if (vi->big_packets) - skb = receive_big(dev, vi, rq, buf, len, stats); - else - skb = receive_small(dev, vi, rq, buf, ctx, len, xdp_xmit, stats); - - if (unlikely(!skb)) - return; + struct net_device *dev = vi->dev; hdr = skb_vnet_common_hdr(skb); if (dev->features & NETIF_F_RXHASH && vi->has_rss_hash_report) @@ -1776,6 +1755,35 @@ static void receive_buf(struct virtnet_info *vi, struct virtnet_rq *rq, dev_kfree_skb(skb); } +static void receive_buf(struct virtnet_info *vi, struct virtnet_rq *rq, + void *buf, unsigned int len, void **ctx, + unsigned int *xdp_xmit, + struct virtnet_rq_stats *stats) +{ + struct net_device *dev = vi->dev; + struct sk_buff *skb; + + if (unlikely(len < vi->hdr_len + ETH_HLEN)) { + pr_debug("%s: short packet %i\n", dev->name, len); + DEV_STATS_INC(dev, rx_length_errors); + virtnet_rq_free_buf(vi, rq, buf); + return; + } + + if (vi->mergeable_rx_bufs) + skb = receive_mergeable(dev, vi, rq, buf, ctx, len, xdp_xmit, + stats); + else if (vi->big_packets) + skb = receive_big(dev, vi, rq, buf, len, stats); + else + skb = receive_small(dev, vi, rq, buf, ctx, len, xdp_xmit, stats); + + if (unlikely(!skb)) + return; + + virtnet_receive_done(vi, rq, skb); +} + /* Unlike mergeable buffers, all buffers are allocated to the * same size, except for the headroom. For this reason we do * not need to use mergeable_len_to_ctx here - it is enough -- 2.32.0.3.g01195cf9f
[PATCH net-next v2 08/12] virtio_ring: introduce vring_need_unmap_buffer
To make the code readable, introduce vring_need_unmap_buffer() to replace do_unmap. use_dma_api premapped -> vring_need_unmap_buffer() 1. false falsefalse 2. truefalsetrue 3. truetrue false Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/virtio/virtio_ring.c | 27 --- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 2a972752ff1b..df8eb0521aa0 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -175,11 +175,6 @@ struct vring_virtqueue { /* Do DMA mapping by driver */ bool premapped; - /* Do unmap or not for desc. Just when premapped is False and -* use_dma_api is true, this is true. -*/ - bool do_unmap; - /* Head of free buffer list. */ unsigned int free_head; /* Number we've added since last sync. */ @@ -297,6 +292,11 @@ static bool vring_use_dma_api(const struct virtio_device *vdev) return false; } +static bool vring_need_unmap_buffer(const struct vring_virtqueue *vring) +{ + return vring->use_dma_api && !vring->premapped; +} + size_t virtio_max_dma_size(const struct virtio_device *vdev) { size_t max_segment_size = SIZE_MAX; @@ -445,7 +445,7 @@ static void vring_unmap_one_split_indirect(const struct vring_virtqueue *vq, { u16 flags; - if (!vq->do_unmap) + if (!vring_need_unmap_buffer(vq)) return; flags = virtio16_to_cpu(vq->vq.vdev, desc->flags); @@ -475,7 +475,7 @@ static unsigned int vring_unmap_one_split(const struct vring_virtqueue *vq, (flags & VRING_DESC_F_WRITE) ? DMA_FROM_DEVICE : DMA_TO_DEVICE); } else { - if (!vq->do_unmap) + if (!vring_need_unmap_buffer(vq)) goto out; dma_unmap_page(vring_dma_dev(vq), @@ -643,7 +643,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, } /* Last one doesn't continue. */ desc[prev].flags &= cpu_to_virtio16(_vq->vdev, ~VRING_DESC_F_NEXT); - if (!indirect && vq->do_unmap) + if (!indirect && vring_need_unmap_buffer(vq)) vq->split.desc_extra[prev & (vq->split.vring.num - 1)].flags &= ~VRING_DESC_F_NEXT; @@ -802,7 +802,7 @@ static void detach_buf_split(struct vring_virtqueue *vq, unsigned int head, VRING_DESC_F_INDIRECT)); BUG_ON(len == 0 || len % sizeof(struct vring_desc)); - if (vq->do_unmap) { + if (vring_need_unmap_buffer(vq)) { for (j = 0; j < len / sizeof(struct vring_desc); j++) vring_unmap_one_split_indirect(vq, &indir_desc[j]); } @@ -1236,7 +1236,7 @@ static void vring_unmap_extra_packed(const struct vring_virtqueue *vq, (flags & VRING_DESC_F_WRITE) ? DMA_FROM_DEVICE : DMA_TO_DEVICE); } else { - if (!vq->do_unmap) + if (!vring_need_unmap_buffer(vq)) return; dma_unmap_page(vring_dma_dev(vq), @@ -1251,7 +1251,7 @@ static void vring_unmap_desc_packed(const struct vring_virtqueue *vq, { u16 flags; - if (!vq->do_unmap) + if (!vring_need_unmap_buffer(vq)) return; flags = le16_to_cpu(desc->flags); @@ -1632,7 +1632,7 @@ static void detach_buf_packed(struct vring_virtqueue *vq, if (!desc) return; - if (vq->do_unmap) { + if (vring_need_unmap_buffer(vq)) { len = vq->packed.desc_extra[id].len; for (i = 0; i < len / sizeof(struct vring_packed_desc); i++) @@ -2091,7 +2091,6 @@ static struct virtqueue *vring_create_virtqueue_packed( vq->dma_dev = dma_dev; vq->use_dma_api = vring_use_dma_api(vdev); vq->premapped = false; - vq->do_unmap = vq->use_dma_api; vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC) && !context; @@ -2636,7 +2635,6 @@ static struct virtqueue *__vring_new_virtqueue(unsigned int index, vq->dma_dev = dma_dev; vq->use_dma_api = vring_use_dma_api(vdev); vq->premapped = false; - vq->do_unmap = vq->use_dma_api; vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC) && !context; @@ -2799,7 +2797,6 @@ int virtqueue_set_dma_premapped(struct virtqueue *_vq) } vq->premapped = true; - vq->do_unmap = false; END_USE(vq); -- 2.32.0.3.g01195cf9f
[PATCH net-next v2 09/12] virtio_ring: introduce dma map api for page
The virtio-net sq will use these APIs to map the scatterlist. dma_addr_t virtqueue_dma_map_page_attrs(struct virtqueue *_vq, struct page *page, size_t offset, size_t size, enum dma_data_direction dir, unsigned long attrs); void virtqueue_dma_unmap_page_attrs(struct virtqueue *_vq, dma_addr_t addr, size_t size, enum dma_data_direction dir, unsigned long attrs); Signed-off-by: Xuan Zhuo --- drivers/virtio/virtio_ring.c | 52 include/linux/virtio.h | 7 + 2 files changed, 59 insertions(+) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index df8eb0521aa0..acb6dba4bb55 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -3149,6 +3149,58 @@ void virtqueue_dma_unmap_single_attrs(struct virtqueue *_vq, dma_addr_t addr, } EXPORT_SYMBOL_GPL(virtqueue_dma_unmap_single_attrs); +/** + * virtqueue_dma_map_page_attrs - map DMA for _vq + * @_vq: the struct virtqueue we're talking about. + * @page: the page to do dma + * @offset: the offset inside the page + * @size: the size of the page to do dma + * @dir: DMA direction + * @attrs: DMA Attrs + * + * The caller calls this to do dma mapping in advance. The DMA address can be + * passed to this _vq when it is in pre-mapped mode. + * + * return DMA address. Caller should check that by virtqueue_dma_mapping_error(). + */ +dma_addr_t virtqueue_dma_map_page_attrs(struct virtqueue *_vq, struct page *page, + size_t offset, size_t size, + enum dma_data_direction dir, + unsigned long attrs) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + + if (!vq->use_dma_api) + return page_to_phys(page) + offset; + + return dma_map_page_attrs(vring_dma_dev(vq), page, offset, size, dir, attrs); +} +EXPORT_SYMBOL_GPL(virtqueue_dma_map_page_attrs); + +/** + * virtqueue_dma_unmap_page_attrs - unmap DMA for _vq + * @_vq: the struct virtqueue we're talking about. + * @addr: the dma address to unmap + * @size: the size of the buffer + * @dir: DMA direction + * @attrs: DMA Attrs + * + * Unmap the address that is mapped by the virtqueue_dma_map_* APIs. + * + */ +void virtqueue_dma_unmap_page_attrs(struct virtqueue *_vq, dma_addr_t addr, + size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + struct vring_virtqueue *vq = to_vvq(_vq); + + if (!vq->use_dma_api) + return; + + dma_unmap_page_attrs(vring_dma_dev(vq), addr, size, dir, attrs); +} +EXPORT_SYMBOL_GPL(virtqueue_dma_unmap_page_attrs); + /** * virtqueue_dma_mapping_error - check dma address * @_vq: the struct virtqueue we're talking about. diff --git a/include/linux/virtio.h b/include/linux/virtio.h index 96fea920873b..ca318a66a7e1 100644 --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -234,6 +234,13 @@ dma_addr_t virtqueue_dma_map_single_attrs(struct virtqueue *_vq, void *ptr, size void virtqueue_dma_unmap_single_attrs(struct virtqueue *_vq, dma_addr_t addr, size_t size, enum dma_data_direction dir, unsigned long attrs); +dma_addr_t virtqueue_dma_map_page_attrs(struct virtqueue *_vq, struct page *page, + size_t offset, size_t size, + enum dma_data_direction dir, + unsigned long attrs); +void virtqueue_dma_unmap_page_attrs(struct virtqueue *_vq, dma_addr_t addr, + size_t size, enum dma_data_direction dir, + unsigned long attrs); int virtqueue_dma_mapping_error(struct virtqueue *_vq, dma_addr_t addr); bool virtqueue_dma_need_sync(struct virtqueue *_vq, dma_addr_t addr); -- 2.32.0.3.g01195cf9f
[PATCH net-next v2 10/12] virtio_ring: introduce virtqueue_dma_map_sg_attrs
Introduce a helper to do dma map for scatterlist. That can be used by other drivers. Signed-off-by: Xuan Zhuo --- drivers/virtio/virtio_ring.c | 32 include/linux/virtio.h | 3 +++ 2 files changed, 35 insertions(+) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index acb6dba4bb55..cdcd8ae63c71 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -3219,6 +3219,38 @@ int virtqueue_dma_mapping_error(struct virtqueue *_vq, dma_addr_t addr) } EXPORT_SYMBOL_GPL(virtqueue_dma_mapping_error); +/** + * virtqueue_dma_map_sg_attrs - map scatterlist addr DMA for _vq + * @_vq: the struct virtqueue we're talking about. + * @sg: the scatterlist to do dma + * @dir: DMA direction + * @attrs: DMA Attrs + * + * The caller calls this to do dma mapping in advance. The sg can be + * passed to this _vq when it is in pre-mapped mode. + * + * Returns zero or a negative error. + * 0: success + * -ENOMEM: dma map error + */ +int virtqueue_dma_map_sg_attrs(struct virtqueue *_vq, struct scatterlist *sg, + enum dma_data_direction dir, unsigned long attrs) +{ + dma_addr_t addr; + int err; + + addr = virtqueue_dma_map_page_attrs(_vq, sg_page(sg), sg->offset, + sg->length, dir, attrs); + err = virtqueue_dma_mapping_error(_vq, addr); + if (err) + return err; + + sg->dma_address = addr; + + return 0; +} +EXPORT_SYMBOL_GPL(virtqueue_dma_map_sg_attrs); + /** * virtqueue_dma_need_sync - check a dma address needs sync * @_vq: the struct virtqueue we're talking about. diff --git a/include/linux/virtio.h b/include/linux/virtio.h index ca318a66a7e1..6e57098c457e 100644 --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -243,6 +243,9 @@ void virtqueue_dma_unmap_page_attrs(struct virtqueue *_vq, dma_addr_t addr, unsigned long attrs); int virtqueue_dma_mapping_error(struct virtqueue *_vq, dma_addr_t addr); +int virtqueue_dma_map_sg_attrs(struct virtqueue *_vq, struct scatterlist *sg, + enum dma_data_direction dir, unsigned long attrs); + bool virtqueue_dma_need_sync(struct virtqueue *_vq, dma_addr_t addr); void virtqueue_dma_sync_single_range_for_cpu(struct virtqueue *_vq, dma_addr_t addr, unsigned long offset, size_t size, -- 2.32.0.3.g01195cf9f
[PATCH net-next v2 11/12] virtio_ring: virtqueue_set_dma_premapped() support to disable
virtio-net sq will only enable premapped mode when the sq is bound to the af-xdp. So we need the helper (virtqueue_set_dma_premapped) to enable the premapped mode when af-xdp binds to the sq. And to disable the premapped mode when af-xdp unbinds to the sq. Signed-off-by: Xuan Zhuo --- drivers/net/virtio/virtnet_main.c | 2 +- drivers/virtio/virtio_ring.c | 7 --- include/linux/virtio.h| 2 +- 3 files changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/net/virtio/virtnet_main.c b/drivers/net/virtio/virtnet_main.c index 68b90ee788bd..60ef7bb2228d 100644 --- a/drivers/net/virtio/virtnet_main.c +++ b/drivers/net/virtio/virtnet_main.c @@ -707,7 +707,7 @@ static void virtnet_rq_set_premapped(struct virtnet_info *vi) for (i = 0; i < vi->max_queue_pairs; i++) /* error should never happen */ - BUG_ON(virtqueue_set_dma_premapped(vi->rq[i].vq)); + BUG_ON(virtqueue_set_dma_premapped(vi->rq[i].vq, true)); } static void virtnet_rq_unmap_free_buf(struct virtqueue *vq, void *buf) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index cdcd8ae63c71..37c9c5b55864 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -2764,8 +2764,9 @@ EXPORT_SYMBOL_GPL(virtqueue_resize); /** * virtqueue_set_dma_premapped - set the vring premapped mode * @_vq: the struct virtqueue we're talking about. + * @premapped: bool enable/disable the premapped mode * - * Enable the premapped mode of the vq. + * Enable/disable the premapped mode of the vq. * * The vring in premapped mode does not do dma internally, so the driver must * do dma mapping in advance. The driver must pass the dma_address through @@ -2782,7 +2783,7 @@ EXPORT_SYMBOL_GPL(virtqueue_resize); * 0: success. * -EINVAL: too late to enable premapped mode, the vq already contains buffers. */ -int virtqueue_set_dma_premapped(struct virtqueue *_vq) +int virtqueue_set_dma_premapped(struct virtqueue *_vq, bool premapped) { struct vring_virtqueue *vq = to_vvq(_vq); u32 num; @@ -2796,7 +2797,7 @@ int virtqueue_set_dma_premapped(struct virtqueue *_vq) return -EINVAL; } - vq->premapped = true; + vq->premapped = premapped; END_USE(vq); diff --git a/include/linux/virtio.h b/include/linux/virtio.h index 6e57098c457e..38e18a764573 100644 --- a/include/linux/virtio.h +++ b/include/linux/virtio.h @@ -81,7 +81,7 @@ bool virtqueue_enable_cb(struct virtqueue *vq); unsigned virtqueue_enable_cb_prepare(struct virtqueue *vq); -int virtqueue_set_dma_premapped(struct virtqueue *_vq); +int virtqueue_set_dma_premapped(struct virtqueue *_vq, bool premapped); bool virtqueue_poll(struct virtqueue *vq, unsigned); -- 2.32.0.3.g01195cf9f
[PATCH net-next v2 12/12] virtio_net: refactor the xmit type
Because the af-xdp and the sq premapped mode will introduce two new xmit types, so I refactor the xmit type mechanism first. Then we can add two new xmit types simply by add two new enum. We can use the last two bits of the pointer to distinguish the xmit type, so we can distinguish four xmit types. Now we have two xmit types: SKB and XDP. Signed-off-by: Xuan Zhuo --- drivers/net/virtio/virtnet_main.c | 58 +-- 1 file changed, 40 insertions(+), 18 deletions(-) diff --git a/drivers/net/virtio/virtnet_main.c b/drivers/net/virtio/virtnet_main.c index 60ef7bb2228d..3ec821106c1d 100644 --- a/drivers/net/virtio/virtnet_main.c +++ b/drivers/net/virtio/virtnet_main.c @@ -47,8 +47,6 @@ module_param(napi_tx, bool, 0644); #define VIRTIO_XDP_TX BIT(0) #define VIRTIO_XDP_REDIR BIT(1) -#define VIRTIO_XDP_FLAGBIT(0) - #define VIRTNET_DRIVER_VERSION "1.0.0" static const unsigned long guest_offloads[] = { @@ -260,42 +258,62 @@ struct virtio_net_common_hdr { static void virtnet_sq_free_unused_buf(struct virtqueue *vq, void *buf); -static bool is_xdp_frame(void *ptr) +enum virtnet_xmit_type { + VIRTNET_XMIT_TYPE_SKB, + VIRTNET_XMIT_TYPE_XDP, +}; + +#define VIRTNET_XMIT_TYPE_MASK (VIRTNET_XMIT_TYPE_SKB | VIRTNET_XMIT_TYPE_XDP) + +static enum virtnet_xmit_type virtnet_xmit_ptr_strip(void **ptr) { - return (unsigned long)ptr & VIRTIO_XDP_FLAG; + unsigned long p = (unsigned long)*ptr; + + *ptr = (void *)(p & ~VIRTNET_XMIT_TYPE_MASK); + + return p & VIRTNET_XMIT_TYPE_MASK; } -static void *xdp_to_ptr(struct xdp_frame *ptr) +static void *virtnet_xmit_ptr_mix(void *ptr, enum virtnet_xmit_type type) { - return (void *)((unsigned long)ptr | VIRTIO_XDP_FLAG); + return (void *)((unsigned long)ptr | type); } -static struct xdp_frame *ptr_to_xdp(void *ptr) +static int virtnet_add_outbuf(struct virtnet_sq *sq, int num, void *data, + enum virtnet_xmit_type type) { - return (struct xdp_frame *)((unsigned long)ptr & ~VIRTIO_XDP_FLAG); + return virtqueue_add_outbuf(sq->vq, sq->sg, num, + virtnet_xmit_ptr_mix(data, type), + GFP_ATOMIC); } static void __free_old_xmit(struct virtnet_sq *sq, bool in_napi, struct virtnet_sq_free_stats *stats) { + struct xdp_frame *frame; + struct sk_buff *skb; unsigned int len; void *ptr; while ((ptr = virtqueue_get_buf(sq->vq, &len)) != NULL) { ++stats->packets; - if (!is_xdp_frame(ptr)) { - struct sk_buff *skb = ptr; + switch (virtnet_xmit_ptr_strip(&ptr)) { + case VIRTNET_XMIT_TYPE_SKB: + skb = ptr; pr_debug("Sent skb %p\n", skb); stats->bytes += skb->len; napi_consume_skb(skb, in_napi); - } else { - struct xdp_frame *frame = ptr_to_xdp(ptr); + break; + + case VIRTNET_XMIT_TYPE_XDP: + frame = ptr; stats->bytes += xdp_get_frame_len(frame); xdp_return_frame(frame); + break; } } } @@ -833,8 +851,7 @@ static int __virtnet_xdp_xmit_one(struct virtnet_info *vi, skb_frag_size(frag), skb_frag_off(frag)); } - err = virtqueue_add_outbuf(sq->vq, sq->sg, nr_frags + 1, - xdp_to_ptr(xdpf), GFP_ATOMIC); + err = virtnet_add_outbuf(sq, nr_frags + 1, xdpf, VIRTNET_XMIT_TYPE_XDP); if (unlikely(err)) return -ENOSPC; /* Caller handle free/refcnt */ @@ -2343,7 +2360,7 @@ static int xmit_skb(struct virtnet_sq *sq, struct sk_buff *skb) return num_sg; num_sg++; } - return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, GFP_ATOMIC); + return virtnet_add_outbuf(sq, num_sg, skb, VIRTNET_XMIT_TYPE_SKB); } static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) @@ -5051,10 +5068,15 @@ static void free_receive_page_frags(struct virtnet_info *vi) static void virtnet_sq_free_unused_buf(struct virtqueue *vq, void *buf) { - if (!is_xdp_frame(buf)) + switch (virtnet_xmit_ptr_strip(&buf)) { + case VIRTNET_XMIT_TYPE_SKB: dev_kfree_skb(buf); - else - xdp_return_frame(ptr_to_xdp(buf)); + break; + + case VIRTNET_XMIT_TYPE_XDP: + xdp_return_frame(buf); + break; + } } static void free_unused_bufs(struct virtnet_info *vi) -- 2.32.0.3.g01195cf9f
Re: [PATCH net-next v2 00/12] virtnet_net: prepare for af-xdp
On Thu, May 30, 2024 at 07:23:54PM +0800, Xuan Zhuo wrote: > This patch set prepares for supporting af-xdp zerocopy. > There is no feature change in this patch set. > I just want to reduce the patch num of the final patch set, > so I split the patch set. > > Thanks. > > v2: > 1. Add five commits. That provides some helper for sq to support premapped >mode. And the last one refactors distinguishing xmit types. > > v1: > 1. resend for the new net-next merge window > It's great that you are working on this but I'd like to see the actual use of this first. > > Xuan Zhuo (12): > virtio_net: independent directory > virtio_net: move core structures to virtio_net.h > virtio_net: add prefix virtnet to all struct inside virtio_net.h > virtio_net: separate virtnet_rx_resize() > virtio_net: separate virtnet_tx_resize() > virtio_net: separate receive_mergeable > virtio_net: separate receive_buf > virtio_ring: introduce vring_need_unmap_buffer > virtio_ring: introduce dma map api for page > virtio_ring: introduce virtqueue_dma_map_sg_attrs > virtio_ring: virtqueue_set_dma_premapped() support to disable > virtio_net: refactor the xmit type > > MAINTAINERS | 2 +- > drivers/net/Kconfig | 9 +- > drivers/net/Makefile | 2 +- > drivers/net/virtio/Kconfig| 12 + > drivers/net/virtio/Makefile | 8 + > drivers/net/virtio/virtnet.h | 248 > .../{virtio_net.c => virtio/virtnet_main.c} | 596 +++--- > drivers/virtio/virtio_ring.c | 118 +++- > include/linux/virtio.h| 12 +- > 9 files changed, 606 insertions(+), 401 deletions(-) > create mode 100644 drivers/net/virtio/Kconfig > create mode 100644 drivers/net/virtio/Makefile > create mode 100644 drivers/net/virtio/virtnet.h > rename drivers/net/{virtio_net.c => virtio/virtnet_main.c} (93%) > > -- > 2.32.0.3.g01195cf9f
Re: [PATCH net-next v2 00/12] virtnet_net: prepare for af-xdp
On Thu, 30 May 2024 07:53:17 -0400, "Michael S. Tsirkin" wrote: > On Thu, May 30, 2024 at 07:23:54PM +0800, Xuan Zhuo wrote: > > This patch set prepares for supporting af-xdp zerocopy. > > There is no feature change in this patch set. > > I just want to reduce the patch num of the final patch set, > > so I split the patch set. > > > > Thanks. > > > > v2: > > 1. Add five commits. That provides some helper for sq to support > > premapped > >mode. And the last one refactors distinguishing xmit types. > > > > v1: > > 1. resend for the new net-next merge window > > > > > It's great that you are working on this but > I'd like to see the actual use of this first. For me, that is easy. But how should we do, if we use one patch set, then the commit number maybe 26, that exceeds 15 (limit of the net next). Thanks. > > > > > Xuan Zhuo (12): > > virtio_net: independent directory > > virtio_net: move core structures to virtio_net.h > > virtio_net: add prefix virtnet to all struct inside virtio_net.h > > virtio_net: separate virtnet_rx_resize() > > virtio_net: separate virtnet_tx_resize() > > virtio_net: separate receive_mergeable > > virtio_net: separate receive_buf > > virtio_ring: introduce vring_need_unmap_buffer > > virtio_ring: introduce dma map api for page > > virtio_ring: introduce virtqueue_dma_map_sg_attrs > > virtio_ring: virtqueue_set_dma_premapped() support to disable > > virtio_net: refactor the xmit type > > > > MAINTAINERS | 2 +- > > drivers/net/Kconfig | 9 +- > > drivers/net/Makefile | 2 +- > > drivers/net/virtio/Kconfig| 12 + > > drivers/net/virtio/Makefile | 8 + > > drivers/net/virtio/virtnet.h | 248 > > .../{virtio_net.c => virtio/virtnet_main.c} | 596 +++--- > > drivers/virtio/virtio_ring.c | 118 +++- > > include/linux/virtio.h| 12 +- > > 9 files changed, 606 insertions(+), 401 deletions(-) > > create mode 100644 drivers/net/virtio/Kconfig > > create mode 100644 drivers/net/virtio/Makefile > > create mode 100644 drivers/net/virtio/virtnet.h > > rename drivers/net/{virtio_net.c => virtio/virtnet_main.c} (93%) > > > > -- > > 2.32.0.3.g01195cf9f >
Re: [PATCH net-next v2 00/12] virtnet_net: prepare for af-xdp
On Thu, 30 May 2024 19:54:44 +0800, Xuan Zhuo wrote: > On Thu, 30 May 2024 07:53:17 -0400, "Michael S. Tsirkin" > wrote: > > On Thu, May 30, 2024 at 07:23:54PM +0800, Xuan Zhuo wrote: > > > This patch set prepares for supporting af-xdp zerocopy. > > > There is no feature change in this patch set. > > > I just want to reduce the patch num of the final patch set, > > > so I split the patch set. > > > > > > Thanks. > > > > > > v2: > > > 1. Add five commits. That provides some helper for sq to support > > > premapped > > >mode. And the last one refactors distinguishing xmit types. > > > > > > v1: > > > 1. resend for the new net-next merge window > > > > > > > > > It's great that you are working on this but > > I'd like to see the actual use of this first. > > > For me, that is easy. But how should we do, if we use one patch set, > then the commit number maybe 26, that exceeds 15 (limit of the net next). Hi, Jakub There will be a huge patch set (about 25) to support AF-XDP for virtio-net. Can I just post this huge patch set if the maintainers of virtio-net agree? Thanks. > > Thanks. > > > > > > > > > > Xuan Zhuo (12): > > > virtio_net: independent directory > > > virtio_net: move core structures to virtio_net.h > > > virtio_net: add prefix virtnet to all struct inside virtio_net.h > > > virtio_net: separate virtnet_rx_resize() > > > virtio_net: separate virtnet_tx_resize() > > > virtio_net: separate receive_mergeable > > > virtio_net: separate receive_buf > > > virtio_ring: introduce vring_need_unmap_buffer > > > virtio_ring: introduce dma map api for page > > > virtio_ring: introduce virtqueue_dma_map_sg_attrs > > > virtio_ring: virtqueue_set_dma_premapped() support to disable > > > virtio_net: refactor the xmit type > > > > > > MAINTAINERS | 2 +- > > > drivers/net/Kconfig | 9 +- > > > drivers/net/Makefile | 2 +- > > > drivers/net/virtio/Kconfig| 12 + > > > drivers/net/virtio/Makefile | 8 + > > > drivers/net/virtio/virtnet.h | 248 > > > .../{virtio_net.c => virtio/virtnet_main.c} | 596 +++--- > > > drivers/virtio/virtio_ring.c | 118 +++- > > > include/linux/virtio.h| 12 +- > > > 9 files changed, 606 insertions(+), 401 deletions(-) > > > create mode 100644 drivers/net/virtio/Kconfig > > > create mode 100644 drivers/net/virtio/Makefile > > > create mode 100644 drivers/net/virtio/virtnet.h > > > rename drivers/net/{virtio_net.c => virtio/virtnet_main.c} (93%) > > > > > > -- > > > 2.32.0.3.g01195cf9f > > >
Re: [PATCH net-next v2 00/12] virtnet_net: prepare for af-xdp
On Fri, 31 May 2024 09:40:14 +0800 Xuan Zhuo wrote: > On Thu, 30 May 2024 19:54:44 +0800, Xuan Zhuo > wrote: > > On Thu, 30 May 2024 07:53:17 -0400, "Michael S. Tsirkin" > > wrote: > > > It's great that you are working on this but > > > I'd like to see the actual use of this first. > > > > > > For me, that is easy. But how should we do, if we use one patch set, > > then the commit number maybe 26, that exceeds 15 (limit of the net next). > > Hi, Jakub > > There will be a huge patch set (about 25) to support AF-XDP for virtio-net. > Can I just post this huge patch set if the maintainers of virtio-net agree? First of all, I see you posted v2 within 4 hours of v1, without really waiting for Michael to reply. So I guess that 15 patch rule is not the only one you intend to break? On v1 Michael asked you to not do the rename, and start with AF_XDP support. Why don't you do that instead of asking me if you can break more rules? -- pw-bot: cr
Re: [PATCH net-next v2 00/12] virtnet_net: prepare for af-xdp
On Thu, 30 May 2024 18:55:17 -0700, Jakub Kicinski wrote: > On Fri, 31 May 2024 09:40:14 +0800 Xuan Zhuo wrote: > > On Thu, 30 May 2024 19:54:44 +0800, Xuan Zhuo > > wrote: > > > On Thu, 30 May 2024 07:53:17 -0400, "Michael S. Tsirkin" > > > wrote: > > > > It's great that you are working on this but > > > > I'd like to see the actual use of this first. > > > > > > > > > For me, that is easy. But how should we do, if we use one patch set, > > > then the commit number maybe 26, that exceeds 15 (limit of the net next). > > > > Hi, Jakub > > > > There will be a huge patch set (about 25) to support AF-XDP for virtio-net. > > Can I just post this huge patch set if the maintainers of virtio-net agree? > > First of all, I see you posted v2 within 4 hours of v1, without really > waiting for Michael to reply. Because I was checking the code, I found some commits need to be prepared also. > So I guess that 15 patch rule is not the > only one you intend to break? Actually, that is the only one rule. > > On v1 Michael asked you to not do the rename, and start with AF_XDP > support. Why don't you do that instead of asking me if you can break > more rules? Because if I don't rename the files, there will still be about 21 commits. For me, I don't think this is the key. If I release everything in one patch set, then I think Michael can understand why I want to rename the files. Thanks. > -- > pw-bot: cr