On Mon, 18 Dec 2017 06:23:40 -0700 David Ahern <dsah...@gmail.com> wrote:
> On 12/18/17 3:55 AM, Jesper Dangaard Brouer wrote: > > > > Handling return-errors in the drivers complicated the driver code, as it > > involves unraveling and deallocating other RX-rings etc (that were > > already allocated) if the reg fails. (Also notice next patch will allow > > dev == NULL, if right ptype is set). > > > > I'm not completely rejecting you idea, as this is a good optimization > > trick, which is to move validation checks to setup-time, thus allowing > > less validation checks at runtime. I sort-of actually already did > > this, as I allow bpf to deref dev without NULL check. I would argue > > this is good enough, as we will crash in a predictable way, as above > > WARN will point to which driver violated the API. > > > > If people think it is valuable I can change this API to return an err? > > Saeed's suggested API in a comment on patch 12 also removes most of the > WARN_ONs as it sets the device and index: > > xdp_rxq_info_reg(netdev, rxq_index) > { > rxqueue = netdev->_rx + rxq_index; > xdp_rxq = rxqueue.xdp_rxq; > xdp_rxq_info_init(xdp_rxq); > xdp_rxq.dev = netdev; > xdp_rxq.queue_index = rxq_index; > } > > xdp_rxq_info_unreg(netdev, rxq_index) > { > ... > } No, we still need the other WARN_ON's. I don't understand why you think above API is better. In case netdev==NULL the system will simply crash on deref of netdev. That case happened for both drivers i40e and mlx5, when I was adding this. The WARN_ON help me quickly identify the issue, and in both drivers it was a non-critical error, as these queues are not used by XDP. IHMO a better experience for the driver developer. IHMO WARN_ON's are a good thing. For example the: if (xdp_rxq->reg_state == REG_STATE_REGISTERED) WARN(1, "Missing unregister, handled but fix driver\n"); Just helped me identify a bug in i40e driver. It turns out that changing the RX-ring queue size via ethtool <-G|--set-ring> (_not_ the number of RX-rings, but frames per RX-ring). Then i40e_set_ringparam() allocates some temp RX-rings and copy-around struct contents, causing this strange issue. It will not crash with our currently simple content, but later this would cause a hard-to-debug issue. I'm happy I could catch this now, instead of later as a strange crash. The WARN's are there to assist driver developers when using this API in their drivers (better than crash/BUG_ON as they don't have to dig-up their serial cable console). For me it is also part of the documentation, as it document the API assumptions/assertions together with a small text field. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer