On Mon, 18 Dec 2017 06:23:40 -0700
David Ahern <dsah...@gmail.com> wrote:

> On 12/18/17 3:55 AM, Jesper Dangaard Brouer wrote:
> > 
> > Handling return-errors in the drivers complicated the driver code, as it
> > involves unraveling and deallocating other RX-rings etc (that were
> > already allocated) if the reg fails. (Also notice next patch will allow
> > dev == NULL, if right ptype is set).
> > 
> > I'm not completely rejecting you idea, as this is a good optimization
> > trick, which is to move validation checks to setup-time, thus allowing
> > less validation checks at runtime.  I sort-of actually already did
> > this, as I allow bpf to deref dev without NULL check.  I would argue
> > this is good enough, as we will crash in a predictable way, as above
> > WARN will point to which driver violated the API.
> > 
> > If people think it is valuable I can change this API to return an err?  
> 
> Saeed's suggested API in a comment on patch 12 also removes most of the
> WARN_ONs as it sets the device and index:
> 
> xdp_rxq_info_reg(netdev, rxq_index)
> {
>     rxqueue = netdev->_rx + rxq_index;
>     xdp_rxq = rxqueue.xdp_rxq;
>         xdp_rxq_info_init(xdp_rxq);
>     xdp_rxq.dev = netdev;
>     xdp_rxq.queue_index = rxq_index;
> }
> 
> xdp_rxq_info_unreg(netdev, rxq_index)
> {
> ...
> }

No, we still need the other WARN_ON's.

I don't understand why you think above API is better.  In case
netdev==NULL the system will simply crash on deref of netdev.  That
case happened for both drivers i40e and mlx5, when I was adding this.
The WARN_ON help me quickly identify the issue, and in both drivers it
was a non-critical error, as these queues are not used by XDP. IHMO a
better experience for the driver developer.

IHMO WARN_ON's are a good thing.  For example the:

 if (xdp_rxq->reg_state == REG_STATE_REGISTERED)
   WARN(1, "Missing unregister, handled but fix driver\n");

Just helped me identify a bug in i40e driver.  It turns out that
changing the RX-ring queue size via ethtool <-G|--set-ring> (_not_ the
number of RX-rings, but frames per RX-ring). Then i40e_set_ringparam()
allocates some temp RX-rings and copy-around struct contents, causing
this strange issue.  It will not crash with our currently simple content,
but later this would cause a hard-to-debug issue.  I'm happy I could
catch this now, instead of later as a strange crash.

The WARN's are there to assist driver developers when using this API
in their drivers (better than crash/BUG_ON as they don't have to dig-up
their serial cable console).  For me it is also part of the
documentation, as it document the API assumptions/assertions together
with a small text field.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Reply via email to