On Mon, 2014-01-13 at 12:47 +0100, Jonas Jensen wrote:
> Please help,
> 
> I think I see memory corruption with a driver recently added to linux-next.
> 
> The following error occur downloading a large file with wget (or
> ncftp): "read error: Bad address"
> wget exits and leaves the file unfinished.
> 
> The error goes away when build_skb() is patched out, in this case by
> allocating pages directly in RX loop.

The problem is you're assuming build_skb() does rather more than it
does.

> I currently have two branches with code placed under ifdef USE_BUILD_SKB:
> https://bitbucket.org/Kasreyn/linux-next/src/faa7c408a655fdfd7c383f79259031da5a05d61e/drivers/net/ethernet/moxa/moxart_ether.c#cl-472

Quoting from there:

> #define USE_BUILD_SKB
> #ifdef USE_BUILD_SKB
> 
> 
>       skb = build_skb(page_address(priv->rx_pages[priv->rx_head]), 
> priv->rx_buf_size);
> 
> 
>       if (unlikely(!skb)) {
>               net_info_ratelimited("RX build_skb failed\n");

Don't log; increment the rx_dropped counter.

>               moxart_rx_drop_packet(ndev, desc0);
>               return true;
>       }
>
> 
>       if (likely(skb))

Redundant condition.

>               get_page(priv->rx_pages[priv->rx_head]);
> 
> 
>       skb_put(skb, len);
> 
> 
> #else
> 
> 
>       dma_unmap_page(&ndev->dev, priv->rx_buf_phys[priv->rx_head], 
> RX_BUF_SIZE, DMA_FROM_DEVICE);

dma_unmap_page() is always needed, so move this above the #ifdef
USE_BUILD_SKB.

> 
>       skb = netdev_alloc_skb_ip_align(ndev, 128);

Missing check for NULL.

>       skb_fill_page_desc(skb, 0, priv->rx_pages[priv->rx_head], 0, len);
>       skb->len += len;
>       skb->data_len += len;
> 
> 
>       if (len > 128) {
>               skb->truesize += PAGE_SIZE;
>               __pskb_pull_tail(skb, ETH_HLEN);
>       } else {
>               __pskb_pull_tail(skb, len);
>       }
> 
> 
>       p = priv->rx_pages[priv->rx_head];

Missing check for NULL.

>       moxart_alloc_rx_page(ndev, priv->rx_head);

New buffer is always required, so move this under the #endif... well,
except that if moxart_alloc_rx_page() fails then things go horribly
wrong as you have.

Best practice is to try allocating the next buffer before passing up the
current one; if that fails then you recycle the current buffer and
increment rx_dropped.

> 
> 
> #endif
[end quoted code]

> If build_skb() is the cause, is there something the driver can do about it?
> 
> A quick search on "build_skb memory corruption" reveals the following
> commit, "igb: Revert support for build_skb in igb"
> f9d40f6a9921cc7d9385f64362314054e22152bd:
[...]

The problem I identified in igb occurs only when splitting a page into
multiple buffers (or when recycling DMA-mapped pages).  build_skb() is
fine to use whenever each RX buffer is a single page (except that that's
memory-inefficient) that's unmapped on completion.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to