On 12/31/18 12:27 AM, Tariq Toukan wrote:
>
>
> On 1/27/2018 2:41 PM, jianchao.wang wrote:
>> Hi Tariq
>>
>> Thanks for your kindly response.
>> That's really appreciated.
>>
>> On 01/25/2018 05:54 PM, Tariq Toukan wrote:
>>>
>>>
>>> On 25/01/2018 8:25 AM, jianchao.wang wrote:
Hi Eric
>>>
Hi Tariq
Thanks for your kindly response.
That's really appreciated.
On 01/25/2018 05:54 PM, Tariq Toukan wrote:
>
>
> On 25/01/2018 8:25 AM, jianchao.wang wrote:
>> Hi Eric
>>
>> Thanks for you kindly response and suggestion.
>> That's really appreciated.
>>
>> Jianchao
>>
>> On 01/25/2018 11:
On 25/01/2018 8:25 AM, jianchao.wang wrote:
Hi Eric
Thanks for you kindly response and suggestion.
That's really appreciated.
Jianchao
On 01/25/2018 11:55 AM, Eric Dumazet wrote:
On Thu, 2018-01-25 at 11:27 +0800, jianchao.wang wrote:
Hi Tariq
On 01/22/2018 10:12 AM, jianchao.wang wrote:
Hi Eric
Thanks for you kindly response and suggestion.
That's really appreciated.
Jianchao
On 01/25/2018 11:55 AM, Eric Dumazet wrote:
> On Thu, 2018-01-25 at 11:27 +0800, jianchao.wang wrote:
>> Hi Tariq
>>
>> On 01/22/2018 10:12 AM, jianchao.wang wrote:
> On 19/01/2018 5:49 PM, Eric Dumaze
On Thu, 2018-01-25 at 11:27 +0800, jianchao.wang wrote:
> Hi Tariq
>
> On 01/22/2018 10:12 AM, jianchao.wang wrote:
> > > > On 19/01/2018 5:49 PM, Eric Dumazet wrote:
> > > > > On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote:
> > > > > > Hi Tariq
> > > > > >
> > > > > > Very sad that the c
Hi Tariq
On 01/22/2018 10:12 AM, jianchao.wang wrote:
>>> On 19/01/2018 5:49 PM, Eric Dumazet wrote:
On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote:
> Hi Tariq
>
> Very sad that the crash was reproduced again after applied the patch.
>> Memory barriers vary for different A
Hi Jason
Thanks for your kindly response.
On 01/22/2018 11:47 PM, Jason Gunthorpe wrote:
>>> Yeah, mlx4 NICs in Google fleet receive trillions of packets per
>>> second, and we never noticed an issue.
>>>
>>> Although we are using a slightly different driver, using order-0 pages
>>> and fast page
On Mon, Jan 22, 2018 at 10:40:53AM +0800, jianchao.wang wrote:
> Hi Eric
>
> On 01/22/2018 12:43 AM, Eric Dumazet wrote:
> > On Sun, 2018-01-21 at 18:24 +0200, Tariq Toukan wrote:
> >>
> >> On 21/01/2018 11:31 AM, Tariq Toukan wrote:
> >>>
> >>>
> >>> On 19/01/2018 5:49 PM, Eric Dumazet wrote:
> >
Hi Eric
On 01/22/2018 12:43 AM, Eric Dumazet wrote:
> On Sun, 2018-01-21 at 18:24 +0200, Tariq Toukan wrote:
>>
>> On 21/01/2018 11:31 AM, Tariq Toukan wrote:
>>>
>>>
>>> On 19/01/2018 5:49 PM, Eric Dumazet wrote:
On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote:
> Hi Tariq
>
>>
Hi Tariq and all
Many thanks for your kindly and detailed response and comment.
On 01/22/2018 12:24 AM, Tariq Toukan wrote:
>
>
> On 21/01/2018 11:31 AM, Tariq Toukan wrote:
>>
>>
>> On 19/01/2018 5:49 PM, Eric Dumazet wrote:
>>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote:
Hi T
> Hmm, this is actually consistent with the example below [1].
>
> AIU from the example, it seems that the dma_wmb/dma_rmb barriers are good
> for synchronizing cpu/device accesses to the "Streaming DMA mapped" buffers
> (the descriptors, went through the dma_map_page() API), but not for the
> doo
On Sun, 2018-01-21 at 18:24 +0200, Tariq Toukan wrote:
>
> On 21/01/2018 11:31 AM, Tariq Toukan wrote:
> >
> >
> > On 19/01/2018 5:49 PM, Eric Dumazet wrote:
> > > On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote:
> > > > Hi Tariq
> > > >
> > > > Very sad that the crash was reproduced aga
On 21/01/2018 11:31 AM, Tariq Toukan wrote:
On 19/01/2018 5:49 PM, Eric Dumazet wrote:
On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote:
Hi Tariq
Very sad that the crash was reproduced again after applied the patch.
Memory barriers vary for different Archs, can you please share mor
On 19/01/2018 5:49 PM, Eric Dumazet wrote:
On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote:
Hi Tariq
Very sad that the crash was reproduced again after applied the patch.
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -252,6 +252,7
On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote:
> Hi Tariq
>
> Very sad that the crash was reproduced again after applied the patch.
>
> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> @@ -252,6 +252,7 @@ static inline bool mlx4_en_is_
Hi Tariq
Very sad that the crash was reproduced again after applied the patch.
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -252,6 +252,7 @@ static inline bool mlx4_en_is_ring_empty(struct
mlx4_en_rx_ring *ring)
static inline void mlx4_e
Hi Tariq
Thanks for your kindly response.
On 01/14/2018 05:47 PM, Tariq Toukan wrote:
> Thanks Jianchao for your patch.
>
> And Thank you guys for your reviews, much appreciated.
> I was off-work on Friday and Saturday.
>
> On 14/01/2018 4:40 AM, jianchao.wang wrote:
>> Dear all
>>
>> Thanks fo
Thanks Jianchao for your patch.
And Thank you guys for your reviews, much appreciated.
I was off-work on Friday and Saturday.
On 14/01/2018 4:40 AM, jianchao.wang wrote:
Dear all
Thanks for the kindly response and reviewing. That's really appreciated.
On 01/13/2018 12:46 AM, Eric Dumazet wrot
Dear all
Thanks for the kindly response and reviewing. That's really appreciated.
On 01/13/2018 12:46 AM, Eric Dumazet wrote:
>> Does this need to be dma_wmb(), and should it be in
>> mlx4_en_update_rx_prod_db ?
>>
> +1 on dma_wmb()
>
> On what architecture bug was observed ?
This issue was obse
On Fri, Jan 12, 2018 at 01:01:56PM -0800, Saeed Mahameed wrote:
> Simply putting a memory barrier on the top or the bottom of a functions,
> means nothing unless you are looking at the whole picture, of all the
> callers of that function to understand why is it there.
When I review code I want t
On Fri, 2018-01-12 at 13:01 -0800, Saeed Mahameed wrote:
> which is better to grasp ?:
>
> update_doorbell() {
> dma_wmb();
> ring->db = prod;
> }
This one is IMO the most secure one (least surprise)
Considering the time it took to discover this bug, I would really play
safe.
But obvio
On 01/12/2018 12:16 PM, Eric Dumazet wrote:
> On Fri, 2018-01-12 at 11:53 -0800, Saeed Mahameed wrote:
>>
>> On 01/12/2018 08:46 AM, Eric Dumazet wrote:
>>> On Fri, 2018-01-12 at 09:32 -0700, Jason Gunthorpe wrote:
On Fri, Jan 12, 2018 at 11:42:22AM +0800, Jianchao Wang wrote:
> Customer
On Fri, 2018-01-12 at 11:53 -0800, Saeed Mahameed wrote:
>
> On 01/12/2018 08:46 AM, Eric Dumazet wrote:
> > On Fri, 2018-01-12 at 09:32 -0700, Jason Gunthorpe wrote:
> > > On Fri, Jan 12, 2018 at 11:42:22AM +0800, Jianchao Wang wrote:
> > > > Customer reported memory corruption issue on previous
On 01/12/2018 08:46 AM, Eric Dumazet wrote:
On Fri, 2018-01-12 at 09:32 -0700, Jason Gunthorpe wrote:
On Fri, Jan 12, 2018 at 11:42:22AM +0800, Jianchao Wang wrote:
Customer reported memory corruption issue on previous mlx4_en driver
version where the order-3 pages and multiple page reference
On Fri, 2018-01-12 at 09:32 -0700, Jason Gunthorpe wrote:
> On Fri, Jan 12, 2018 at 11:42:22AM +0800, Jianchao Wang wrote:
> > Customer reported memory corruption issue on previous mlx4_en driver
> > version where the order-3 pages and multiple page reference counting
> > were still used.
> >
> >
On Fri, Jan 12, 2018 at 11:42:22AM +0800, Jianchao Wang wrote:
> Customer reported memory corruption issue on previous mlx4_en driver
> version where the order-3 pages and multiple page reference counting
> were still used.
>
> Finally, find out one of the root causes is that the HW may see stale
Customer reported memory corruption issue on previous mlx4_en driver
version where the order-3 pages and multiple page reference counting
were still used.
Finally, find out one of the root causes is that the HW may see stale
rx_descs due to prod db updating reaches HW before rx_desc. Especially
wh
27 matches
Mail list logo