Am 23.01.2018 um 16:28 schrieb David Miller:
> Looking at how these DMA counters are handled, there appears to be a
> requirement that the memory buffer is 64-byte aligned.
> 
> [...]
> 
> Therefore the driver needs to allocate "size + (64 - 1)" bytes and do
> the 64-byte alignment of the CPU pointer and the DMA address by hand.

This is also what I wondered about as a non-expert in hardware drivers; 
alignment should surely be enforced here. 

However, for the memory corruption I observed, I used an x86_64 system
(which I believe always has PAGE_SIZE aligned buffers). 
So there should be another bug, unless I am mistaken about x86_64. 

I checked the deprecated r8168 driver by Realtek (I am not sure if this one is 
also affected by the issue, though)
and found two major differences in DMA handling:
1) It wraps the DMA operations (writing of adresses, waiting for cmd bits to be 
pulled down) in spin_lock_irqsave / spin_unlock_irqrestore. 
2) It does not reset CounterAddrLow / CounterAddrHigh to 0 / 0 after finishing. 
   That's not really good, but may have hidden this issue with r8168. 

Again, I have not tried to use r8168 yet (especially since it only supports old 
kernels),
but maybe this helps to trigger some ideas. 

Worst case, this could be a firmware timing bug, i.e. the card writes the 
counters to system memory
shortly before the cmd bytes are pulled high / shortly after they have been 
pulled down (then using the partially zeroed
out memory address) - I don't know. Let me know if I can extract any more info 
from an affected machine,
but I believe these machines should be very abundant. 

HTH and thanks,
Oliver

Reply via email to