On 06/07/2015 13:50, Peter Maydell wrote: > On 6 July 2015 at 11:31, Michael S. Tsirkin <m...@redhat.com> wrote: >> On Mon, Jul 06, 2015 at 11:04:24AM +0100, Peter Maydell wrote: >>> On 6 July 2015 at 11:03, Michael S. Tsirkin <m...@redhat.com> wrote: >>>> On Mon, Jul 06, 2015 at 10:11:18AM +0100, Peter Maydell wrote: >>>>> But address_space_rw() is just the "memcpy bytes to the >>>>> target's memory" operation -- if you have a pile of bytes >>>>> then there are no endianness concerns. If you don't have >>>>> a pile of bytes then you need to know the structure of >>>>> the data you're DMAing around, and you should probably >>>>> have a loop doing things with the specify-the-width functions. >>> >>>> Absolutely. But what if DMA happens to target another device >>>> and not memory? Device needs some endian-ness so it needs >>>> to be converted to that. >>> >>> Yes, and address_space_rw() already deals with conversion to >>> that device's specified endianness. > >> Yes, but incorrectly if target endian != host endian. >> For example, LE target and LE device on BE host. > > Having walked through the code, got confused, talked to > bonzini on IRC about it and got unconfused again,
Ah, *that discussion*. So it was yet another XY question, :) but for the better because it also helped me abstract Michael's question. Peter's analysis below summarizes the implementation very well. I believe > we do get this correct. > > * address_space_rw() takes a pointer to a pile of bytes > * if the destination is RAM, we just memcpy them (because > guest RAM is also a pile of bytes) > * if the destination is a device, then we read a value > out of the pile of bytes at whatever width the target > device can handle. The functions we use for this are > ldl_q/ldl_p/etc, which do "load target endianness" > (ie "interpret this set of 4 bytes as if it were an > integer in the target-endianness") because the API of > memory_region_dispatch_write() is that it takes a uint64_t > data whose contents are the value to write in target > endianness order. (This is regrettably undocumented.) ^^ And this is the part where "the endianness of the CPU->device bus/link" enters the picture. But it doesn't matter if the source is instead another device. What matters is that address_space_rw() manages conversion from a pile of bytes, and the device doing DMA provides that---a pile of bytes. In the patch at the beginning of this thread, problems arose because what you passed to address_space_write wasn't just a "pile of bytes" coming from a network packet or a disk sector. Instead, it was the outcome of a previous conversion from "pile of bytes" to "bytes representing an integer in little-endian format". This conversion could have possibly included a byteswap. Once you have established that the bytes represent an integer the right way to access them is to use ld*_p/st*_p and address_space_ld*/address_space_st*. This ensures that you do an even number of further byteswaps; for *_le_p and address_space_*_le, there will be 0 further byteswaps on little-endian hosts and 2 on big-endian hosts. Paolo > * memory_region_dispatch_write() then calls adjust_endianness(), > converting a target-endian value to the endianness the > device says it requires > * we then call the device's read/write functions, whose API > is that they get a value in the endianness they asked for. > >> IO callbacks always get a native endian format so they expect to get >> byte 0 of the buffer in MSB on this host. > > IO callbacks get the format they asked for (which might > be BE, LE or target endianness). They will get byte 0 of > the buffer in the MSB if they said they were BE devices > (or if they said they were target-endian on a BE target).