On Mon, 14 Apr 2014 15:46:34 +0300 "Michael S. Tsirkin" <m...@redhat.com> wrote:
> On Mon, Apr 14, 2014 at 02:40:04PM +0200, Alexander Graf wrote: > > > > On 14.04.14 14:37, Michael S. Tsirkin wrote: > > >On Mon, Apr 14, 2014 at 02:29:20PM +0200, Alexander Graf wrote: > > >>On 14.04.14 14:24, Michael S. Tsirkin wrote: > > >>>On Mon, Apr 14, 2014 at 02:16:03PM +0200, Alexander Graf wrote: > > >>>>On 14.04.14 13:58, Greg Kurz wrote: > > >>>>>From: Rusty Russell <ru...@rustcorp.com.au> > > >>>>> > > >>>>>virtio data structures are defined as "target endian", which assumes > > >>>>>that's a fixed value. In fact, that actually means it's > > >>>>>platform-specific. > > >>>>>The OASIS virtio 1.0 spec will fix this, by making it all little > > >>>>>endian. > > >>>>> > > >>>>>We introduce memory accessors to be used accross the virtio code where > > >>>>>needed. These accessors should support both legacy and 1.0 devices. > > >>>>>A good way to do it is to introduce a per-device property to store the > > >>>>>endianness. We choose to set this flag at device reset time because it > > >>>>>is reasonnable to assume the endianness won't change unless we reboot > > >>>>>or > > >>>>>kexec another kernel. And it is also reasonnable to assume the new > > >>>>>kernel > > >>>>>will reset the devices before using them (otherwise it will break). > > >>>>> > > >>>>>We reuse the virtio_is_big_endian() helper since it provides the right > > >>>>>value for legacy devices with most of the targets, that have fixed > > >>>>>endianness. It can then be overriden to support endian-ambivalent > > >>>>>targets. > > >>>>> > > >>>>>To support migration, we need to set the flag in virtio_load() as well. > > >>>>> > > >>>>>(a) One solution would be to add it to the stream, but it have some > > >>>>> drawbacks: > > >>>>>- since this only affects a few targets, the field should be put into a > > >>>>> subsection > > >>>>>- virtio migration code should be ported to vmstate to be able to > > >>>>>introduce > > >>>>> such a subsection > > >>>>> > > >>>>>(b) If we assume the following to be true: > > >>>>>- target endianness falls under some cpu state > > >>>>>- cpu state is always restored before virtio devices state because they > > >>>>> get initialized in this order in main(). > > >>>>>Then an alternative is to rely on virtio_is_big_endian() again at > > >>>>>load time. No need to mess around with the migration stream in this > > >>>>>case. > > >>>>> > > >>>>>This patch implements (b). > > >>>>> > > >>>>>Note that the tswap helpers are implemented in virtio.c so that > > >>>>>virtio-access.h stays platform independant. Most of the virtio code > > >>>>>will be buildable under common-obj instead of obj then, and spare > > >>>>>some cycles when building for multiple targets. > > >>>>> > > >>>>>Signed-off-by: Rusty Russell <ru...@rustcorp.com.au> > > >>>>>[ ldq_phys() API change, > > >>>>> relicensed virtio-access.h to GPLv2+ on Rusty's request, > > >>>>> introduce a per-device is_big_endian flag (supersedes > > >>>>> needs_byteswap global) > > >>>>> add VirtIODevice * arg to virtio helpers, > > >>>>> use the existing virtio_is_big_endian() helper, > > >>>>> virtio-pci: use the device is_big_endian flag, > > >>>>> introduce virtio tswap16 and tswap64 helpers, > > >>>>> move calls to tswap* out of virtio-access.h to make it platform > > >>>>> independant, > > >>>>> migration support, > > >>>>> Greg Kurz <gk...@linux.vnet.ibm.com> ] > > >>>>>Cc: Cédric Le Goater <c...@fr.ibm.com> > > >>>>>Signed-off-by: Greg Kurz <gk...@linux.vnet.ibm.com> > > >>>>>--- > > >>>>> > > >>>>>Changes since v6: > > >>>>>- merge the virtio_needs_byteswap() helper from v6 and existing > > >>>>> virtio_is_big_endian() > > >>>>>- virtio-pci: now supports endianness changes > > >>>>>- virtio-access.h fixes (target independant) > > >>>>> > > >>>>> exec.c | 2 - > > >>>>> hw/virtio/Makefile.objs | 2 - > > >>>>> hw/virtio/virtio-pci.c | 11 +-- > > >>>>> hw/virtio/virtio.c | 35 +++++++++ > > >>>>> include/hw/virtio/virtio-access.h | 138 > > >>>>> +++++++++++++++++++++++++++++++++++++ > > >>>>> include/hw/virtio/virtio.h | 2 + > > >>>>> vl.c | 4 + > > >>>>> 7 files changed, 185 insertions(+), 9 deletions(-) > > >>>>> create mode 100644 include/hw/virtio/virtio-access.h > > >>>>> > > >>>>>diff --git a/exec.c b/exec.c > > >>>>>index 91513c6..e6777d0 100644 > > >>>>>--- a/exec.c > > >>>>>+++ b/exec.c > > >>>>>@@ -42,6 +42,7 @@ > > >>>>> #else /* !CONFIG_USER_ONLY */ > > >>>>> #include "sysemu/xen-mapcache.h" > > >>>>> #include "trace.h" > > >>>>>+#include "hw/virtio/virtio.h" > > >>>>> #endif > > >>>>> #include "exec/cpu-all.h" > > >>>>>@@ -2745,7 +2746,6 @@ int cpu_memory_rw_debug(CPUState *cpu, > > >>>>>target_ulong addr, > > >>>>> * A helper function for the _utterly broken_ virtio device model to > > >>>>> find out if > > >>>>> * it's running on a big endian machine. Don't do this at home kids! > > >>>>> */ > > >>>>>-bool virtio_is_big_endian(void); > > >>>>> bool virtio_is_big_endian(void) > > >>>>> { > > >>>>> #if defined(TARGET_WORDS_BIGENDIAN) > > >>>>>diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs > > >>>>>index 1ba53d9..68c3064 100644 > > >>>>>--- a/hw/virtio/Makefile.objs > > >>>>>+++ b/hw/virtio/Makefile.objs > > >>>>>@@ -4,5 +4,5 @@ common-obj-y += virtio-bus.o > > >>>>> common-obj-y += virtio-mmio.o > > >>>>> common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/ > > >>>>>-obj-y += virtio.o virtio-balloon.o > > >>>>>+obj-y += virtio.o virtio-balloon.o > > >>>>> obj-$(CONFIG_LINUX) += vhost.o > > >>>>>diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c > > >>>>>index ce97514..82a1689 100644 > > >>>>>--- a/hw/virtio/virtio-pci.c > > >>>>>+++ b/hw/virtio/virtio-pci.c > > >>>>>@@ -89,9 +89,6 @@ > > >>>>> /* Flags track per-device state like workarounds for quirks in older > > >>>>> guests. */ > > >>>>> #define VIRTIO_PCI_FLAG_BUS_MASTER_BUG (1 << 0) > > >>>>>-/* HACK for virtio to determine if it's running a big endian guest */ > > >>>>>-bool virtio_is_big_endian(void); > > >>>>>- > > >>>>> static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size, > > >>>>> VirtIOPCIProxy *dev); > > >>>>>@@ -409,13 +406,13 @@ static uint64_t virtio_pci_config_read(void > > >>>>>*opaque, hwaddr addr, > > >>>>> break; > > >>>>> case 2: > > >>>>> val = virtio_config_readw(vdev, addr); > > >>>>>- if (virtio_is_big_endian()) { > > >>>>>+ if (vdev->is_big_endian) { > > >>>>> val = bswap16(val); > > >>>>> } > > >>>>> break; > > >>>>> case 4: > > >>>>> val = virtio_config_readl(vdev, addr); > > >>>>>- if (virtio_is_big_endian()) { > > >>>>>+ if (vdev->is_big_endian) { > > >>>>> val = bswap32(val); > > >>>>> } > > >>>>> break; > > >>>>>@@ -443,13 +440,13 @@ static void virtio_pci_config_write(void > > >>>>>*opaque, hwaddr addr, > > >>>>> virtio_config_writeb(vdev, addr, val); > > >>>>> break; > > >>>>> case 2: > > >>>>>- if (virtio_is_big_endian()) { > > >>>>>+ if (vdev->is_big_endian) { > > >>>>> val = bswap16(val); > > >>>>> } > > >>>>> virtio_config_writew(vdev, addr, val); > > >>>>> break; > > >>>>> case 4: > > >>>>>- if (virtio_is_big_endian()) { > > >>>>>+ if (vdev->is_big_endian) { > > >>>>> val = bswap32(val); > > >>>>> } > > >>>>> virtio_config_writel(vdev, addr, val); > > >>>>>diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > > >>>>>index aeabf3a..bb646f0 100644 > > >>>>>--- a/hw/virtio/virtio.c > > >>>>>+++ b/hw/virtio/virtio.c > > >>>>>@@ -19,6 +19,7 @@ > > >>>>> #include "hw/virtio/virtio.h" > > >>>>> #include "qemu/atomic.h" > > >>>>> #include "hw/virtio/virtio-bus.h" > > >>>>>+#include "hw/virtio/virtio-access.h" > > >>>>> /* > > >>>>> * The alignment to use between consumer and producer parts of vring. > > >>>>>@@ -546,6 +547,8 @@ void virtio_reset(void *opaque) > > >>>>> virtio_set_status(vdev, 0); > > >>>>>+ vdev->is_big_endian = virtio_is_big_endian(); > > >>>>>+ > > >>>>> if (k->reset) { > > >>>>> k->reset(vdev); > > >>>>> } > > >>>>>@@ -897,6 +900,11 @@ int virtio_load(VirtIODevice *vdev, QEMUFile *f) > > >>>>> BusState *qbus = qdev_get_parent_bus(DEVICE(vdev)); > > >>>>> VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); > > >>>>>+ /* NOTE: we assume that endianness is a cpu state AND > > >>>>>+ * cpu state is restored before virtio devices. > > >>>>>+ */ > > >>>>>+ vdev->is_big_endian = virtio_is_big_endian(); > > >>>>>+ > > >>>>> if (k->load_config) { > > >>>>> ret = k->load_config(qbus->parent, f); > > >>>>> if (ret) > > >>>>>@@ -1153,6 +1161,33 @@ void > > >>>>>virtio_device_set_child_bus_name(VirtIODevice *vdev, char *bus_name) > > >>>>> } > > >>>>> } > > >>>>>+uint16_t virtio_tswap16(VirtIODevice *vdev, uint16_t s) > > >>>>>+{ > > >>>>>+ if (vdev->is_big_endian) { > > >>>>>+ return tswap16(s); > > >>>>>+ } else { > > >>>>>+ return bswap16(tswap16(s)); > > >>>>>+ } > > >>>>This looks pretty bogus. When virtio wants to do "tswap" what it > > >>>>means is "give me a host endianness value in virtio endianness". How > > >>>>about something like > > >>>> > > >>>>#ifdef HOST_WORDS_BIGENDIAN > > >>>> return vdev->is_big_endian ? s : bswap16(s); > > >>>>#else > > >>>> return vdev->is_big_endian ? bswap16(s) : s; > > >>>>#endif > > >>>> > > >>>Actually why doesn't this call virtio_is_big_endian? > > >>>As it is, we get extra branches even if target endian-ness > > >>>is fixed. > > >>Because virtio_is_big_endian() returns the default endianness, not > > >>the runtime endianness of a virtio device. > > > > In fact, we should probably rename it accordingly. > > > > >> > > >>>>That should work pretty well inline as well, so you don't need to > > >>>>compile virtio.c as target dependent. > > >>>> > > >>>> > > >>>>Alex > > >>>Yes but we'll still need to build two variants: fixed endian and > > >>>dynamic endian platforms. > > >>>Something along the lines of 32/64 bit split that we have? > > >>Why bother? Always make it dynamic and don't change the per-device > > >>variable, no? I'd be surprised if the performance difference is > > >>measurable. The bulk of the data we transfer gets copied raw anyway. > > >> > > >> > > >>Alex > > >This will have to be measured and proved by whoever's proposing the > > >patch, not by reviewers. Platforms such as AMD which don't do > > >prediction well would be especially interesting to test on. > > > > Sure, Greg, can you do that? I'm sure Michael has test cases > > available he can give you to measure performance on this. > > > > Speaking of which, how does all of this work with vhost? > > > > > > Alex > > I think that's missing. > As a first step, we need to disable vhost when > host/guest endian-ness do not match. > > We could also add cross-endian support to vhost. > I confirm: vhost has to be fixed to use guest endian virtio rings. > Or just wait a couple more months for virtio 1.0 which is fixed > endian-ness. > Oohhh no... don't say that ! Legacy virtio is not dead yet. -- Gregory Kurz kurzg...@fr.ibm.com gk...@linux.vnet.ibm.com Software Engineer @ IBM/Meiosys http://www.ibm.com Tel +33 (0)562 165 496 "Anarchy is about taking complete responsibility for yourself." Alan Moore.