[Qemu-devel] [PATCH] Add warmup phase for live migration of large memory apps

2011-05-11 Thread Shribman, Aidan
From: Aidan Shribman 

[PATCH] Add warmup phase for live migration of large memory apps

By invoking "migrate -w " we initiate a background live-migration
transferring of dirty pages continuously until invocation of "migrate_end"
which attempts to complete the live migration operation.

Qemu host: Ubuntu 10.10

Testing: live migration (with/without warmup phase) tested successfully.

Signed-off-by: Benoit Hudzia 
Signed-off-by: Petter Svard 
Signed-off-by: Aidan Shribman 
---
 hmp-commands.hx |   34 +-
 migration.c |   23 ++-
 migration.h |2 ++
 qmp-commands.hx |   41 ++---
 savevm.c|   12 
 sysemu.h|1 +
 6 files changed, 96 insertions(+), 17 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index e5585ba..234a952 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -717,24 +717,27 @@ ETEXI
 
 {
 .name   = "migrate",
-.args_type  = "detach:-d,blk:-b,inc:-i,uri:s",
-.params = "[-d] [-b] [-i] uri",
-.help   = "migrate to URI (using -d to not wait for completion)"
- "\n\t\t\t -b for migration without shared storage with"
- " full copy of disk\n\t\t\t -i for migration without "
- "shared storage with incremental copy of disk "
- "(base image shared between src and destination)",
+.args_type  = "detach:-d,blk:-b,inc:-i,warmup:-w,uri:s",
+.params = "[-d] [-b] [-i] [-w] uri",
+.help   = "migrate to URI"
+  "\n\t -d to not wait for completion"
+  "\n\t -b for migration without shared storage with"
+  " full copy of disk"
+  "\n\t -i for migration without"
+  " shared storage with incremental copy of disk"
+  " (base image shared between source and destination)"
+  "\n\t -w to enter warmup phase",
 .user_print = monitor_user_noop,   
.mhandler.cmd_new = do_migrate,
 },
 
-
 STEXI
-@item migrate [-d] [-b] [-i] @var{uri}
+@item migrate [-d] [-b] [-i] [-w] @var{uri}
 @findex migrate
 Migrate to @var{uri} (using -d to not wait for completion).
-b for migration with full copy of disk
-i for migration with incremental copy of disk (base image is shared)
+-w to enter warmup phase
 ETEXI
 
 {
@@ -753,6 +756,19 @@ Cancel the current VM migration.
 ETEXI
 
 {
+.name   = "migrate_end",
+.args_type  = "",
+.params = "",
+.help   = "Complete warmup and move to full live migration",
+.mhandler.cmd = do_migrate_end,
+},
+
+STEXI
+@item migrate_end
+Complete warmup and move to full live migration.
+ETEXI
+
+{
 .name   = "migrate_set_speed",
 .args_type  = "value:o",
 .params = "value",
diff --git a/migration.c b/migration.c
index 9ee8b17..7525b47 100644
--- a/migration.c
+++ b/migration.c
@@ -31,6 +31,8 @@
 do { } while (0)
 #endif
 
+static int is_migrate_warmup = 0;
+
 /* Migration speed throttling */
 static uint32_t max_throttle = (32 << 20);
 
@@ -81,6 +83,11 @@ int do_migrate(Monitor *mon, const QDict *qdict, QObject 
**ret_data)
 int blk = qdict_get_try_bool(qdict, "blk", 0);
 int inc = qdict_get_try_bool(qdict, "inc", 0);
 const char *uri = qdict_get_str(qdict, "uri");
+is_migrate_warmup = qdict_get_try_bool(qdict, "warmup", 0);
+
+if (is_migrate_warmup) {
+detach = 1; /* as we need migrate_end to complte */
+   }
 
 if (current_migration &&
 current_migration->get_status(current_migration) == MIG_STATE_ACTIVE) {
@@ -361,7 +368,9 @@ void migrate_fd_put_ready(void *opaque)
 }
 
 DPRINTF("iterate\n");
-if (qemu_savevm_state_iterate(s->mon, s->file) == 1) {
+if (is_migrate_warmup) {
+qemu_savevm_state_warmup(s->mon, s->file);
+   } else if (qemu_savevm_state_iterate(s->mon, s->file) == 1) {
 int state;
 int old_vm_running = vm_running;
 
@@ -448,3 +457,15 @@ int migrate_fd_close(void *opaque)
 qemu_set_fd_handler2(s->fd, NULL, NULL, NULL, NULL);
 return s->close(s);
 }
+
+void do_migrate_end(Monitor *mon, const QDict *qdict)
+{
+if (!vm_running) {
+return;
+   }
+if (!is_migrate_warmup) {
+return;
+   }
+is_migrate_warmup = 0;
+}
+
diff --git a/migration.h b/migration.h
index d13ed4f..6a96b29 100644
--- a/migration.h
+++ b/migration.h
@@ -134,4 +134,6 @@ static inline FdMigrationState 
*migrate_to_fms(MigrationState *mig_state)
 return container_of(mig_state, FdMigrationState, mig_state);
 }
 
+void do_migrate_end(Monitor *mon, const QDict *qdict);
+
 #endif
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 793cf1c..58fe696 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -431,13 +431,16 @@ EQMP
 
  

Re: [Qemu-devel] [virt-tools-list] Virt Tools Survey: What to do about virt-clone

2011-05-11 Thread Juerg Haefliger
> (a) Is cloning guests useful for you or not?  Often or infrequently?

I don't 'clone' in the strict sense of the word. I create a base raw
OS image and provide that image to other users as a common starting
point for them to setup their guests. I don't care about the XML
definition, just the raw image file. Hopefully I won't have to do that
too often once the base image is stable/mature. But it needs to be
done every time I'll have to support a new OS variant.


> (b) Do you currently use virt-clone to clone guests?

Nope. Don't know what it does, never looked into it.


> (c) Do you have a homebrew method to clone guests?  What does it do?

Plain 'cp' to 'clone' the image.


> (e) When you clone a guest, do you "sysprep" it or would you like to?
> (Using the term "sysprep" generically here, I mean any sort of
> reinitialization for Linux or Windows guests).

Before making the image available to potential users, I 'sanitize'
(what you call 'sysprep') the image. Currently, I only support SL6 and
all this step accomplishes is purging the persistent-net udev rule and
removing the MACADDR from ifcfg-eth0 so that eth0 comes up when the
image is used in a new KVM instance. This is done by loading a little
script into the guest, running it and then removing it again, using
libguestfs.


> (f) How do you feel about a multi-step process?
>
>  virt-clone -> virt-sysprep -> virt-resize (for example)

Sounds fine to me as long as there is sufficient control over what
each step is doing and as long as it can be automated without the need
for a fancy GUI.


...Juerg



Re: [Qemu-devel] [PATCH 1/2] usb-linux: fix device path aka physical port handling

2011-05-11 Thread Markus Armbruster
Good stuff, just a few questions.

Gerd Hoffmann  writes:

> The device path isn't just a number.  It specifies the physical port
> the device is connected to and in case the device is connected via
> usb hub you'll have two numbers there, like this: "5.1".  The first
> specifies the root port where the hub is plugged into, the second
> specifies the port number of the hub where the device is plugged in.
> With multiple hubs chained the string can become longer.

"5.1"?  Do you mean "5-1"?

I think you're talking about the file names in /sys/bus/usb/devices.
Kernel names them in drivers/usb/core/usb.c's usb_alloc_dev().

Roots:
dev->devpath[0] = '0';
dev_set_name(&dev->dev, "usb%d", bus->busnum);

Hubs:
if (parent->devpath[0] == '0') {
snprintf(dev->devpath, sizeof dev->devpath,
"%d", port1);
} else {
snprintf(dev->devpath, sizeof dev->devpath,
"%s.%d", parent->devpath, port1);
}
dev_set_name(&dev->dev, "%d-%s", bus->busnum, dev->devpath);

On my system:

$ ls /sys/bus/usb/devices
1-0:1.0  2-0:1.0  3-0:1.0  4-0:1.0  5-0:1.0  usb1  usb2  usb3  usb4  usb5

> This patch renames devpath to port and makes it a string.   It also
> adapts the sysfs parsing code accordingly.  The "info usbhost" monitor
> command now prints bus number, (os-assigned) device address and physical
> port for each device.
>
> Signed-off-by: Gerd Hoffmann 
> ---
>  usb-linux.c |   38 --
>  1 files changed, 20 insertions(+), 18 deletions(-)
>
> diff --git a/usb-linux.c b/usb-linux.c
> index 72fe6f5..9543a69 100644
> --- a/usb-linux.c
> +++ b/usb-linux.c
> @@ -62,7 +62,7 @@ struct usb_ctrlrequest {
>  uint16_t wLength;
>  };
>  
> -typedef int USBScanFunc(void *opaque, int bus_num, int addr, int devpath,
> +typedef int USBScanFunc(void *opaque, int bus_num, int addr, char *port,
>  int class_id, int vendor_id, int product_id,
>  const char *product_name, int speed);
>  
> @@ -79,6 +79,7 @@ typedef int USBScanFunc(void *opaque, int bus_num, int 
> addr, int devpath,
>  #define USBPROCBUS_PATH "/proc/bus/usb"
>  #define PRODUCT_NAME_SZ 32
>  #define MAX_ENDPOINTS 15
> +#define MAX_PORTLEN 8

Where does 8 come from?  For what it's worth, kernel's struct usb_device
has char devpath[16].

>  #define USBDEVBUS_PATH "/dev/bus/usb"
>  #define USBSYSBUS_PATH "/sys/bus/usb"
>  
> @@ -155,7 +156,7 @@ typedef struct USBHostDevice {
>  /* Host side address */
>  int bus_num;
>  int addr;
> -int devpath;
> +char port[MAX_PORTLEN];
>  struct USBAutoFilter match;
>  
>  QTAILQ_ENTRY(USBHostDevice) next;
> @@ -1092,7 +1093,7 @@ static int usb_linux_get_configuration(USBHostDevice *s)
>  char device_name[32], line[1024];
>  int configuration;
>  
> -sprintf(device_name, "%d-%d", s->bus_num, s->devpath);
> +sprintf(device_name, "%d-%s", s->bus_num, s->port);
>  
>  if (!usb_host_read_file(line, sizeof(line), "bConfigurationValue",
>  device_name)) {
> @@ -1138,7 +1139,7 @@ static uint8_t usb_linux_get_alt_setting(USBHostDevice 
> *s,
>  char device_name[64], line[1024];
>  int alt_setting;
>  
> -sprintf(device_name, "%d-%d:%d.%d", s->bus_num, s->devpath,
> +sprintf(device_name, "%d-%s:%d.%d", s->bus_num, s->port,
>  (int)configuration, (int)interface);
>  
>  if (!usb_host_read_file(line, sizeof(line), "bAlternateSetting",
> @@ -1257,7 +1258,7 @@ static int usb_linux_update_endp_table(USBHostDevice *s)
>  }
>  
>  static int usb_host_open(USBHostDevice *dev, int bus_num,
> - int addr, int devpath, const char *prod_name)
> + int addr, char *port, const char *prod_name)
>  {
>  int fd = -1, ret;
>  struct usbdevfs_connectinfo ci;
> @@ -1283,7 +1284,7 @@ static int usb_host_open(USBHostDevice *dev, int 
> bus_num,
>  
>  dev->bus_num = bus_num;
>  dev->addr = addr;
> -dev->devpath = devpath;
> +strcpy(dev->port, port);
>  dev->fd = fd;
>  
>  /* read the device description */
> @@ -1655,8 +1656,9 @@ static int usb_host_scan_sys(void *opaque, USBScanFunc 
> *func)
>  {
>  DIR *dir = NULL;
>  char line[1024];
> -int bus_num, addr, devpath, speed, class_id, product_id, vendor_id;
> +int bus_num, addr, speed, class_id, product_id, vendor_id;
>  int ret = 0;
> +char port[MAX_PORTLEN];
>  char product_name[512];
>  struct dirent *de;
>  
> @@ -1672,8 +1674,8 @@ static int usb_host_scan_sys(void *opaque, USBScanFunc 
> *func)
>  if (!strncmp(de->d_name, "usb", 3)) {
>  tmpstr += 3;
>  }

Sloppy parsing, but that's not your fault.

> -if (sscanf(tmpstr, "%d-%d", &

Re: [Qemu-devel] adding search to dhcp

2011-05-11 Thread Stefan Hajnoczi
On Tue, May 10, 2011 at 6:40 PM, Carl Karsten  wrote:
> I would expect the syntax to look like this:
>
> qemu -hda 1.qcow2 -net nick -net
> user,hostname=qemu,search=example.com,sales.example.com

Comma escaping is needed but it seems like a reasonable feature to me.

Stefan



Re: [Qemu-devel] [PATCH] Add warmup phase for live migration of large memory apps

2011-05-11 Thread Stefan Hajnoczi
On Wed, May 11, 2011 at 8:58 AM, Shribman, Aidan  wrote:
> From: Aidan Shribman 
>
> [PATCH] Add warmup phase for live migration of large memory apps
>
> By invoking "migrate -w " we initiate a background live-migration
> transferring of dirty pages continuously until invocation of "migrate_end"
> which attempts to complete the live migration operation.

What is the purpose of this patch?  How and when do I use it?

Some nitpicks:

> @@ -81,6 +83,11 @@ int do_migrate(Monitor *mon, const QDict *qdict, QObject 
> **ret_data)
>     int blk = qdict_get_try_bool(qdict, "blk", 0);
>     int inc = qdict_get_try_bool(qdict, "inc", 0);
>     const char *uri = qdict_get_str(qdict, "uri");
> +    is_migrate_warmup = qdict_get_try_bool(qdict, "warmup", 0);
> +
> +    if (is_migrate_warmup) {
> +        detach = 1; /* as we need migrate_end to complte */

s/complte/complete/

> +       }

Please follow the coding style and put the closing curly brace on the
same column as the 'if' statement.

> +int qemu_savevm_state_warmup(Monitor *mon, QEMUFile *f) {
> +    int ret = 1;

1 is overwritten immediately.

Stefan



Re: [Qemu-devel] -net PCI bus order reversed since 1.14.0

2011-05-11 Thread Rob Landley
On 05/10/2011 02:51 AM, Rob Landley wrote:
> Until recently, -net options used to add interfaces to linux in the same
> order they went on the command line, so the first one you listed on the
> qemu command line would become eth0, the second -net became eth1, and so
> on.  Now they're added in _reverse_ order, so the _last_ one on the
> command line is eth0.
> 
> I bisected the behavior change to commit
> 60c07d933c66c4b30a83b7ccbc8a0cb3df1b2d0e but don't understand why it
> changes this behavior.  (Probably making udev do something stupid, but
> it happens consistently both before and after...)
> 
> Rob

I poked at this a bit more, and got the problem to be more explicit:

$ ~/qemu/x86_64-softmmu/qemu-system-x86_64 -m 512 \
  -kernel ~/linux/arch/x86/boot/bzImage -no-reboot -hda squeeze.ext3 \
  -append "root=/dev/hda rw" \
  -net nic,model=e1000,macaddr=52:54:00:11:11:11 -net user \
  -redir tcp:9876::22 \
  -net nic,model=e1000,macaddr=52:54:00:22:22:22 \
  -net tap,ifname=kvm0,script=no,downscript=no

Before the above commit the 11:11:11 interface becomes -net user (and
thus dhclient works on it), and the 22:22:22 interface becomes -net tap
and needs a static IP.

After the above commit, 22:22:22 gets 10.0.2.16 (instead of 10.0.2.15),
and 11:11:11 doesn't respond to dhcp.

Would anyone like to explain what's going on, and what I _should_ be
doing here to get predictable behavior?  I deleted
/etc/udev/rules.d/70-persistent-net.rules when it first started
misbehaving, but that's not it.  The association between -net nic and
-net user/tap no longer seems to be in the order the command line
options go.

Rob



Re: [Qemu-devel] [PATCH] Add AACI audio playback support to the ARM Versatile/PB platform

2011-05-11 Thread Paul Brook
> The PL041 driver provides an interface to an ACLink bus.
> The LM4549 driver emulates a DAC connected on the ACLink bus.
> Only audio playback is implemented.

Shouldn't this be shared with the other AC97 devices?

Paul



[Qemu-devel] [PATCH 2/2] coroutine: add check-coroutine automated tests

2011-05-11 Thread Stefan Hajnoczi
To run automated tests for coroutines:

  make check-coroutine
  ./check-coroutine

On success the program terminates with exit status 0.  On failure an
error message is written to stderr and the program exits with exit
status 1.

Signed-off-by: Stefan Hajnoczi 
---
 Makefile  |3 +-
 check-coroutine.c |  188 +
 2 files changed, 190 insertions(+), 1 deletions(-)
 create mode 100644 check-coroutine.c

diff --git a/Makefile b/Makefile
index 67c0268..27ec6a1 100644
--- a/Makefile
+++ b/Makefile
@@ -130,7 +130,7 @@ qemu-io$(EXESUF): qemu-io.o cmd.o qemu-tool.o qemu-error.o 
$(oslib-obj-y) $(trac
 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"  GEN  
 $@")
 
-check-qint.o check-qstring.o check-qdict.o check-qlist.o check-qfloat.o 
check-qjson.o: $(GENERATED_HEADERS)
+check-qint.o check-qstring.o check-qdict.o check-qlist.o check-qfloat.o 
check-qjson.o check-coroutine.o: $(GENERATED_HEADERS)
 
 CHECK_PROG_DEPS = qemu-malloc.o $(oslib-obj-y) $(trace-obj-y)
 
@@ -140,6 +140,7 @@ check-qdict: check-qdict.o qdict.o qfloat.o qint.o 
qstring.o qbool.o qlist.o $(C
 check-qlist: check-qlist.o qlist.o qint.o $(CHECK_PROG_DEPS)
 check-qfloat: check-qfloat.o qfloat.o $(CHECK_PROG_DEPS)
 check-qjson: check-qjson.o qfloat.o qint.o qdict.o qstring.o qlist.o qbool.o 
qjson.o json-streamer.o json-lexer.o json-parser.o $(CHECK_PROG_DEPS)
+check-coroutine: check-coroutine.o $(coroutine-obj-y) $(CHECK_PROG_DEPS)
 
 QEMULIBS=libhw32 libhw64 libuser libdis libdis-user
 
diff --git a/check-coroutine.c b/check-coroutine.c
new file mode 100644
index 000..f65ac2e
--- /dev/null
+++ b/check-coroutine.c
@@ -0,0 +1,188 @@
+/*
+ * Coroutine tests
+ *
+ * Copyright IBM, Corp. 2011
+ *
+ * Authors:
+ *  Stefan Hajnoczi
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ *
+ */
+
+#include 
+#include 
+#include "qemu-coroutine.h"
+
+static const char *cur_test_name;
+
+static void test_assert(bool condition, const char *msg)
+{
+if (!condition) {
+fprintf(stderr, "%s: %s\n", cur_test_name, msg);
+exit(EXIT_FAILURE);
+}
+}
+
+/*
+ * Check that qemu_in_coroutine() works
+ */
+
+static void coroutine_fn verify_in_coroutine(void *opaque)
+{
+test_assert(qemu_in_coroutine(), "expected coroutine context");
+}
+
+static void test_in_coroutine(void)
+{
+Coroutine *coroutine;
+
+test_assert(!qemu_in_coroutine(), "expected no coroutine context");
+
+coroutine = qemu_coroutine_create(verify_in_coroutine);
+qemu_coroutine_enter(coroutine, NULL);
+}
+
+/*
+ * Check that qemu_coroutine_self() works
+ */
+
+static void coroutine_fn verify_self(void *opaque)
+{
+test_assert(qemu_coroutine_self() == opaque,
+"qemu_coroutine_self() did not return this coroutine");
+}
+
+static void test_self(void)
+{
+Coroutine *coroutine;
+
+coroutine = qemu_coroutine_create(verify_self);
+qemu_coroutine_enter(coroutine, coroutine);
+}
+
+/*
+ * Check that coroutines may nest multiple levels
+ */
+
+typedef struct {
+unsigned int n_enter;   /* num coroutines entered */
+unsigned int n_return;  /* num coroutines returned */
+unsigned int max;   /* maximum level of nesting */
+} NestData;
+
+static void coroutine_fn nest(void *opaque)
+{
+NestData *nd = opaque;
+
+nd->n_enter++;
+
+if (nd->n_enter < nd->max) {
+Coroutine *child;
+
+child = qemu_coroutine_create(nest);
+qemu_coroutine_enter(child, nd);
+}
+
+nd->n_return++;
+}
+
+static void test_nesting(void)
+{
+Coroutine *root;
+NestData nd = {
+.n_enter  = 0,
+.n_return = 0,
+.max  = 1,
+};
+
+root = qemu_coroutine_create(nest);
+qemu_coroutine_enter(root, &nd);
+
+test_assert(nd.n_enter == nd.max,
+"failed entering to max nesting level");
+test_assert(nd.n_return == nd.max,
+"failed returning from max nesting level");
+}
+
+/*
+ * Check that yield/enter transfer control correctly
+ */
+
+static void coroutine_fn yield_5_times(void *opaque)
+{
+bool *done = opaque;
+int i;
+
+for (i = 0; i < 5; i++) {
+qemu_coroutine_yield();
+}
+*done = true;
+}
+
+static void test_yield(void)
+{
+Coroutine *coroutine;
+bool done = false;
+int i = -1; /* one extra time to return from coroutine */
+
+coroutine = qemu_coroutine_create(yield_5_times);
+while (!done) {
+qemu_coroutine_enter(coroutine, &done);
+i++;
+}
+test_assert(i == 5, "coroutine did not yield 5 times");
+}
+
+/*
+ * Check that creation, enter, and return work
+ */
+
+static void coroutine_fn set_and_exit(void *opaque)
+{
+bool *done = opaque;
+
+*done = true;
+}
+
+static void test_lifecycle(void)
+{
+Coroutine *coroutine;
+bool done = fal

[Qemu-devel] [PATCH 0/2] Coroutines for better asynchronous programming

2011-05-11 Thread Stefan Hajnoczi
QEMU is event-driven and suffers when blocking operations are performed because
VM execution may be stopped until the operation completes.  Therefore many
operations that could block are performed asynchronously and a callback is
invoked when the operation has completed.  This allows QEMU to continue
executing while the operation is pending.

The downside to callbacks is that they split up code into many smaller
functions, each of which is a single step in a state machine that quickly
becomes complex and hard to understand.  Callback functions also result in lots
of noise as variables are packed and unpacked into temporary structs that pass
state to the callback function.

This patch series introduces coroutines as a solution for writing asynchronous
code while still having a nice sequential control flow.  The semantics are
explained in the first patch.  The second patch adds automated tests.

A nice feature of coroutines is that it is relatively easy to take synchronous
code and lift it into a coroutine to make it asynchronous.  Work has been done
to move qcow2 request processing into coroutines and thereby make it
asynchronous (today qcow2 will perform synchronous metadata accesses).  This
qcow2 work is still ongoing and not quite ready for mainline yet.

Coroutines are also being used for virtfs (virtio-9p) so I have submitted this
patch now because virtfs patches that depend on coroutines will follow shortly.

Other areas of QEMU that could take advantage of coroutines include the VNC
server, migration, and qemu-tools.




[Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Stefan Hajnoczi
From: Kevin Wolf 

Asynchronous code is becoming very complex.  At the same time
synchronous code is growing because it is convenient to write.
Sometimes duplicate code paths are even added, one synchronous and the
other asynchronous.  This patch introduces coroutines which allow code
that looks synchronous but is asynchronous under the covers.

A coroutine has its own stack and is therefore able to preserve state
across blocking operations, which traditionally require callback
functions and manual marshalling of parameters.

Creating and starting a coroutine is easy:

  coroutine = qemu_coroutine_create(my_coroutine);
  qemu_coroutine_enter(coroutine, my_data);

The coroutine then executes until it returns or yields:

  void coroutine_fn my_coroutine(void *opaque) {
  MyData *my_data = opaque;

  /* do some work */

  qemu_coroutine_yield();

  /* do some more work */
  }

Yielding switches control back to the caller of qemu_coroutine_enter().
This is typically used to switch back to the main thread's event loop
after issuing an asynchronous I/O request.  The request callback will
then invoke qemu_coroutine_enter() once more to switch back to the
coroutine.

Note that coroutines never execute concurrently and should only be used
from threads which hold the global mutex.  This restriction makes
programming with coroutines easier than with threads.  Race conditions
cannot occur since only one coroutine may be active at any time.  Other
coroutines can only run across yield.

This coroutines implementation is based on the gtk-vnc implementation
written by Anthony Liguori  but it has been
significantly rewritten by Kevin Wolf  to use
setjmp()/longjmp() instead of the more expensive swapcontext().

Signed-off-by: Kevin Wolf 
Signed-off-by: Stefan Hajnoczi 
---
 Makefile.objs|7 +++
 coroutine-ucontext.c |   73 +++
 coroutine-win32.c|   57 +
 qemu-coroutine-int.h |   57 +
 qemu-coroutine.c |  132 ++
 qemu-coroutine.h |   82 +++
 trace-events |5 ++
 7 files changed, 413 insertions(+), 0 deletions(-)
 create mode 100644 coroutine-ucontext.c
 create mode 100644 coroutine-win32.c
 create mode 100644 qemu-coroutine-int.h
 create mode 100644 qemu-coroutine.c
 create mode 100644 qemu-coroutine.h

diff --git a/Makefile.objs b/Makefile.objs
index 9d8851e..cba6c2b 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -11,6 +11,12 @@ oslib-obj-$(CONFIG_WIN32) += oslib-win32.o
 oslib-obj-$(CONFIG_POSIX) += oslib-posix.o
 
 ###
+# coroutines
+coroutine-obj-y = qemu-coroutine.o
+coroutine-obj-$(CONFIG_POSIX) += coroutine-ucontext.o
+coroutine-obj-$(CONFIG_WIN32) += coroutine-win32.o
+
+###
 # block-obj-y is code used by both qemu system emulation and qemu-img
 
 block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o 
async.o
@@ -67,6 +73,7 @@ common-obj-y += readline.o console.o cursor.o qemu-error.o
 common-obj-y += $(oslib-obj-y)
 common-obj-$(CONFIG_WIN32) += os-win32.o
 common-obj-$(CONFIG_POSIX) += os-posix.o
+common-obj-y += $(coroutine-obj-y)
 
 common-obj-y += tcg-runtime.o host-utils.o
 common-obj-y += irq.o ioport.o input.o
diff --git a/coroutine-ucontext.c b/coroutine-ucontext.c
new file mode 100644
index 000..97f2b35
--- /dev/null
+++ b/coroutine-ucontext.c
@@ -0,0 +1,73 @@
+/*
+ * ucontext coroutine initialization code
+ *
+ * Copyright (C) 2006  Anthony Liguori 
+ * Copyright (C) 2011  Kevin Wolf 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.0 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301 
USA
+ */
+
+/* XXX Is there a nicer way to disable glibc's stack check for longjmp? */
+#ifdef _FORTIFY_SOURCE
+#undef _FORTIFY_SOURCE
+#endif
+#include 
+#include 
+#include 
+#include "qemu-coroutine-int.h"
+
+static Coroutine *new_coroutine;
+
+static void continuation_trampoline(void)
+{
+Coroutine *co = new_coroutine;
+
+/* Initialize longjmp environment and switch back to
+ * qemu_coroutine_init_env() in the old ucontext. */
+if (!setjmp(co->env)) {
+return;
+}
+
+while (true) {
+co->entry(co->d

Re: [Qemu-devel] adding search to dhcp

2011-05-11 Thread Markus Armbruster
Stefan Hajnoczi  writes:

> On Tue, May 10, 2011 at 6:40 PM, Carl Karsten  wrote:
>> I would expect the syntax to look like this:
>>
>> qemu -hda 1.qcow2 -net nick -net
>> user,hostname=qemu,search=example.com,sales.example.com
>
> Comma escaping is needed but it seems like a reasonable feature to me.

Comma escaping is ugly:
-net user,hostname=qemu,search=example.com,,sales.example.com

Could we have multiple search options instead?  Like this:
-net user,hostname=qemu,search=example.com,search=sales.example.com



Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Kevin Wolf
Am 11.05.2011 12:15, schrieb Stefan Hajnoczi:
> From: Kevin Wolf 
> 
> Asynchronous code is becoming very complex.  At the same time
> synchronous code is growing because it is convenient to write.
> Sometimes duplicate code paths are even added, one synchronous and the
> other asynchronous.  This patch introduces coroutines which allow code
> that looks synchronous but is asynchronous under the covers.
> 
> A coroutine has its own stack and is therefore able to preserve state
> across blocking operations, which traditionally require callback
> functions and manual marshalling of parameters.
> 
> Creating and starting a coroutine is easy:
> 
>   coroutine = qemu_coroutine_create(my_coroutine);
>   qemu_coroutine_enter(coroutine, my_data);
> 
> The coroutine then executes until it returns or yields:
> 
>   void coroutine_fn my_coroutine(void *opaque) {
>   MyData *my_data = opaque;
> 
>   /* do some work */
> 
>   qemu_coroutine_yield();
> 
>   /* do some more work */
>   }
> 
> Yielding switches control back to the caller of qemu_coroutine_enter().
> This is typically used to switch back to the main thread's event loop
> after issuing an asynchronous I/O request.  The request callback will
> then invoke qemu_coroutine_enter() once more to switch back to the
> coroutine.
> 
> Note that coroutines never execute concurrently and should only be used
> from threads which hold the global mutex.  This restriction makes
> programming with coroutines easier than with threads.  Race conditions
> cannot occur since only one coroutine may be active at any time.  Other
> coroutines can only run across yield.
> 
> This coroutines implementation is based on the gtk-vnc implementation
> written by Anthony Liguori  but it has been
> significantly rewritten by Kevin Wolf  to use
> setjmp()/longjmp() instead of the more expensive swapcontext().
> 
> Signed-off-by: Kevin Wolf 
> Signed-off-by: Stefan Hajnoczi 

For the diff between my latest version and this patch:

Reviewed-by: Kevin Wolf 



Re: [Qemu-devel] Bug #757654: UHCI fails to signal stall response patch

2011-05-11 Thread Jan Vesely
glad I could help

the original bug report is here: https://bugs.launchpad.net/qemu/+bug/757654
should I update it? or will it be updated when the patch reaches master?

j.

On Mon, May 9, 2011 at 1:43 PM, Gerd Hoffmann  wrote:
> On 05/09/11 12:16, Jan Vesely wrote:
>>
>> UHCI host controller status register indicates error and
>> an interrupt is triggered on BABBLE and STALL errors.
>
> Queued up.
>
> thanks,
>  Gerd
>
>



Re: [Qemu-devel] [PATCH uq/master V2] kvm: Add CPUID support for VIA CPU

2011-05-11 Thread Avi Kivity

On 05/10/2011 11:02 AM, BrillyWu wrote:

From: BrillyWu

When KVM is running on VIA CPU with host cpu's model, the
feautures of VIA CPU will be passed into kvm guest by calling
the CPUID instruction for Centaur.



Applied, thanks.

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Paolo Bonzini

On 05/11/2011 12:15 PM, Stefan Hajnoczi wrote:

+#ifdef __i386__
+asm volatile(
+"mov %%esp, %%ebx;"
+"mov %0, %%esp;"
+"pushl %1;"
+"call _trampoline;"
+"mov %%ebx, %%esp;"
+: : "r" (co->stack + co->stack_size), "r" (co) : "ebx"
+);


This is incomplete, it should set FS:[4] and FS:[8] to top and bottom of 
stack respectively, otherwise exception handling (including SIGSEGV) is 
broken.  But I think for Windows it's anyway better to use fibers. 
Commit this and either I or Stefan Weil will fix it. :)


Acked-by: Paolo Bonzini 

Paolo



Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Kevin Wolf
Am 11.05.2011 14:04, schrieb Paolo Bonzini:
> On 05/11/2011 12:15 PM, Stefan Hajnoczi wrote:
>> +#ifdef __i386__
>> +asm volatile(
>> +"mov %%esp, %%ebx;"
>> +"mov %0, %%esp;"
>> +"pushl %1;"
>> +"call _trampoline;"
>> +"mov %%ebx, %%esp;"
>> +: : "r" (co->stack + co->stack_size), "r" (co) : "ebx"
>> +);
> 
> This is incomplete, it should set FS:[4] and FS:[8] to top and bottom of 
> stack respectively, otherwise exception handling (including SIGSEGV) is 
> broken.  But I think for Windows it's anyway better to use fibers. 
> Commit this and either I or Stefan Weil will fix it. :)
> 
> Acked-by: Paolo Bonzini 

Yeah, I didn't feel like searching for the right APIs, but wanted to
have something that builds for win32 at least. And this one seemed to
work for some simple tests in Wine.

So my plan with it was to CC Stefan Weil and have him provide the real
implementation that works on 64 bit, too. ;-)

Kevin



Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Anthony Liguori

On 05/11/2011 05:15 AM, Stefan Hajnoczi wrote:

From: Kevin Wolf

Asynchronous code is becoming very complex.  At the same time
synchronous code is growing because it is convenient to write.
Sometimes duplicate code paths are even added, one synchronous and the
other asynchronous.  This patch introduces coroutines which allow code
that looks synchronous but is asynchronous under the covers.

A coroutine has its own stack and is therefore able to preserve state
across blocking operations, which traditionally require callback
functions and manual marshalling of parameters.

Creating and starting a coroutine is easy:

   coroutine = qemu_coroutine_create(my_coroutine);
   qemu_coroutine_enter(coroutine, my_data);


Why do away with yieldto?

Do we have performance data around setjmp vs. setcontext?



The coroutine then executes until it returns or yields:

   void coroutine_fn my_coroutine(void *opaque) {
   MyData *my_data = opaque;

   /* do some work */

   qemu_coroutine_yield();

   /* do some more work */
   }

Yielding switches control back to the caller of qemu_coroutine_enter().
This is typically used to switch back to the main thread's event loop
after issuing an asynchronous I/O request.  The request callback will
then invoke qemu_coroutine_enter() once more to switch back to the
coroutine.

Note that coroutines never execute concurrently and should only be used
from threads which hold the global mutex.  This restriction makes
programming with coroutines easier than with threads.  Race conditions
cannot occur since only one coroutine may be active at any time.  Other
coroutines can only run across yield.

This coroutines implementation is based on the gtk-vnc implementation
written by Anthony Liguori  but it has been
significantly rewritten by Kevin Wolf  to use
setjmp()/longjmp() instead of the more expensive swapcontext().

Signed-off-by: Kevin Wolf
Signed-off-by: Stefan Hajnoczi
---
  Makefile.objs|7 +++
  coroutine-ucontext.c |   73 +++
  coroutine-win32.c|   57 +
  qemu-coroutine-int.h |   57 +
  qemu-coroutine.c |  132 ++
  qemu-coroutine.h |   82 +++
  trace-events |5 ++
  7 files changed, 413 insertions(+), 0 deletions(-)
  create mode 100644 coroutine-ucontext.c
  create mode 100644 coroutine-win32.c
  create mode 100644 qemu-coroutine-int.h
  create mode 100644 qemu-coroutine.c
  create mode 100644 qemu-coroutine.h

diff --git a/Makefile.objs b/Makefile.objs
index 9d8851e..cba6c2b 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -11,6 +11,12 @@ oslib-obj-$(CONFIG_WIN32) += oslib-win32.o
  oslib-obj-$(CONFIG_POSIX) += oslib-posix.o

  ###
+# coroutines
+coroutine-obj-y = qemu-coroutine.o
+coroutine-obj-$(CONFIG_POSIX) += coroutine-ucontext.o
+coroutine-obj-$(CONFIG_WIN32) += coroutine-win32.o
+
+###
  # block-obj-y is code used by both qemu system emulation and qemu-img

  block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o 
async.o
@@ -67,6 +73,7 @@ common-obj-y += readline.o console.o cursor.o qemu-error.o
  common-obj-y += $(oslib-obj-y)
  common-obj-$(CONFIG_WIN32) += os-win32.o
  common-obj-$(CONFIG_POSIX) += os-posix.o
+common-obj-y += $(coroutine-obj-y)

  common-obj-y += tcg-runtime.o host-utils.o
  common-obj-y += irq.o ioport.o input.o
diff --git a/coroutine-ucontext.c b/coroutine-ucontext.c
new file mode 100644
index 000..97f2b35
--- /dev/null
+++ b/coroutine-ucontext.c
@@ -0,0 +1,73 @@
+/*
+ * ucontext coroutine initialization code
+ *
+ * Copyright (C) 2006  Anthony Liguori
+ * Copyright (C) 2011  Kevin Wolf
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.0 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301 
USA
+ */
+
+/* XXX Is there a nicer way to disable glibc's stack check for longjmp? */
+#ifdef _FORTIFY_SOURCE
+#undef _FORTIFY_SOURCE
+#endif
+#include
+#include
+#include
+#include "qemu-coroutine-int.h"
+
+static Coroutine *new_coroutine;
+
+static void continuation_trampoline(void)
+{
+Coroutine *co = new_coroutine;
+
+/* Initialize longjmp environment and switch back to

Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Paolo Bonzini

On 05/11/2011 02:36 PM, Anthony Liguori wrote:


So the only Linux host we support is x86??

We can't reasonably do this IMHO.  If we're going to go this route, we
should at least fall back to setcontext for the sake of portability.


That was:

diff --git a/coroutine-win32.c b/coroutine-win32.c
new file mode 100644
index 000..f4521c3
--- /dev/null
+++ b/coroutine-win32.c

See my reply too.

Paolo




Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Anthony Liguori

On 05/11/2011 07:04 AM, Paolo Bonzini wrote:

On 05/11/2011 12:15 PM, Stefan Hajnoczi wrote:

+#ifdef __i386__
+ asm volatile(
+ "mov %%esp, %%ebx;"
+ "mov %0, %%esp;"
+ "pushl %1;"
+ "call _trampoline;"
+ "mov %%ebx, %%esp;"
+ : : "r" (co->stack + co->stack_size), "r" (co) : "ebx"
+ );


This is incomplete, it should set FS:[4] and FS:[8] to top and bottom of
stack respectively, otherwise exception handling (including SIGSEGV) is
broken. But I think for Windows it's anyway better to use fibers. Commit
this and either I or Stefan Weil will fix it. :)

Acked-by: Paolo Bonzini 


How about a generic thread fallback?  That's what we do in gtk-vnc and 
it solves the portability issue in a very robust way.


Regards,

Anthony Liguori



Paolo






Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Paolo Bonzini

On 05/11/2011 02:51 PM, Anthony Liguori wrote:

How about a generic thread fallback?  That's what we do in gtk-vnc and
it solves the portability issue in a very robust way.


A very slow way, too (on Windows at least if you use qemu_cond...).

Paolo



Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Stefan Hajnoczi
On Wed, May 11, 2011 at 1:46 PM, Paolo Bonzini  wrote:
> On 05/11/2011 02:36 PM, Anthony Liguori wrote:
> diff --git a/coroutine-win32.c b/coroutine-win32.c

Kevin: This reminds me that I did not run ./check-coroutine on win32.
If you are able to run it in your environment that would be good to
make sure the Windows code works.

Stefan



Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Kevin Wolf
Am 11.05.2011 14:36, schrieb Anthony Liguori:
> On 05/11/2011 05:15 AM, Stefan Hajnoczi wrote:
>> From: Kevin Wolf
>>
>> Asynchronous code is becoming very complex.  At the same time
>> synchronous code is growing because it is convenient to write.
>> Sometimes duplicate code paths are even added, one synchronous and the
>> other asynchronous.  This patch introduces coroutines which allow code
>> that looks synchronous but is asynchronous under the covers.
>>
>> A coroutine has its own stack and is therefore able to preserve state
>> across blocking operations, which traditionally require callback
>> functions and manual marshalling of parameters.
>>
>> Creating and starting a coroutine is easy:
>>
>>coroutine = qemu_coroutine_create(my_coroutine);
>>qemu_coroutine_enter(coroutine, my_data);
> 
> Why do away with yieldto?
> 
> Do we have performance data around setjmp vs. setcontext?

I did some quick ad-hoc tests when I introduced setjmp, but don't have
them any more. IIRC, it was something like a factor of 3.

>> diff --git a/coroutine-win32.c b/coroutine-win32.c
>> new file mode 100644
>> index 000..f4521c3
>> --- /dev/null
>> +++ b/coroutine-win32.c
>> @@ -0,0 +1,57 @@
>> +/*
>> + * Win32 coroutine initialization code
>> + *
>> + * Copyright (c) 2011 Kevin Wolf
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a 
>> copy
>> + * of this software and associated documentation files (the "Software"), to 
>> deal
>> + * in the Software without restriction, including without limitation the 
>> rights
>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>> + * copies of the Software, and to permit persons to whom the Software is
>> + * furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included 
>> in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
>> OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
>> FROM,
>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>> + * THE SOFTWARE.
>> + */
>> +
>> +#include "qemu-coroutine-int.h"
>> +
>> +static void __attribute__((used)) trampoline(Coroutine *co)
>> +{
>> +if (!setjmp(co->env)) {
>> +return;
>> +}
>> +
>> +while (true) {
>> +co->entry(co->data);
>> +if (!setjmp(co->env)) {
>> +longjmp(co->caller->env, COROUTINE_TERMINATE);
>> +}
>> +}
>> +}
>> +
>> +int qemu_coroutine_init_env(Coroutine *co)
>> +{
>> +#ifdef __i386__
>> +asm volatile(
>> +"mov %%esp, %%ebx;"
>> +"mov %0, %%esp;"
>> +"pushl %1;"
>> +"call _trampoline;"
>> +"mov %%ebx, %%esp;"
>> +: : "r" (co->stack + co->stack_size), "r" (co) : "ebx"
>> +);
> 
> So the only Linux host we support is x86??
> 
> We can't reasonably do this IMHO.  If we're going to go this route, we 
> should at least fall back to setcontext for the sake of portability.

This is win32 code, Linux uses ucontext for initialization.

Kevin



Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Anthony Liguori

On 05/11/2011 07:52 AM, Paolo Bonzini wrote:

On 05/11/2011 02:51 PM, Anthony Liguori wrote:

How about a generic thread fallback? That's what we do in gtk-vnc and
it solves the portability issue in a very robust way.


A very slow way, too (on Windows at least if you use qemu_cond...).


That doesn't mean you can't do a fiber implementation for Windows... but 
having a highly portable fallback is a good thing.


Regards,

Anthony Liguori


Paolo





[Qemu-devel] [PATCH 2/5] spice: enable thread support

2011-05-11 Thread Gerd Hoffmann
The spice locking fixes cherry-picked from master need qemu mutexes.
Enable CONFIG_THREAD when enabling spice so they get linked in.

Signed-off-by: Gerd Hoffmann 
---
 configure |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/configure b/configure
index f2551b3..b2eade1 100755
--- a/configure
+++ b/configure
@@ -2742,6 +2742,7 @@ fi
 
 if test "$spice" = "yes" ; then
   echo "CONFIG_SPICE=y" >> $config_host_mak
+  echo "CONFIG_THREAD=y" >> $config_host_mak
 fi
 
 # XXX: suppress that
-- 
1.7.1




[Qemu-devel] [PATCH 4/5] spice: don't call displaystate callbacks from spice server context.

2011-05-11 Thread Gerd Hoffmann
This patch moves the displaystate callback calls for setting the cursor
and the mouse pointer from spice server to qemu (iothread) context.
This allows us to simplify locking.

Signed-off-by: Gerd Hoffmann 
(cherry picked from commit 075360945860ad9bdd491921954b383bf762b0e5)
---
 hw/qxl-render.c|   25 -
 hw/qxl.c   |2 ++
 ui/spice-display.c |   12 
 ui/spice-display.h |3 +++
 4 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/hw/qxl-render.c b/hw/qxl-render.c
index 58965e0..1316066 100644
--- a/hw/qxl-render.c
+++ b/hw/qxl-render.c
@@ -185,7 +185,6 @@ void qxl_render_cursor(PCIQXLDevice *qxl, QXLCommandExt 
*ext)
 QXLCursorCmd *cmd = qxl_phys2virt(qxl, ext->cmd.data, ext->group_id);
 QXLCursor *cursor;
 QEMUCursor *c;
-int x = -1, y = -1;
 
 if (!qxl->ssd.ds->mouse_set || !qxl->ssd.ds->cursor_define) {
 return;
@@ -198,8 +197,6 @@ void qxl_render_cursor(PCIQXLDevice *qxl, QXLCommandExt 
*ext)
 }
 switch (cmd->type) {
 case QXL_CURSOR_SET:
-x = cmd->u.set.position.x;
-y = cmd->u.set.position.y;
 cursor = qxl_phys2virt(qxl, cmd->u.set.shape, ext->group_id);
 if (cursor->chunk.data_size != cursor->data_size) {
 fprintf(stderr, "%s: multiple chunks\n", __FUNCTION__);
@@ -209,18 +206,20 @@ void qxl_render_cursor(PCIQXLDevice *qxl, QXLCommandExt 
*ext)
 if (c == NULL) {
 c = cursor_builtin_left_ptr();
 }
-qemu_mutex_lock_iothread();
-qxl->ssd.ds->cursor_define(c);
-qxl->ssd.ds->mouse_set(x, y, 1);
-qemu_mutex_unlock_iothread();
-cursor_put(c);
+qemu_mutex_lock(&qxl->ssd.lock);
+if (qxl->ssd.cursor) {
+cursor_put(qxl->ssd.cursor);
+}
+qxl->ssd.cursor = c;
+qxl->ssd.mouse_x = cmd->u.set.position.x;
+qxl->ssd.mouse_y = cmd->u.set.position.y;
+qemu_mutex_unlock(&qxl->ssd.lock);
 break;
 case QXL_CURSOR_MOVE:
-x = cmd->u.position.x;
-y = cmd->u.position.y;
-qemu_mutex_lock_iothread();
-qxl->ssd.ds->mouse_set(x, y, 1);
-qemu_mutex_unlock_iothread();
+qemu_mutex_lock(&qxl->ssd.lock);
+qxl->ssd.mouse_x = cmd->u.position.x;
+qxl->ssd.mouse_y = cmd->u.position.y;
+qemu_mutex_unlock(&qxl->ssd.lock);
 break;
 }
 }
diff --git a/hw/qxl.c b/hw/qxl.c
index bd250db..4dfddf0 100644
--- a/hw/qxl.c
+++ b/hw/qxl.c
@@ -1309,6 +1309,8 @@ static int qxl_init_primary(PCIDevice *dev)
qxl_hw_screen_dump, qxl_hw_text_update, 
qxl);
 qxl->ssd.ds = vga->ds;
 qemu_mutex_init(&qxl->ssd.lock);
+qxl->ssd.mouse_x = -1;
+qxl->ssd.mouse_y = -1;
 qxl->ssd.bufsize = (16 * 1024 * 1024);
 qxl->ssd.buf = qemu_malloc(qxl->ssd.bufsize);
 
diff --git a/ui/spice-display.c b/ui/spice-display.c
index d56dcfc..8579bfd 100644
--- a/ui/spice-display.c
+++ b/ui/spice-display.c
@@ -254,6 +254,16 @@ void qemu_spice_display_refresh(SimpleSpiceDisplay *ssd)
 ssd->update = qemu_spice_create_update(ssd);
 ssd->notify++;
 }
+if (ssd->cursor) {
+ssd->ds->cursor_define(ssd->cursor);
+cursor_put(ssd->cursor);
+ssd->cursor = NULL;
+}
+if (ssd->mouse_x != -1 && ssd->mouse_y != -1) {
+ssd->ds->mouse_set(ssd->mouse_x, ssd->mouse_y, 1);
+ssd->mouse_x = -1;
+ssd->mouse_y = -1;
+}
 qemu_mutex_unlock(&ssd->lock);
 
 if (ssd->notify) {
@@ -409,6 +419,8 @@ void qemu_spice_display_init(DisplayState *ds)
 assert(sdpy.ds == NULL);
 sdpy.ds = ds;
 qemu_mutex_init(&sdpy.lock);
+sdpy.mouse_x = -1;
+sdpy.mouse_y = -1;
 sdpy.bufsize = (16 * 1024 * 1024);
 sdpy.buf = qemu_malloc(sdpy.bufsize);
 register_displaychangelistener(ds, &display_listener);
diff --git a/ui/spice-display.h b/ui/spice-display.h
index e0cc46e..2f95f68 100644
--- a/ui/spice-display.h
+++ b/ui/spice-display.h
@@ -20,6 +20,7 @@
 #include 
 
 #include "qemu-thread.h"
+#include "console.h"
 #include "pflib.h"
 
 #define NUM_MEMSLOTS 8
@@ -55,6 +56,8 @@ struct SimpleSpiceDisplay {
  */
 QemuMutex lock;
 SimpleSpiceUpdate *update;
+QEMUCursor *cursor;
+int mouse_x, mouse_y;
 };
 
 struct SimpleSpiceUpdate {
-- 
1.7.1




[Qemu-devel] [PATCH 5/5] spice: drop obsolete iothread locking

2011-05-11 Thread Gerd Hoffmann
We don't use qemu internals from spice server context any more.
Thus we don't also need to grab the iothread mutex from spice
server context.  And we don't have to temporarely release the
lock to avoid deadlocks.  Drop all the calls.

Signed-off-by: Gerd Hoffmann 
(cherry picked from commit 196a778428989217b82de042725dc8eb29c8f8d8)
---
 hw/qxl.c   |8 
 ui/spice-display.c |6 --
 2 files changed, 0 insertions(+), 14 deletions(-)

diff --git a/hw/qxl.c b/hw/qxl.c
index 4dfddf0..2bb36c6 100644
--- a/hw/qxl.c
+++ b/hw/qxl.c
@@ -666,10 +666,8 @@ static void qxl_hard_reset(PCIQXLDevice *d, int loadvm)
 dprint(d, 1, "%s: start%s\n", __FUNCTION__,
loadvm ? " (loadvm)" : "");
 
-qemu_mutex_unlock_iothread();
 d->ssd.worker->reset_cursor(d->ssd.worker);
 d->ssd.worker->reset_image_cache(d->ssd.worker);
-qemu_mutex_lock_iothread();
 qxl_reset_surfaces(d);
 qxl_reset_memslots(d);
 
@@ -799,9 +797,7 @@ static void qxl_reset_surfaces(PCIQXLDevice *d)
 {
 dprint(d, 1, "%s:\n", __FUNCTION__);
 d->mode = QXL_MODE_UNDEFINED;
-qemu_mutex_unlock_iothread();
 d->ssd.worker->destroy_surfaces(d->ssd.worker);
-qemu_mutex_lock_iothread();
 memset(&d->guest_surfaces.cmds, 0, sizeof(d->guest_surfaces.cmds));
 }
 
@@ -870,9 +866,7 @@ static void qxl_destroy_primary(PCIQXLDevice *d)
 dprint(d, 1, "%s\n", __FUNCTION__);
 
 d->mode = QXL_MODE_UNDEFINED;
-qemu_mutex_unlock_iothread();
 d->ssd.worker->destroy_primary_surface(d->ssd.worker, 0);
-qemu_mutex_lock_iothread();
 }
 
 static void qxl_set_mode(PCIQXLDevice *d, int modenr, int loadvm)
@@ -942,10 +936,8 @@ static void ioport_write(void *opaque, uint32_t addr, 
uint32_t val)
 case QXL_IO_UPDATE_AREA:
 {
 QXLRect update = d->ram->update_area;
-qemu_mutex_unlock_iothread();
 d->ssd.worker->update_area(d->ssd.worker, d->ram->update_surface,
&update, NULL, 0, 0);
-qemu_mutex_lock_iothread();
 break;
 }
 case QXL_IO_NOTIFY_CMD:
diff --git a/ui/spice-display.c b/ui/spice-display.c
index 8579bfd..15f0704 100644
--- a/ui/spice-display.c
+++ b/ui/spice-display.c
@@ -176,18 +176,14 @@ void qemu_spice_create_host_primary(SimpleSpiceDisplay 
*ssd)
 surface.mem= (intptr_t)ssd->buf;
 surface.group_id   = MEMSLOT_GROUP_HOST;
 
-qemu_mutex_unlock_iothread();
 ssd->worker->create_primary_surface(ssd->worker, 0, &surface);
-qemu_mutex_lock_iothread();
 }
 
 void qemu_spice_destroy_host_primary(SimpleSpiceDisplay *ssd)
 {
 dprint(1, "%s:\n", __FUNCTION__);
 
-qemu_mutex_unlock_iothread();
 ssd->worker->destroy_primary_surface(ssd->worker, 0);
-qemu_mutex_lock_iothread();
 }
 
 void qemu_spice_vm_change_state_handler(void *opaque, int running, int reason)
@@ -197,9 +193,7 @@ void qemu_spice_vm_change_state_handler(void *opaque, int 
running, int reason)
 if (running) {
 ssd->worker->start(ssd->worker);
 } else {
-qemu_mutex_unlock_iothread();
 ssd->worker->stop(ssd->worker);
-qemu_mutex_lock_iothread();
 }
 ssd->running = running;
 }
-- 
1.7.1




[Qemu-devel] [STABLE PULL] spice: locking fixes

2011-05-11 Thread Gerd Hoffmann
  Hi,

Here are the spice locking fixes backported to the 0.14 stable branch.

please pull,
  Gerd

The following changes since commit 56a60dd6d619877e9957ba06b92d2f276e3c229d:

  Version 0.14.1 (2011-05-04 13:50:56 -0500)

are available in the git repository at:
  git://anongit.freedesktop.org/spice/qemu spice.stable.14

Gerd Hoffmann (4):
  spice: enable thread support
  spice: don't create updates in spice server context.
  spice: don't call displaystate callbacks from spice server context.
  spice: drop obsolete iothread locking

Jes Sorensen (1):
  Make spice dummy functions inline to fix calls not checking return values

 configure  |1 +
 hw/qxl-render.c|   25 ++---
 hw/qxl.c   |   27 +++
 ui/qemu-spice.h|   12 -
 ui/spice-display.c |   61 +--
 ui/spice-display.h |   24 
 6 files changed, 94 insertions(+), 56 deletions(-)



[Qemu-devel] [PATCH 3/5] spice: don't create updates in spice server context.

2011-05-11 Thread Gerd Hoffmann
This patch moves the creation of spice screen updates from the spice
server context to qemu iothread context (display refresh timer to be
exact).  This way we avoid accessing qemu internals (display surface)
from spice thread context which in turn allows us to simplify locking.

Signed-off-by: Gerd Hoffmann 
(cherry picked from commit e0c64d08d11736dcea7c5a6373e3e7f62db51d9e)
---
 hw/qxl.c   |   17 +++--
 ui/spice-display.c |   43 +++
 ui/spice-display.h |   21 -
 3 files changed, 54 insertions(+), 27 deletions(-)

diff --git a/hw/qxl.c b/hw/qxl.c
index fe4212b..bd250db 100644
--- a/hw/qxl.c
+++ b/hw/qxl.c
@@ -343,18 +343,22 @@ static int interface_get_command(QXLInstance *sin, struct 
QXLCommandExt *ext)
 SimpleSpiceUpdate *update;
 QXLCommandRing *ring;
 QXLCommand *cmd;
-int notify;
+int notify, ret;
 
 switch (qxl->mode) {
 case QXL_MODE_VGA:
 dprint(qxl, 2, "%s: vga\n", __FUNCTION__);
-update = qemu_spice_create_update(&qxl->ssd);
-if (update == NULL) {
-return false;
+ret = false;
+qemu_mutex_lock(&qxl->ssd.lock);
+if (qxl->ssd.update != NULL) {
+update = qxl->ssd.update;
+qxl->ssd.update = NULL;
+*ext = update->ext;
+ret = true;
 }
-*ext = update->ext;
+qemu_mutex_unlock(&qxl->ssd.lock);
 qxl_log_command(qxl, "vga", ext);
-return true;
+return ret;
 case QXL_MODE_COMPAT:
 case QXL_MODE_NATIVE:
 case QXL_MODE_UNDEFINED:
@@ -1304,6 +1308,7 @@ static int qxl_init_primary(PCIDevice *dev)
 vga->ds = graphic_console_init(qxl_hw_update, qxl_hw_invalidate,
qxl_hw_screen_dump, qxl_hw_text_update, 
qxl);
 qxl->ssd.ds = vga->ds;
+qemu_mutex_init(&qxl->ssd.lock);
 qxl->ssd.bufsize = (16 * 1024 * 1024);
 qxl->ssd.buf = qemu_malloc(qxl->ssd.bufsize);
 
diff --git a/ui/spice-display.c b/ui/spice-display.c
index 020b423..d56dcfc 100644
--- a/ui/spice-display.c
+++ b/ui/spice-display.c
@@ -62,14 +62,7 @@ void qemu_spice_rect_union(QXLRect *dest, const QXLRect *r)
 dest->right = MAX(dest->right, r->right);
 }
 
-/*
- * Called from spice server thread context (via interface_get_command).
- *
- * We must aquire the global qemu mutex here to make sure the
- * DisplayState (+DisplaySurface) we are accessing doesn't change
- * underneath us.
- */
-SimpleSpiceUpdate *qemu_spice_create_update(SimpleSpiceDisplay *ssd)
+static SimpleSpiceUpdate *qemu_spice_create_update(SimpleSpiceDisplay *ssd)
 {
 SimpleSpiceUpdate *update;
 QXLDrawable *drawable;
@@ -78,9 +71,7 @@ SimpleSpiceUpdate 
*qemu_spice_create_update(SimpleSpiceDisplay *ssd)
 uint8_t *src, *dst;
 int by, bw, bh;
 
-qemu_mutex_lock_iothread();
 if (qemu_spice_rect_is_empty(&ssd->dirty)) {
-qemu_mutex_unlock_iothread();
 return NULL;
 };
 
@@ -141,7 +132,6 @@ SimpleSpiceUpdate 
*qemu_spice_create_update(SimpleSpiceDisplay *ssd)
 cmd->data = (intptr_t)drawable;
 
 memset(&ssd->dirty, 0, sizeof(ssd->dirty));
-qemu_mutex_unlock_iothread();
 return update;
 }
 
@@ -241,6 +231,12 @@ void qemu_spice_display_resize(SimpleSpiceDisplay *ssd)
 qemu_pf_conv_put(ssd->conv);
 ssd->conv = NULL;
 
+qemu_mutex_lock(&ssd->lock);
+if (ssd->update != NULL) {
+qemu_spice_destroy_update(ssd, ssd->update);
+ssd->update = NULL;
+}
+qemu_mutex_unlock(&ssd->lock);
 qemu_spice_destroy_host_primary(ssd);
 qemu_spice_create_host_primary(ssd);
 
@@ -252,6 +248,14 @@ void qemu_spice_display_refresh(SimpleSpiceDisplay *ssd)
 {
 dprint(3, "%s:\n", __FUNCTION__);
 vga_hw_update();
+
+qemu_mutex_lock(&ssd->lock);
+if (ssd->update == NULL) {
+ssd->update = qemu_spice_create_update(ssd);
+ssd->notify++;
+}
+qemu_mutex_unlock(&ssd->lock);
+
 if (ssd->notify) {
 ssd->notify = 0;
 ssd->worker->wakeup(ssd->worker);
@@ -298,14 +302,20 @@ static int interface_get_command(QXLInstance *sin, struct 
QXLCommandExt *ext)
 {
 SimpleSpiceDisplay *ssd = container_of(sin, SimpleSpiceDisplay, qxl);
 SimpleSpiceUpdate *update;
+int ret = false;
 
 dprint(3, "%s:\n", __FUNCTION__);
-update = qemu_spice_create_update(ssd);
-if (update == NULL) {
-return false;
+
+qemu_mutex_lock(&ssd->lock);
+if (ssd->update != NULL) {
+update = ssd->update;
+ssd->update = NULL;
+*ext = update->ext;
+ret = true;
 }
-*ext = update->ext;
-return true;
+qemu_mutex_unlock(&ssd->lock);
+
+return ret;
 }
 
 static int interface_req_cmd_notification(QXLInstance *sin)
@@ -398,6 +408,7 @@ void qemu_spice_display_init(DisplayState *ds)
 {
 assert(sdpy.ds == NULL);
 sdpy.ds = ds;
+qemu_mutex_init(&sdpy.lock);
 sdpy.bufsize = (16 * 1024 * 1024);

[Qemu-devel] [PATCH 1/5] Make spice dummy functions inline to fix calls not checking return values

2011-05-11 Thread Gerd Hoffmann
From: Jes Sorensen 

qemu_spice_set_passwd() and qemu_spice_set_pw_expire() dummy functions
needs to be inline, in order to handle the case where they are called
without checking the return value.

Signed-off-by: Jes Sorensen 
Signed-off-by: Gerd Hoffmann 
(cherry picked from commit 14da8345b2f7c21bab20fd12b755a61d6277f171)
---
 ui/qemu-spice.h |   12 ++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/ui/qemu-spice.h b/ui/qemu-spice.h
index 916e5dc..3c6f1fe 100644
--- a/ui/qemu-spice.h
+++ b/ui/qemu-spice.h
@@ -47,8 +47,16 @@ CharDriverState *qemu_chr_open_spice(QemuOpts *opts);
 #else  /* CONFIG_SPICE */
 
 #define using_spice 0
-#define qemu_spice_set_passwd(_p, _f1, _f2) (-1)
-#define qemu_spice_set_pw_expire(_e) (-1)
+static inline int qemu_spice_set_passwd(const char *passwd,
+bool fail_if_connected,
+bool disconnect_if_connected)
+{
+return -1;
+}
+static inline int qemu_spice_set_pw_expire(time_t expires)
+{
+return -1;
+}
 static inline int qemu_spice_migrate_info(const char *h, int p, int t, const 
char *s)
 { return -1; }
 
-- 
1.7.1




Re: [Qemu-devel] [PATCH] Add warmup phase for live migration of large memory apps

2011-05-11 Thread Anthony Liguori

On 05/11/2011 02:58 AM, Shribman, Aidan wrote:

From: Aidan Shribman

[PATCH] Add warmup phase for live migration of large memory apps

By invoking "migrate -w" we initiate a background live-migration
transferring of dirty pages continuously until invocation of "migrate_end"
which attempts to complete the live migration operation.

Qemu host: Ubuntu 10.10

Testing: live migration (with/without warmup phase) tested successfully.

Signed-off-by: Benoit Hudzia
Signed-off-by: Petter Svard
Signed-off-by: Aidan Shribman


If you do:

migrate_set_downtime 0
migrate -d 

You'll achieve the same result.  The migration will never converge 
because there's no way QEMU can meet the downtime expectation.


You can then do:

migrate_set_downtime 30ms

To let migration go.  You can even use info migrate to figure out when 
migration has been sufficiently warmed up.


You can also always do migrate_cancel to abort the operation.

Regards,

Anthony Liguori


---
  hmp-commands.hx |   34 +-
  migration.c |   23 ++-
  migration.h |2 ++
  qmp-commands.hx |   41 ++---
  savevm.c|   12 
  sysemu.h|1 +
  6 files changed, 96 insertions(+), 17 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index e5585ba..234a952 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -717,24 +717,27 @@ ETEXI

  {
  .name   = "migrate",
-.args_type  = "detach:-d,blk:-b,inc:-i,uri:s",
-.params = "[-d] [-b] [-i] uri",
-.help   = "migrate to URI (using -d to not wait for completion)"
- "\n\t\t\t -b for migration without shared storage with"
- " full copy of disk\n\t\t\t -i for migration without "
- "shared storage with incremental copy of disk "
- "(base image shared between src and destination)",
+.args_type  = "detach:-d,blk:-b,inc:-i,warmup:-w,uri:s",
+.params = "[-d] [-b] [-i] [-w] uri",
+.help   = "migrate to URI"
+  "\n\t -d to not wait for completion"
+  "\n\t -b for migration without shared storage with"
+  " full copy of disk"
+  "\n\t -i for migration without"
+  " shared storage with incremental copy of disk"
+  " (base image shared between source and destination)"
+  "\n\t -w to enter warmup phase",
  .user_print = monitor_user_noop,  
.mhandler.cmd_new = do_migrate,
  },

-
  STEXI
-@item migrate [-d] [-b] [-i] @var{uri}
+@item migrate [-d] [-b] [-i] [-w] @var{uri}
  @findex migrate
  Migrate to @var{uri} (using -d to not wait for completion).
-b for migration with full copy of disk
-i for migration with incremental copy of disk (base image is shared)
+-w to enter warmup phase
  ETEXI

  {
@@ -753,6 +756,19 @@ Cancel the current VM migration.
  ETEXI

  {
+.name   = "migrate_end",
+.args_type  = "",
+.params = "",
+.help   = "Complete warmup and move to full live migration",
+.mhandler.cmd = do_migrate_end,
+},
+
+STEXI
+@item migrate_end
+Complete warmup and move to full live migration.
+ETEXI
+
+{
  .name   = "migrate_set_speed",
  .args_type  = "value:o",
  .params = "value",
diff --git a/migration.c b/migration.c
index 9ee8b17..7525b47 100644
--- a/migration.c
+++ b/migration.c
@@ -31,6 +31,8 @@
  do { } while (0)
  #endif

+static int is_migrate_warmup = 0;
+
  /* Migration speed throttling */
  static uint32_t max_throttle = (32<<  20);

@@ -81,6 +83,11 @@ int do_migrate(Monitor *mon, const QDict *qdict, QObject 
**ret_data)
  int blk = qdict_get_try_bool(qdict, "blk", 0);
  int inc = qdict_get_try_bool(qdict, "inc", 0);
  const char *uri = qdict_get_str(qdict, "uri");
+is_migrate_warmup = qdict_get_try_bool(qdict, "warmup", 0);
+
+if (is_migrate_warmup) {
+detach = 1; /* as we need migrate_end to complte */
+   }

  if (current_migration&&
  current_migration->get_status(current_migration) == MIG_STATE_ACTIVE) 
{
@@ -361,7 +368,9 @@ void migrate_fd_put_ready(void *opaque)
  }

  DPRINTF("iterate\n");
-if (qemu_savevm_state_iterate(s->mon, s->file) == 1) {
+if (is_migrate_warmup) {
+qemu_savevm_state_warmup(s->mon, s->file);
+   } else if (qemu_savevm_state_iterate(s->mon, s->file) == 1) {
  int state;
  int old_vm_running = vm_running;

@@ -448,3 +457,15 @@ int migrate_fd_close(void *opaque)
  qemu_set_fd_handler2(s->fd, NULL, NULL, NULL, NULL);
  return s->close(s);
  }
+
+void do_migrate_end(Monitor *mon, const QDict *qdict)
+{
+if (!vm_running) {
+return;
+   }
+if (!is_migrate_warmup) {
+return;
+   }
+is_migrate

Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Paolo Bonzini

On 05/11/2011 03:05 PM, Anthony Liguori wrote:


A very slow way, too (on Windows at least if you use qemu_cond...).


That doesn't mean you can't do a fiber implementation for Windows... but
having a highly portable fallback is a good thing.


I agree but where would you place it, since QEMU is only portable to 
POSIX and Windows?


osdep-$(CONFIG_POSIX) += coroutine-posix.c
osdep-$(CONFIG_WIN32) += coroutine-win32.c
osdep-??? += coroutine-fallback.c

:)

Paolo



Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Daniel P. Berrange
On Wed, May 11, 2011 at 03:45:39PM +0200, Paolo Bonzini wrote:
> On 05/11/2011 03:05 PM, Anthony Liguori wrote:
> >>
> >>A very slow way, too (on Windows at least if you use qemu_cond...).
> >
> >That doesn't mean you can't do a fiber implementation for Windows... but
> >having a highly portable fallback is a good thing.
> 
> I agree but where would you place it, since QEMU is only portable to
> POSIX and Windows?
> 
> osdep-$(CONFIG_POSIX) += coroutine-posix.c
> osdep-$(CONFIG_WIN32) += coroutine-win32.c
> osdep-??? += coroutine-fallback.c

NetBSD forbids the use of 'makecontext' in any application
which also links to libpthread.so[1]. We used makecontext in
GTK-VNC's coroutines and got random crashes in threaded
apps running on NetBSD. So for NetBSD we tell people to use
the thread based coroutines instead.

So at least one POSIX platform will need a different impl.

Regards,
Daniel

[1] http://www.eila.univ-paris-diderot.fr/cgi-bin/man.cgi?swapcontext+3
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] Add option to disable Cocoa on Mac OS X

2011-05-11 Thread Peter Maydell
On 7 May 2011 12:40, Alexander Graf  wrote:
> So I suppose the only thing missing is a --disable-cocoa option, yup.

I've just noticed that some of the code in block/raw-posix.c
uses the CONFIG_COCOA #define to gate whether to do MacOSX
specific handling of CDROMs and so on. I'm not a MacOS expert
but maybe that needs to be changed to some other ifdef -- I'm
guessing we don't want to have cdrom handling randomly change
behaviour just because the user tried to disable Cocoa graphics
handling...

-- PMM



Re: [Qemu-devel] [PATCH] Add warmup phase for live migration of large memory apps

2011-05-11 Thread Shribman, Aidan
> -Original Message-
> From: Anthony Liguori [mailto:anth...@codemonkey.ws] 
> Sent: Wednesday, May 11, 2011 4:35 PM
> To: Shribman, Aidan
> Cc: qemu-devel@nongnu.org
> Subject: Re: [Qemu-devel] [PATCH] Add warmup phase for live 
> migration of large memory apps
> 
> On 05/11/2011 02:58 AM, Shribman, Aidan wrote:
> > From: Aidan Shribman
> >
> > [PATCH] Add warmup phase for live migration of large memory apps
> >
> > By invoking "migrate -w" we initiate a background 
> live-migration
> > transferring of dirty pages continuously until invocation 
> of "migrate_end"
> > which attempts to complete the live migration operation.
> >
> > Qemu host: Ubuntu 10.10
> >
> > Testing: live migration (with/without warmup phase) tested 
> successfully.
> >
> > Signed-off-by: Benoit Hudzia
> > Signed-off-by: Petter Svard
> > Signed-off-by: Aidan Shribman
> 
> If you do:
> 
> migrate_set_downtime 0
> migrate -d 
> 
> You'll achieve the same result.  The migration will never converge 
> because there's no way QEMU can meet the downtime expectation.

If there are no additional dirty pages to be sent i.e. ram_save_remaining() == 
0 then the migration *will* converge (even though we don't want it to) see:

int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
{   
...
expected_time = ram_save_remaining() * TARGET_PAGE_SIZE / bwidth;
return (stage == 2) && (expected_time <= migrate_max_downtime());
}

So using migrate_set_downtime appears to be only a partial surrogate for 
"migrate -x " and "migrate_end"

Thanks,
Aidan Shribman

> 
> You can then do:
> 
> migrate_set_downtime 30ms
> 
> To let migration go.  You can even use info migrate to figure 
> out when 
> migration has been sufficiently warmed up.
> 
> You can also always do migrate_cancel to abort the operation.
> 
> Regards,
> 
> Anthony Liguori
> 
> > ---
> >   hmp-commands.hx |   34 +-
> >   migration.c |   23 ++-
> >   migration.h |2 ++
> >   qmp-commands.hx |   41 ++---
> >   savevm.c|   12 
> >   sysemu.h|1 +
> >   6 files changed, 96 insertions(+), 17 deletions(-)
> >
> > diff --git a/hmp-commands.hx b/hmp-commands.hx
> > index e5585ba..234a952 100644
> > --- a/hmp-commands.hx
> > +++ b/hmp-commands.hx
> > @@ -717,24 +717,27 @@ ETEXI
> >
> >   {
> >   .name   = "migrate",
> > -.args_type  = "detach:-d,blk:-b,inc:-i,uri:s",
> > -.params = "[-d] [-b] [-i] uri",
> > -.help   = "migrate to URI (using -d to not 
> wait for completion)"
> > - "\n\t\t\t -b for migration without shared 
> storage with"
> > - " full copy of disk\n\t\t\t -i for 
> migration without "
> > - "shared storage with incremental copy of disk "
> > - "(base image shared between src and destination)",
> > +.args_type  = "detach:-d,blk:-b,inc:-i,warmup:-w,uri:s",
> > +.params = "[-d] [-b] [-i] [-w] uri",
> > +.help   = "migrate to URI"
> > +  "\n\t -d to not wait for completion"
> > +  "\n\t -b for migration without 
> shared storage with"
> > +  " full copy of disk"
> > +  "\n\t -i for migration without"
> > +  " shared storage with incremental 
> copy of disk"
> > +  " (base image shared between source 
> and destination)"
> > +  "\n\t -w to enter warmup phase",
> >   .user_print = monitor_user_noop,  
> > .mhandler.cmd_new = do_migrate,
> >   },
> >
> > -
> >   STEXI
> > -@item migrate [-d] [-b] [-i] @var{uri}
> > +@item migrate [-d] [-b] [-i] [-w] @var{uri}
> >   @findex migrate
> >   Migrate to @var{uri} (using -d to not wait for completion).
> > -b for migration with full copy of disk
> > -i for migration with incremental copy of disk (base 
> image is shared)
> > +-w to enter warmup phase
> >   ETEXI
> >
> >   {
> > @@ -753,6 +756,19 @@ Cancel the current VM migration.
> >   ETEXI
> >
> >   {
> > +.name   = "migrate_end",
> > +.args_type  = "",
> > +.params = "",
> > +.help   = "Complete warmup and move to full 
> live migration",
> > +.mhandler.cmd = do_migrate_end,
> > +},
> > +
> > +STEXI
> > +@item migrate_end
> > +Complete warmup and move to full live migration.
> > +ETEXI
> > +
> > +{
> >   .name   = "migrate_set_speed",
> >   .args_type  = "value:o",
> >   .params = "value",
> > diff --git a/migration.c b/migration.c
> > index 9ee8b17..7525b47 100644
> > --- a/migration.c
> > +++ b/migration.c
> > @@ -31,6 +31,8 @@
> >   do { } while (0)
> >   #endif
> >
> > +static int is_migrate_warmup = 0;
> > +
> >   /* Migration speed throttling */
> >   static uint32_t max_throttle = (32<<  20);
> >
> > @@ -81,6 +83,11 

Re: [Qemu-devel] adding search to dhcp

2011-05-11 Thread Carl Karsten
On Wed, May 11, 2011 at 6:01 AM, Markus Armbruster  wrote:
> Stefan Hajnoczi  writes:
>
>> On Tue, May 10, 2011 at 6:40 PM, Carl Karsten  wrote:
>>> I would expect the syntax to look like this:
>>>
>>> qemu -hda 1.qcow2 -net nick -net
>>> user,hostname=qemu,search=example.com,sales.example.com
>>
>> Comma escaping is needed but it seems like a reasonable feature to me.
>
> Comma escaping is ugly:
> -net user,hostname=qemu,search=example.com,,sales.example.com
>
> Could we have multiple search options instead?  Like this:
> -net user,hostname=qemu,search=example.com,search=sales.example.com
>

How about:

-net user,hostname=qemu,search="example.com,sales.example.com"

Given I personally only need one search domain, any objection to me
only writing the code to support 1?  I would think the simple case
search=example.com  should be the same regardless of how someone
implemented multiples.


-- 
Carl K



Re: [Qemu-devel] adding search to dhcp

2011-05-11 Thread Stefan Hajnoczi
On Wed, May 11, 2011 at 4:22 PM, Carl Karsten  wrote:
> On Wed, May 11, 2011 at 6:01 AM, Markus Armbruster  wrote:
>> Stefan Hajnoczi  writes:
>>
>>> On Tue, May 10, 2011 at 6:40 PM, Carl Karsten  
>>> wrote:
 I would expect the syntax to look like this:

 qemu -hda 1.qcow2 -net nick -net
 user,hostname=qemu,search=example.com,sales.example.com
>>>
>>> Comma escaping is needed but it seems like a reasonable feature to me.
>>
>> Comma escaping is ugly:
>> -net user,hostname=qemu,search=example.com,,sales.example.com
>>
>> Could we have multiple search options instead?  Like this:
>> -net user,hostname=qemu,search=example.com,search=sales.example.com
>>
>
> How about:
>
> -net user,hostname=qemu,search="example.com,sales.example.com"

That does not work the way you'd expect:
$ echo asdf=asdf,ok="this,is,a,test"
asdf=asdf,ok=this,is,a,test

Also, let's not get into the business of matching quotes and passing
them escaped on the shell.  That's just as ugly as escaping commas and
more work.

I think the two options are using QEMU's typical comma escaping ',,'
or specifying the option multiple times.  I'd go with comma escaping
for consistency.  I'm not aware of any other option in QEMU that is
specified multiple times.

Stefan



[Qemu-devel] [PATCH] Add an isa device for SGA

2011-05-11 Thread Glauber Costa
This patch adds a dummy legacy ISA device whose responsibility is to
deploy sgabios, an option rom for a serial graphics adapter.
The proposal is that this device is always-on when -nographics,
but can otherwise be enable in any setup when -device sga is used.

Signed-off-by: Glauber Costa 
---
 Makefile.target |2 +-
 hw/pc.c |2 ++
 hw/pc.h |3 +++
 hw/sga.c|   49 +
 4 files changed, 55 insertions(+), 1 deletions(-)
 create mode 100644 hw/sga.c

diff --git a/Makefile.target b/Makefile.target
index fdbdc6c..004ea7e 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -224,7 +224,7 @@ obj-$(CONFIG_KVM) += ivshmem.o
 # Hardware support
 obj-i386-y += vga.o
 obj-i386-y += mc146818rtc.o i8259.o pc.o
-obj-i386-y += cirrus_vga.o apic.o ioapic.o piix_pci.o
+obj-i386-y += cirrus_vga.o sga.o apic.o ioapic.o piix_pci.o
 obj-i386-y += vmport.o
 obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o
 obj-i386-y += extboot.o
diff --git a/hw/pc.c b/hw/pc.c
index 8d351ba..56c3887 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1096,6 +1096,8 @@ void pc_vga_init(PCIBus *pci_bus)
 isa_vga_init();
 }
 }
+
+isa_sga_init();
 }
 
 static void cpu_request_exit(void *opaque, int irq, int level)
diff --git a/hw/pc.h b/hw/pc.h
index 1291e2d..a00e054 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -219,6 +219,9 @@ int isa_vga_mm_init(target_phys_addr_t vram_base,
 void pci_cirrus_vga_init(PCIBus *bus);
 void isa_cirrus_vga_init(void);
 
+/* serial graphics */
+void isa_sga_init(void);
+
 /* ne2000.c */
 static inline bool isa_ne2000_init(int base, int irq, NICInfo *nd)
 {
diff --git a/hw/sga.c b/hw/sga.c
new file mode 100644
index 000..411191b
--- /dev/null
+++ b/hw/sga.c
@@ -0,0 +1,49 @@
+#include "pci.h"
+#include "pc.h"
+#include "loader.h"
+#include "sysemu.h"
+
+#define SGABIOS_FILENAME "sgabios.bin"
+
+typedef struct ISAGAState {
+ISADevice dev;
+} ISASGAState;
+
+/* We can have both -device, and the initfn called, so better
+ * avoid to have the rom loaded twice */
+static void deploy_rom(void)
+{
+static int rom_deployed = 0;
+
+if (!rom_deployed++) {
+rom_add_vga(SGABIOS_FILENAME);
+}
+}
+
+static int isa_cirrus_vga_initfn(ISADevice *dev)
+{
+deploy_rom();
+
+return 0;
+}
+
+void isa_sga_init(void)
+{
+if (display_type == DT_NOGRAPHIC) {
+deploy_rom();
+}
+}
+
+static ISADeviceInfo sga_info = {
+.qdev.name= "sga",
+.qdev.desc= "Serial Graphics Adapter",
+.qdev.size= sizeof(ISASGAState),
+.init = isa_cirrus_vga_initfn,
+};
+
+static void sga_register(void)
+{
+  isa_qdev_register(&sga_info);
+}
+
+device_init(sga_register);
-- 
1.7.4.2




Re: [Qemu-devel] TCG: AREG0 removal planning

2011-05-11 Thread Blue Swirl
On Wed, May 11, 2011 at 12:28 AM, Paul Brook  wrote:
>> TCG uses a fixed global register (AREG0) to which points to currently
>> used CPUState, also known as 'env'. Using a fixed register has the
>> downsides that the register must be reserved by TCG for generated code
>> and by the compiler for compiling a few critical files (op_helper.c
>> etc.). The latter also means that any calls to C library may be unsafe
>> from those files.
>>
>> Here are my sketches about transition to AREG0-less world.
>> ...
>> Translators/op helpers
>> ...
>> Comments? There are a few blank spots too.
>
> I think a useful, and incremental goal is elimination of global cpu_env state
> in C code (i.e eliminate HELPER_CFLAGS and dyngen-exec.h).
> We already have much of the infrastructure for this - op_helper v.s. helper.c
> and code_gen_prologue for transition in/out of "generated code" state.
>
> In practice generated code probably accesses CPUState often enough that a
> dedicated register isn't a bad idea.  My guess is that eliminating it from C
> code gets us almost all of the useful benefit.  Removing it from the code
> generator (i.e. TCG_AREG0) may be more pain that it's worth.

I don't think moving the helpers from op_helper.c to helper.c will be
a performance win if AREG0 is not eliminated. The code gets to use one
register more, but AREG0 needs to be moved to a function argument
register in most cases and AREG0 has to be restored. I think the
benefit should come from generated code getting one more available
register.

TCG side doesn't look so difficult, just qemu_ld/st ops and using a
stack frame. Converting translators and helpers is a much bigger job.

> For changes to
> the TCG side we want to consider how we can provide useful aliasing
> information, rather than a naive replacement of TCG_AREG0 with a variable.

What aliasing information?



Re: [Qemu-devel] TCG: AREG0 removal planning

2011-05-11 Thread Blue Swirl
On Wed, May 11, 2011 at 12:58 AM, Richard Henderson  wrote:
> On 05/10/2011 01:54 PM, Blue Swirl wrote:
>> TCG the generator backend
>> -AREG0 is used for qemu_ld/st ops for TLB access. It should be
>> possible for the translators to pass instead a pointer to either
>> CPUState or directly to the TLB.
>
> I believe that AREG0 should continue to be present in the generated
> code.  There are simply too many references to it throughout the
> translated code for allocating this dynamically to be a win.
>
> What should change, however, is the removal of AREG0 outside the
> generated code.  The cpu-state pointer should be passed as a regular
> parameter wherever it is required.  This includes tcg_qemu_tb_exec,
> which means that the generated prologue would change, setting up
> AREG0 in the process.

I have exactly opposite feeling, AREG0 needs to be eliminated from
generated code but converting helpers won't be useful. Maybe some
experiments are needed to see what would be the real gains.

>> New qemu_ld/st ops are needed for all TCG targets.
>
> Yes, qemu_ld/st would have to change to accommodate the new parameter
> being passed.
>
> While we're at it, let us change things a bit further to allow guest
> byte-swap load/store insns to be implemented more efficiently.  For
> instance, currently a sparc load_asr (little-endian), as emulated on
> an x86 host, does the byte swap twice.
>
> There is, currently, a const int parameter to qemu_ld/st that encodes
> the size of the load.  Almost all TCG backends behind the scenes
> extend this parameter with a bit to indicate byte swap needed.  Let us
> formalize this, and allow this to be set in the original TCG op, with
> appropriate new inlines in tcg-op.h to access it from the translators.
>
> We can also make things easier for the backends by allowing them
> to declare that they do or do not have byte swap load/store insns.
> If the such are not available, a separate bswap opcode is emitted
> right from tcg_gen_qemu_st32 et al.
>
> This would allow a nice cleanup for i386, which currently has a small
> register allocation problem in the store path, what with needing to
> not clobber the input register while byte swapping.  (This problem is
> solved by restricting the set of input registers for qemu_ld/st.)
>
> All this does require the slow path to be changed to accommodate this.
> In particular, if byte-swap memory ops are available, we need slow
> path functions that also byte swap.  Indeed, I'd expect them to use
> the byte-swap memory ops themselves.  Further, if byte-swap memory
> ops are not available, the slow path should always return memory in
> the host byte order, because a separate bswap operation will be done
> on behalf of the fast path.

I agree.

>> -TCG temps are stored in CPUState field temp_buf[], accessed via
>> AREG0. Maybe a regular stack frame should be allocated instead?
>
> Probably.  Most of the backends manage a stack frame anyway, to
> handle registers saved in the prologue.  All that would be needed
> is a define from TCG to tell the backends how much memory is required,
> and some value passed from the backends to tell TCG what the offset
> of that area is from the stack pointer.

Currently the size of temp_buf is fixed, but the exact size for the
stack frame could be calculated during translation. Not that there
would be measurable performance improvements though.



Re: [Qemu-devel] TCG: AREG0 removal planning

2011-05-11 Thread Blue Swirl
On Wed, May 11, 2011 at 1:57 AM, Paul Brook  wrote:
>> While we're at it, let us change things a bit further to allow guest
>> byte-swap load/store insns to be implemented more efficiently.  For
>> instance, currently a sparc load_asr (little-endian), as emulated on
>> an x86 host, does the byte swap twice.
>
> FWIW this also ends up interacting with the device and bus models. This is
> partially implemented by the endian parameter of cpu_register_io_memory et.
> al.  This may also be a runtime property, either part of the CPU state (e.g.
> ARM where instruction and data accesses may have different endianness), or
> even a per-page TLB attribute (PPC?).

SuperSparc (Sparc32) MMU had a bit for reversing the endianness of a
page, on Sparc64 there are many levels where this can be done (global
CPU mode flag, MMU page flag and some memory accesses can use byte
swapping ASIs). I don't think they are used though.



Re: [Qemu-devel] [PATCH] Add AACI audio playback support to the ARM Versatile/PB platform

2011-05-11 Thread Mathieu Sonet

Paul Brook wrote:

The PL041 driver provides an interface to an ACLink bus.
The LM4549 driver emulates a DAC connected on the ACLink bus.
Only audio playback is implemented.


Shouldn't this be shared with the other AC97 devices?

Paul


I organized the code in 3 different drivers (PL041 <=> ACLink <=> 
LM4549) to decorrelate the codec interface from its implementation. This 
could allow the use of alternative AC97 models with the same PL041 
implementation.


On the other hand the current ac97.c implementation is a closely coupled 
combination of a PCI/ACLink bridge (Intel 82801AA) with a generic AC97 
codec. This has prevent me to easily reuse this code.


The milkymist-ac97 implementation is another case. It looks like a basic 
implementation with the AC97 registers directly mapped on the system bus.


Using the ACLink bus I defined, it could be interesting to implement 
separately the PCI/ACLink bridge from ac97.c.


Is it what you mean by saying this should be shared with the other AC97 
devices ?


Mathieu



Re: [Qemu-devel] TCG: AREG0 removal planning

2011-05-11 Thread Lluís
Blue Swirl writes:

> On Wed, May 11, 2011 at 12:28 AM, Paul Brook  wrote:
>> In practice generated code probably accesses CPUState often enough that a
>> dedicated register isn't a bad idea.  My guess is that eliminating it from C
>> code gets us almost all of the useful benefit.  Removing it from the code
>> generator (i.e. TCG_AREG0) may be more pain that it's worth.

> I don't think moving the helpers from op_helper.c to helper.c will be
> a performance win if AREG0 is not eliminated. The code gets to use one
> register more, but AREG0 needs to be moved to a function argument
> register in most cases and AREG0 has to be restored. I think the
> benefit should come from generated code getting one more available
> register.

So, it all boils down to the amount of register spilling that can be
avoided in TCG-generated code when eliminating the reserved register for
AREG0. Am I right?


Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth



Re: [Qemu-devel] [PATCH 1/2] coroutine: introduce coroutines

2011-05-11 Thread Stefan Weil

Am 11.05.2011 12:15, schrieb Stefan Hajnoczi:

From: Kevin Wolf 

Asynchronous code is becoming very complex. At the same time
synchronous code is growing because it is convenient to write.
Sometimes duplicate code paths are even added, one synchronous and the
other asynchronous. This patch introduces coroutines which allow code
that looks synchronous but is asynchronous under the covers.

A coroutine has its own stack and is therefore able to preserve state
across blocking operations, which traditionally require callback
functions and manual marshalling of parameters.

Creating and starting a coroutine is easy:

coroutine = qemu_coroutine_create(my_coroutine);
qemu_coroutine_enter(coroutine, my_data);

The coroutine then executes until it returns or yields:

void coroutine_fn my_coroutine(void *opaque) {
MyData *my_data = opaque;

/* do some work */

qemu_coroutine_yield();

/* do some more work */
}

Yielding switches control back to the caller of qemu_coroutine_enter().
This is typically used to switch back to the main thread's event loop
after issuing an asynchronous I/O request. The request callback will
then invoke qemu_coroutine_enter() once more to switch back to the
coroutine.

Note that coroutines never execute concurrently and should only be used
from threads which hold the global mutex. This restriction makes
programming with coroutines easier than with threads. Race conditions
cannot occur since only one coroutine may be active at any time. Other
coroutines can only run across yield.

This coroutines implementation is based on the gtk-vnc implementation
written by Anthony Liguori  but it has been
significantly rewritten by Kevin Wolf  to use
setjmp()/longjmp() instead of the more expensive swapcontext().

Signed-off-by: Kevin Wolf 
Signed-off-by: Stefan Hajnoczi 
---



Hi Stefan,

you might want to add the following or a similar patch:

diff --git a/qemu-coroutine.c b/qemu-coroutine.c
index 0927f58..9cd0dd7 100644
--- a/qemu-coroutine.c
+++ b/qemu-coroutine.c
@@ -91,7 +91,10 @@ static void *coroutine_swap(Coroutine *from, 
Coroutine *to, void *opaque)

 case COROUTINE_TERMINATE:
 current = to->caller;
 qemu_coroutine_terminate(to);
-return to->data;
+opaque = to->data;
+qemu_free(to->stack);
+qemu_free(to);
+return opaque;
 default:
 /* Switch to called coroutine */
 current = to;

I tested your test code with Valgrind. Beside of the memory leaks which 
are fixed

with the small modification shown above, Valgrind has a lot of complains.
Maybe you can try it yourself, otherwise please wait until I have finished
analyzing the Valgrind results. At a first glance, I'm afraid that
debugging with gdb or Valgrind might become more difficult when coroutines
are used. This is different with threads: they are fully supported by gdb.

The w32 build needs additional libraries (ws2_32, maybe more), then
check-coroutine works.

Cheers,
Stefan W.









Re: [Qemu-devel] TCG: AREG0 removal planning

2011-05-11 Thread Blue Swirl
On Wed, May 11, 2011 at 9:39 PM, Lluís  wrote:
> Blue Swirl writes:
>
>> On Wed, May 11, 2011 at 12:28 AM, Paul Brook  wrote:
>>> In practice generated code probably accesses CPUState often enough that a
>>> dedicated register isn't a bad idea.  My guess is that eliminating it from C
>>> code gets us almost all of the useful benefit.  Removing it from the code
>>> generator (i.e. TCG_AREG0) may be more pain that it's worth.
>
>> I don't think moving the helpers from op_helper.c to helper.c will be
>> a performance win if AREG0 is not eliminated. The code gets to use one
>> register more, but AREG0 needs to be moved to a function argument
>> register in most cases and AREG0 has to be restored. I think the
>> benefit should come from generated code getting one more available
>> register.
>
> So, it all boils down to the amount of register spilling that can be
> avoided in TCG-generated code when eliminating the reserved register for
> AREG0. Am I right?

For the generated code, yes.



Re: [Qemu-devel] [virt-tools-list] Virt Tools Survey: What to do about virt-clone

2011-05-11 Thread Alexander Boström
On tis, 2011-05-10 at 12:56 +0100, Richard W.M. Jones wrote:

> Fedora
> --
> 
> In theory you can just write a file /.unconfigured in the root, and

Perhaps this could also be triggered by a change in the system UUID (see
dmidecode). Store the UUID in a file during kickstart/firstboot and
check it at every boot. If it changed then the disk was either moved or
cloned to new hardware (physical or virtual doesn't really matter), so
perhaps the system could ask for a root password and after confirmation
start the re-setup process.

> Some admins will *not* want all of these things to be reset, and will
> want either a lesser degree of unconfiguration, or will want to
> control each thing manually.

Yeah, it's rather analogous to kickstart.

I think every guest OS needs to have support for this internally because
it gets too complicated to do it from the outside but it should be
possible to trigger the process and control it using the virt tools.

The OS needs to back out some changes from the system and then redo part
of the kickstart/firstboot. (The division between Anaconda and Firstboot
seems a bit arbitrary, so I'm treating them as one thing here.)

Stuff that can happen at boot after cloning:

1. (If virt-clone triggered the boot, skip this step.)
Ask the admin if this is a clone or a move. If it's a move, exit and
continue with the normal boot. Otherwise, ask for the root password.

2. Eradicate as thoroughly as possible any host-private data. This can
be Smolt ID, Spacewalk, RHN data, SSH key pairs, Kerberos keys, Puppet
registration, reset the hostname to localhost.localdomain and so on.

3. Forget about any hardware that doesn't seem to exist anymore (so eth0
is removed from udev and the configuration is deleted from NM).

4. At this point it should be possible to shut down, thereby creating a
clean slate template.

5. Re-run some parts of kickstart/firstboot. This should be scriptable
and virt-clone should be able to provide this info.
* Network config.
* (Hash of) the root password, bootloader password.
* Smolt, Kerberos hostkey, Spacewalk, RHN, Puppet...
* %post script.

/abo





Re: [Qemu-devel] [PATCH] Add an isa device for SGA

2011-05-11 Thread Jan Kiszka
On 2011-05-11 19:11, Glauber Costa wrote:
> This patch adds a dummy legacy ISA device whose responsibility is to
> deploy sgabios, an option rom for a serial graphics adapter.
> The proposal is that this device is always-on when -nographics,
> but can otherwise be enable in any setup when -device sga is used.

That means bios output is written to the virtual uart? Sounds
interesting. Where do we find such an option rom?

Alternatively, what about adding serial console support to SeaBIOS? I've
seen real BIOSes that allow driving both VGA and serial in parallel.
That could be controlled via -boot and some fwcfg, just like we request
the boot menu today.

Jan



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH] vfio: Fixup uiommu sharing

2011-05-11 Thread Alex Williamson
When setting up a vfio device, we can either create a new iommu
domain for each device, or use an iommu domain shared between
multiple devices.  In the first case, we open /dev/uiommu for
each device and pass the new file descriptor for each device's
VFIO_DOMAIN_SET ioctl.  For the latter, we use a single file
descriptor multiple times.

This doesn't currently work because the dma region list is
tied to the instance of the vfio device file opening.  This
completely breaks any kind of page accounting or dma region
tracking, and if one device is closed, the other is in bad
shape.

Instead, manage the uiommu connection around VFIO_DOMAIN_{UN}SET.
The first vfio device to set a domain will create a vfio_uiommu
object and add it to a list, subsequent users trying to use the
same domain will share the object, last one out does the cleanup.
For virtual machine usage, this means we can now have a single
iommu domain context persist across addition and removal of
devices, so long as there's at least one device always attached.

Signed-off-by: Alex Williamson 
---

Tom,

This is the thing that was confusing me the other day.  I think
the model below makes more sense, but it's entirely possible
that I'm overlooking your usage model.  Let me know.  Thanks,

Alex

Note: patch against "vfio: Allow sub-ranges to be unmapped"

 drivers/vfio/vfio_dma.c |  212 +--
 drivers/vfio/vfio_main.c|   91 ++
 drivers/vfio/vfio_netlink.c |2 
 drivers/vfio/vfio_sysfs.c   |3 -
 include/linux/vfio.h|   36 ---
 5 files changed, 177 insertions(+), 167 deletions(-)

diff --git a/drivers/vfio/vfio_dma.c b/drivers/vfio/vfio_dma.c
index bf260e4..154ccb7 100644
--- a/drivers/vfio/vfio_dma.c
+++ b/drivers/vfio/vfio_dma.c
@@ -42,6 +42,9 @@
 #include 
 #include 
 
+static LIST_HEAD(vfio_uiommu_list);
+static DEFINE_MUTEX(vfio_uiommu_lock);
+
 struct vwork {
struct mm_struct *mm;
unsigned long npage;
@@ -97,77 +100,75 @@ static void vfio_lock_acct(unsigned long npage)
 
 /* Unmap DMA region */
 /* dgate must be held */
-static void vfio_dma_unmap(struct vfio_listener *listener, unsigned long iova,
+static void vfio_dma_unmap(struct vfio_uiommu *uiommu, unsigned long iova,
   unsigned long npage, struct page **pages, int rdwr)
 {
-   struct vfio_dev *vdev = listener->vdev;
unsigned long i;
 
for (i = 0; i < npage; i++, iova += PAGE_SIZE) {
-   uiommu_unmap(vdev->udomain, iova, 0);
+   uiommu_unmap(uiommu->udomain, iova, 0);
if (rdwr)
SetPageDirty(pages[i]);
put_page(pages[i]);
}
-   vdev->locked_pages -= npage;
+   uiommu->locked_pages -= npage;
vfio_lock_acct(-npage);
 }
 
 /* Unmap ALL DMA regions */
-void vfio_dma_unmapall(struct vfio_listener *listener)
+static void vfio_dma_unmapall(struct vfio_uiommu *uiommu)
 {
struct list_head *pos, *pos2;
struct dma_map_page *mlp;
 
-   mutex_lock(&listener->vdev->dgate);
-   list_for_each_safe(pos, pos2, &listener->dm_list) {
+   mutex_lock(&uiommu->dgate);
+   list_for_each_safe(pos, pos2, &uiommu->dm_list) {
mlp = list_entry(pos, struct dma_map_page, list);
-   vfio_dma_unmap(listener, mlp->daddr, mlp->npage,
+   vfio_dma_unmap(uiommu, mlp->daddr, mlp->npage,
   mlp->pages, mlp->rdwr);
list_del(&mlp->list);
vfree(mlp->pages);
kfree(mlp);
}
-   mutex_unlock(&listener->vdev->dgate);
+   mutex_unlock(&uiommu->dgate);
 }
 
 /* Map DMA region */
 /* dgate must be held */
-static int vfio_dma_map(struct vfio_listener *listener, unsigned long iova,
+static int vfio_dma_map(struct vfio_uiommu *uiommu, unsigned long iova,
unsigned long npage, struct page **pages, int rdwr)
 {
-   struct vfio_dev *vdev = listener->vdev;
unsigned long i;
int ret;
 
/* Verify pages are not already mapped */
for (i = 0; i < npage; i++)
-   if (uiommu_iova_to_phys(vdev->udomain, iova + i * PAGE_SIZE))
+   if (uiommu_iova_to_phys(uiommu->udomain, iova + i * PAGE_SIZE))
return -EBUSY;
 
for (i = 0; i < npage; i++, iova += PAGE_SIZE) {
-   ret = uiommu_map(vdev->udomain, iova,
+   ret = uiommu_map(uiommu->udomain, iova,
 page_to_phys(pages[i]), 0, rdwr);
if (!ret)
continue;
 
/* Back out mappings on error */
for (i--, iova -= PAGE_SIZE; i >= 0; i--, iova -= PAGE_SIZE)
-   uiommu_unmap(vdev->udomain, iova, 0);
+   uiommu_unmap(uiommu->udomain, iova, 0);
return ret;
}
-   vdev->locked_pages += npage;
+   uiommu->locked_pages 

Re: [Qemu-devel] adding search to dhcp

2011-05-11 Thread Jan Kiszka
On 2011-05-11 18:08, Stefan Hajnoczi wrote:
> On Wed, May 11, 2011 at 4:22 PM, Carl Karsten  wrote:
>> On Wed, May 11, 2011 at 6:01 AM, Markus Armbruster  wrote:
>>> Stefan Hajnoczi  writes:
>>>
 On Tue, May 10, 2011 at 6:40 PM, Carl Karsten  
 wrote:
> I would expect the syntax to look like this:
>
> qemu -hda 1.qcow2 -net nick -net
> user,hostname=qemu,search=example.com,sales.example.com

 Comma escaping is needed but it seems like a reasonable feature to me.
>>>
>>> Comma escaping is ugly:
>>> -net user,hostname=qemu,search=example.com,,sales.example.com
>>>
>>> Could we have multiple search options instead?  Like this:
>>> -net user,hostname=qemu,search=example.com,search=sales.example.com
>>>
>>
>> How about:
>>
>> -net user,hostname=qemu,search="example.com,sales.example.com"
> 
> That does not work the way you'd expect:
> $ echo asdf=asdf,ok="this,is,a,test"
> asdf=asdf,ok=this,is,a,test
> 
> Also, let's not get into the business of matching quotes and passing
> them escaped on the shell.  That's just as ugly as escaping commas and
> more work.
> 
> I think the two options are using QEMU's typical comma escaping ',,'
> or specifying the option multiple times.  I'd go with comma escaping
> for consistency.  I'm not aware of any other option in QEMU that is
> specified multiple times.

-net user,hostfwd=...,hostfwd=...

Let's got for multiple specification, ',,' is just ugly IMHO.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] Add an isa device for SGA

2011-05-11 Thread Glauber de Oliveira Costa


- Original Message -
> From: "Jan Kiszka" 
> To: "Glauber Costa" 
> Cc: k...@vger.kernel.org, aligu...@us.ibm.com, mtosa...@redhat.com, 
> qemu-devel@nongnu.org, a...@redhat.com
> Sent: Wednesday, May 11, 2011 5:00:10 PM
> Subject: Re: [PATCH] Add an isa device for SGA
> On 2011-05-11 19:11, Glauber Costa wrote:
> > This patch adds a dummy legacy ISA device whose responsibility is to
> > deploy sgabios, an option rom for a serial graphics adapter.
> > The proposal is that this device is always-on when -nographics,
> > but can otherwise be enable in any setup when -device sga is used.
> 
> That means bios output is written to the virtual uart? Sounds
> interesting. Where do we find such an option rom?

code.google.com/p/sgabios

Plan is to include it in the set of roms distributed by QEMU.

> Alternatively, what about adding serial console support to SeaBIOS?
> I've
> seen real BIOSes that allow driving both VGA and serial in parallel.
> That could be controlled via -boot and some fwcfg, just like we
> request
> the boot menu today.
Possible, cleaner, but much more work.

I think using sgabios give us what we need for very little additional cost.
If someone would be willing to do the work in SeaBIOS, that would be another 
story.

> 
> Jan



Re: [Qemu-devel] adding search to dhcp

2011-05-11 Thread Michael Tokarev
12.05.2011 00:49, Jan Kiszka пишет:
> On 2011-05-11 18:08, Stefan Hajnoczi wrote:
>> On Wed, May 11, 2011 at 4:22 PM, Carl Karsten  wrote:
>>> On Wed, May 11, 2011 at 6:01 AM, Markus Armbruster  
>>> wrote:
 Stefan Hajnoczi  writes:

> On Tue, May 10, 2011 at 6:40 PM, Carl Karsten  
> wrote:
>> I would expect the syntax to look like this:
>>
>> qemu -hda 1.qcow2 -net nick -net
>> user,hostname=qemu,search=example.com,sales.example.com
>
> Comma escaping is needed but it seems like a reasonable feature to me.

 Comma escaping is ugly:
 -net user,hostname=qemu,search=example.com,,sales.example.com

 Could we have multiple search options instead?  Like this:
 -net user,hostname=qemu,search=example.com,search=sales.example.com

>>>
>>> How about:
>>>
>>> -net user,hostname=qemu,search="example.com,sales.example.com"
>>
>> That does not work the way you'd expect:
>> $ echo asdf=asdf,ok="this,is,a,test"
>> asdf=asdf,ok=this,is,a,test
>>
>> Also, let's not get into the business of matching quotes and passing
>> them escaped on the shell.  That's just as ugly as escaping commas and
>> more work.
>>
>> I think the two options are using QEMU's typical comma escaping ',,'
>> or specifying the option multiple times.  I'd go with comma escaping
>> for consistency.  I'm not aware of any other option in QEMU that is
>> specified multiple times.
> 
> -net user,hostfwd=...,hostfwd=...
> 
> Let's got for multiple specification, ',,' is just ugly IMHO.

I second this, just repeat the specification, please no double ,,.
Or alternatively, search1=foo,search2=bar, but this is also sort
of ugly.

But I'm not sure why there's no way to use some other character,
like colon (:) for example - it's used for protocol:details
already, and for domain names it works well too...

/mjt



[Qemu-devel] [PULL] PPC patch queue

2011-05-11 Thread Alexander Graf
Hi,

This is my current PPC patch queue containing all the collected PPC work for
Qemu so far. Please pull.

Alex

The following changes since commit 0225e254ae81c5638463cda8f5730f31619113b6:
  Stefan Weil (1):
usb-linux: Add missing break statement

are available in the git repository at:

  git://repo.or.cz/qemu/agraf.git ppc-next

Alexander Graf (9):
  kvm: ppc: detect old headers
  kvm: ppc: warn user on PAGE_SIZE mismatch
  PPC: Make MPC8544DS obey -cpu switch
  PPC: Make MPC8544DS emulation work w/o KVM
  PPC: Add GS MSR definition
  PPC: Add another 64 bits to instruction feature mask
  PPC: Implement e500 (FSL) MMU
  PPC MPC7544DS: Use new TLB helper function
  PPC: Qdev'ify e500 pci

Andreas Färber (2):
  ppc64: Don't try to build sPAPR RTAS on Darwin
  ppc64: Fix out-of-tree builds

Anton Blanchard (1):
  pseries: Increase maximum CPUs to 256

David Gibson (3):
  Make pSeries 'model' property more closely resemble real hardware
  Place pseries vty devices at addresses more similar to existing machines
  Fix off-by-one error in sizing pSeries hcall table

Scott Wood (2):
  kvm: ppc: fixes for KVM_SET_SREGS on init
  monitor: add PPC BookE SPRs

 configure   |   22 +++-
 hw/ppc.c|   12 ++
 hw/ppce500.h|   22 ---
 hw/ppce500_mpc8544ds.c  |  113 
 hw/ppce500_pci.c|  136 +++-
 hw/spapr.c  |9 +-
 hw/spapr_hcall.c|4 +-
 hw/spapr_rtas.c |3 +-
 hw/spapr_vio.h  |2 +
 kvm-all.c   |5 +
 monitor.c   |   71 ++-
 target-ppc/cpu.h|  308 ++-
 target-ppc/helper.c |  269 ++
 target-ppc/helper.h |6 +
 target-ppc/kvm.c|  180 -
 target-ppc/op_helper.c  |  296 +
 target-ppc/translate.c  |  200 ++--
 target-ppc/translate_init.c |  282 ---
 18 files changed, 1664 insertions(+), 276 deletions(-)
 delete mode 100644 hw/ppce500.h




Re: [Qemu-devel] s390x: fix memory detection for guests > 64GB

2011-05-11 Thread Alexander Graf

On 10.05.2011, at 14:49, Christian Borntraeger wrote:

> Alex,
> 
> the s390 memory detection has a 16bit field that specifies the amount of
> increments. This patch adopts the memory size to always fit into that
> scheme. This also fixes virtio detection for these guests, since the 
> descriptor page is located after the main memory.
> 
> Signed-off-by: Christian Borntraeger 
> 
> --- a/target-s390x/op_helper.c
> +++ b/target-s390x/op_helper.c
> @@ -2361,6 +2361,7 @@ static void ext_interrupt(CPUState *env, int type, 
> uint32_t param,
> int sclp_service_call(CPUState *env, uint32_t sccb, uint64_t code)
> {
> int r = 0;
> +int shift = 0;
> 
> #ifdef DEBUG_HELPER
> printf("sclp(0x%x, 0x%" PRIx64 ")\n", sccb, code);
> @@ -2375,8 +2376,10 @@ int sclp_service_call(CPUState *env, uint32_t sccb, 
> uint64_t code)
> switch(code) {
> case SCLP_CMDW_READ_SCP_INFO:
> case SCLP_CMDW_READ_SCP_INFO_FORCED:
> -stw_phys(sccb + SCP_MEM_CODE, ram_size >> 20);
> -stb_phys(sccb + SCP_INCREMENT, 1);
> +while ((ram_size >> (20 + shift)) > 65535)
> +   shift++;

Please run scripts/checkpatch.pl on your patch.


Alex




Re: [Qemu-devel] s390x: change mapping base to allow guests > 2GB

2011-05-11 Thread Alexander Graf

On 10.05.2011, at 14:49, Christian Borntraeger wrote:

> Alex,
> 
> the current s390x qemu memory layout is
> 
> 0x100: guest start
> 0x8000: qemu binary
> 
> which limits the amount of available memory to <2GB.
> This patch moves the guest pages to 32GB to not collide with the binary
> and to leave some space for the program break of qemu. 
> 
> Signed-off-by: Christian Borntraeger 

Thanks, applied to the s390-next tree. Also, please dont't address me directly 
in the patch description, as that one's supposed to go to the public :).


Alex




Re: [Qemu-devel] [PATCH v4 5/5] hpet 'driftfix': add code in hpet_timer() to compensate delayed callbacks and coalesced interrupts

2011-05-11 Thread Zachary Amsden

On 05/09/2011 12:03 AM, Ulrich Obergfell wrote:

Loss of periodic timer interrupts caused by delayed callbacks and by
interrupt coalescing is compensated by gradually injecting additional
interrupts during subsequent timer intervals, starting at a rate of
one additional interrupt per interval. The injection of additional
interrupts is based on a backlog of unaccounted HPET clock periods
(new HPETTimer field 'ticks_not_accounted'). The backlog increases
due to delayed callbacks and coalesced interrupts, and it decreases
if an interrupt was injected successfully. If the backlog increases
while compensation is still in progress, the rate at which additional
interrupts are injected is increased too. A limit is imposed on the
backlog and on the rate.

Injecting additional timer interrupts to compensate lost interrupts
can alleviate long term time drift. However, on a short time scale,
this method can have the side effect of making virtual machine time
intermittently pass slower and faster than real time (depending on
the guest's time keeping algorithm). Compensation is disabled by
default and can be enabled for guests where this behaviour may be
acceptable.

Signed-off-by: Ulrich Obergfell
---
  hw/hpet.c |   70 +++-
  1 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/hw/hpet.c b/hw/hpet.c
index e57c654..519fc6b 100644
--- a/hw/hpet.c
+++ b/hw/hpet.c
@@ -31,6 +31,7 @@
  #include "hpet_emul.h"
  #include "sysbus.h"
  #include "mc146818rtc.h"
+#include

  //#define HPET_DEBUG
  #ifdef HPET_DEBUG
@@ -41,6 +42,9 @@

  #define HPET_MSI_SUPPORT0

+#define MAX_TICKS_NOT_ACCOUNTED (uint64_t)5 /* 5 sec */
+#define MAX_IRQ_RATE(uint32_t)10
+
  struct HPETState;
  typedef struct HPETTimer {  /* timers */
  uint8_t tn; /*timer number*/
@@ -324,14 +328,35 @@ static const VMStateDescription vmstate_hpet = {
  }
  };

+static void hpet_timer_driftfix_reset(HPETTimer *t)
+{
+if (t->state->driftfix&&  timer_is_periodic(t)) {
+t->ticks_not_accounted = t->prev_period = t->period;
   


This is rather confusing.  Clearly, ticks_not_accounted isn't actually 
ticks not accounted, it's a different variable entirely which is based 
on the current period.


If it were actually ticks_not_accounted, it should be reset to zero.


+t->irq_rate = 1;
+t->divisor = 1;
+}
+}
+
+static bool hpet_timer_has_tick_backlog(HPETTimer *t)
+{
+uint64_t backlog = 0;
+
+if (t->ticks_not_accounted>= t->period + t->prev_period) {
+backlog = t->ticks_not_accounted - (t->period + t->prev_period);
+}
+return (backlog>= t->period);
+}
+
  /*
   * timer expiration callback
   */
  static void hpet_timer(void *opaque)
  {
  HPETTimer *t = opaque;
+HPETState *s = t->state;
  uint64_t diff;
-
+int irq_delivered = 0;
+uint32_t period_count = 0;
  uint64_t period = t->period;
  uint64_t cur_tick = hpet_get_ticks(t->state);

@@ -339,13 +364,37 @@ static void hpet_timer(void *opaque)
  if (t->config&  HPET_TN_32BIT) {
  while (hpet_time_after(cur_tick, t->cmp)) {
  t->cmp = (uint32_t)(t->cmp + t->period);
+t->ticks_not_accounted += t->period;
+period_count++;
  }
  } else {
  while (hpet_time_after64(cur_tick, t->cmp)) {
  t->cmp += period;
+t->ticks_not_accounted += period;
+period_count++;
  }
  }
  diff = hpet_calculate_diff(t, cur_tick);
+if (s->driftfix) {
+if (t->ticks_not_accounted>  MAX_TICKS_NOT_ACCOUNTED) {
+t->ticks_not_accounted = t->period + t->prev_period;
+}
+if (hpet_timer_has_tick_backlog(t)) {
+if (t->irq_rate == 1 || period_count>  1) {
+t->irq_rate++;
+t->irq_rate = MIN(t->irq_rate, MAX_IRQ_RATE);
+}
+if (t->divisor == 0) {
+assert(period_count);
+}
+if (period_count) {
+t->divisor = t->irq_rate;
+}
+diff /= t->divisor--;
   


Why subtracting from the divisor?  Shouldn't the divisor and irq_rate 
change in lockstep?



+} else {
+t->irq_rate = 1;
+}
+}
  qemu_mod_timer(t->qemu_timer,
 qemu_get_clock_ns(vm_clock) + 
(int64_t)ticks_to_ns(diff));
  } else if (t->config&  HPET_TN_32BIT&&  !timer_is_periodic(t)) {
@@ -356,7 +405,22 @@ static void hpet_timer(void *opaque)
  t->wrap_flag = 0;
  }
  }
-update_irq(t, 1);
+if (s->driftfix&&  timer_is_periodic(t)&&  period != 0) {
+if (t->ticks_not_accounted>= t->period + t->prev_period) {
+irq_delivered = update_irq(t, 1);
+if (irq_del

[Qemu-devel] -net interface association behavior change in current -git.

2011-05-11 Thread Rob Landley
In 1.14.0, if I did this:

  qemu -net nic,blah -net user -net nic,blah -net tun,blah

Then the first nic would be -net user, and the second nic would be -net
tun.In current -git, -net user attaches to the second interface and
-net tun attaches to the first, I.E. the order is reversed.

Either way the first -nic becomes eth0 in Linux and the second becomes
eth1 (I can manually assign mac addresses in order to confirm which is
which), but eth0 used to be the -net user interface and now eth1 is the
-net user interface.

I bisected this to commit 60c07d933c66c4b30a83b but I don't know why it
changed the behavior, and I can't find _documentation_ on having
multiple interfaces transports hooked up to the same qemu instance
anyway.  (It used to work, but possibly that was an accident?)

Any ideas?

Rob