[Qemu-devel] [Bug 1198350] Re: USB pass-through fails with USBDEVFS_DISCONNECT: Invalid argument

2013-08-24 Thread Mike
I also have this issue. Does anyone have a work around? (it works with
Virtual Box)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1198350

Title:
  USB pass-through fails with USBDEVFS_DISCONNECT: Invalid argument

Status in QEMU:
  New

Bug description:
  Host Gentoo linux 32bit
  Guest Windows XP SP3
  qemu 1.4.2 and
  qemu fresh get clone and build 2013-07-04 (version1.5.50)
  qemu command line

  qemu-system-i386 -enable-kvm localtime -m 2047 -boot d
  /archive3/qemu/WindowsXP.img -net nic,model=rtl8139 -net user -usb
  -device usb-ehci,id=ehci -usbdevice host:1493:19

  The device I am trying to use with the guest is an interface for the
  Suunto Ambit 2 GPS watch which has no linux support.

  When the USB device is plugged in qemu reports to the command line:

  USBDEVFS_DISCONNECT: Invalid argument
  Invalid argument

  dmesg shows

  [237755.495968] usb 2-1.5: new full-speed USB device number 34 using ehci-pci
  [237755.582778] usb 2-1.5: config 1 has an invalid interface number: 1 but 
max is 0
  [237755.582781] usb 2-1.5: config 1 has no interface number 0
  [237755.583628] usb 2-1.5: New USB device found, idVendor=1493, idProduct=0019
  [237755.583631] usb 2-1.5: New USB device strings: Mfr=1, Product=2, 
SerialNumber=3
  [237755.583633] usb 2-1.5: Product: Ambit
  [237755.583634] usb 2-1.5: Manufacturer: Suunto
  [237755.583636] usb 2-1.5: SerialNumber: CE8309511700
  [237756.584937] usb 2-1.5: reset full-speed USB device number 34 using 
ehci-pci
  [237756.832658] usb 2-1.5: reset full-speed USB device number 34 using 
ehci-pci
  [237757.143585] usb 2-1.5: usbfs: process 12684 (qemu-system-i38) did not 
claim interface 1 before use

  In the windows guest Device Manager a HID device is listed but nothing
  else happens, no found new hardware dialog or the Suunto software
  (which is sitting there waiting) is not triggered as it should be.

  I have tried successfully with several other devices (flash drive,
  mouse, printer and video capture device). Because this device pretends
  to be an HID device my kernel's hid-generic driver was picking it up
  first until I modified hid-core.c to ignore this vendorid/productid.
  But still no joy.

  I'm guessing it has something to do with the the dmesg lines:

  [237755.582778] usb 2-1.5: config 1 has an invalid interface number: 1 but 
max is 0
  [237755.582781] usb 2-1.5: config 1 has no interface number 0

  But read that these warnings are not important though I don't get them
  for other devices. Nor do I get:

  [237757.143585] usb 2-1.5: usbfs: process 12684 (qemu-system-i38) did
  not claim interface 1 before use

  I've done alot of searching and I've run out of ideas. Any help would
  be great.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1198350/+subscriptions



Re: [Qemu-devel] [PATCH v2] net/net: Change the default mac address of nic

2013-10-14 Thread mike

On 10/14/2013 03:30 PM, Mike Qiu wrote:

Sorry for the first version make a mistake of the sender.

There is no difference with v1.

Thanks
Mike

The default mac address is 52:54:00:12:34:56 + index, this will
cause problem that when we boot up more than one guest with all
mac addresses unset by default, assume that each guest has one
nic. In this situation, all the guest's nic has the same mac address.

This patch is to solve this bug.

Signed-off-by: Mike Qiu 
---
  net/net.c | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/net.c b/net/net.c
index c330c9a..2796d04 100644
--- a/net/net.c
+++ b/net/net.c
@@ -21,6 +21,8 @@
   * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
   * THE SOFTWARE.
   */
+#include 
+
  #include "config-host.h"

  #include "net/net.h"
@@ -147,12 +149,13 @@ void qemu_macaddr_default_if_unset(MACAddr *macaddr)

  if (memcmp(macaddr, &zero, sizeof(zero)) != 0)
  return;
+srand((unsigned)time(NULL));
  macaddr->a[0] = 0x52;
  macaddr->a[1] = 0x54;
  macaddr->a[2] = 0x00;
-macaddr->a[3] = 0x12;
-macaddr->a[4] = 0x34;
-macaddr->a[5] = 0x56 + index++;
+macaddr->a[3] = rand() % 256;
+macaddr->a[4] = rand() % 256;
+macaddr->a[5] = rand() % 256 + index++;
  }

  /**





Re: [Qemu-devel] [PATCH] hmp: Add '\n' in monitor_printf()

2013-10-14 Thread mike

On 10/14/2013 10:36 PM, Markus Armbruster wrote:

Mike Qiu  writes:


Without this, output of 'info block'

scsi0-hd0: /images/f18-ppc64.qcow2 (qcow2)
  [not inserted]
scsi0-cd2: [not inserted]
 Removable device: not locked, tray closed

floppy0: [not inserted]
 Removable device: not locked, tray closed

sd0: [not inserted]
 Removable device: not locked, tray closed

There will be no additional lines between scsi0-hd0 scsi0-cd2,
and break the info style.

Just saw a similar one:

 (qemu) info block
 disk0: test.img (raw)
  [not inserted]
 cd: [not inserted]
 Removable device: not locked, tray closed

 foo: tmp.img (raw)
 Removable device: not locked, tray closed
  [not inserted](qemu)


This patch is to solve this.

Signed-off-by: Mike Qiu 
---
  hmp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hmp.c b/hmp.c
index 5891507..2d2e5f8 100644
--- a/hmp.c
+++ b/hmp.c
@@ -367,7 +367,7 @@ void hmp_info_block(Monitor *mon, const QDict *qdict)
  info->value->inserted->iops_wr_max,
  info->value->inserted->iops_size);
  } else {
-monitor_printf(mon, " [not inserted]");
+monitor_printf(mon, " [not inserted]\n");
  }
  
  if (verbose) {

monitor_printf(mon, "\nImages:\n");

What about removing the newline before "Images"?

A good idea I think, it no need to add addition lines in one info.

But see commit id: fbe2e26c15af35e4d157874dc80f6a19eebaa83b

-if (verbose) {
-monitor_printf(mon, " images:\n");
^^^
-image_info = info->value->inserted->image;
-while (1) {
- bdrv_image_info_dump((fprintf_function)monitor_printf,
- mon, image_info);
-if (image_info->has_backing_image) {
-image_info = image_info->backing_image;
-} else {
-break;
-}
+if (verbose) {
+monitor_printf(mon, "\nImages:\n");
^
+image_info = info->value->inserted->image;
+while (1) {
+ bdrv_image_info_dump((fprintf_function)monitor_printf,
+ mon, image_info);
+if (image_info->has_backing_image) {
+image_info = image_info->backing_image;
+} else {
+break;
 }
 }
-} else {
-monitor_printf(mon, " [not inserted]");
 }

It was changed to add this, so there maybe some reasons I think,

Thanks
Mike

I think we should also drop this newline:

 if (info->value->removable) {
 monitor_printf(mon, "Removable device: %slocked, tray %s\n",
info->value->locked ? "" : "not ",
info->value->tray_open ? "open" : "closed");
 }

Maybe more.  The function probably needs a systematic newline review.







Re: [Qemu-devel] [PATCH v2] net/net: Change the default mac address of nic

2013-10-14 Thread mike

On 10/15/2013 01:07 PM, Stefan Weil wrote:

Am 15.10.2013 06:17, schrieb Mike Qiu:

Changelog to v1:
Find remainder of macaddr->a[5] by modulo 256,
otherwise it may be overflow by add index++.

The default mac address is 52:54:00:12:34:56 + index, this will
cause problem that when we boot up more than one guest with all
mac addresses unset by default, assume that each guest has one
nic. In this situation, all the guest's nic has the same mac address.

This patch is to solve this bug.

Signed-off-by: Mike Qiu 
---
  net/net.c | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/net.c b/net/net.c
index c330c9a..9e72764 100644
--- a/net/net.c
+++ b/net/net.c
@@ -21,6 +21,8 @@
   * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
   * THE SOFTWARE.
   */
+#include 
+
  #include "config-host.h"
  
  #include "net/net.h"

@@ -147,12 +149,13 @@ void qemu_macaddr_default_if_unset(MACAddr *macaddr)
  
  if (memcmp(macaddr, &zero, sizeof(zero)) != 0)

  return;
+srand((unsigned)time(NULL));
  macaddr->a[0] = 0x52;
  macaddr->a[1] = 0x54;
  macaddr->a[2] = 0x00;
-macaddr->a[3] = 0x12;
-macaddr->a[4] = 0x34;
-macaddr->a[5] = 0x56 + index++;
+macaddr->a[3] = rand() % 256;
+macaddr->a[4] = rand() % 256;
+macaddr->a[5] = (rand() % 256 + index++) % 256;
  }
  
  /**

There is no overflow which must be handled because a[5] is an uint8_t
value, so the assignment automatically limits the range to 0...255.

OK, you are right, but I think we'd better to ensure this,
even though a[5] is an uint8_t.

Is it reasonable to get a random mac address in your guest? I don't
think so. It would no longer be possible to connect to a guest using
ssh, restart that guest and connect again with ssh.

Why not? I have do the experiment, after reboot, the mac is not changed.
and the ip address always the same.

And can be login to the guest after reboot.


If you start more than one guest, you simply have to decide which mac
address you want and tell it on the command line.

Yes, mostly we should do this, but users sometimes not do this,
so this interface should cover this situation.

Thank
Mike


Stefan









Re: [Qemu-devel] [PATCH v2] net/net: Change the default mac address of nic

2013-10-15 Thread mike

On 10/15/2013 02:05 PM, Stefan Weil wrote:

Am 15.10.2013 07:57, schrieb mike:

On 10/15/2013 01:07 PM, Stefan Weil wrote:

Am 15.10.2013 06:17, schrieb Mike Qiu:

Changelog to v1:
 Find remainder of macaddr->a[5] by modulo 256,
 otherwise it may be overflow by add index++.

The default mac address is 52:54:00:12:34:56 + index, this will
cause problem that when we boot up more than one guest with all
mac addresses unset by default, assume that each guest has one
nic. In this situation, all the guest's nic has the same mac address.

This patch is to solve this bug.

Signed-off-by: Mike Qiu 
---
   net/net.c | 9 ++---
   1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/net.c b/net/net.c
index c330c9a..9e72764 100644
--- a/net/net.c
+++ b/net/net.c
@@ -21,6 +21,8 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN
* THE SOFTWARE.
*/
+#include 
+
   #include "config-host.h"
 #include "net/net.h"
@@ -147,12 +149,13 @@ void qemu_macaddr_default_if_unset(MACAddr
*macaddr)
 if (memcmp(macaddr, &zero, sizeof(zero)) != 0)
   return;
+srand((unsigned)time(NULL));
   macaddr->a[0] = 0x52;
   macaddr->a[1] = 0x54;
   macaddr->a[2] = 0x00;
-macaddr->a[3] = 0x12;
-macaddr->a[4] = 0x34;
-macaddr->a[5] = 0x56 + index++;
+macaddr->a[3] = rand() % 256;
+macaddr->a[4] = rand() % 256;
+macaddr->a[5] = (rand() % 256 + index++) % 256;
   }
 /**

There is no overflow which must be handled because a[5] is an uint8_t
value, so the assignment automatically limits the range to 0...255.

OK, you are right, but I think we'd better to ensure this,
even though a[5] is an uint8_t.

Is it reasonable to get a random mac address in your guest? I don't
think so. It would no longer be possible to connect to a guest using
ssh, restart that guest and connect again with ssh.

Why not? I have do the experiment, after reboot, the mac is not changed.
and the ip address always the same.

And can be login to the guest after reboot.

"restart" means terminate QEMU and start it again.

OK,  qemu support the mac address unset right ?
So it may be a joke if just can start only one VM with mac address unset :).

In your case, also can use monitor to get the mac address, simply 'info 
network' :)

Then you can do as if you set the mac address.

Thanks
Mike









Re: [Qemu-devel] [PATCH] hmp: Add '\n' in monitor_printf()

2013-10-15 Thread mike

On 10/15/2013 04:58 PM, Kevin Wolf wrote:

Am 15.10.2013 um 05:38 hat mike geschrieben:

On 10/14/2013 10:36 PM, Markus Armbruster wrote:

Mike Qiu  writes:


Without this, output of 'info block'

scsi0-hd0: /images/f18-ppc64.qcow2 (qcow2)
  [not inserted]
scsi0-cd2: [not inserted]
 Removable device: not locked, tray closed

floppy0: [not inserted]
 Removable device: not locked, tray closed

sd0: [not inserted]
 Removable device: not locked, tray closed

There will be no additional lines between scsi0-hd0 scsi0-cd2,
and break the info style.

Just saw a similar one:

 (qemu) info block
 disk0: test.img (raw)
  [not inserted]
 cd: [not inserted]
 Removable device: not locked, tray closed

 foo: tmp.img (raw)
 Removable device: not locked, tray closed
  [not inserted](qemu)


This patch is to solve this.

Signed-off-by: Mike Qiu 
---
  hmp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hmp.c b/hmp.c
index 5891507..2d2e5f8 100644
--- a/hmp.c
+++ b/hmp.c
@@ -367,7 +367,7 @@ void hmp_info_block(Monitor *mon, const QDict *qdict)
  info->value->inserted->iops_wr_max,
  info->value->inserted->iops_size);
  } else {
-monitor_printf(mon, " [not inserted]");
+monitor_printf(mon, " [not inserted]\n");
  }
  if (verbose) {

monitor_printf(mon, "\nImages:\n");

What about removing the newline before "Images"?

A good idea I think, it no need to add addition lines in one info.

But see commit id: fbe2e26c15af35e4d157874dc80f6a19eebaa83b
[...]
It was changed to add this, so there maybe some reasons I think,

Like everything else in that commit, I did that change because I found it
more readable.

The problem seems to be commit 3e9fab69 ('block: Add support for
throttling burst max in QMP and the command line'), which added a bogus
"[not inserted]" message. We simply need to drop it altogether instead of
adding a newline.


Yes, I agree with you. but maybe need the author of the commit 3e9fab69
('block: Add support for throttling burst max in QMP and the command line')
to have some comments on this line, I think.

I think we should also drop this newline:

  if (info->value->removable) {
  monitor_printf(mon, "Removable device: %slocked, tray %s\n",
 info->value->locked ? "" : "not ",
 info->value->tray_open ? "open" : "closed");
  }

Why? Look:

(qemu) info block
scsi0-cd0: /tmp/cdrom.qcow2 (qcow2)
 Removable device: not locked, tray closed
 Backing file: /home/kwolf/images/iso/Fedora-18-x86_64-Live-Desktop.iso 
(chain depth: 1)
 I/O throttling:   bps=1048576 bps_rd=0 bps_wr=0 bps_max=104857 
bps_rd_max=0 bps_wr_max=0 iops=0 iops_rd=0 iops_wr=0 iops_max=0 iops_rd_max=0 
iops_wr_max=0 iops_size=0

Do you really want to remove the newline?

I'm not, but Markus suggest to do so.

Thanks
Mike

Kevin








Re: [Qemu-devel] [PATCH] hmp: Add '\n' in monitor_printf()

2013-10-15 Thread mike

On 10/15/2013 05:31 PM, Markus Armbruster wrote:

Kevin Wolf  writes:


Am 15.10.2013 um 05:38 hat mike geschrieben:

On 10/14/2013 10:36 PM, Markus Armbruster wrote:

Mike Qiu  writes:


Without this, output of 'info block'

scsi0-hd0: /images/f18-ppc64.qcow2 (qcow2)
  [not inserted]
scsi0-cd2: [not inserted]
 Removable device: not locked, tray closed

floppy0: [not inserted]
 Removable device: not locked, tray closed

sd0: [not inserted]
 Removable device: not locked, tray closed

There will be no additional lines between scsi0-hd0 scsi0-cd2,
and break the info style.

Just saw a similar one:

 (qemu) info block
 disk0: test.img (raw)
  [not inserted]
 cd: [not inserted]
 Removable device: not locked, tray closed

 foo: tmp.img (raw)
 Removable device: not locked, tray closed
  [not inserted](qemu)


This patch is to solve this.

Signed-off-by: Mike Qiu 
---
  hmp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hmp.c b/hmp.c
index 5891507..2d2e5f8 100644
--- a/hmp.c
+++ b/hmp.c
@@ -367,7 +367,7 @@ void hmp_info_block(Monitor *mon, const QDict *qdict)
  info->value->inserted->iops_wr_max,
  info->value->inserted->iops_size);
  } else {
-monitor_printf(mon, " [not inserted]");
+monitor_printf(mon, " [not inserted]\n");
  }
  if (verbose) {

monitor_printf(mon, "\nImages:\n");

What about removing the newline before "Images"?

A good idea I think, it no need to add addition lines in one info.

But see commit id: fbe2e26c15af35e4d157874dc80f6a19eebaa83b
[...]
It was changed to add this, so there maybe some reasons I think,

Like everything else in that commit, I did that change because I found it
more readable.

The problem seems to be commit 3e9fab69 ('block: Add support for
throttling burst max in QMP and the command line'), which added a bogus
"[not inserted]" message. We simply need to drop it altogether instead of
adding a newline.


I think we should also drop this newline:

  if (info->value->removable) {
  monitor_printf(mon, "Removable device: %slocked, tray %s\n",
 info->value->locked ? "" : "not ",
 info->value->tray_open ? "open" : "closed");
  }

Why? Look:

(qemu) info block
scsi0-cd0: /tmp/cdrom.qcow2 (qcow2)
 Removable device: not locked, tray closed
 Backing file: /home/kwolf/images/iso/Fedora-18-x86_64-Live-Desktop.iso 
(chain depth: 1)
 I/O throttling:   bps=1048576 bps_rd=0 bps_wr=0 bps_max=104857 
bps_rd_max=0 bps_wr_max=0 iops=0 iops_rd=0 iops_wr=0 iops_max=0 iops_rd_max=0 
iops_wr_max=0 iops_size=0

Do you really want to remove the newline?

This one made me think I do:

 foo: tmp.img (raw)
 Removable device: not locked, tray closed
  [not inserted](qemu)

If the '[not inserted]' is wrong and needs to go, then I actually don't.
Here '[not inserted]' is very strange, if the commit author has some 
reasonable reasons,

we can keep it, otherwise, I think we should remove it.

Thanks
Mike








Re: [Qemu-devel] [PATCH v2] net/net: Change the default mac address of nic

2013-10-15 Thread mike

On 10/15/2013 08:36 PM, Eric Blake wrote:

On 10/14/2013 11:07 PM, Stefan Weil wrote:

Is it reasonable to get a random mac address in your guest? I don't
think so. It would no longer be possible to connect to a guest using
ssh, restart that guest and connect again with ssh.

Agreed - libvirt ALWAYS passes a MAC to qemu, even if the user did not
specify a MAC to libvirt, precisely because the MAC must be reproducible
rather than random to avoid changing the guest ABI.  I don't think this
patch is needed - it's up to management to use qemu correctly.

Yes, you are right in this condition. But qemu support Mac address unset.
Also we can get the ip address through a lot of different ways, like use
monitor to get the mac and then get the ip. So we can login use ssh.

But as you mentioned, this patch is not needed, I don't agree with you.

First, this patch just fix the Potential issue of this feature. Now libvirt
maybe can't triggered this issue, who can promise in future will not.

The second is,  qemu not only be used by libvirt,  lots of developers like
to use the command line to boot up the guest. And in the future, we
are not sure about other program will use qemu.

The third is, when one feature has a issue in qemu,
no matter when it is been triggered, should we not fix it?

Mike
Thanks



Re: [Qemu-devel] [PATCH] hmp: Add '\n' in monitor_printf()

2013-10-16 Thread mike

On 10/16/2013 03:56 PM, Wenchao Xia wrote:

于 2013/10/16 15:47, Markus Armbruster 写道:

Wenchao Xia  writes:


于 2013/10/15 18:07, mike 写道:

On 10/15/2013 04:58 PM, Kevin Wolf wrote:

Am 15.10.2013 um 05:38 hat mike geschrieben:

On 10/14/2013 10:36 PM, Markus Armbruster wrote:

Mike Qiu  writes:


Without this, output of 'info block'

scsi0-hd0: /images/f18-ppc64.qcow2 (qcow2)
[not inserted]
scsi0-cd2: [not inserted]
Removable device: not locked, tray closed

floppy0: [not inserted]
Removable device: not locked, tray closed

sd0: [not inserted]
Removable device: not locked, tray closed

There will be no additional lines between scsi0-hd0 scsi0-cd2,
and break the info style.

Just saw a similar one:

(qemu) info block
disk0: test.img (raw)
[not inserted]
cd: [not inserted]
Removable device: not locked, tray closed

foo: tmp.img (raw)
Removable device: not locked, tray closed
[not inserted](qemu)


This patch is to solve this.

Signed-off-by: Mike Qiu
---
hmp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hmp.c b/hmp.c
index 5891507..2d2e5f8 100644
--- a/hmp.c
+++ b/hmp.c
@@ -367,7 +367,7 @@ void hmp_info_block(Monitor *mon, const
QDict *qdict)
info->value->inserted->iops_wr_max,
info->value->inserted->iops_size);
} else {
- monitor_printf(mon, " [not inserted]");
+ monitor_printf(mon, " [not inserted]\n");
}
if (verbose) {

monitor_printf(mon, "\nImages:\n");

What about removing the newline before "Images"?

A good idea I think, it no need to add addition lines in one info.

But see commit id: fbe2e26c15af35e4d157874dc80f6a19eebaa83b
[...]
It was changed to add this, so there maybe some reasons I think,

Like everything else in that commit, I did that change because I
found it
more readable.

The problem seems to be commit 3e9fab69 ('block: Add support for
throttling burst max in QMP and the command line'), which added a 
bogus

"[not inserted]" message. We simply need to drop it altogether
instead of
adding a newline.

Yes, I agree with you. but maybe need the author of the commit 
3e9fab69

('block: Add support for throttling burst max in QMP and the command
line')
to have some comments on this line, I think.

I think we should also drop this newline:

if (info->value->removable) {
monitor_printf(mon, " Removable device: %slocked, tray %s\n",
info->value->locked ? "" : "not ",
info->value->tray_open ? "open" : "closed");
}

Why? Look:

(qemu) info block
scsi0-cd0: /tmp/cdrom.qcow2 (qcow2)
Removable device: not locked, tray closed
Backing file:
/home/kwolf/images/iso/Fedora-18-x86_64-Live-Desktop.iso (chain
depth: 1)
I/O throttling: bps=1048576 bps_rd=0 bps_wr=0 bps_max=104857
bps_rd_max=0 bps_wr_max=0 iops=0 iops_rd=0 iops_wr=0 iops_max=0
iops_rd_max=0 iops_wr_max=0 iops_size=0

Do you really want to remove the newline?

I'm not, but Markus suggest to do so.


Why the new line matters? I think there is a QMP interface available,
so the hmp output format
would not bother much, just need to tip clear the info.

I want this fixed:

 $ qemu -nodefaults -nographic -monitor stdio -S -drive 
if=none,file=tmp.img

 QEMU 1.6.50 monitor - type 'help' for more information
 (qemu) info block
 none0: tmp.img (raw)
 Removable device: not locked, tray closed
  [not inserted](qemu)

Output doesn't end with newline, messing up the prompt.


  I see, no end with newline is bad, should fix.


OK, I will make a patch to drop this ' [not inserted]' line instead of 
add a new line.


Thanks
Mike

Elswhere in the thread, we concluded that the offending '[not inserted]'
is bogus, and should be dropped.











Re: [Qemu-devel] [PATCH v2] hmp: solve '\n' in monitor_printf()

2013-10-17 Thread mike

On 10/17/2013 04:11 PM, Kevin Wolf wrote:

Am 17.10.2013 um 05:16 hat Mike Qiu geschrieben:

Change to v1:
remove '[not inserted]' line instead of adding '\n'

Please put such notes for reviewers below the --- line, so that git am
automatically removes them and they don't end up in the git commit
message.

OK, thanks for your kindly remind,

Thanks
Mike

Output of 'info block'

scsi0-hd0: /images/f18-ppc64.qcow2 (qcow2)
  [not inserted]
scsi0-cd2: [not inserted]
 Removable device: not locked, tray closed

floppy0: [not inserted]
 Removable device: not locked, tray closed

sd0: [not inserted]
 Removable device: not locked, tray closed

There will be no additional lines between scsi0-hd0 and
scsi0-cd2.

At the same time, scsi0-hd0 already inserted, but still has
'[not inserted]' flag. This line should be removed.

This patch is to solve this.

Signed-off-by: Mike Qiu 

Reviewed-by: Kevin Wolf 








Re: [Qemu-devel] [PATCH v2] hmp: solve '\n' in monitor_printf()

2013-10-17 Thread mike

On 10/17/2013 04:14 PM, Stefan Hajnoczi wrote:

On Wed, Oct 16, 2013 at 11:16:01PM -0400, Mike Qiu wrote:

Change to v1:
remove '[not inserted]' line instead of adding '\n'

Output of 'info block'

scsi0-hd0: /images/f18-ppc64.qcow2 (qcow2)
  [not inserted]
scsi0-cd2: [not inserted]
 Removable device: not locked, tray closed

floppy0: [not inserted]
 Removable device: not locked, tray closed

sd0: [not inserted]
 Removable device: not locked, tray closed

There will be no additional lines between scsi0-hd0 and
scsi0-cd2.

At the same time, scsi0-hd0 already inserted, but still has
'[not inserted]' flag. This line should be removed.

This patch is to solve this.

Signed-off-by: Mike Qiu 
---
  hmp.c | 2 --
  1 file changed, 2 deletions(-)

The commit message is out-of-date, this patch now drops the bogus "[not
inserted]" output.  I fixed up the commit message.

In the future, please put the patch changelog after the "---" line so
that git am does not include the changelog in the commit description.

OK, Got it, thanks for your kindly remind :)

Thanks
Mike

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan







Re: [Qemu-devel] [Bug 1174654] Re: qemu-system-x86_64 takes 100% CPU after host machine resumed from suspend to ram

2013-10-17 Thread mike

On 10/18/2013 04:29 AM, tobias wrote:

hi,

tried your option but it does not help. (cpu usage is still high)
below my command line syntax:
qemu-system-x86_64 -global mc146818rtc.lost_tick_policy=slew -machine 
accel=kvm:tcg -name win7 -S -machine pc-i440fx-1.4,accel=kvm,usb=off -m 2048 
-realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 
813f5806-64ec-3319-452a-5e1834e753c9 -no-user-config -nodefaults -chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/win7.monitor,server,nowait 
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime 
-no-shutdown -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device 
ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 
-device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 
-device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x8 -drive 
file=/data/vmware/win7.img,if=none,id=drive-virtio-disk0,format=qcow2 -device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
 -device usb-tablet,id=input0 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 
-device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device 
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -vga std
Hi, have you enable the kernel CPU idle driver?  especially the guest 
kernel.


Thanks
Mike








Re: [Qemu-devel] [PATCH v2] net/net: Change the default mac address of nic

2013-10-17 Thread mike

On 10/17/2013 08:30 PM, Stefan Hajnoczi wrote:

On Tue, Oct 15, 2013 at 09:33:06PM +0800, mike wrote:

On 10/15/2013 08:36 PM, Eric Blake wrote:

On 10/14/2013 11:07 PM, Stefan Weil wrote:

Is it reasonable to get a random mac address in your guest? I don't
think so. It would no longer be possible to connect to a guest using
ssh, restart that guest and connect again with ssh.

Agreed - libvirt ALWAYS passes a MAC to qemu, even if the user did not
specify a MAC to libvirt, precisely because the MAC must be reproducible
rather than random to avoid changing the guest ABI.  I don't think this
patch is needed - it's up to management to use qemu correctly.

Yes, you are right in this condition. But qemu support Mac address unset.
Also we can get the ip address through a lot of different ways, like use
monitor to get the mac and then get the ip. So we can login use ssh.

But as you mentioned, this patch is not needed, I don't agree with you.

First, this patch just fix the Potential issue of this feature. Now libvirt
maybe can't triggered this issue, who can promise in future will not.

The second is,  qemu not only be used by libvirt,  lots of developers like
to use the command line to boot up the guest. And in the future, we
are not sure about other program will use qemu.

The third is, when one feature has a issue in qemu,
no matter when it is been triggered, should we not fix it?

NACK

I'm not going to merge this patch:

If you terminate QEMU and launch it again the NIC gets a different MAC
address.  Some guest operating systems are sensitive to this - under

For these users must use -device ,mac=XX:XX:XX:XX:XX:XX.
I think no body will boot up the guest, which sensitive to this, without 
mac address.


Actually, people use the command line without mac address, mean they mainly
don't care about mac address, so give them random mac address is reasonable
I think.

In my opinion, if we fix this, for qemu side no any issue, we both support
mac address set or unset correctly.

What am I confuse is, *qemu supports mac address unset, why we force
users must set the address when more than one guests*?
This is unreasonable.


many Linux distros the network interfaces names change due to the MAC
address change.  As a result firewall configuration will break and other
services may fail to start because they cannot find the interface.

Agree, so this mac address should set in qemu command line as
libvirt does :)

If you have multiple guests or want control over the MAC address, set it
explicitly using -device ,mac=XX:XX:XX:XX:XX:XX.

Currently, especially for developers, people mainly use qemu
command line directly, and as qemu supports mac address
unset, they may try the simplest command line to boot up
lots of guests, they will confuse about why all this guest use
the same mac address.

Thanks
Mike

Stefan








Re: [Qemu-devel] [Bug 1174654] Re: qemu-system-x86_64 takes 100% CPU after host machine resumed from suspend to ram

2013-10-18 Thread mike

On 10/18/2013 03:12 PM, tobias wrote:

Hello Mike,

Thanks a lot for getting back on this.
Is the "cpu idle driver" a command line option I need to specify for
qemu (the -cpu option ?)
I could not find a reference to "idle" in the man page.

You need to check the guest kernel config file.

Thanks
Mike

regards,

Tobias.

On 18-10-13 04:33, mike wrote:

On 10/18/2013 04:29 AM, tobias wrote:

hi,

tried your option but it does not help. (cpu usage is still high)
below my command line syntax:
qemu-system-x86_64 -global mc146818rtc.lost_tick_policy=slew -machine
accel=kvm:tcg -name win7 -S -machine pc-i440fx-1.4,accel=kvm,usb=off
-m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid
813f5806-64ec-3319-452a-5e1834e753c9 -no-user-config -nodefaults
-chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/win7.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime
-no-shutdown -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7
-device
ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5
-device
ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1
-device
ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x8 -drive
file=/data/vmware/win7.img,if=none,id=drive-virtio-disk0,format=qcow2
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-device usb-tablet,id=input0 -device
intel-hda,id=sound0,bus=pci.0,addr=0x4 -device
hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -vga std

Hi, have you enable the kernel CPU idle driver?  especially the guest
kernel.

Thanks
Mike






Re: [Qemu-devel] [PATCH v2] net/net: Change the default mac address of nic

2013-10-18 Thread mike

On 10/18/2013 05:00 PM, Stefan Hajnoczi wrote:

On Fri, Oct 18, 2013 at 10:54:13AM +0800, mike wrote:

NACK

I'm not going to merge this patch:

If you terminate QEMU and launch it again the NIC gets a different MAC
address.  Some guest operating systems are sensitive to this - under

For these users must use -device ,mac=XX:XX:XX:XX:XX:XX.
I think no body will boot up the guest, which sensitive to this,
without mac address.

Actually, people use the command line without mac address, mean they mainly
don't care about mac address, so give them random mac address is reasonable
I think.

In my opinion, if we fix this, for qemu side no any issue, we both support
mac address set or unset correctly.

What am I confuse is, *qemu supports mac address unset, why we force
users must set the address when more than one guests*?
This is unreasonable.


many Linux distros the network interfaces names change due to the MAC
address change.  As a result firewall configuration will break and other
services may fail to start because they cannot find the interface.

Agree, so this mac address should set in qemu command line as
libvirt does :)

If you have multiple guests or want control over the MAC address, set it
explicitly using -device ,mac=XX:XX:XX:XX:XX:XX.

Currently, especially for developers, people mainly use qemu
command line directly, and as qemu supports mac address
unset, they may try the simplest command line to boot up
lots of guests, they will confuse about why all this guest use
the same mac address.

Your argument is weak: *you* want to avoid specifying the MAC address so
in exchange you want to *break* existing configurations and force other
people to start specifying a MAC address.

OK, I do not want to break the existing configurations.

I'm fine if you do not want to merge this patch.

But I think this should be an issue of qemu, and need to
do something on it, so I make this patch.

Can we try other solutions to solve this issue?
(if you agree this should be an issue)

Also this potential issue can happens if the user set the same mac address
with more than one guest on one host.

Can we avoid this ?

Thanks
Mike

This doesn't improve anything, it will just annoy users and cause bug
reports.

Sorry that there isn't a solution that satisfies everyone, you'll have
to add a MAC address to your command-line.

Stefan







Re: [Qemu-devel] [PATCH] spapr: increase temporary fdt buffer size

2013-09-24 Thread mike

On 09/24/2013 01:59 PM, Alexey Kardashevskiy wrote:

At the moment the size of the buffer is set to 64K which is
enough for approximately 150 VCPUs which is not the limit.

This increases the buffer up to 256K which allows having
a tree for approximately 600 VCPUs which is way beyond the real
number we need.

As only the real size of the tree is copied to the guest, there
will be no impact on existing configurations.

Signed-off-by: Alexey Kardashevskiy 
---
  hw/ppc/spapr.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 1bb76f1..063fe63 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -62,7 +62,7 @@
   *
   * We load our kernel at 4M, leaving space for SLOF initial image
   */
-#define FDT_MAX_SIZE0x1
+#define FDT_MAX_SIZE0x4
  #define RTAS_MAX_SIZE   0x1
  #define FW_MAX_SIZE 0x40
  #define FW_FILE_NAME"slof.bin"

Signed-off-by: Mike Qiu 




[Qemu-devel] QEMU built from TCC

2007-09-27 Thread Mike

Can QEMU be built from TCC?

Mike




[Qemu-devel] [Bug 818673] Re: virtio: trying to map MMIO memory

2012-02-02 Thread Mike Cao
Does this Bug similiar with
https://bugzilla.redhat.com/show_bug.cgi?id=771390 ?

** Bug watch added: Red Hat Bugzilla #771390
   https://bugzilla.redhat.com/show_bug.cgi?id=771390

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/818673

Title:
  virtio: trying to map MMIO memory

Status in QEMU:
  New

Bug description:
  Qemu host is Core i7, running Linux.  Guest is Windows XP sp3.
  Often, qemu will crash shortly after starting (1-5 minutes) with a statement 
"qemu-system-x86_64: virtio: trying to map MMIO memory"
  This has occured with qemu-kvm 0.14, qemu-kvm 0.14.1, qemu-0.15.0-rc0 and 
qemu 0.15.0-rc1.
  Qemu is started as such:
  qemu-system-x86_64 -cpu host -enable-kvm -pidfile /home/rick/qemu/hds/wxp.pid 
-drive file=/home/rick/qemu/hds/wxp.raw,if=virtio -m 768 -name WinXP -net 
nic,model=virtio -net user -localtime -usb -vga qxl -device virtio-serial 
-chardev spicevmc,name=vdagent,id=vdagent -device 
virtserialport,chardev=vdagent,name=com.redhat.spice.0 -spice 
port=1234,disable-ticketing -daemonize -monitor 
telnet:localhost:12341,server,nowait
  The WXP guest has virtio 1.1.16 drivers for net and scsi, and the most 
current spice binaries from spice-space.org.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/818673/+subscriptions



[Qemu-devel] [Bug 601946] Re: [Feature request] qemu-img multi-threaded compressed image conversion

2012-02-07 Thread Mike Ashton
The compression in this case is certainly chunked already, otherwise you
couldn't implement a pseudo block device without reading the entire
stream to read the last block!  As the data in the new disk is
necessarily chunk compressed, parallelisation is perfect feasible, it's
just a question of the algorithm you use to arbitrate the work between
the threads, which may need some thought as you'd likely be navigating a
tree structure.

There's no question that Jes' suggestion would create a 12x speed up for
me, and there's pretty standard off the shelf server hardware with 48
cores.  As Jan-Simon Möller points out, being single-threaded and
single-process isn't much of an option any more.  If one is trying to
compress, say, a 4TB virtual disk image then using a little over 2% of
the available CPU time meaning you have to wait a week is going to be...
frustrating :)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/601946

Title:
  [Feature request] qemu-img multi-threaded compressed image conversion

Status in QEMU:
  New

Bug description:
  Feature request:
  qemu-img multi-threaded compressed image conversion

  Suppose I want to convert raw image to compressed qcow2. Multi-
  threaded conversion will be much faster, because bottleneck is
  compressing data.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/601946/+subscriptions



Re: [Qemu-devel] [PATCH] net: Allow specifying ifname for qemu-bridge-helper

2012-10-12 Thread Mike Lovell

On 10/12/2012 12:49 AM, Mike Lovell wrote:

This makes a few changes to allow ifname to be specified when using
qemu-bridge-helper with both the bridge and tap network interfaces. It adds
the --ifname option to qemu-bridge-helper, removes the restriction that ifname
cannot be specified with helper for the tap interface, and adds logic to
specify the --ifname option when exec'ing the helper.

Signed-off-by: Mike Lovell 
---

This feature was originally requested by Mario De Chenno on the qemu-devel
mailing list. Seems pretty simple and figured it was something I could throw
together pretty quickly. I have tested the following combinations of invoking
qemu (where qbr is qemu-bridge-helper)

qemu-system-x86_64 -net nic -net tap,helper="qbr --br=test1"
qemu-system-x86_64 -net nic -net tap,helper="qbr --br=test1",ifname=vm1
qemu-system-x86_64 -net nic -net tap,helper=qbr
qemu-system-x86_64 -net nic -net tap,helper=qbr,ifname=vm1
qemu-system-x86_64 -net nic -net bridge,helper=qbr
qemu-system-x86_64 -net nic -net bridge,helper=qbr,ifname=vm1
qemu-system-x86_64 -net nic -net bridge,helper=qbr,ifname=vm1,br=test1
qemu-system-x86_64 -net nic -net bridge,helper=qbr,br=test1


I realized there were a couple more cases to check. I also tested the 
following.


qemu-system-x86_64 -net nic -net tap,helper="qbr --ifname=vm1"
qemu-system-x86_64 -net nic -net tap,helper="qbr --br=test1 --ifname=vm1"
qemu-system-x86_64 -net nic -net tap,helper="qbr --ifname=vm1",ifname=foo
qemu-system-x86_64 -net nic -net tap,helper="qbr --br=test1 
--ifname=vm1",ifname=foo


In the last two cases, the --ifname specified in the helper option (in 
this case, vm1) is used over the tap ifname option (in this case, foo).


mike



[Qemu-devel] [PATCH] net: Allow specifying ifname for qemu-bridge-helper

2012-10-12 Thread Mike Lovell
This makes a few changes to allow ifname to be specified when using
qemu-bridge-helper with both the bridge and tap network interfaces. It adds
the --ifname option to qemu-bridge-helper, removes the restriction that ifname
cannot be specified with helper for the tap interface, and adds logic to
specify the --ifname option when exec'ing the helper.

Signed-off-by: Mike Lovell 
---

This feature was originally requested by Mario De Chenno on the qemu-devel
mailing list. Seems pretty simple and figured it was something I could throw
together pretty quickly. I have tested the following combinations of invoking
qemu (where qbr is qemu-bridge-helper)

qemu-system-x86_64 -net nic -net tap,helper="qbr --br=test1"
qemu-system-x86_64 -net nic -net tap,helper="qbr --br=test1",ifname=vm1
qemu-system-x86_64 -net nic -net tap,helper=qbr
qemu-system-x86_64 -net nic -net tap,helper=qbr,ifname=vm1
qemu-system-x86_64 -net nic -net bridge,helper=qbr 
qemu-system-x86_64 -net nic -net bridge,helper=qbr,ifname=vm1
qemu-system-x86_64 -net nic -net bridge,helper=qbr,ifname=vm1,br=test1
qemu-system-x86_64 -net nic -net bridge,helper=qbr,br=test1

 net/tap.c|   39 ---
 qapi-schema.json |3 ++-
 qemu-bridge-helper.c |   10 +++---
 3 files changed, 37 insertions(+), 15 deletions(-)

diff --git a/net/tap.c b/net/tap.c
index a88ae8f..cfb5bff 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -417,11 +417,13 @@ static int recv_fd(int c)
 return len;
 }
 
-static int net_bridge_run_helper(const char *helper, const char *bridge)
+static int net_bridge_run_helper(const char *helper,
+ const char *bridge,
+ const char *ifname)
 {
 sigset_t oldmask, mask;
 int pid, status;
-char *args[5];
+char *args[6];
 char **parg;
 int sv[2];
 
@@ -439,7 +441,9 @@ static int net_bridge_run_helper(const char *helper, const 
char *bridge)
 int open_max = sysconf(_SC_OPEN_MAX), i;
 char fd_buf[6+10];
 char br_buf[6+IFNAMSIZ] = {0};
-char helper_cmd[PATH_MAX + sizeof(fd_buf) + sizeof(br_buf) + 15];
+char ifname_buf[10+IFNAMSIZ] = {0};
+char helper_cmd[PATH_MAX + sizeof(fd_buf) + sizeof(br_buf) +
+sizeof(ifname_buf) + 15];
 
 for (i = 0; i < open_max; i++) {
 if (i != STDIN_FILENO &&
@@ -459,8 +463,13 @@ static int net_bridge_run_helper(const char *helper, const 
char *bridge)
 snprintf(br_buf, sizeof(br_buf), "%s%s", "--br=", bridge);
 }
 
-snprintf(helper_cmd, sizeof(helper_cmd), "%s %s %s %s",
- helper, "--use-vnet", fd_buf, br_buf);
+if ((strstr(helper, "--ifname=") == NULL) && (ifname != NULL)) {
+snprintf(ifname_buf, sizeof(ifname_buf), "%s%s" ,
+ "--ifname=", ifname);
+}
+
+snprintf(helper_cmd, sizeof(helper_cmd), "%s %s %s %s %s",
+ helper, "--use-vnet", fd_buf, br_buf, ifname_buf);
 
 parg = args;
 *parg++ = (char *)"sh";
@@ -473,12 +482,17 @@ static int net_bridge_run_helper(const char *helper, 
const char *bridge)
 /* assume helper is just the executable path name */
 
 snprintf(br_buf, sizeof(br_buf), "%s%s", "--br=", bridge);
+if (ifname != NULL) {
+snprintf(ifname_buf, sizeof(ifname_buf), "%s%s" ,
+ "--ifname=", ifname);
+}
 
 parg = args;
 *parg++ = (char *)helper;
 *parg++ = (char *)"--use-vnet";
 *parg++ = fd_buf;
 *parg++ = br_buf;
+*parg++ = ifname_buf;
 *parg++ = NULL;
 
 execv(helper, args);
@@ -517,7 +531,7 @@ int net_init_bridge(const NetClientOptions *opts, const 
char *name,
 NetClientState *peer)
 {
 const NetdevBridgeOptions *bridge;
-const char *helper, *br;
+const char *helper, *br, *ifname;
 
 TAPState *s;
 int fd, vnet_hdr;
@@ -527,8 +541,9 @@ int net_init_bridge(const NetClientOptions *opts, const 
char *name,
 
 helper = bridge->has_helper ? bridge->helper : DEFAULT_BRIDGE_HELPER;
 br = bridge->has_br ? bridge->br : DEFAULT_BRIDGE_INTERFACE;
+ifname = bridge->has_ifname ? bridge->ifname : NULL;
 
-fd = net_bridge_run_helper(helper, br);
+fd = net_bridge_run_helper(helper, br, ifname);
 if (fd == -1) {
 return -1;
 }
@@ -622,14 +637,16 @@ int net_init_tap(const NetClientOptions *opts, const char 
*name,
 model = "tap";
 
 } else if (tap->has_helper) {
-if (tap->has_ifname || tap->has_script || tap->h

Re: [Qemu-devel] [PATCH] net: Allow specifying ifname for qemu-bridge-helper

2012-10-12 Thread Mike Lovell

On 10/12/2012 02:32 AM, Michael Tokarev wrote:

On 12.10.2012 10:49, Mike Lovell wrote:

  /* request a tap device, disable PI, and add vnet header support if
- * requested and it's available. */
-prep_ifreq(&ifr, "tap%d");
+ * requested and it's available. use ifname if provided for tap name. */
+prep_ifreq(&ifr, ifname != NULL ? ifname : "tap%d");

Should we check for special symbols here? prep_ifreq() does this:

 snprintf(ifr->ifr_name, IFNAMSIZ, "%s", ifname);

so at least it ensures we have length constraint.


I tried the code as is with specifying ifnames with various random 
combinations of special characters. Some of them we just allowed 
through, some caused an error when initializing the tap device, and some 
cause problems in the shell invoking qemu. I think the linux kernel does 
the necessary checking during the TUNSETIFF ioctl and the 
qemu-bridge-helper exits with an error if there was a problem.



Actually I'm not so sure anymore this is a good idea.
For example, system may have firewall (iptables) rules
in place for, say, future ppp interfaces for ppp clients,
and this way we may request the interface to be named
pppX and be allowed to send packets where we don't usually
have access to.


While I admit this does have that possibility, I'm not sure its a qemu 
problem. I don't know what the original motivation for the request was 
but I could see this being the reason for the request. Some 
administrator sets up firewall rules for a variety of guests ahead of 
actually running them and needs to specify the interface at runtime. 
Also, without using the helper programs, the qemu already allows 
specifying arbitrary names such as ppp0.



Maybe - at least - require some common prefix for the
interfaces created this way, so we'll live in our own,
easily distinguishable namespace -- like, qvif* (from
Qemu Virtual InterFace)?


I do like the idea of using a common prefix for the default name of tap 
devices. Something like "qvif%d" instead of "tap%d" in tap 
initialization code. But something tells me this could break 
compatibility with external management software where something might be 
expecting the interface name to start with tap.


mike



[Qemu-devel] centos 6.3 and nested kvm

2012-10-30 Thread Mike Lovell
i have been using kvm in an nested environment for a couple months now. 
things have appeared to be working fine. i have used both intel 
(westmere-ep) hosts and, when starting the first level guest, i have 
specified '-cpu qemu64,+vmx' to pass the kvm support through. when using 
a centos 6.2 guest, i am able to load the kvm and kvm_intel modules and 
use it. the problem is today i tried updating my centos 6.2 guest to 6.3 
and now the kvm_intel module fails to load in this guest. there aren't 
any kernel messages that i can see. does anyone on the list have an idea 
of what might have changed. the 6.2 system had 2.6.32-220.el6.x86_64 as 
the kernel version and 6.3 has 2.6.32-279.11.1.el6. the exact error i 
get when trying to load the module is


FATAL: Error inserting kvm_intel 
(/lib/modules/2.6.32-279.11.1.el6.x86_64/kernel/arch/x86/kvm/kvm-intel.ko): 
Input/output error


anyone know of anything that might have changed in the kvm_intel module 
or elsewhere that would cause this to break?


mike



Re: [Qemu-devel] centos 6.3 and nested kvm

2012-10-30 Thread Mike Lovell

On 10/30/2012 02:02 PM, Mike Lovell wrote:
i have been using kvm in an nested environment for a couple months 
now. things have appeared to be working fine. i have used both intel 
(westmere-ep) hosts and, when starting the first level guest, i have 
specified '-cpu qemu64,+vmx' to pass the kvm support through. when 
using a centos 6.2 guest, i am able to load the kvm and kvm_intel 
modules and use it. the problem is today i tried updating my centos 
6.2 guest to 6.3 and now the kvm_intel module fails to load in this 
guest. there aren't any kernel messages that i can see. does anyone on 
the list have an idea of what might have changed. the 6.2 system had 
2.6.32-220.el6.x86_64 as the kernel version and 6.3 has 
2.6.32-279.11.1.el6. the exact error i get when trying to load the 
module is


FATAL: Error inserting kvm_intel 
(/lib/modules/2.6.32-279.11.1.el6.x86_64/kernel/arch/x86/kvm/kvm-intel.ko): 
Input/output error


anyone know of anything that might have changed in the kvm_intel 
module or elsewhere that would cause this to break?


i probably should have specified this before but i'm using qemu 1.1.2 on 
the host.


mike



[Qemu-devel] net: RFC New Socket-Based, Switched Network Backend (QDES)

2012-06-25 Thread Mike Lovell

Hi all,
Here is something I've been tinkering with the past few weeks and now have it
in a state where the basic idea makes sense, it works, and could use some
feedback from the community.

This is what I've been calling QDES or QEMU Distributed Ethernet Switch. I
first had the idea when I was playing with the udp and mcast socket network
backends while exploring how to build a VM infrastructure. I liked the idea of
using the sockets backends cause it doesn't require escalated permissions to
configure and run as well as the ability to talk over IP networks.

But the built in socket backends either allowed for only 2 guests talking
directly or for multiple guests where all traffic is sent to all. So one can
either have two guests talking or have bandwidth wasted with multiple guests.
There wasn't something that could talk to multiple guests but also utilize
unicast traffic.

So I made a backend that can do this. It takes the basics of how the udp and
mcast socket backends work and combines them with some switching based on the
ethernet packets. The result is multiple guests can talk to each other but
not waste bandwidth by delivering unicast traffic to all guests. The backend
also adds some header data to each packet. This header includes a network
identifier so multiple logical networks can be created using the same
multicast configuration but still have separation in the guests.

Its kind of like VXLAN or NVGRE but replace the GRE tunnels with UDP packets.

There are a couple advantages that I see to this. It allows for multiple guests
in multiple locations to talk to each other while keeping unicast traffic to
just between two hosts. It doesn't require root permissions to run. It can
operate over non-ethernet networks (like IPoIB). It doesn't require changing
network configuration on the host. It allows for a ton of logical networks to
be created (currently 65536 per multicast address and port combination).

There are a few disadvantages as well. It does add some more processing to the
QEMU process but not much (I saw it go as fast as the socket backends). It is
encapsulating an Ethernet frame inside a UDP packet so there is the overhead of
the IP and UDP headers as well as the transport medium headers (most likely
Ethernet again). Because there is additional header data and MTU of the guest
could be limited depending on the ability to send larger multicast packet from
the host. (I haven't really looked closely at this last one). There isn't the
ability for something besides QEMU processes to communicate using this, though
I hope to build a utility to work with a tap device.

Overall, I think this is something that's pretty cool. I don't know how much
people give any thought to the socket backends for real world use and so I
don't know if this would be of much use to anyone. I am looking for some
feedback into what the community thinks and for comments about the code. Its
only my second time doing more than 20 lines of C so I'm sure I did some stupid
things. I have only tested on 64 bit x86 Linux systems so far.

Hopefully you all have good things to say. :)

mike



[Qemu-devel] [PATCH] Initial commit for QDES - QEMU Distributed Ethernet Switch

2012-06-25 Thread Mike Lovell
This commit adds a new network backend to QEMU. It combines the basic behavior
of the unicast udp and multicast socket backends with some intelligence about
the source and destination of the packets. It also adds a header to the
packets to allow for creating multiple logical networks using the same
underlying network infrastructure. (Kind of like GRE Keys). This provides a
network backend that acts as a type of Ethernet switch.

During initialization, QDES will create two sockets. One that is used for
receiving unicast udp traffic and for sending both unicast and multicast udp
traffic. The second socket is for receiving multicast udp traffic. A
GHashTable is also created that will serve as an address table for where
packets should be delivered. A timer is also configured to regularly clean the
contents of the address table.

When a packet is received by either of the two sockets, the header is parsed
and checked to make sure the packet is a member of the correct logical
network. Then the ethernet frame that is received is inspected for the source
MAC address. The address table is then updated with the source MAC address,
the address of where the packets was received from, and the current time.

When QEMU delivers a packet to be sent, the destination MAC address is looked
for in the address table. If it is found, the packet is sent to the remote
address stored in the table along with the approriate header. If the address
is not found, the packet and header is sent to the configured multicast
address so all members of the network will receive the packet.

Unfortunately, only IPv4 is currently supported. IPv6 is on the short list of
improvements to be made.

Signed-off-by: Mike Lovell 
---
 hmp-commands.hx   |2 +-
 net.c |   31 +++-
 net.h |1 +
 net/Makefile.objs |1 +
 net/qdes.c|  453 +
 net/qdes.h|   38 +
 net/socket.c  |3 +-
 net/socket.h  |3 +
 qemu-options.hx   |5 +
 9 files changed, 533 insertions(+), 4 deletions(-)
 create mode 100644 net/qdes.c
 create mode 100644 net/qdes.h

diff --git a/hmp-commands.hx b/hmp-commands.hx
index f5d9d91..042bb85 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1007,7 +1007,7 @@ ETEXI
 {
 .name   = "host_net_add",
 .args_type  = "device:s,opts:s?",
-.params = "tap|user|socket|vde|dump [options]",
+.params = "tap|user|socket|qdes|vde|dump [options]",
 .help   = "add host VLAN client",
 .mhandler.cmd = net_host_device_add,
 },
diff --git a/net.c b/net.c
index 4aa416c..dbb4a48 100644
--- a/net.c
+++ b/net.c
@@ -30,6 +30,7 @@
 #include "net/dump.h"
 #include "net/slirp.h"
 #include "net/vde.h"
+#include "net/qdes.h"
 #include "net/util.h"
 #include "monitor.h"
 #include "qemu-common.h"
@@ -1016,6 +1017,30 @@ static const struct {
 { /* end of list */ }
 },
 },
+[NET_CLIENT_TYPE_QDES] = {
+.type = "qdes",
+.init = net_init_qdes,
+.desc = {
+NET_COMMON_PARAMS_DESC,
+{
+.name = "timer",
+.type = QEMU_OPT_NUMBER,
+.help = "Seconds between cleaning mac address table"
+}, {
+.name = "mcast",
+.type = QEMU_OPT_STRING,
+.help = "UDP multicast address and port number",
+}, {
+.name = "localaddr",
+.type = QEMU_OPT_STRING,
+.help = "source address for multicast and udp packets",
+}, {
+.name = "network",
+.type = QEMU_OPT_NUMBER,
+.help = "qdes network number",
+},
+},
+},
 #ifdef CONFIG_VDE
 [NET_CLIENT_TYPE_VDE] = {
 .type = "vde",
@@ -1104,7 +1129,8 @@ int net_client_init(QemuOpts *opts, int is_netdev, Error 
**errp)
 #ifdef CONFIG_VDE
 strcmp(type, "vde") != 0 &&
 #endif
-strcmp(type, "socket") != 0) {
+strcmp(type, "socket") != 0 &&
+strcmp(type, "qdes") != 0) {
 error_set(errp, QERR_INVALID_PARAMETER_VALUE, "type",
   "a netdev backend type");
 return -1;
@@ -1170,7 +1196,7 @@ int net_client_init(QemuOpts *opts, int is_netdev, Error 
**errp)
 static int net_host_check_device(const char *device)
 {
 int i;
-const char *valid_param_list[] = { "tap", "socket", "dump"
+const char *valid_param_list[] = { "tap", "socket", "dump", "qdes"
 #ifdef CONFIG_NET_BRIDGE
   

Re: [Qemu-devel] [PATCH] Initial commit for QDES - QEMU Distributed Ethernet Switch

2012-06-25 Thread Mike Lovell
I'm not sure why but it looks like my intro email for this got eaten by 
something. Here it is again and sorry if it shows up twice. This is my 
first time posting to the list and submitting a patch and I guess 
something doesn't like the way I did it.


-
Hi all,
Here is something I've been tinkering with the past few weeks and now 
have it

in a state where the basic idea makes sense, it works, and could use some
feedback from the community.

This is what I've been calling QDES or QEMU Distributed Ethernet Switch. I
first had the idea when I was playing with the udp and mcast socket network
backends while exploring how to build a VM infrastructure. I liked the 
idea of

using the sockets backends cause it doesn't require escalated permissions to
configure and run as well as the ability to talk over IP networks.

But the built in socket backends either allowed for only 2 guests talking
directly or for multiple guests where all traffic is sent to all. So one can
either have two guests talking or have bandwidth wasted with multiple 
guests.

There wasn't something that could talk to multiple guests but also utilize
unicast traffic.

So I made a backend that can do this. It takes the basics of how the udp and
mcast socket backends work and combines them with some switching based 
on the

ethernet packets. The result is multiple guests can talk to each other but
not waste bandwidth by delivering unicast traffic to all guests. The backend
also adds some header data to each packet. This header includes a network
identifier so multiple logical networks can be created using the same
multicast configuration but still have separation in the guests.

There are a couple advantages that I see to this. It allows for multiple 
guests

in multiple locations to talk to each other while keeping unicast traffic to
just between two hosts. It doesn't require root permissions to run. It can
operate over non-ethernet networks (like IPoIB). It doesn't require changing
network configuration on the host. It allows for a ton of logical 
networks to

be created (currently 65536 per multicast address and port combination).

There are a few disadvantages as well. It does add some more processing 
to the
QEMU process but not much (I saw it go as fast as the socket backends). 
It is
encapsulating an Ethernet frame inside a UDP packet so there is the 
overhead of

the IP and UDP headers as well as the transport medium headers (most likely
Ethernet again). Because there is additional header data and MTU of the 
guest
could be limited depending on the ability to send larger multicast 
packet from
the host. (I haven't really looked closely at this last one). There 
isn't the
ability for something besides QEMU processes to communicate using this, 
though

I hope to build a utility to work with a tap device.

Overall, I think this is something that's pretty cool. I don't know how much
people give any thought to the socket backends for real world use and so I
don't know if this would be of much use to anyone. I am looking for some
feedback into what the community thinks and for comments about the code. Its
only my second time doing more than 20 lines of C so I'm sure I did some 
stupid

things. I have only tested on 64 bit x86 Linux systems so far.

Hopefully you all have good things to say. :)

mike



Re: [Qemu-devel] net: RFC New Socket-Based, Switched Network Backend (QDES)

2012-06-25 Thread Mike Lovell

On 06/25/2012 04:40 AM, Stefan Hajnoczi wrote:

On Mon, Jun 25, 2012 at 6:42 AM, Mike Lovell  wrote:

This is what I've been calling QDES or QEMU Distributed Ethernet Switch. I
first had the idea when I was playing with the udp and mcast socket network
backends while exploring how to build a VM infrastructure. I liked the idea of
using the sockets backends cause it doesn't require escalated permissions to
configure and run as well as the ability to talk over IP networks.

But the built in socket backends either allowed for only 2 guests talking
directly or for multiple guests where all traffic is sent to all. So one can
either have two guests talking or have bandwidth wasted with multiple guests.
There wasn't something that could talk to multiple guests but also utilize
unicast traffic.

Have you looked at QEMU's net/vde.c backend?

Does VDE (http://vde.sourceforge.net/) already do everything that QDES does?

Stefan


I have looked at VDE and used it for a few things. I think QDES has an 
advantage over VDE in that it doesn't require additional steps to get up 
and running. With VDE, one has to at least start a vde_switch process on 
the host for the guest to connect to. Then, if multiple hosts are being 
used, the multiple vde_switch processes have to be connected. Maybe 
through some vde_plug processes connected through netcat or ssh. Its a 
few extra steps that can be a pain in a dynamic environment. With QDES, 
the only configuration should be the options to QEMU as long as the 
multiple hosts can send and receive data through the same multicast 
address.


QDES doesn't have anywhere close to the feature set that VDE does. This 
doesn't do VLANs at the switching level. It doesn't support STP so 
connecting it to other bridges might cause weird behavior. Etc. But it 
does provide a really easy way for multiple guests on multiple hosts to 
be connected.


mike



Re: [Qemu-devel] net: RFC New Socket-Based, Switched Network Backend (QDES)

2012-06-25 Thread Mike Lovell

On 06/25/2012 09:33 AM, Mike Lovell wrote:

On 06/25/2012 04:40 AM, Stefan Hajnoczi wrote:

Have you looked at QEMU's net/vde.c backend?

Does VDE (http://vde.sourceforge.net/) already do everything that 
QDES does?


Stefan


I have looked at VDE and used it for a few things. I think QDES has an 
advantage over VDE in that it doesn't require additional steps to get 
up and running. With VDE, one has to at least start a vde_switch 
process on the host for the guest to connect to. Then, if multiple 
hosts are being used, the multiple vde_switch processes have to be 
connected. Maybe through some vde_plug processes connected through 
netcat or ssh. Its a few extra steps that can be a pain in a dynamic 
environment. With QDES, the only configuration should be the options 
to QEMU as long as the multiple hosts can send and receive data 
through the same multicast address.


QDES doesn't have anywhere close to the feature set that VDE does. 
This doesn't do VLANs at the switching level. It doesn't support STP 
so connecting it to other bridges might cause weird behavior. Etc. But 
it does provide a really easy way for multiple guests on multiple 
hosts to be connected.


Oh. I forgot another reason why I decided to do this over using VDE. 
I'll do this one with an example. Say you have 3 virtual machines on 3 
different hosts. Each host has a vde_switch process running, Switch A, 
B, and C. Each vde_switch has connections to the other 2 through some 
vde_plug's and netcat. In this case, VDE will disable one of the links 
between switches to prevent loops, say the link between Switch A and C. 
Traffic from the VM connected to Switch C that is destined for the VM on 
Switch A will have to traverse through Switch B. This is a suboptimal 
traffic flow. Especially when you consider that the traffic has to flow 
through 3 to 4 additional processes on each host for each direction.


QDES is able to send traffic directly to the correct destination because 
it isn't using something like STP. Connecting it to a regular 802.1D 
switching network would probably result in bad things. There aren't 
loops in QDES because individual packets only flow one way through it. 
Packets received from the sockets aren't sent back out the sockets.


I hope that makes sense.

mike



Re: [Qemu-devel] net: RFC New Socket-Based, Switched Network Backend (QDES)

2012-06-26 Thread Mike Lovell

On 06/26/2012 02:29 AM, Stefan Hajnoczi wrote:

On Mon, Jun 25, 2012 at 5:32 PM, Mike Lovell  wrote:


Oh. I forgot another reason why I decided to do this over using VDE. I'll do
this one with an example. Say you have 3 virtual machines on 3 different
hosts. Each host has a vde_switch process running, Switch A, B, and C. Each
vde_switch has connections to the other 2 through some vde_plug's and
netcat. In this case, VDE will disable one of the links between switches to
prevent loops, say the link between Switch A and C. Traffic from the VM
connected to Switch C that is destined for the VM on Switch A will have to
traverse through Switch B. This is a suboptimal traffic flow. Especially
when you consider that the traffic has to flow through 3 to 4 additional
processes on each host for each direction.

I haven't tried VDE myself but this sounds odd.  Why can't you run a
single vde_switch instance and connect multiple guests to it (with
netcat)?


you can connect multiple guests to a single vde_switch. as i understand 
it, the communication happens over a local unix domain socket. this 
would limit the guests on the same switch to the same host. so if you 
want guests on multiple hosts to talk using vde, i think you need a 
vde_switch on each host and then connect the vde_switch processes.


mike



Re: [Qemu-devel] net: RFC New Socket-Based, Switched Network Backend (QDES)

2012-06-27 Thread Mike Lovell

On 06/27/2012 02:26 AM, Stefan Hajnoczi wrote:

On Tue, Jun 26, 2012 at 5:48 PM, Mike Lovell  wrote:

you can connect multiple guests to a single vde_switch. as i understand it,
the communication happens over a local unix domain socket. this would limit
the guests on the same switch to the same host. so if you want guests on
multiple hosts to talk using vde, i think you need a vde_switch on each host
and then connect the vde_switch processes.

It can be done with socat or netcat (unix_domain_socket_a <-> TCP <->
unix_domain_socket_b):
http://www.dest-unreach.org/socat/doc/socat.html

Another idea is to take the QDES code and turn it into a freestanding
program that speaks the net/socket.c protocol.  That way it works with
existing QEMUs: launch the qdes daemon, then launch qemu -netdev
socket,connect=qdes-host:qdes-port.


okay. so yes that would be possible. its complicated by the fact that 
there are multiple unix domain sockets used between a vde client 
application (like QEMU) and the vde_switch process and some of these are 
dynamically created. clients connect to a control socket for the switch, 
negotiate the creation of a new unix socket, and then pass traffic over 
the new socket. whatever is starting the vde_switches and the qemu 
guests will have to be aware of this and connect them.


this goes back to one of my original points. setting this up on multiple 
physical hosts is more complex than QDES is. with QDES, you just need to 
specify a ip and port on the physical host that is unique to each 
process as well as specify a multicast ip and port thats the same for 
all. no figuring out where the other virtual machines are running, 
spawning additional processes, and connecting all of those processes 
together. (i hope to eventually remove the need to specify the local 
address and port)


also, connecting multiple guests on multiple hosts to a single 
vde_switch instance results in a sub-optimal traffic flow. traffic from 
a guest on one host will have to go to the host running the vde_switch 
and then to the host with the destination guest. with QDES, the traffic 
goes directly to the host running the destination guest.


i actually did my initial prototyping of QDES using an external python 
app that talks to QEMU using the socket network backend. i used the udp 
one instead of connect but it still worked. some of the future work i 
intend to do is to have a few external applications that can talk to 
other QDES as well. i want to do one that can forward packets between a 
tap device and other QEMU processes so that regular systems can talk to 
the QEMU virtual machines. doing another that talks to the QEMU sockets 
backend would be possible as well. i do think there is some value in 
having it built into QEMU in the ease of use.


mike



[Qemu-devel] [PATCH] flatload: fix bss clearing

2012-07-09 Thread Mike Frysinger
The current bss clear logic assumes the target mmap address and host
address are the same.  Use g2h to translate from the target address
space to the host so we can call memset on it.

Signed-off-by: Mike Frysinger 
---
 linux-user/flatload.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/linux-user/flatload.c b/linux-user/flatload.c
index be79496..58f679e 100644
--- a/linux-user/flatload.c
+++ b/linux-user/flatload.c
@@ -660,7 +660,7 @@ static int load_flat_file(struct linux_binprm * bprm,
 }
 
 /* zero the BSS.  */
-memset((void *)((unsigned long)datapos + data_len), 0, bss_len);
+memset(g2h(datapos + data_len), 0, bss_len);
 
 return 0;
 }
-- 
1.7.7.3




Re: [Qemu-devel] [PATCH] flatload: fix bss clearing

2012-07-09 Thread Mike Frysinger
On Monday 09 July 2012 09:21:52 Andreas Färber wrote:
> Am 09.07.2012 15:04, schrieb Mike Frysinger:
> > The current bss clear logic assumes the target mmap address and host
> > address are the same.  Use g2h to translate from the target address
> > space to the host so we can call memset on it.
> 
> Patch looks sensible. Are you working on rebasing your Blackfin target
> to QOM and AREG0?

i've rebased them to the latest release (1.1.0).  FDPIC seems to work fine, as 
does basic ELF, but FLAT gets into an infinite loop and i haven't figured out 
why just yet.
-mike


signature.asc
Description: This is a digitally signed message part.


[Qemu-devel] [PATCH] fix gcc warnings when RESERVED_VA is 0

2012-09-15 Thread Mike Frysinger
The current code, while correct, triggers a bunch of gcc warnings when
RESERVED_VA is 0 like so:
linux-user/syscall.c: In function 'do_shmat':
linux-user/syscall.c:3058: warning: comparison of unsigned expression < 0 is 
always false
linux-user/syscall.c: In function 'open_self_maps':
linux-user/syscall.c:4960: warning: comparison of unsigned expression < 0 is 
always false
linux-user/syscall.c:4960: warning: comparison of unsigned expression < 0 is 
always false

Signed-off-by: Mike Frysinger 
---
 cpu-all.h |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/cpu-all.h b/cpu-all.h
index 5e07d28..0e5dcf0 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -202,10 +202,16 @@ extern unsigned long reserved_va;
 #if HOST_LONG_BITS <= TARGET_VIRT_ADDR_SPACE_BITS
 #define h2g_valid(x) 1
 #else
+/* Gcc likes to warn about comparing unsigned longs to < 0, so cpp it away.  */
+# if RESERVED_VA
+#  define _h2g_reserved_va(x) ((x) < RESERVED_VA)
+# else
+#  define _h2g_reserved_va(x) 1
+# endif
 #define h2g_valid(x) ({ \
 unsigned long __guest = (unsigned long)(x) - GUEST_BASE; \
 (__guest < (1ul << TARGET_VIRT_ADDR_SPACE_BITS)) && \
-(!RESERVED_VA || (__guest < RESERVED_VA)); \
+_h2g_reserved_va(__guest); \
 })
 #endif
 
-- 
1.7.9.7




[Qemu-devel] [PATCH] fix warnings from printf target addresses

2012-09-15 Thread Mike Frysinger
Current code triggers:
memory.c: In function 'invalid_read':
memory.c:1001: warning: format '%#x' expects type 'unsigned int',
but argument 4 has type 'target_phys_addr_t'
memory.c: In function 'invalid_write':
memory.c:1013: warning: format '%#x' expects type 'unsigned int',
    but argument 4 has type 'target_phys_addr_t'

Signed-off-by: Mike Frysinger 
---
 memory.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/memory.c b/memory.c
index 58a242d..7d5f4a3 100644
--- a/memory.c
+++ b/memory.c
@@ -998,7 +998,8 @@ static uint64_t invalid_read(void *opaque, 
target_phys_addr_t addr,
 MemoryRegion *mr = opaque;
 
 if (!mr->warning_printed) {
-fprintf(stderr, "Invalid read from memory region %s at offset %#x\n", 
mr->name, addr);
+fprintf(stderr, "Invalid read from memory region %s at offset %#llx\n",
+mr->name, (unsigned long long)addr);
 mr->warning_printed = true;
 }
 return -1U;
@@ -1010,7 +1011,8 @@ static void invalid_write(void *opaque, 
target_phys_addr_t addr, uint64_t data,
 MemoryRegion *mr = opaque;
 
 if (!mr->warning_printed) {
-fprintf(stderr, "Invalid write to memory region %s at offset %#x\n", 
mr->name, addr);
+fprintf(stderr, "Invalid write to memory region %s at offset %#llx\n",
+mr->name, (unsigned long long)addr);
 mr->warning_printed = true;
 }
 }
-- 
1.7.9.7




[Qemu-devel] [PATCH] allow make {dist,}clean work w/out configure

2012-09-15 Thread Mike Frysinger
There's no reason to require configure to run before running a clean
target, so check MAKECMDGOALS before.

Signed-off-by: Mike Frysinger 
---
 Makefile |4 
 1 file changed, 4 insertions(+)

diff --git a/Makefile b/Makefile
index 1cd5bc8..e75740c 100644
--- a/Makefile
+++ b/Makefile
@@ -14,9 +14,11 @@ config-host.mak: $(SRC_PATH)/configure
@sed -n "/.*Configured with/s/[^:]*: //p" $@ | sh
 else
 config-host.mak:
+ifeq ($(findstring clean,$(MAKECMDGOALS)),)
@echo "Please call configure before running make!"
@exit 1
 endif
+endif
 
 GENERATED_HEADERS = config-host.h trace.h qemu-options.def
 ifeq ($(TRACE_BACKEND),dtrace)
@@ -398,7 +400,9 @@ qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \
 
 # Add a dependency on the generated files, so that they are always
 # rebuilt before other object files
+ifeq ($(findstring clean,$(MAKECMDGOALS)),)
 Makefile: $(GENERATED_HEADERS)
+endif
 
 # Include automatically generated dependency files
 # Dependencies in Makefile.objs files come from our recursive subdir rules
-- 
1.7.9.7




Re: [Qemu-devel] [PATCH] fix warnings from printf target addresses

2012-09-16 Thread Mike Frysinger
On Sunday 16 September 2012 03:24:22 Blue Swirl wrote:
> On Sun, Sep 16, 2012 at 12:05 AM, Mike Frysinger wrote:
> > Current code triggers:
> > memory.c: In function 'invalid_read':
> > memory.c:1001: warning: format '%#x' expects type 'unsigned int',
> > but argument 4 has type 'target_phys_addr_t'
> > memory.c: In function 'invalid_write':
> > memory.c:1013: warning: format '%#x' expects type 'unsigned int',
> > but argument 4 has type 'target_phys_addr_t'
> > 
> > -fprintf(stderr, "Invalid read from memory region %s at offset
> > %#x\n", mr->name, addr); +fprintf(stderr, "Invalid read from
> > memory region %s at offset %#llx\n", +mr->name,
> > (unsigned long long)addr);
> 
> The right way is not adding potentially dangerous casts

they're not dangerous here at all

> but using TARGET_PRIxPHYS or TARGET_FMT_plx.

np.  i just copied what was already in memory.c, but i guess that's broken 
too.
-mike


signature.asc
Description: This is a digitally signed message part.


[Qemu-devel] [PATCH] include address in invalid memory region accesses

2012-09-16 Thread Mike Frysinger
The current code to display invalid memory accesses isn't terribly useful
as it doesn't tell you what address is actually being accessed.  Include
it in the error message.

Signed-off-by: Mike Frysinger 
---
 memory.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/memory.c b/memory.c
index d528d1f..0ea0320 100644
--- a/memory.c
+++ b/memory.c
@@ -998,7 +998,9 @@ static uint64_t invalid_read(void *opaque, 
target_phys_addr_t addr,
 MemoryRegion *mr = opaque;
 
 if (!mr->warning_printed) {
-fprintf(stderr, "Invalid read from memory region %s\n", mr->name);
+fprintf(stderr,
+"Invalid read from memory region %s at %#" TARGET_PRIxPHYS 
"\n",
+mr->name, addr);
 mr->warning_printed = true;
 }
 return -1U;
@@ -1010,7 +1012,9 @@ static void invalid_write(void *opaque, 
target_phys_addr_t addr, uint64_t data,
 MemoryRegion *mr = opaque;
 
 if (!mr->warning_printed) {
-fprintf(stderr, "Invalid write to memory region %s\n", mr->name);
+fprintf(stderr,
+"Invalid write to memory region %s at %#" TARGET_PRIxPHYS "\n",
+mr->name, addr);
 mr->warning_printed = true;
 }
 }
-- 
1.7.9.7




Re: [Qemu-devel] [PATCH] configure: do not quote $PKG_CONFIG

2012-09-16 Thread Mike Frysinger
On Sunday 15 July 2012 15:54:51 Stefan Weil wrote:
> Am 15.07.2012 22:26, schrieb Mike Frysinger:
> > We should not quote the PKG_CONFIG setting as this deviates from the
> > canonical upstream behavior that gets integrated with all other build
> > systems, and deviates from how we treat all other toolchain variables
> > that we get from the environment.
> > 
> > Ultimately, the point is that it breaks passing custom flags directly
> > to pkg-config via the env var where this normally works elsewhere,
> > and it used to work in the past.

ping ...
-mike


signature.asc
Description: This is a digitally signed message part.


Re: [Qemu-devel] [PATCH] allow make {dist, }clean work w/out configure

2012-09-16 Thread Mike Frysinger
On Sunday 16 September 2012 05:52:47 Paolo Bonzini wrote:
> Il 16/09/2012 02:30, Mike Frysinger ha scritto:
> > There's no reason to require configure to run before running a clean
> > target, so check MAKECMDGOALS before.
> > 
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -14,9 +14,11 @@ config-host.mak: $(SRC_PATH)/configure
> > @sed -n "/.*Configured with/s/[^:]*: //p" $@ | sh
> >  else
> >  config-host.mak:
> > +ifeq ($(findstring clean,$(MAKECMDGOALS)),)
> 
> Please use instead:
> 
> ifneq ($(filter-out %clean,$(MAKECMDGOALS)),)
> 
> so that "make clean all" still gives error.

that breaks `make` though.  or rather, `make` no longer errors out.  need to 
make it a little more complicated to handle this.
-mike


signature.asc
Description: This is a digitally signed message part.


[Qemu-devel] [PATCH v2] allow make {dist, }clean work w/out configure

2012-09-16 Thread Mike Frysinger
There's no reason to require configure to run before running a clean
target, so check MAKECMDGOALS before.

Signed-off-by: Mike Frysinger 
---
v2
- handle edge cases

 Makefile |4 
 1 file changed, 4 insertions(+)

diff --git a/Makefile b/Makefile
index 1cd5bc8..0a37369 100644
--- a/Makefile
+++ b/Makefile
@@ -14,9 +14,11 @@ config-host.mak: $(SRC_PATH)/configure
@sed -n "/.*Configured with/s/[^:]*: //p" $@ | sh
 else
 config-host.mak:
+ifneq ($(filter-out %clean,$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
@echo "Please call configure before running make!"
@exit 1
 endif
+endif
 
 GENERATED_HEADERS = config-host.h trace.h qemu-options.def
 ifeq ($(TRACE_BACKEND),dtrace)
@@ -398,7 +400,9 @@ qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \
 
 # Add a dependency on the generated files, so that they are always
 # rebuilt before other object files
+ifneq ($(filter-out %clean,$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
 Makefile: $(GENERATED_HEADERS)
+endif
 
 # Include automatically generated dependency files
 # Dependencies in Makefile.objs files come from our recursive subdir rules
-- 
1.7.9.7




[Qemu-devel] [PATCH] ignore more generated files

2012-09-16 Thread Mike Frysinger
Signed-off-by: Mike Frysinger 
---
 .gitignore |8 
 1 file changed, 8 insertions(+)

diff --git a/.gitignore b/.gitignore
index 824c0d2..25c134e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -11,6 +11,7 @@ trace-dtrace.dtrace
 *-darwin-user
 *-linux-user
 *-bsd-user
+libcacard/vscclient
 libdis*
 libhw32
 libhw64
@@ -51,7 +52,11 @@ test-string-output-visitor
 test-visitor-serialization
 fsdev/virtfs-proxy-helper.1
 fsdev/virtfs-proxy-helper.pod
+a.out*
 .gdbinit
+.gdb_history
+core
+gmon.out
 *.a
 *.aux
 *.cp
@@ -79,6 +84,9 @@ fsdev/virtfs-proxy-helper.pod
 *.orig
 .pc
 patches
+*.diff
+*.patch
+*.rej
 pc-bios/bios-pq/status
 pc-bios/vgabios-pq/status
 pc-bios/optionrom/linuxboot.bin
-- 
1.7.9.7




Re: [Qemu-devel] [PATCH] ignore more generated files

2012-09-16 Thread Mike Frysinger
On Sunday 16 September 2012 16:33:15 Stefan Weil wrote:
> Am 16.09.2012 22:11, schrieb Mike Frysinger:
> > +libcacard/vscclient
> > +a.out*
> > +.gdb_history
> > +core
> > +gmon.out
> > +*.diff
> > +*.patch
> > +*.rej
> 
> IMHO .gitignore should only contain files which are generated
> by a normal QEMU build.
> 
> a.out*, .gdb_history, core, *.diff, *patch, *.rej are not QEMU specific
> and should be ignored in the user's GIT configuration.

these files show up a lot when doing development on qemu, hence i think they do 
make sense to be listed here.  it also makes things "just work" for all users 
rather than requiring every one to set up their local system in the same way.  
plus, that would conflict with repos that do want to merge these types of files 
(uncommon, but not unheard of as test inputs).

it's not like entries in this file "cost" anything at all.
-mike


signature.asc
Description: This is a digitally signed message part.


Re: [Qemu-devel] [PATCH] ignore more generated files

2012-09-17 Thread Mike Frysinger
On Monday 17 September 2012 03:19:54 Jan Kiszka wrote:
> On 2012-09-16 22:55, Mike Frysinger wrote:
> > On Sunday 16 September 2012 16:33:15 Stefan Weil wrote:
> >> Am 16.09.2012 22:11, schrieb Mike Frysinger:
> >>> +libcacard/vscclient +a.out* +.gdb_history +core +gmon.out
> >>> +*.diff +*.patch +*.rej
> >> 
> >> IMHO .gitignore should only contain files which are generated by
> >> a normal QEMU build.
> >> 
> >> a.out*, .gdb_history, core, *.diff, *patch, *.rej are not QEMU
> >> specific and should be ignored in the user's GIT configuration.
> > 
> > these files show up a lot when doing development on qemu, hence i
> > think they do make sense to be listed here.  it also makes things
> > "just work" for all users rather than requiring every one to set
> > up their local system in the same way. plus, that would conflict
> > with repos that do want to merge these types of files (uncommon,
> > but not unheard of as test inputs).
> > 
> > it's not like entries in this file "cost" anything at all.
> 
> NAK. I'd like to see what is polluting my repositories, not ignoring
> it because that's the policy of someone else. Excluding *.rej is
> something I would _never_ do. If they lie around somewhere, something
> was not merged here - that's my policy.

if something wasn't merged, your git command would have told you and spit an 
error.  same with patch.  if .rej files are showing up and you're not noticing 
at the time of patch application, it's because you're using the tools wrong.
-mike


signature.asc
Description: This is a digitally signed message part.


[Qemu-devel] [RFC PATCH 0/3] v2.2 RCU Implementation for QEMU

2013-08-16 Thread Mike Day
This patch set merges Paolo's conversion of address spaces to enable
RCU. There is one more patchset coming for TLB access that I'm
debugging right now. After I submit the last one I'll start working on
enabling RCU more widely - and the series will need to be refactored.

The series is availale online at: 

https://github.com/ncultra/qemu/tree/rcu-for-1.7

Mike Day (2):
  fixup changes from commit f63ca950
  fixup changes from commit f62a6b2f from Paulo Bonzini

Paolo Bonzini (1):
  exec: change iotlb APIs to take AddressSpaceDispatch

 aio-posix.c|  1 +
 aio-win32.c|  1 +
 cpus.c |  3 +++
 cputlb.c   |  7 ---
 exec.c | 13 -
 include/exec/cpu-all.h |  3 +++
 include/exec/cputlb.h  |  9 ++---
 7 files changed, 26 insertions(+), 11 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [PATCH 3/3] exec: change iotlb APIs to take AddressSpaceDispatch

2013-08-16 Thread Mike Day
From: Paolo Bonzini 

This makes it possible to start following RCU rules, which require
not dereferencing as->dispatch more than once.  It is not covering
the whole of TCG, since the TLB data structures are not RCU-friendly,
but it is enough for exec.c.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 cputlb.c  | 7 ---
 exec.c| 9 +
 include/exec/cputlb.h | 9 ++---
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 977c0ca..1f74d08 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -255,6 +255,7 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
   hwaddr paddr, int prot,
   int mmu_idx, target_ulong size)
 {
+AddressSpaceDispatch *d;
 MemoryRegionSection *section;
 unsigned int index;
 target_ulong address;
@@ -269,8 +270,8 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
 }
 
 sz = size;
-section = address_space_translate_for_iotlb(&address_space_memory, paddr,
-&xlat, &sz);
+d = address_space_memory.dispatch;
+section = address_space_translate_for_iotlb(d, paddr, &xlat, &sz);
 assert(sz >= TARGET_PAGE_SIZE);
 
 #if defined(DEBUG_TLB)
@@ -290,7 +291,7 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
 }
 
 code_address = address;
-iotlb = memory_region_section_get_iotlb(env, section, vaddr, paddr, xlat,
+iotlb = memory_region_section_get_iotlb(d, env, section, vaddr, paddr, 
xlat,
 prot, &address);
 
 index = (vaddr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
diff --git a/exec.c b/exec.c
index 52e7fd5..c194bad 100644
--- a/exec.c
+++ b/exec.c
@@ -301,11 +301,11 @@ MemoryRegion *address_space_translate(AddressSpace *as, 
hwaddr addr,
 }
 
 MemoryRegionSection *
-address_space_translate_for_iotlb(AddressSpace *as, hwaddr addr, hwaddr *xlat,
+address_space_translate_for_iotlb(AddressSpaceDispatch *d, hwaddr addr, hwaddr 
*xlat,
   hwaddr *plen)
 {
 MemoryRegionSection *section;
-section = address_space_translate_internal(as->dispatch, addr, xlat, plen, 
false);
+section = address_space_translate_internal(d, addr, xlat, plen, false);
 
 assert(!section->mr->iommu_ops);
 return section;
@@ -719,7 +719,8 @@ static int cpu_physical_memory_set_dirty_tracking(int 
enable)
 return ret;
 }
 
-hwaddr memory_region_section_get_iotlb(CPUArchState *env,
+hwaddr memory_region_section_get_iotlb(AddressSpaceDispatch *d,
+   CPUArchState *env,
MemoryRegionSection *section,
target_ulong vaddr,
hwaddr paddr, hwaddr xlat,
@@ -739,7 +740,7 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env,
 iotlb |= PHYS_SECTION_ROM;
 }
 } else {
-iotlb = section - address_space_memory.dispatch->sections;
+iotlb = section - d->sections;
 iotlb += xlat;
 }
 
diff --git a/include/exec/cputlb.h b/include/exec/cputlb.h
index e21cb60..bd7ff2f 100644
--- a/include/exec/cputlb.h
+++ b/include/exec/cputlb.h
@@ -19,6 +19,8 @@
 #ifndef CPUTLB_H
 #define CPUTLB_H
 
+#include 
+
 #if !defined(CONFIG_USER_ONLY)
 /* cputlb.c */
 void tlb_protect_code(ram_addr_t ram_addr);
@@ -34,9 +36,10 @@ extern int tlb_flush_count;
 void tb_flush_jmp_cache(CPUArchState *env, target_ulong addr);
 
 MemoryRegionSection *
-address_space_translate_for_iotlb(AddressSpace *as, hwaddr addr, hwaddr *xlat,
-  hwaddr *plen);
-hwaddr memory_region_section_get_iotlb(CPUArchState *env,
+address_space_translate_for_iotlb(AddressSpaceDispatch *d, hwaddr addr,
+  hwaddr *xlat, hwaddr *plen);
+hwaddr memory_region_section_get_iotlb(AddressSpaceDispatch *d,
+   CPUArchState *env,
MemoryRegionSection *section,
target_ulong vaddr,
hwaddr paddr, hwaddr xlat,
-- 
1.8.3.1




[Qemu-devel] [PATCH 1/3] fixup changes from commit f63ca950

2013-08-16 Thread Mike Day
From: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 aio-posix.c | 1 +
 aio-win32.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/aio-posix.c b/aio-posix.c
index 562add6..7f27ea5 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -176,6 +176,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
 int ret;
 bool busy, progress;
 
+rcu_quiescent_state();
 progress = false;
 
 /*
diff --git a/aio-win32.c b/aio-win32.c
index 8a6abb0..9744fa5 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -99,6 +99,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
 bool busy, progress;
 int count;
 
+rcu_quiescent_state();
 progress = false;
 
 /*
-- 
1.8.3.1




[Qemu-devel] [PATCH 2/3] fixup changes from commit f62a6b2f from Paulo Bonzini

2013-08-16 Thread Mike Day
From: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 cpus.c | 3 +++
 exec.c | 4 +++-
 include/exec/cpu-all.h | 3 +++
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index 624658e..d6f2775 100644
--- a/cpus.c
+++ b/cpus.c
@@ -764,6 +764,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
 CPUState *cpu = arg;
 int r;
 
+tls_alloc_cpu_single_env_var();
 qemu_mutex_lock(&qemu_global_mutex);
 qemu_thread_get_self(cpu->thread);
 cpu->thread_id = qemu_get_thread_id();
@@ -804,6 +805,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
 sigset_t waitset;
 int r;
 
+tls_alloc_cpu_single_env_var();
 qemu_mutex_lock_iothread();
 qemu_thread_get_self(cpu->thread);
 cpu->thread_id = qemu_get_thread_id();
@@ -850,6 +852,7 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
 CPUState *cpu = arg;
 
+tls_alloc_cpu_single_env_var();
 qemu_tcg_init_cpu_signals();
 qemu_thread_get_self(cpu->thread);
 
diff --git a/exec.c b/exec.c
index 3ca9381..52e7fd5 100644
--- a/exec.c
+++ b/exec.c
@@ -45,7 +45,7 @@
 #include "trace.h"
 #endif
 #include "exec/cpu-all.h"
-
+#include "qemu/tls.h"
 #include "exec/cputlb.h"
 #include "translate-all.h"
 
@@ -72,6 +72,7 @@ static MemoryRegion io_mem_unassigned;
 CPUState *first_cpu;
 /* current CPU in the current thread. It is only valid inside
cpu_exec() */
+DEFINE_TLS(CPUArchState *, cpu_single_env_var);
 DEFINE_TLS(CPUState *, current_cpu);
 /* 0 = Do not count executed instructions.
1 = Precise instruction counting.
@@ -313,6 +314,7 @@ address_space_translate_for_iotlb(AddressSpace *as, hwaddr 
addr, hwaddr *xlat,
 
 void cpu_exec_init_all(void)
 {
+tls_alloc_cpu_single_env_var();
 #if !defined(CONFIG_USER_ONLY)
 qemu_mutex_init(&ram_list.mutex);
 memory_map_init();
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index a407b50..b961e58 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -22,6 +22,7 @@
 #include "qemu-common.h"
 #include "exec/cpu-common.h"
 #include "qemu/thread.h"
+#include "qemu/tls.h"
 #include "qom/cpu.h"
 
 /* some important defines:
@@ -356,6 +357,8 @@ int page_get_flags(target_ulong address);
 void page_set_flags(target_ulong start, target_ulong end, int flags);
 int page_check_range(target_ulong start, target_ulong len, int flags);
 #endif
+DECLARE_TLS(CPUArchState *, cpu_single_env_var);
+#define cpu_single_env (*tls_get_cpu_single_env_var())
 
 CPUArchState *cpu_copy(CPUArchState *env);
 
-- 
1.8.3.1




Re: [Qemu-devel] [RFC PATCH 0/3] v2.2 RCU Implementation for QEMU

2013-08-16 Thread Mike Day

Andreas Färber  writes:


>> https://github.com/ncultra/qemu/tree/rcu-for-1.7
>> 
>> Mike Day (2):
>>   fixup changes from commit f63ca950
>>   fixup changes from commit f62a6b2f from Paulo Bonzini
>
> These two patches are certainly not acceptable for upstream, lacking any
> textual explanation and with those subjects.

Thanks. The entire series is not ready yet - it needs to be
refactored and tested, hence the RFC tag. I'm trying to keep an rebased
version of the Paolo's May RCU tree available for those who are
interested. When its ready to be considered for an upstream merge it
will be as expected. 

The the next step for me is to start working on leveraging the RCU code. 

Mike

-- 

Mike Day | + 1 919 371-8786 | ncm...@ncultra.org
"Endurance is a Virtue"



[Qemu-devel] [PATCH] Fixup some dynamic casts in the Qemu device tree to correspond to the QOM type-checking system. These patches change from using Linux kernel style upcasts to typesafe object orien

2013-08-19 Thread Mike Day
These patches apply to Paolo Bonzini's rcu tree:

https://github.com/bonzini/qemu/tree/rcu
commit 781e47bf1693a80b84eec298a6a1c7b29ab2c135

Signed-off-by: Mike Day 
---
 hw/misc/ivshmem.c  | 2 +-
 hw/pci-bridge/pci_bridge_dev.c | 6 +++---
 hw/pci/pci_bridge.c| 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index ebcb52a..46d8c27 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -789,7 +789,7 @@ static void pci_ivshmem_uninit(PCIDevice *dev)
 
 static void pci_ivshmem_instance_finalize(Object *obj)
 {
-IVShmemState *s = IVSHMEM(dev);
+IVShmemState *s = IVSHMEM(obj);
 
 if (s->migration_blocker) {
 migrate_del_blocker(s->migration_blocker);
diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
index c995d5d..22caf14 100644
--- a/hw/pci-bridge/pci_bridge_dev.c
+++ b/hw/pci-bridge/pci_bridge_dev.c
@@ -87,7 +87,6 @@ shpc_error:
 bridge_error:
 return err;
 }
-
 static void pci_bridge_dev_exitfn(PCIDevice *dev)
 {
 PCIBridgeDev *bridge_dev = PCI_BRIDGE_DEV(dev);
@@ -102,8 +101,9 @@ static void pci_bridge_dev_exitfn(PCIDevice *dev)
 static void pci_bridge_dev_instance_finalize(Object *obj)
 {
 PCIDevice *dev = PCI_DEVICE(obj);
-PCIBridge *br = DO_UPCAST(PCIBridge, dev, dev);
-PCIBridgeDev *bridge_dev = DO_UPCAST(PCIBridgeDev, bridge, br);
+PCIBridge *br = PCI_BRIDGE(dev);
+PCIBridgeDev *bridge_dev = PCI_BRIDGE_DEV(br);
+
 shpc_free(dev);
 memory_region_destroy(&bridge_dev->bar);
 pci_bridge_free(dev);
diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
index 63f9912..307e076 100644
--- a/hw/pci/pci_bridge.c
+++ b/hw/pci/pci_bridge.c
@@ -391,7 +391,7 @@ void pci_bridge_exitfn(PCIDevice *pci_dev)
 
 void pci_bridge_free(PCIDevice *pci_dev)
 {
-PCIBridge *s = DO_UPCAST(PCIBridge, dev, pci_dev);
+PCIBridge *s = PCI_BRIDGE(pci_dev);
 pci_bridge_region_cleanup(s, s->windows);
 memory_region_destroy(&s->address_space_mem);
 memory_region_destroy(&s->address_space_io);
-- 
1.8.3.1




[Qemu-devel] [PATCH RESEND] RCU implementation for Qemu. Fixup some dynamic casts in the Qemu device tree to correspond to the QOM type-checking system.

2013-08-19 Thread Mike Day

These patches change from using Linux kernel style upcasts to typesafe
object oriented casts with runtime checking semantics.

These patches apply to Paolo Bonzini's rcu tree:

https://github.com/bonzini/qemu/tree/rcu
commit 781e47bf1693a80b84eec298a6a1c7b29ab2c135

Signed-off-by: Mike Day 
---
 hw/misc/ivshmem.c  | 2 +-
 hw/pci-bridge/pci_bridge_dev.c | 6 +++---
 hw/pci/pci_bridge.c| 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index ebcb52a..46d8c27 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -789,7 +789,7 @@ static void pci_ivshmem_uninit(PCIDevice *dev)
 
 static void pci_ivshmem_instance_finalize(Object *obj)
 {
-IVShmemState *s = IVSHMEM(dev);
+IVShmemState *s = IVSHMEM(obj);
 
 if (s->migration_blocker) {
 migrate_del_blocker(s->migration_blocker);
diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
index c995d5d..22caf14 100644
--- a/hw/pci-bridge/pci_bridge_dev.c
+++ b/hw/pci-bridge/pci_bridge_dev.c
@@ -87,7 +87,6 @@ shpc_error:
 bridge_error:
 return err;
 }
-
 static void pci_bridge_dev_exitfn(PCIDevice *dev)
 {
 PCIBridgeDev *bridge_dev = PCI_BRIDGE_DEV(dev);
@@ -102,8 +101,9 @@ static void pci_bridge_dev_exitfn(PCIDevice *dev)
 static void pci_bridge_dev_instance_finalize(Object *obj)
 {
 PCIDevice *dev = PCI_DEVICE(obj);
-PCIBridge *br = DO_UPCAST(PCIBridge, dev, dev);
-PCIBridgeDev *bridge_dev = DO_UPCAST(PCIBridgeDev, bridge, br);
+PCIBridge *br = PCI_BRIDGE(dev);
+PCIBridgeDev *bridge_dev = PCI_BRIDGE_DEV(br);
+
 shpc_free(dev);
 memory_region_destroy(&bridge_dev->bar);
 pci_bridge_free(dev);
diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
index 63f9912..307e076 100644
--- a/hw/pci/pci_bridge.c
+++ b/hw/pci/pci_bridge.c
@@ -391,7 +391,7 @@ void pci_bridge_exitfn(PCIDevice *pci_dev)
 
 void pci_bridge_free(PCIDevice *pci_dev)
 {
-PCIBridge *s = DO_UPCAST(PCIBridge, dev, pci_dev);
+PCIBridge *s = PCI_BRIDGE(pci_dev);
 pci_bridge_region_cleanup(s, s->windows);
 memory_region_destroy(&s->address_space_mem);
 memory_region_destroy(&s->address_space_io);
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] Fixup some dynamic casts in the Qemu device tree to correspond to the QOM type-checking system. These patches change from using Linux kernel style upcasts to typesafe object o

2013-08-19 Thread Mike Day

Peter Maydell  writes:

> On 19 August 2013 19:33, Mike Day  wrote:
>> These patches apply to Paolo Bonzini's rcu tree:
>>
>> https://github.com/bonzini/qemu/tree/rcu
>> commit 781e47bf1693a80b84eec298a6a1c7b29ab2c135
>>
>> Signed-off-by: Mike Day 
>> ---
>>  hw/misc/ivshmem.c  | 2 +-
>>  hw/pci-bridge/pci_bridge_dev.c | 6 +++---
>>  hw/pci/pci_bridge.c| 2 +-
>>  3 files changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
>> index ebcb52a..46d8c27 100644
>> --- a/hw/misc/ivshmem.c
>> +++ b/hw/misc/ivshmem.c
>> @@ -789,7 +789,7 @@ static void pci_ivshmem_uninit(PCIDevice *dev)
>>
>>  static void pci_ivshmem_instance_finalize(Object *obj)
>>  {
>> -IVShmemState *s = IVSHMEM(dev);
>> +IVShmemState *s = IVSHMEM(obj);
>
> This should have been a flat-out compiler error, right?

Yes, correct, but Paolo hasn't previously submitted this specific change
code afaik.

>>  if (s->migration_blocker) {
>>  migrate_del_blocker(s->migration_blocker);
>> diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
>> index c995d5d..22caf14 100644
>> --- a/hw/pci-bridge/pci_bridge_dev.c
>> +++ b/hw/pci-bridge/pci_bridge_dev.c
>> @@ -87,7 +87,6 @@ shpc_error:
>>  bridge_error:
>>  return err;
>>  }
>> -
>
> Stray blank line change.

Thanks - and git just found it for me too. Apologies.

Mike
-- 

Mike Day | + 1 919 371-8786 | ncm...@ncultra.org
"Endurance is a Virtue"



[Qemu-devel] [RFC PATCH] Introduce RCU-enabled DQs.

2013-08-23 Thread Mike Day
Add RCU-enabled variants on the existing bsd DQ facility. Each Q
operation has the same interface as the existing (non-RCU)
version. Also, each operation is implemented as macro for now.

Using the RCU-enabled DQ, existing DQ users will be able to convert to
RCU without using a different list interface.

To accompany the RCU-enabled DQ, there is also a test file that uses
concurrent readers to contend with a single updater.

This patchset builds on top of Paolo Bonzini's rcu tree:
https://github.com/bonzini/qemu/tree/rcu

Signed-off-by: Mike Day 
---
 docs/rcu.txt |   2 +-
 include/qemu/queue.h |  11 --
 include/qemu/rcu_queue.h | 137 +++
 tests/Makefile   |  14 ++-
 tests/rcuq_test.c| 279 +++
 5 files changed, 430 insertions(+), 13 deletions(-)

diff --git a/docs/rcu.txt b/docs/rcu.txt
index b3c593c..de59896 100644
--- a/docs/rcu.txt
+++ b/docs/rcu.txt
@@ -106,7 +106,7 @@ The core RCU API is small:
 so that the reclaimer function can fetch the struct foo address
 and free it:
 
-call_rcu1(foo_reclaim, &foo.rcu);
+call_rcu1(&foo.rcu, foo_reclaim);
 
 void foo_reclaim(struct rcu_head *rp)
 {
diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index 847ddd1..f6f0636 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -139,17 +139,6 @@ struct {   
 \
 (elm)->field.le_prev = &(head)->lh_first;   \
 } while (/*CONSTCOND*/0)
 
-#define QLIST_INSERT_HEAD_RCU(head, elm, field) do {\
-(elm)->field.le_prev = &(head)->lh_first;   \
-(elm)->field.le_next = (head)->lh_first;\
-smp_wmb(); /* fill elm before linking it */ \
-if ((head)->lh_first != NULL)  {\
-(head)->lh_first->field.le_prev = &(elm)->field.le_next;\
-}   \
-(head)->lh_first = (elm);   \
-smp_wmb();  \
-} while (/* CONSTCOND*/0)
-
 #define QLIST_REMOVE(elm, field) do {   \
 if ((elm)->field.le_next != NULL)   \
 (elm)->field.le_next->field.le_prev =   \
diff --git a/include/qemu/rcu_queue.h b/include/qemu/rcu_queue.h
new file mode 100644
index 000..e82196d
--- /dev/null
+++ b/include/qemu/rcu_queue.h
@@ -0,0 +1,137 @@
+#ifndef QEMU_RCU_SYS_QUEUE_H
+#define QEMU_RCU_SYS_QUEUE_H
+
+/*
+ * rc_queue.h
+ *
+ * Userspace RCU QSBR header.
+ *
+ * LGPL-compatible code should include this header with :
+ *
+ * #define _LGPL_SOURCE
+ * #include 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ *
+ * Copyright (c) 2013 Mike D. Day, IBM Corporation.
+ *
+ * IBM's contributions to this file may be relicensed under LGPLv2 or later.
+ */
+
+#include "qemu/rcu.h"  /* rcu.h includes qemu/queue.h and qemu/atomic.h */
+
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+ * List functions.
+ */
+
+
+/* 
+ *  The difference between atomic_read/set and atomic_rcu_read/set
+ *  is in the including of a read/write memory barrier to the volatile
+ *  access. atomic_rcu_* macros include the memory barrier, the 
+ *  plain atomic macros do not. Therefore, it should be correct to 
+ *  issue a series of reads or writes to the same element using only 
+ *  the atomic_* macro, until the last read or write, which should be 
+ *  atomic_rcu_* to introduce a read or write memory barrier as 
+ *  appropriate.  
+ */
+
+/* Upon publication of the listelm->next value, list readers
+ * will see the new node when following next pointers from 
+ * antecedent nodes, but may not see the new node when following 
+ * prev pointers from subsequent nodes until after the rcu grace 
+ * period expires.   
+ * see linux/include/rculist.h __list_add_rcu(new, prev, next)
+ */
+#define QLIST_INSERT_AFTER_RCU(listelm,

[Qemu-devel] [RFC PATCH] Introduce RCU-enabled DQs (v2)

2013-08-24 Thread Mike Day
Add RCU-enabled variants on the existing bsd DQ facility. Each Q
operation has the same interface as the existing (non-RCU)
version. Also, each operation is implemented as macro for now.

Using the RCU-enabled DQ, existing DQ users will be able to convert to
RCU without using a different list interface.

This version (2) adds a macro to walk a Q in reverse:

QLIST_FOREACH_REVERSE_RCU(el, head, field)

Accordingly the reader threads in the test program walk the Q in
reverse in addition to walking forward.

To accompany the RCU-enabled DQ, there is also a test file that uses
concurrent readers to contend with a single updater.

This patchset builds on top of Paolo Bonzini's rcu tree:
https://github.com/bonzini/qemu/tree/rcu

Signed-off-by: Mike Day 
---
 docs/rcu.txt |   2 +-
 include/qemu/queue.h |  11 --
 include/qemu/rcu_queue.h | 145 
 tests/Makefile   |   6 +-
 tests/rcuq_test.c| 290 +++
 5 files changed, 440 insertions(+), 14 deletions(-)
 create mode 100644 include/qemu/rcu_queue.h
 create mode 100644 tests/rcuq_test.c

diff --git a/docs/rcu.txt b/docs/rcu.txt
index b3c593c..de59896 100644
--- a/docs/rcu.txt
+++ b/docs/rcu.txt
@@ -106,7 +106,7 @@ The core RCU API is small:
 so that the reclaimer function can fetch the struct foo address
 and free it:
 
-call_rcu1(foo_reclaim, &foo.rcu);
+call_rcu1(&foo.rcu, foo_reclaim);
 
 void foo_reclaim(struct rcu_head *rp)
 {
diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index 847ddd1..f6f0636 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -139,17 +139,6 @@ struct {   
 \
 (elm)->field.le_prev = &(head)->lh_first;   \
 } while (/*CONSTCOND*/0)
 
-#define QLIST_INSERT_HEAD_RCU(head, elm, field) do {\
-(elm)->field.le_prev = &(head)->lh_first;   \
-(elm)->field.le_next = (head)->lh_first;\
-smp_wmb(); /* fill elm before linking it */ \
-if ((head)->lh_first != NULL)  {\
-(head)->lh_first->field.le_prev = &(elm)->field.le_next;\
-}   \
-(head)->lh_first = (elm);   \
-smp_wmb();  \
-} while (/* CONSTCOND*/0)
-
 #define QLIST_REMOVE(elm, field) do {   \
 if ((elm)->field.le_next != NULL)   \
 (elm)->field.le_next->field.le_prev =   \
diff --git a/include/qemu/rcu_queue.h b/include/qemu/rcu_queue.h
new file mode 100644
index 000..198a87d
--- /dev/null
+++ b/include/qemu/rcu_queue.h
@@ -0,0 +1,145 @@
+#ifndef QEMU_RCU_SYS_QUEUE_H
+#define QEMU_RCU_SYS_QUEUE_H
+
+/*
+ * rc_queue.h
+ *
+ * Userspace RCU QSBR header.
+ *
+ * LGPL-compatible code should include this header with :
+ *
+ * #define _LGPL_SOURCE
+ * #include 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ *
+ * Copyright (c) 2013 Mike D. Day, IBM Corporation.
+ *
+ * IBM's contributions to this file may be relicensed under LGPLv2 or later.
+ */
+
+#include "qemu/rcu.h"  /* rcu.h includes qemu/queue.h and qemu/atomic.h */
+
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+ * List functions.
+ */
+
+
+/*
+ *  The difference between atomic_read/set and atomic_rcu_read/set
+ *  is in the including of a read/write memory barrier to the volatile
+ *  access. atomic_rcu_* macros include the memory barrier, the
+ *  plain atomic macros do not. Therefore, it should be correct to
+ *  issue a series of reads or writes to the same element using only
+ *  the atomic_* macro, until the last read or write, which should be
+ *  atomic_rcu_* to introduce a read or write memory barrier as
+ *  appropriate.
+ */
+
+/* Upon publication of the listelm->next value, list readers
+ * will see the new node when following next pointer

Re: [Qemu-devel] [RFC PATCH] Introduce RCU-enabled DQs (v2)

2013-08-25 Thread Mike Day

Paolo Bonzini  writes:

> Just a couple of questions, one of them on the new macro...
>
>> +/* prior to publication of the elm->prev->next value, some list
>> + * readers  may still see the removed element when following
>> + * the antecedent's next pointer.
>> + */
>> +
>> +#define QLIST_REMOVE_RCU(elm, field) do {   \
>> +if ((elm)->field.le_next != NULL) { \
>> +   (elm)->field.le_next->field.le_prev =\
>> +(elm)->field.le_prev;   \
>> +}   \
>> +atomic_rcu_set((elm)->field.le_prev, (elm)->field.le_next); \
>> +} while (/*CONSTCOND*/0)
>
> Why is the barrier needed here, but not in Linux's list_del_rcu?
>
> I think it is not needed because all involved elements have already been
> published and just have their pointers shuffled.

I read this as more than shuffling pointers. The intent here is 
that the previous element's next pointer is being updated to omit the
current element from the list.

atomic_set always deferences the pointer passed to it, and
(field)->le_pre is a double pointer. So looking at the macro:

#define atomic_set(ptr, i) ((*(__typeof__(*ptr) *volatile) (ptr)) = (i))

It translates to: 

( ( * (__typeof(*elm->field.le_prev) *volatile) (elm)->field.le_prev)  = 
elm->field.le_next; ) 

Which is: 

 *((struct *elm) *volatile)(elm)->field.le_prev = elm->field.le_next; 

Which is:

*(elm)->field.le_prev = elm->field.le_next;

Because field.le_prev is a double pointer that has previously been set
to &prev (the address of the previous list element) this is assiging the
*previous* element's next pointer, the way I read it.

The Linux list_del_rcu is dealing with a singly linked list and
therefore does not set a value in the previous node's element. 

But I'm still unclear on whether or not the memory barrier is needed
because the deleted element won't be reclaimed right away.

Mike

-- 

Mike Day | + 1 919 371-8786 | ncm...@ncultra.org
"Endurance is a Virtue"



Re: [Qemu-devel] [RFC PATCH] Introduce RCU-enabled DQs (v2)

2013-08-26 Thread Mike Day
On Sun, Aug 25, 2013 at 3:18 PM, Mathieu Desnoyers <
mathieu.desnoy...@efficios.com> wrote:
>
> I'm not very comfortable with your DQ implementation not providing any
> kind of guarantee to a forward traversal followed by backward traversal,
> nor for backward followed by forward traversal. If a list add is
> executed concurrently with traversals, and we can ensure there are no
> list del of the node, if a traversal sees the added node when doing
> forward iteration, I would clearly expect to still see it if a backward
> iteration follows.
>
> I took the liberty of implementing a couple of ideas I had to provide
> a RCU DQ with those guarantees. I just pushed the code here (beware, I
> just did some basic single-threaded unit tests so far, so consider this
> code as largely untested concurrency-wise):
>
>   git clone git://git.urcu.io/urcu.git
>   branch: rcudq
>   file: urcu/rcudq.h
>
> Direct link to the file via gitweb:
>
http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/rcudq.h;h=4a8d7b0d5143a958514cf130b1c124d99f3194ca;hb=refs/heads/rcudq

Mathieu - Thanks for the review! And thanks for the code, I'm working with
it right now. I like the idea of using a flag to provide a form of
atomicity for the doubly-linked list elements. I'm also planning on running
some timing tests to see of the additional memory barriers and atomic
accesses make *any* difference whatsoever.

Mike

Mike Day | ncm...@ncultra.org | +1 919 371-8786


[Qemu-devel] [RFC PATCH V3 ] Introduce RCU-enabled DQs

2013-08-27 Thread Mike Day
Add RCU-enabled variants on the existing bsd DQ facility. Each Q
operation has the same interface as the existing (non-RCU)
version. Also, each operation is implemented as macro for now.

Using the RCU-enabled DQ, existing DQ users will be able to convert to
RCU without using a different list interface.

Changes since V2:

* reverted QLIST_FOREACH_REVERSE_RCU, it's not needed in the interface
  file. (But it is used in the test program.)  

* added g_test support to the test program

To accompany the RCU-enabled DQ, there is also a test file that uses
concurrent readers to contend with a single updater.

This patchset builds on top of Paolo Bonzini's rcu tree:
https://github.com/bonzini/qemu/tree/rcu


Signed-off-by: Mike Day 
---
 docs/rcu.txt |   2 +-
 include/qemu/queue.h |  11 --
 include/qemu/rcu_queue.h | 137 +++
 tests/Makefile   |   5 +-
 tests/rcuq_test.c| 337 +++
 5 files changed, 479 insertions(+), 13 deletions(-)
 create mode 100644 include/qemu/rcu_queue.h
 create mode 100644 tests/rcuq_test.c

diff --git a/docs/rcu.txt b/docs/rcu.txt
index b3c593c..de59896 100644
--- a/docs/rcu.txt
+++ b/docs/rcu.txt
@@ -106,7 +106,7 @@ The core RCU API is small:
 so that the reclaimer function can fetch the struct foo address
 and free it:
 
-call_rcu1(foo_reclaim, &foo.rcu);
+call_rcu1(&foo.rcu, foo_reclaim);
 
 void foo_reclaim(struct rcu_head *rp)
 {
diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index 847ddd1..f6f0636 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -139,17 +139,6 @@ struct {   
 \
 (elm)->field.le_prev = &(head)->lh_first;   \
 } while (/*CONSTCOND*/0)
 
-#define QLIST_INSERT_HEAD_RCU(head, elm, field) do {\
-(elm)->field.le_prev = &(head)->lh_first;   \
-(elm)->field.le_next = (head)->lh_first;\
-smp_wmb(); /* fill elm before linking it */ \
-if ((head)->lh_first != NULL)  {\
-(head)->lh_first->field.le_prev = &(elm)->field.le_next;\
-}   \
-(head)->lh_first = (elm);   \
-smp_wmb();  \
-} while (/* CONSTCOND*/0)
-
 #define QLIST_REMOVE(elm, field) do {   \
 if ((elm)->field.le_next != NULL)   \
 (elm)->field.le_next->field.le_prev =   \
diff --git a/include/qemu/rcu_queue.h b/include/qemu/rcu_queue.h
new file mode 100644
index 000..e2b8ba5
--- /dev/null
+++ b/include/qemu/rcu_queue.h
@@ -0,0 +1,137 @@
+#ifndef QEMU_RCU_SYS_QUEUE_H
+#define QEMU_RCU_SYS_QUEUE_H
+
+/*
+ * rcu_queue.h
+ *
+ * Userspace RCU QSBR header.
+ *
+ * LGPL-compatible code should include this header with :
+ *
+ * #define _LGPL_SOURCE
+ * #include 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ *
+ * Copyright (c) 2013 Mike D. Day, IBM Corporation.
+ *
+ * IBM's contributions to this file may be relicensed under LGPLv2 or later.
+ */
+
+#include "qemu/rcu.h"  /* rcu.h includes qemu/queue.h and qemu/atomic.h */
+
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+ * List functions.
+ */
+
+
+/*
+ *  The difference between atomic_read/set and atomic_rcu_read/set
+ *  is in the including of a read/write memory barrier to the volatile
+ *  access. atomic_rcu_* macros include the memory barrier, the
+ *  plain atomic macros do not. Therefore, it should be correct to
+ *  issue a series of reads or writes to the same element using only
+ *  the atomic_* macro, until the last read or write, which should be
+ *  atomic_rcu_* to introduce a read or write memory barrier as
+ *  appropriate.
+ */
+
+/* Upon publication of the listelm->next value, list readers
+ * will see the new node when following next pointers from
+ * anteced

[Qemu-devel] [RFC PATCH] convert ram_list to RCU DQ

2013-08-28 Thread Mike Day
Allow "unlocked" reads of the ram_list by using an RCU-enabled
DQ. Most readers of the list no longer require holding the list mutex.

The ram_list now uses a QLIST instead of a QTAILQ. The difference is
minimal.

This patch has been built and make-checked for the x86_64, ppc64,
s390x, and arm targets. It has not been tested further than that at
this point.

To apply this patch, you must base upon Paolo Bonzini's rcu tree and
also apply the RCU DQ patch (below).

https://github.com/bonzini/qemu/tree/rcu
http://article.gmane.org/gmane.comp.emulators.qemu/230159/

Signed-off-by: Mike Day 
---
 arch_init.c   |  51 -
 exec.c| 111 +-
 hw/9pfs/virtio-9p-synth.c |   2 +-
 include/exec/cpu-all.h|   4 +-
 include/qemu/rcu_queue.h  |   8 
 5 files changed, 111 insertions(+), 65 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 68a7ab7..5c7b284 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -49,6 +49,7 @@
 #include "trace.h"
 #include "exec/cpu-all.h"
 #include "hw/acpi/acpi.h"
+#include "qemu/rcu_queue.h"
 
 #ifdef DEBUG_ARCH_INIT
 #define DPRINTF(fmt, ...) \
@@ -397,8 +398,8 @@ static void migration_bitmap_sync(void)
 
 trace_migration_bitmap_sync_start();
 address_space_sync_dirty_bitmap(&address_space_memory);
-
-QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+rcu_read_lock();
+QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
 for (addr = 0; addr < block->length; addr += TARGET_PAGE_SIZE) {
 if (memory_region_test_and_clear_dirty(block->mr,
addr, TARGET_PAGE_SIZE,
@@ -407,6 +408,7 @@ static void migration_bitmap_sync(void)
 }
 }
 }
+rcu_read_unlock();
 trace_migration_bitmap_sync_end(migration_dirty_pages
 - num_dirty_pages_init);
 num_dirty_pages_period += migration_dirty_pages - num_dirty_pages_init;
@@ -457,8 +459,9 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 MemoryRegion *mr;
 ram_addr_t current_addr;
 
+rcu_read_lock();
 if (!block)
-block = QTAILQ_FIRST(&ram_list.blocks);
+block = QLIST_FIRST_RCU(&ram_list.blocks);
 
 while (true) {
 mr = block->mr;
@@ -469,9 +472,9 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 }
 if (offset >= block->length) {
 offset = 0;
-block = QTAILQ_NEXT(block, next);
+block = QLIST_NEXT_RCU(block, next);
 if (!block) {
-block = QTAILQ_FIRST(&ram_list.blocks);
+block = QLIST_FIRST_RCU(&ram_list.blocks);
 complete_round = true;
 ram_bulk_stage = false;
 }
@@ -526,6 +529,7 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 }
 }
 }
+rcu_read_unlock();
 last_seen_block = block;
 last_offset = offset;
 
@@ -565,10 +569,10 @@ uint64_t ram_bytes_total(void)
 {
 RAMBlock *block;
 uint64_t total = 0;
-
-QTAILQ_FOREACH(block, &ram_list.blocks, next)
+rcu_read_lock();
+QLIST_FOREACH_RCU(block, &ram_list.blocks, next)
 total += block->length;
-
+rcu_read_unlock();
 return total;
 }
 
@@ -631,7 +635,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 }
 
 qemu_mutex_lock_iothread();
-qemu_mutex_lock_ramlist();
+rcu_read_lock();
 bytes_transferred = 0;
 reset_ram_globals();
 
@@ -641,13 +645,13 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 
 qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
 
-QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
 qemu_put_byte(f, strlen(block->idstr));
 qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
 qemu_put_be64(f, block->length);
 }
 
-qemu_mutex_unlock_ramlist();
+rcu_read_unlock();
 
 ram_control_before_iterate(f, RAM_CONTROL_SETUP);
 ram_control_after_iterate(f, RAM_CONTROL_SETUP);
@@ -664,8 +668,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 int64_t t0;
 int total_sent = 0;
 
-qemu_mutex_lock_ramlist();
-
 if (ram_list.version != last_version) {
 reset_ram_globals();
 }
@@ -701,8 +703,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 i++;
 }
 
-qemu_mutex_unlock_ramlist();
-
 /*
  * Must occur before EOS (or any QEMUFile operation)
  * because of RDMA protocol.
@@ -814,6 +814,7 @@ static inline void *host_from_stream_offset(QEMUFile *f,
 static RAMBlock *block = NULL;
 char id[256];
 uint8_t len;
+void *ptr = NULL;
 
 if (flags & RAM_SAVE_FLAG_CONTINUE) {
 if (!block) {
@@ -

Re: [Qemu-devel] [RFC PATCH] convert ram_list to RCU DQ

2013-08-29 Thread Mike Day
On Wed, Aug 28, 2013 at 12:35 PM, Paolo Bonzini  wrote:
>
> > @@ -828,13 +829,18 @@ static inline void
*host_from_stream_offset(QEMUFile *f,
> >  qemu_get_buffer(f, (uint8_t *)id, len);
> >  id[len] = 0;
> >
> > -QTAILQ_FOREACH(block, &ram_list.blocks, next) {
> > -if (!strncmp(id, block->idstr, sizeof(id)))
> > -return memory_region_get_ram_ptr(block->mr) + offset;
> > +rcu_read_lock();
> > +QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> > +if (!strncmp(id, block->idstr, sizeof(id))) {
> > +ptr = memory_region_get_ram_ptr(block->mr) + offset;
> > +goto unlock_out;
> > +}
> >  }
> >
> >  fprintf(stderr, "Can't find block %s!\n", id);
> > -return NULL;
> > +unlock_out:
> > +rcu_read_unlock();
> > +return ptr;
> >  }
>
>
> Similarly, here the critical section includes the caller, and block is
> living across calls to host.  Again, for now just put all of ram_load
> under a huge RCU critical section.  Later we can use ram_list.version to
> refresh the list and make the critical sections smaller.

And: rcu_read_lock() and rcu_read_unlock() can be called recursively, so I
can still leave the lock/unlock pair in host_from_stream_offset.


Re: [Qemu-devel] [RFC PATCH] convert ram_list to RCU DQ

2013-08-30 Thread Mike Day
On Fri, Aug 30, 2013 at 4:19 AM, Paolo Bonzini  wrote:
>
>
> I'm not sure about that; returning an RCU-protected variable after
> rcu_read_unlock() seems wrong to me because the pointer may not be valid
> at that point.  I suggest using a comment that asks to call
> host_from_stream_offset within rcu_read_lock()/rcu_read_unlock().
> However, if existing practice in the kernel is different, I'll bow to
that.

In this case the only caller is ram_load so I'm removing the critical
section from within host_from_stream_offset, and adding comments to note
that the ram_list needs to be protected by the caller. The current
docs/rcu.txt information indicates that rcu critical sections can be
"nested or overlapping." But your suggestion results in cleaner code - we
will have to go back to this later as you noted earlier.

Mike


[Qemu-devel] [RFC PATCH] Convert ram_list to RCU DQ V2

2013-08-30 Thread Mike Day
Changes from V1:

* Omitted locks or rcu critical sections within Some functions that
  read or write the ram_list but are called in a protected context
  (the caller holds the iothread lock, the ram_list mutex, or an rcu
  critical section).

Allow "unlocked" reads of the ram_list by using an RCU-enabled
DQ. Most readers of the list no longer require holding the list mutex.

The ram_list now uses a QLIST instead of a QTAILQ. The difference is
minimal.

This patch has been built and make-checked for the x86_64, ppc64,
s390x, and arm targets. It has not been tested further than that at
this point.

To apply this patch, you must base upon Paolo Bonzini's rcu tree and
also apply the RCU DQ patch (below).

https://github.com/bonzini/qemu/tree/rcu
http://article.gmane.org/gmane.comp.emulators.qemu/230159/

Signed-off-by: Mike Day 
---
 arch_init.c   |  80 +++---
 exec.c| 142 ++
 hw/9pfs/virtio-9p-synth.c |   2 +-
 include/exec/cpu-all.h|   4 +-
 include/qemu/rcu_queue.h  |   8 +++
 5 files changed, 151 insertions(+), 85 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 68a7ab7..3f4d676 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -49,6 +49,7 @@
 #include "trace.h"
 #include "exec/cpu-all.h"
 #include "hw/acpi/acpi.h"
+#include "qemu/rcu_queue.h"
 
 #ifdef DEBUG_ARCH_INIT
 #define DPRINTF(fmt, ...) \
@@ -398,7 +399,11 @@ static void migration_bitmap_sync(void)
 trace_migration_bitmap_sync_start();
 address_space_sync_dirty_bitmap(&address_space_memory);
 
-QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+ /* This assumes the iothread lock or the ram_list mutex is taken.
+ * if that changes, accesses to ram_list need to be protected
+ * by a mutex (writes) or an rcu read lock (reads)
+ */
+QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
 for (addr = 0; addr < block->length; addr += TARGET_PAGE_SIZE) {
 if (memory_region_test_and_clear_dirty(block->mr,
addr, TARGET_PAGE_SIZE,
@@ -457,8 +462,13 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 MemoryRegion *mr;
 ram_addr_t current_addr;
 
+/* Sometimes called with the ram_list mutex held (ram_save_complete)
+ * also called WITHOUT the ram_list mutex held. (ram_save_iterate).
+ * Protect ram_list with an rcu critical section.
+ */
+rcu_read_lock();
 if (!block)
-block = QTAILQ_FIRST(&ram_list.blocks);
+block = QLIST_FIRST_RCU(&ram_list.blocks);
 
 while (true) {
 mr = block->mr;
@@ -469,9 +479,9 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 }
 if (offset >= block->length) {
 offset = 0;
-block = QTAILQ_NEXT(block, next);
+block = QLIST_NEXT_RCU(block, next);
 if (!block) {
-block = QTAILQ_FIRST(&ram_list.blocks);
+block = QLIST_FIRST_RCU(&ram_list.blocks);
 complete_round = true;
 ram_bulk_stage = false;
 }
@@ -526,6 +536,7 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 }
 }
 }
+rcu_read_unlock();
 last_seen_block = block;
 last_offset = offset;
 
@@ -565,10 +576,10 @@ uint64_t ram_bytes_total(void)
 {
 RAMBlock *block;
 uint64_t total = 0;
-
-QTAILQ_FOREACH(block, &ram_list.blocks, next)
+rcu_read_lock();
+QLIST_FOREACH_RCU(block, &ram_list.blocks, next)
 total += block->length;
-
+rcu_read_unlock();
 return total;
 }
 
@@ -631,7 +642,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 }
 
 qemu_mutex_lock_iothread();
-qemu_mutex_lock_ramlist();
+rcu_read_lock();
 bytes_transferred = 0;
 reset_ram_globals();
 
@@ -641,13 +652,13 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 
 qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
 
-QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
 qemu_put_byte(f, strlen(block->idstr));
 qemu_put_buffer(f, (uint8_t *)block->idstr, strlen(block->idstr));
 qemu_put_be64(f, block->length);
 }
 
-qemu_mutex_unlock_ramlist();
+rcu_read_unlock();
 
 ram_control_before_iterate(f, RAM_CONTROL_SETUP);
 ram_control_after_iterate(f, RAM_CONTROL_SETUP);
@@ -664,8 +675,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 int64_t t0;
 int total_sent = 0;
 
-qemu_mutex_lock_ramlist();
-
 if (ram_list.version != last_version) {
 reset_ram_globals();
 }
@@ -701,8 +710,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 i++;
 }
 
-qemu_mutex_unlock_ramlist();
-
 /*

Re: [Qemu-devel] [RFC PATCH] Convert ram_list to RCU DQ V2

2013-09-03 Thread Mike Day
On Fri, Aug 30, 2013 at 12:38 PM, Paolo Bonzini  wrote:
>
> > @@ -867,7 +879,12 @@ static int ram_load(QEMUFile *f, void *opaque, int
version_id)
> >  if (version_id < 4 || version_id > 4) {
> >  return -EINVAL;
> >  }
> > -
> > +/* this implements a long-running RCU critical section.
> > + * When rcu reclaims in the code start to become numerous
> > + * it will be necessary to reduce the granularity of this critical
> > + * section.
> > + */
>
> Please add the same comment (and a rcu_read_lock/unlock pair replacing
> the ramlist mutex) in ram_save_iterate, too.

Just double checking on this particular change. In practice ram_save
manipulates the ram_list indirectly through ram_save_block. But I'm
assuming you want this change because of the ram state info that persists
between calls to ram_save (ram_list version in particular). Also, there is
potential for the callback functions ram_control_*_iterate to manipulate
the ram_list.

I'm adding the rcu_read_lock/unlock pair in ram_load. It will be recursive
with the same calls in ram_save_block, but as you pointed out this is low
overhead.

With this change in my working code, ram_control_*_iterate are called from
within an rcu critical section.

Mike


Re: [Qemu-devel] [RFC PATCH] Convert ram_list to RCU DQ V2

2013-09-03 Thread Mike Day
On Tue, Sep 3, 2013 at 10:09 AM, Paolo Bonzini  wrote:
>
> Il 03/09/2013 15:56, Mike Day ha scritto:
> >> > +/* this implements a long-running RCU critical section.
> >> > + * When rcu reclaims in the code start to become numerous
> >> > + * it will be necessary to reduce the granularity of this
critical
> >> > + * section.
> >> > + */
> >>
> >> Please add the same comment (and a rcu_read_lock/unlock pair replacing
> >> the ramlist mutex) in ram_save_iterate, too.
> >
> > Just double checking on this particular change. In practice ram_save
> > manipulates the ram_list indirectly through ram_save_block. But I'm
> > assuming you want this change because of the ram state info that
> > persists between calls to ram_save (ram_list version in particular).
>
> ram_list.version is not really a problem, but last_seen_block has to
> persist across ram_save_block calls.

Got it. that's a subtle point.

> > Also, there is potential for the callback functions
> > ram_control_*_iterate to manipulate the ram_list.
>
> I think that's right now not possible (and they could use
> rcu_read_lock/unlock as well).

Yeah. So how about we say for now that the rcu critical section status upon
entry to the ram_control_*_iterate functions is undefined. I'll make some
updates.

Mike


[Qemu-devel] [PATCH] Convert ram_list to RCU DQ V3

2013-09-04 Thread Mike Day
Changes from V2:

* added rcu reclaim function to free ram blocks
* reduced the number of comments
* made rcu locking policy consistent for different caller scenarios
* rcu updates to ram_block now are protected only by the iothread mutex

Changes from V1:

* Omitted locks or rcu critical sections within Some functions that
  read or write the ram_list but are called in a protected context
  (the caller holds the iothread lock, the ram_list mutex, or an rcu
  critical section).

Allow "unlocked" reads of the ram_list by using an RCU-enabled
DQ. Most readers of the list no longer require holding the list mutex.

The ram_list now uses a QLIST instead of a QTAILQ. The difference is
minimal.

This patch has been built and make-checked for the x86_64, ppc64,
s390x, and arm targets. It has not been tested further than that at
this point.

To apply this patch, you must base upon Paolo Bonzini's rcu tree and
also apply the RCU DQ patch (below).

https://github.com/bonzini/qemu/tree/rcu
http://article.gmane.org/gmane.comp.emulators.qemu/230159/

Signed-off-by: Mike Day 
---
 arch_init.c  | 103 ++
 exec.c   | 160 +--
 include/exec/cpu-all.h   |   6 +-
 include/qemu/rcu_queue.h |   8 +++
 4 files changed, 173 insertions(+), 104 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 68a7ab7..4cb1543 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -49,6 +49,7 @@
 #include "trace.h"
 #include "exec/cpu-all.h"
 #include "hw/acpi/acpi.h"
+#include "qemu/rcu_queue.h"
 
 #ifdef DEBUG_ARCH_INIT
 #define DPRINTF(fmt, ...) \
@@ -398,7 +399,8 @@ static void migration_bitmap_sync(void)
 trace_migration_bitmap_sync_start();
 address_space_sync_dirty_bitmap(&address_space_memory);
 
-QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+rcu_read_lock();
+QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
 for (addr = 0; addr < block->length; addr += TARGET_PAGE_SIZE) {
 if (memory_region_test_and_clear_dirty(block->mr,
addr, TARGET_PAGE_SIZE,
@@ -407,6 +409,8 @@ static void migration_bitmap_sync(void)
 }
 }
 }
+rcu_read_unlock();
+
 trace_migration_bitmap_sync_end(migration_dirty_pages
 - num_dirty_pages_init);
 num_dirty_pages_period += migration_dirty_pages - num_dirty_pages_init;
@@ -446,6 +450,8 @@ static void migration_bitmap_sync(void)
  *
  * Returns:  The number of bytes written.
  *   0 means no dirty pages
+ *
+ * assumes that the caller has rcu-locked the ram_list
  */
 
 static int ram_save_block(QEMUFile *f, bool last_stage)
@@ -457,8 +463,9 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 MemoryRegion *mr;
 ram_addr_t current_addr;
 
+
 if (!block)
-block = QTAILQ_FIRST(&ram_list.blocks);
+block = QLIST_FIRST_RCU(&ram_list.blocks);
 
 while (true) {
 mr = block->mr;
@@ -469,9 +476,9 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 }
 if (offset >= block->length) {
 offset = 0;
-block = QTAILQ_NEXT(block, next);
+block = QLIST_NEXT_RCU(block, next);
 if (!block) {
-block = QTAILQ_FIRST(&ram_list.blocks);
+block = QLIST_FIRST_RCU(&ram_list.blocks);
 complete_round = true;
 ram_bulk_stage = false;
 }
@@ -526,9 +533,9 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 }
 }
 }
+
 last_seen_block = block;
 last_offset = offset;
-
 return bytes_sent;
 }
 
@@ -565,10 +572,10 @@ uint64_t ram_bytes_total(void)
 {
 RAMBlock *block;
 uint64_t total = 0;
-
-QTAILQ_FOREACH(block, &ram_list.blocks, next)
+rcu_read_lock();
+QLIST_FOREACH_RCU(block, &ram_list.blocks, next)
 total += block->length;
-
+rcu_read_unlock();
 return total;
 }
 
@@ -602,10 +609,20 @@ static void reset_ram_globals(void)
 last_offset = 0;
 last_version = ram_list.version;
 ram_bulk_stage = true;
+smp_wmb();
 }
 
 #define MAX_WAIT 50 /* ms, half buffered_file limit */
 
+
+/* ram_save_* functions each  implement a long-running RCU critical
+ * section. When rcu-reclaims in the code start to become numerous
+ * it will be necessary to reduce the granularity of these critical
+ * sections.
+ *
+ * (ram_save_setup, ram_save_iterate, and ram_save_complete.)
+ */
+
 static int ram_save_setup(QEMUFile *f, void *opaque)
 {
 RAMBlock *block;
@@ -631,7 +648,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 }
 
 qemu_mutex_lock_iothread();
-qemu_mutex_lock_ramlist();
+rcu_read_lock();
 bytes_transferred = 0;
 reset_ram_globals();
 
@@ -641,13 +658,13 @@ static 

[Qemu-devel] [RFC PATCH] Convert ram_list to RCU DQ V4

2013-09-04 Thread Mike Day
* now passes virt-test -t qemu
* uses call_rcu instead of call_rcu1
* completely removed the ram_list mutex and its locking functions
* cleaned up some comments

Changes from V3:

* added rcu reclaim function to free ram blocks
* reduced the number of comments
* made rcu locking policy consistent for different caller scenarios
* rcu updates to ram_block now are protected only by the iothread mutex

Changes from V1:

* Omitted locks or rcu critical sections within Some functions that
  read or write the ram_list but are called in a protected context
  (the caller holds the iothread lock, the ram_list mutex, or an rcu
  critical section).

Allow "unlocked" reads of the ram_list by using an RCU-enabled
DQ. Most readers of the list no longer require holding the list mutex.

The ram_list now uses a QLIST instead of a QTAILQ. The difference is
minimal.

This patch has been built and make-checked for the x86_64, ppc64,
s390x, and arm targets. It has been virt-tested using KVM -t qemu and
passes 15/15 migration tests.

To apply this patch, you must base upon Paolo Bonzini's rcu tree and
also apply the RCU DQ patch (below).

https://github.com/bonzini/qemu/tree/rcu
http://article.gmane.org/gmane.comp.emulators.qemu/230159/

Signed-off-by: Mike Day 
---
 arch_init.c  |  96 +---
 exec.c   | 162 +--
 include/exec/cpu-all.h   |  13 ++--
 include/qemu/rcu_queue.h |   8 +++
 4 files changed, 160 insertions(+), 119 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 68a7ab7..76ef5c9 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -49,6 +49,7 @@
 #include "trace.h"
 #include "exec/cpu-all.h"
 #include "hw/acpi/acpi.h"
+#include "qemu/rcu_queue.h"
 
 #ifdef DEBUG_ARCH_INIT
 #define DPRINTF(fmt, ...) \
@@ -398,7 +399,8 @@ static void migration_bitmap_sync(void)
 trace_migration_bitmap_sync_start();
 address_space_sync_dirty_bitmap(&address_space_memory);
 
-QTAILQ_FOREACH(block, &ram_list.blocks, next) {
+rcu_read_lock();
+QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
 for (addr = 0; addr < block->length; addr += TARGET_PAGE_SIZE) {
 if (memory_region_test_and_clear_dirty(block->mr,
addr, TARGET_PAGE_SIZE,
@@ -407,6 +409,8 @@ static void migration_bitmap_sync(void)
 }
 }
 }
+rcu_read_unlock();
+
 trace_migration_bitmap_sync_end(migration_dirty_pages
 - num_dirty_pages_init);
 num_dirty_pages_period += migration_dirty_pages - num_dirty_pages_init;
@@ -446,6 +450,8 @@ static void migration_bitmap_sync(void)
  *
  * Returns:  The number of bytes written.
  *   0 means no dirty pages
+ *
+ * assumes that the caller has rcu-locked the ram_list
  */
 
 static int ram_save_block(QEMUFile *f, bool last_stage)
@@ -458,7 +464,7 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 ram_addr_t current_addr;
 
 if (!block)
-block = QTAILQ_FIRST(&ram_list.blocks);
+block = QLIST_FIRST_RCU(&ram_list.blocks);
 
 while (true) {
 mr = block->mr;
@@ -469,9 +475,9 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
 }
 if (offset >= block->length) {
 offset = 0;
-block = QTAILQ_NEXT(block, next);
+block = QLIST_NEXT_RCU(block, next);
 if (!block) {
-block = QTAILQ_FIRST(&ram_list.blocks);
+block = QLIST_FIRST_RCU(&ram_list.blocks);
 complete_round = true;
 ram_bulk_stage = false;
 }
@@ -565,10 +571,10 @@ uint64_t ram_bytes_total(void)
 {
 RAMBlock *block;
 uint64_t total = 0;
-
-QTAILQ_FOREACH(block, &ram_list.blocks, next)
+rcu_read_lock();
+QLIST_FOREACH_RCU(block, &ram_list.blocks, next)
 total += block->length;
-
+rcu_read_unlock();
 return total;
 }
 
@@ -602,10 +608,20 @@ static void reset_ram_globals(void)
 last_offset = 0;
 last_version = ram_list.version;
 ram_bulk_stage = true;
+smp_wmb();
 }
 
 #define MAX_WAIT 50 /* ms, half buffered_file limit */
 
+
+/* ram_save_* functions each  implement a long-running RCU critical
+ * section. When rcu-reclaims in the code start to become numerous
+ * it will be necessary to reduce the granularity of these critical
+ * sections.
+ *
+ * (ram_save_setup, ram_save_iterate, and ram_save_complete.)
+ */
+
 static int ram_save_setup(QEMUFile *f, void *opaque)
 {
 RAMBlock *block;
@@ -631,7 +647,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 }
 
 qemu_mutex_lock_iothread();
-qemu_mutex_lock_ramlist();
+rcu_read_lock();
 bytes_transferred = 0;
 reset_ram_globals();
 
@@ -641,13 +657,13 @@ static int ram_save_setup(QEMUFile *f

Re: [Qemu-devel] [RFC PATCH] Convert ram_list to RCU DQ V4

2013-09-04 Thread Mike Day
On Wed, Sep 4, 2013 at 3:58 PM, Paolo Bonzini  wrote:
>
> > @@ -1323,23 +1325,21 @@ static RAMBlock *qemu_get_ram_block(ram_addr_t
addr)
> >  {
> >  RAMBlock *block;
> >
> > -/* The list is protected by the iothread lock here.  */
> > +/* This assumes the iothread lock is taken here too. */
> > +
> >  block = ram_list.mru_block;
> >  if (block && addr - block->offset < block->length) {
> > -goto found;
> > +return block;
> >  }
> > -QTAILQ_FOREACH(block, &ram_list.blocks, next) {
> > +QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> >  if (addr - block->offset < block->length) {
> > -goto found;
> > +atomic_rcu_set(&ram_list.mru_block, block);
>
> I think this is not needed, block was already published into the block
> list.  What is important is to order mru_block == NULL so that it
> happens before the node is removed.  Which we don't do, but we can fix
> later.

I am thinking of writing some macros and possibly reorganizing the ram
globals into a struct so that we can update it by exchanging pointers
atomically. It seems to disjoint right and error-prone now that we are
making it rcu-enabled.

thanks!

Mike


Re: [Qemu-devel] [PATCH 08/17] mm: madvise MADV_USERFAULT

2014-10-03 Thread Mike Hommey
On Fri, Oct 03, 2014 at 07:07:58PM +0200, Andrea Arcangeli wrote:
> MADV_USERFAULT is a new madvise flag that will set VM_USERFAULT in the
> vma flags. Whenever VM_USERFAULT is set in an anonymous vma, if
> userland touches a still unmapped virtual address, a sigbus signal is
> sent instead of allocating a new page. The sigbus signal handler will
> then resolve the page fault in userland by calling the
> remap_anon_pages syscall.

What does "unmapped virtual address" mean in this context?

Mike



Re: [Qemu-devel] [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Bug 1207686]

2013-08-05 Thread Mike Dawson

Josh,

Logs are uploaded to cephdrop with the file name 
mikedawson-rbd-qemu-deadlock.


- At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0
- At about 2013-08-05 19:53:51, ran a 'virsh screenshot'


Environment is:

- Ceph 0.61.7 (client is co-mingled with three OSDs)
- rbd cache = true and cache=writeback
- qemu 1.4.0 1.4.0+dfsg-1expubuntu4
- Ubuntu Raring with 3.8.0-25-generic

This issue is reproducible in my environment, and I'm willing to run any 
wip branch you need. What else can I provide to help?


Thanks,
Mike Dawson


On 8/5/2013 3:48 AM, Stefan Hajnoczi wrote:

On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote:

Am 02.08.2013 um 23:47 schrieb Mike Dawson :

We can "un-wedge" the guest by opening a NoVNC session or running a 'virsh 
screenshot' command. After that, the guest resumes and runs as expected. At that point we 
can examine the guest. Each time we'll see:


If virsh screenshot works then this confirms that QEMU itself is still
responding.  Its main loop cannot be blocked since it was able to
process the screendump command.

This supports Josh's theory that a callback is not being invoked.  The
virtio-blk I/O request would be left in a pending state.

Now here is where the behavior varies between configurations:

On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.

On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.

Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.

I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug.  At least I hope they are :).

Stefan





[Qemu-devel] [PATCH] block: Bugfix 'format' and 'snapshot' used in drive option

2013-08-08 Thread Mike Qiu
When use -drive file='xxx',format=qcow2,snapshot=on the error
message "Can't use snapshot=on with driver-specific options"
can be show, and fail to start the qemu.

This should not be happened, and there is no file.driver option
in qemu command line.

It is because the commit 74fe54f2a1b5c4f4498a8fe521e1dfc936656cd4,
it puts 'driver' option if the command line use 'format' option.

This patch is to solve this bug.

Signed-off-by: Mike Qiu 
---
 blockdev.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 41b0a49..e174b7d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -340,6 +340,7 @@ static DriveInfo *blockdev_init(QemuOpts *all_opts,
 QDict *bs_opts;
 const char *id;
 bool has_driver_specific_opts;
+BlockDriver *drv = NULL;
 
 translation = BIOS_ATA_TRANSLATION_AUTO;
 media = MEDIA_DISK;
@@ -485,7 +486,11 @@ static DriveInfo *blockdev_init(QemuOpts *all_opts,
 return NULL;
 }
 
-qdict_put(bs_opts, "driver", qstring_from_str(buf));
+drv = bdrv_find_whitelisted_format(buf, ro);
+if (!drv) {
+error_report("'%s' invalid format", buf);
+return NULL;
+}
 }
 
 /* disk I/O throttling */
@@ -700,12 +705,13 @@ static DriveInfo *blockdev_init(QemuOpts *all_opts,
 }
 
 QINCREF(bs_opts);
-ret = bdrv_open(dinfo->bdrv, file, bs_opts, bdrv_flags, NULL);
+ret = bdrv_open(dinfo->bdrv, file, bs_opts, bdrv_flags, drv);
 
 if (ret < 0) {
 if (ret == -EMEDIUMTYPE) {
 error_report("could not open disk image %s: not in %s format",
- file ?: dinfo->id, qdict_get_str(bs_opts, "driver"));
+ file ?: dinfo->id, drv ? drv->format_name :
+ qdict_get_str(bs_opts, "driver"));
 } else {
 error_report("could not open disk image %s: %s",
  file ?: dinfo->id, strerror(-ret));
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] block: Bugfix 'format' and 'snapshot' used in drive option

2013-08-08 Thread Mike Qiu
Hi all

This 'bug' also happens use -snapshot,

But as newer to qemu,
I'm not sure about what am I understanding.

But I think snapshot should works at least in qcow2.

My understanding about the driver option is mainly for
'nbd' and 'http/https/ftp', and so in commit:
c2ad1b0c465a9ea8375eaff14bbd85705c673f73
So when 'driver' option used, the snapshot can't
be using.

As this understanding, my first patch is try to
get the the value of key==driver, and if the
value in 'nbd' and 'http/https/ftp', then
give out that error messages.

So give me some comments and sugguestions, pls.

Thanks
Mike

2013/8/8 22:45, Mike Qiu wrote:
> When use -drive file='xxx',format=qcow2,snapshot=on the error
> message "Can't use snapshot=on with driver-specific options"
> can be show, and fail to start the qemu.
>
> This should not be happened, and there is no file.driver option
> in qemu command line.
>
> It is because the commit 74fe54f2a1b5c4f4498a8fe521e1dfc936656cd4,
> it puts 'driver' option if the command line use 'format' option.
>
> This patch is to solve this bug.
>
> Signed-off-by: Mike Qiu 
> ---
>  blockdev.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/blockdev.c b/blockdev.c
> index 41b0a49..e174b7d 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -340,6 +340,7 @@ static DriveInfo *blockdev_init(QemuOpts *all_opts,
>  QDict *bs_opts;
>  const char *id;
>  bool has_driver_specific_opts;
> +BlockDriver *drv = NULL;
>
>  translation = BIOS_ATA_TRANSLATION_AUTO;
>  media = MEDIA_DISK;
> @@ -485,7 +486,11 @@ static DriveInfo *blockdev_init(QemuOpts *all_opts,
>  return NULL;
>  }
>
> -qdict_put(bs_opts, "driver", qstring_from_str(buf));
> +drv = bdrv_find_whitelisted_format(buf, ro);
> +if (!drv) {
> +error_report("'%s' invalid format", buf);
> +return NULL;
> +}
>  }
>
>  /* disk I/O throttling */
> @@ -700,12 +705,13 @@ static DriveInfo *blockdev_init(QemuOpts *all_opts,
>  }
>
>  QINCREF(bs_opts);
> -ret = bdrv_open(dinfo->bdrv, file, bs_opts, bdrv_flags, NULL);
> +ret = bdrv_open(dinfo->bdrv, file, bs_opts, bdrv_flags, drv);
>
>  if (ret < 0) {
>  if (ret == -EMEDIUMTYPE) {
>  error_report("could not open disk image %s: not in %s format",
> - file ?: dinfo->id, qdict_get_str(bs_opts, 
> "driver"));
> + file ?: dinfo->id, drv ? drv->format_name :
> + qdict_get_str(bs_opts, "driver"));
>  } else {
>  error_report("could not open disk image %s: %s",
>   file ?: dinfo->id, strerror(-ret));




[Qemu-devel] [RFC PATCH 02/14] rcu: add rcu library

2013-08-14 Thread Mike Day
From: Paolo Bonzini 

This includes a (mangled) copy of the urcu-qsbr code from liburcu.
The main changes are: 1) removing dependencies on many other header files
in liburcu; 2) removing for simplicity the tentative busy waiting in
synchronize_rcu, which has limited performance effects; 3) replacing
futexes in synchronize_rcu with QemuEvents for Win32 portability.
The API is the same as liburcu, so it should be possible in the future
to require liburcu on POSIX systems for example and use our copy only
on Windows.

Among the various versions available I chose urcu-qsbr, which has the
fastest rcu_read_{lock,unlock} but requires the program to manually
annotate quiescent points or intervals.  QEMU threads usually have easily
identified quiescent periods, so this should not be a problem.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 docs/rcu.txt   | 301 +
 hw/9pfs/virtio-9p-synth.c  |   1 +
 include/qemu/queue.h   |  13 ++
 include/qemu/rcu-pointer.h | 110 +
 include/qemu/rcu.h | 168 +
 include/qemu/thread.h  |   3 -
 libcacard/Makefile |   3 +-
 util/Makefile.objs |   1 +
 util/rcu.c | 203 ++
 9 files changed, 799 insertions(+), 4 deletions(-)
 create mode 100644 docs/rcu.txt
 create mode 100644 include/qemu/rcu-pointer.h
 create mode 100644 include/qemu/rcu.h
 create mode 100644 util/rcu.c

diff --git a/docs/rcu.txt b/docs/rcu.txt
new file mode 100644
index 000..19e4840
--- /dev/null
+++ b/docs/rcu.txt
@@ -0,0 +1,301 @@
+Using RCU (Read-Copy-Update) for synchronization
+
+
+Read-copy update (RCU) is a synchronization mechanism that is used to
+protect read-mostly data structures.  RCU is very efficient and scalable
+on the read side (it is wait-free), and thus can make the read paths
+extremely fast.
+
+RCU supports concurrency between a single writer and multiple readers,
+thus it is not used alone.  Typically, the write-side will use a lock to
+serialize multiple updates, but other approaches are possible (e.g.,
+restricting updates to a single task).  In QEMU, when a lock is used,
+this will often be the "iothread mutex", also known as the "big QEMU
+lock" (BQL).  Also, restricting updates to a single task is done in
+QEMU using the "bottom half" API.
+
+RCU is fundamentally a "wait-to-finish" mechanism.  The read side marks
+sections of code with "critical sections", and the update side will wait
+for the execution of all *currently running* critical sections before
+proceeding, or before asynchronously executing a callback.
+
+The key point here is that only the currently running critical sections
+are waited for; critical sections that are started _after_ the beginning
+of the wait do not extend the wait, despite running concurrently with
+the updater.  This is the reason why RCU is more scalable than,
+for example, reader-writer locks.  It is so much more scalable that
+the system will have a single instance of the RCU mechanism; a single
+mechanism can be used for an arbitrary number of "things", without
+having to worry about things such as contention or deadlocks.
+
+How is this possible?  The basic idea is to split updates in two phases,
+"removal" and "reclamation".  During removal, we ensure that subsequent
+readers will not be able to get a reference to the old data.  After
+removal has completed, a critical section will not be able to access
+the old data.  Therefore, critical sections that begin after removal
+do not matter; as soon as all previous critical sections have finished,
+there cannot be any readers who hold references to the data structure,
+which may not be safely reclaimed (e.g., freed or unref'ed).
+
+Here is a picutre:
+
+thread 1  thread 2  thread 3
+------
+enter RCU crit.sec.
+   |finish removal phase
+   |begin wait
+   |  |enter RCU crit.sec.
+exit RCU crit.sec |   |
+complete wait |
+begin reclamation phase   |
+   exit RCU crit.sec.
+
+
+Note how thread 3 is still executing its critical section when thread 2
+starts reclaiming data.  This is possible, because the old version of the
+data structure was not accessible at the time thread 3 began executing
+that critical section.
+
+
+RCU API
+===
+
+The core RCU API is small:
+
+ void rcu_read_lock(void);
+
+Used by a reader to inform the reclaimer that the reader is
+entering an RCU read-side critical s

[Qemu-devel] [RFC PATCH 05/14] rcu: add call_rcu

2013-08-14 Thread Mike Day
From: Paolo Bonzini 

Signed-off-by: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 docs/rcu.txt   | 108 +--
 include/qemu/rcu.h |  22 ++
 util/rcu.c | 120 +
 3 files changed, 246 insertions(+), 4 deletions(-)

diff --git a/docs/rcu.txt b/docs/rcu.txt
index 5736676..d7c4f0b 100644
--- a/docs/rcu.txt
+++ b/docs/rcu.txt
@@ -82,7 +82,50 @@ The core RCU API is small:
 Note that it would be valid for another update to come while
 synchronize_rcu is running.  Because of this, it is better that
 the updater releases any locks it may hold before calling
-synchronize_rcu.
+synchronize_rcu.  If this is not possible (for example, because
+the updater is protected by the BQL), you can use call_rcu.
+
+ void call_rcu1(struct rcu_head * head,
+void (*func)(struct rcu_head *head));
+
+This function invokes func(head) after all pre-existing RCU
+read-side critical sections on all threads have completed.  This
+marks the end of the removal phase, with func taking care
+asynchronously of the reclamation phase.
+
+The foo struct needs to have an rcu_head structure added,
+perhaps as follows:
+
+struct foo {
+struct rcu_head rcu;
+int a;
+char b;
+long c;
+};
+
+so that the reclaimer function can fetch the struct foo address
+and free it:
+
+call_rcu1(foo_reclaim, &foo.rcu);
+
+void foo_reclaim(struct rcu_head *rp)
+{
+struct foo *fp = container_of(rp, struct foo, rcu);
+g_free(fp);
+}
+
+For the common case where the rcu_head member is the first of the
+struct, you can use the following macro.
+
+ void call_rcu(T *p,
+   void (*func)(T *p),
+   field-name);
+
+call_rcu1 is typically used through this macro, in the common case
+where the "struct rcu_head" is the first field in the struct.  In
+the above case, one could have written simply:
+
+call_rcu(foo_reclaim, g_free, rcu);
 
  typeof(*p) rcu_dereference(p);
  typeof(p) rcu_assign_pointer(p, typeof(p) v);
@@ -173,6 +216,11 @@ DIFFERENCES WITH LINUX
 - rcu_dereference takes a _pointer_ to the variable being accessed.
   Wrong usage will be detected by the compiler.
 
+- call_rcu is a macro that has an extra argument (the name of the first
+  field in the struct, which must be a struct rcu_head), and expects the
+  type of the callback's argument to be the type of the first argument.
+  call_rcu1 is the same as Linux's call_rcu.
+
 - Quiescent points must be marked explicitly unless the thread uses
   condvars/semaphores/events for synchronization.
 
@@ -229,7 +277,47 @@ The write side looks simply like this (with appropriate 
locking):
 synchronize_rcu();
 free(old);
 
-Note that the same idiom would be possible with reader/writer
+If the processing cannot be done purely within the critical section, it
+is possible to combine this idiom with a "real" reference count:
+
+rcu_read_lock();
+p = rcu_dereference(&foo);
+foo_ref(p);
+rcu_read_unlock();
+/* do something with p. */
+foo_unref(p);
+
+The write side can be like this:
+
+qemu_mutex_lock(&foo_mutex);
+old = foo;
+rcu_assign_pointer(foo, new);
+qemu_mutex_unlock(&foo_mutex);
+synchronize_rcu();
+foo_unref(old);
+
+or with call_rcu:
+
+qemu_mutex_lock(&foo_mutex);
+old = foo;
+rcu_assign_pointer(foo, new);
+qemu_mutex_unlock(&foo_mutex);
+call_rcu(foo_unref, old, rcu);
+
+In both cases, the write side only performs removal.  Reclamation
+happens when the last reference to a "foo" object is dropped.
+Using synchronize_rcu() is undesirably expensive, because the
+last reference may be dropped on the read side.  Hence you can
+use call_rcu() instead:
+
+ foo_unref(struct foo *p) {
+if (atomic_dec(&p->refcount) == 0) {
+call_rcu(foo_destroy, p, rcu);
+}
+}
+
+
+Note that the same idioms would be possible with reader/writer
 locks:
 
 read_lock(&foo_rwlock); write_mutex_lock(&foo_rwlock);
@@ -239,13 +327,25 @@ locks:
 write_mutex_unlock(&foo_rwlock);
 free(p);
 
+--
+
+read_lock(&foo_rwlock); write_mutex_lock(&foo_rwlock);
+p = foo;old = foo;
+foo_ref(p); foo = new;
+read_unlock(&foo_rwlock);   write_mutex_unlock(&foo_rwlock);
+/* do something with p. */  foo_unref(old);
+foo_unref(

[Qemu-devel] [RFC PATCH 00/14] RCU Implementation for Qemu

2013-08-14 Thread Mike Day
This is a rebase of Paolo's May patchset on v1.6.0-rc3 

The tree is availavle on github:

https://github.com/ncultra/qemu/tree/rcu-for-1.7

Mike Day (3):
  fix #include directive for rcu header
  include osdep.h for definition of glue(a,b)
  fix pointer reference to rcu_assign_pointer

Paolo Bonzini (11):
  qemu-thread: add QemuEvent
  rcu: add rcu library
  qemu-thread: register threads with RCU
  rcu: add call_rcu
  rcu: add rcutorture
  rcu: allow nested calls to rcu_thread_offline/rcu_thread_online
  qemu-thread: report RCU quiescent states
  event loop: report RCU quiescent states
  cpus: report RCU quiescent states
  block: report RCU quiescent states
  migration: report RCU quiescent states

 aio-posix.c |   9 +-
 aio-win32.c |   7 +
 block/raw-posix.c   |   3 +
 block/raw-win32.c   |   3 +
 cpus.c  |   3 +
 docs/rcu.txt| 434 +++
 hw/9pfs/virtio-9p-synth.c   |   1 +
 include/qemu/queue.h|  13 ++
 include/qemu/rcu-pointer.h  | 110 +++
 include/qemu/rcu.h  | 208 +
 include/qemu/thread-posix.h |   8 +
 include/qemu/thread-win32.h |   4 +
 include/qemu/thread.h   |  10 +-
 kvm-all.c   |   3 +
 libcacard/Makefile  |   3 +-
 main-loop.c |   7 +-
 migration.c |   2 +
 tests/Makefile  |   4 +-
 tests/rcutorture.c  | 439 
 util/Makefile.objs  |   1 +
 util/qemu-thread-posix.c| 173 -
 util/qemu-thread-win32.c|  44 -
 util/rcu.c  | 320 
 23 files changed, 1796 insertions(+), 13 deletions(-)
 create mode 100644 docs/rcu.txt
 create mode 100644 include/qemu/rcu-pointer.h
 create mode 100644 include/qemu/rcu.h
 create mode 100644 tests/rcutorture.c
 create mode 100644 util/rcu.c

-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 04/14] qemu-thread: register threads with RCU

2013-08-14 Thread Mike Day
From: Paolo Bonzini 

Signed-off-by: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 docs/rcu.txt | 13 +++--
 util/qemu-thread-posix.c | 28 +++-
 util/qemu-thread-win32.c |  2 ++
 3 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/docs/rcu.txt b/docs/rcu.txt
index 19e4840..5736676 100644
--- a/docs/rcu.txt
+++ b/docs/rcu.txt
@@ -122,8 +122,8 @@ on many POSIX systems other than Linux and Solaris).
 
 For this reason, QEMU's RCU implementation resorts to manual annotation
 of "quiescent states", i.e. points where no RCU read-side critical
-section can be active.  All threads that participate in the RCU mechanism
-need to annotate such points.
+section can be active.  All threads created with qemu_thread_create
+participate in the RCU mechanism and need to annotate such points.
 
 Marking quiescent states is done with the following three APIs:
 
@@ -144,8 +144,8 @@ Marking quiescent states is done with the following three 
APIs:
 thread.
 
 
-Furthermore, threads that participate in the RCU mechanism must communicate
-this fact using the following APIs:
+The following APIs can be used to use RCU in a thread that is not
+created with qemu_thread_create():
 
  void rcu_register_thread(void);
 
@@ -160,8 +160,9 @@ this fact using the following APIs:
 either manually or by using the QemuCond/QemuSemaphore/QemuEvent
 APIs.
 
-Note that these APIs are relatively heavyweight, and should _not_ be
-nested.
+Note that these APIs are relatively heavyweight, should _not_ be
+nested, and should not be called in threads that are created with
+qemu_thread_create().
 
 
 DIFFERENCES WITH LINUX
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 37dd298..2371176 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -26,6 +26,7 @@
 #endif
 #include "qemu/thread.h"
 #include "qemu/atomic.h"
+#include "qemu/rcu.h"
 
 static void error_exit(int err, const char *msg)
 {
@@ -388,6 +389,26 @@ void qemu_event_wait(QemuEvent *ev)
 }
 
 
+typedef struct QemuThreadData {
+/* Passed to win32_start_routine.  */
+void *(*start_routine)(void *);
+void *arg;
+} QemuThreadData;
+
+static void *thread_start_routine(void *arg)
+{
+QemuThreadData *data = (QemuThreadData *) arg;
+void *(*start_routine)(void *) = data->start_routine;
+void *thread_arg = data->arg;
+void *ret;
+
+rcu_register_thread();
+g_free(data);
+ret = start_routine(thread_arg);
+rcu_unregister_thread();
+return ret;
+}
+
 void qemu_thread_create(QemuThread *thread,
void *(*start_routine)(void*),
void *arg, int mode)
@@ -395,6 +416,11 @@ void qemu_thread_create(QemuThread *thread,
 sigset_t set, oldset;
 int err;
 pthread_attr_t attr;
+QemuThreadData *data;
+
+data = g_malloc(sizeof(*data));
+data->start_routine = start_routine;
+data->arg = arg;
 
 err = pthread_attr_init(&attr);
 if (err) {
@@ -410,7 +436,7 @@ void qemu_thread_create(QemuThread *thread,
 /* Leave signal handling to the iothread.  */
 sigfillset(&set);
 pthread_sigmask(SIG_SETMASK, &set, &oldset);
-err = pthread_create(&thread->thread, &attr, start_routine, arg);
+err = pthread_create(&thread->thread, &attr, thread_start_routine, data);
 if (err)
 error_exit(err, __func__);
 
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 27a5217..0c4850d 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -278,6 +278,7 @@ static unsigned __stdcall win32_start_routine(void *arg)
 data = NULL;
 }
 qemu_thread_data = data;
+rcu_register_thread();
 qemu_thread_exit(start_routine(thread_arg));
 abort();
 }
@@ -293,6 +294,7 @@ void qemu_thread_exit(void *arg)
 data->exited = true;
 LeaveCriticalSection(&data->cs);
 }
+rcu_unregister_thread();
 _endthreadex(0);
 }
 
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 08/14] qemu-thread: report RCU quiescent states

2013-08-14 Thread Mike Day
From: Paolo Bonzini 

Most threads will use mutexes and other sleeping synchronization primitives
(condition variables, semaphores, events) periodically.  For these threads,
the synchronization primitives are natural places to report a quiescent
state (possibly an extended one).

Signed-off-by: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 docs/rcu.txt | 27 +++
 util/qemu-thread-posix.c | 29 +
 util/qemu-thread-win32.c | 16 +++-
 util/rcu.c   |  3 ---
 4 files changed, 67 insertions(+), 8 deletions(-)

diff --git a/docs/rcu.txt b/docs/rcu.txt
index 4e7cde3..38fd8f4 100644
--- a/docs/rcu.txt
+++ b/docs/rcu.txt
@@ -168,6 +168,33 @@ of "quiescent states", i.e. points where no RCU read-side 
critical
 section can be active.  All threads created with qemu_thread_create
 participate in the RCU mechanism and need to annotate such points.
 
+Luckily, in most cases no manual annotation is needed, because waiting
+on condition variables (qemu_cond_wait), semaphores (qemu_sem_wait,
+qemu_sem_timedwait) or events (qemu_event_wait) implicitly marks the thread
+as quiescent for the whole duration of the wait.  (There is an exception
+for semaphore waits with a zero timeout).
+
+Manual annotation is still needed in the following cases:
+
+- threads that spend their sleeping time in the kernel, for example
+  in a call to select(), poll() or WaitForMultipleObjects().  The QEMU
+  I/O thread is an example of this case.
+
+- threads that perform a lot of I/O.  In QEMU, the workers used for
+  aio=thread are an example of this case (see aio_worker in block/raw-*).
+
+- threads that run continuously until they exit.  The migration thread
+  is an example of this case.
+
+Regarding the second case, note that the workers run in the QEMU thread
+pool.  The thread pool uses semaphores for synchronization, hence it does
+report quiescent states periodically.  However, in some cases (e.g. NFS
+mounted with the "hard" option) the workers can take an arbitrarily long
+amount of time.  When this happens, synchronize_rcu() will not exit and
+call_rcu() callbacks will be delayed arbitrarily.  It is therefore a
+good idea to mark I/O system calls as quiescence points in the worker
+functions.
+
 Marking quiescent states is done with the following three APIs:
 
  void rcu_quiescent_state(void);
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 2371176..dbefff8 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -119,7 +119,9 @@ void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex)
 {
 int err;
 
+rcu_thread_offline();
 err = pthread_cond_wait(&cond->cond, &mutex->lock);
+rcu_thread_online();
 if (err)
 error_exit(err, __func__);
 }
@@ -211,6 +213,10 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
 int rc;
 struct timespec ts;
 
+if (ms) {
+rcu_thread_offline();
+}
+
 #if defined(__APPLE__) || defined(__NetBSD__)
 rc = 0;
 compute_abs_deadline(&ts, ms);
@@ -228,7 +234,10 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
 --sem->count;
 }
 pthread_mutex_unlock(&sem->lock);
-return (rc == ETIMEDOUT ? -1 : 0);
+if (rc == ETIMEDOUT) {
+rc == -1;
+}
+
 #else
 if (ms <= 0) {
 /* This is cheaper than sem_timedwait.  */
@@ -236,7 +245,7 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
 rc = sem_trywait(&sem->sem);
 } while (rc == -1 && errno == EINTR);
 if (rc == -1 && errno == EAGAIN) {
-return -1;
+goto out;
 }
 } else {
 compute_abs_deadline(&ts, ms);
@@ -244,19 +253,26 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
 rc = sem_timedwait(&sem->sem, &ts);
 } while (rc == -1 && errno == EINTR);
 if (rc == -1 && errno == ETIMEDOUT) {
-return -1;
+goto out;
 }
 }
 if (rc < 0) {
 error_exit(errno, __func__);
 }
-return 0;
+
 #endif
+
+out:
+if (ms) {
+rcu_thread_online();
+}
+return rc;
 }
 
 void qemu_sem_wait(QemuSemaphore *sem)
 {
 int rc;
+rcu_thread_offline();
 
 #if defined(__APPLE__) || defined(__NetBSD__)
 pthread_mutex_lock(&sem->lock);
@@ -276,6 +292,7 @@ void qemu_sem_wait(QemuSemaphore *sem)
 error_exit(errno, __func__);
 }
 #endif
+rcu_thread_online();
 }
 
 #ifdef __linux__
@@ -384,7 +401,11 @@ void qemu_event_wait(QemuEvent *ev)
 return;
 }
 }
+rcu_thread_offline();
 futex_wait(ev, EV_BUSY);
+rcu_thread_online();
+} else {
+rcu_quiescent_state();
 }
 }
 
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 0c4850d..7bc07b3 100644
--- a/util/qemu-thread-win32.c
+++ b/

[Qemu-devel] [RFC PATCH 07/14] rcu: allow nested calls to rcu_thread_offline/rcu_thread_online

2013-08-14 Thread Mike Day
From: Paolo Bonzini 

Signed-off-by: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 docs/rcu.txt   |  5 +
 include/qemu/rcu.h | 21 +++--
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/docs/rcu.txt b/docs/rcu.txt
index d7c4f0b..4e7cde3 100644
--- a/docs/rcu.txt
+++ b/docs/rcu.txt
@@ -187,6 +187,11 @@ Marking quiescent states is done with the following three 
APIs:
 thread.
 
 
+rcu_thread_offline() and rcu_thread_online() can be nested.  The end of
+the extended quiescent state will coincide with the outermost call to
+rcu_thread_online().
+
+
 The following APIs can be used to use RCU in a thread that is not
 created with qemu_thread_create():
 
diff --git a/include/qemu/rcu.h b/include/qemu/rcu.h
index e43b912..3a55045 100644
--- a/include/qemu/rcu.h
+++ b/include/qemu/rcu.h
@@ -77,6 +77,9 @@ struct rcu_reader_data {
 unsigned long ctr;
 bool waiting;
 
+/* Data used by reader only */
+unsigned offline;
+
 /* Data used for registry, protected by rcu_gp_lock */
 QLIST_ENTRY(rcu_reader_data) node;
 };
@@ -127,9 +130,14 @@ static inline void rcu_read_unlock(void)
 static inline void rcu_quiescent_state(void)
 {
 struct rcu_reader_data *p_rcu_reader = get_rcu_reader();
+unsigned ctr;
+
+if (p_rcu_reader->offline > 0) {
+return;
+}
 
 /* Reuses smp_rmb() in the last rcu_read_unlock().  */
-unsigned ctr = atomic_read(&rcu_gp_ctr);
+ctr = atomic_read(&rcu_gp_ctr);
 atomic_xchg(&p_rcu_reader->ctr, ctr);
 if (atomic_read(&p_rcu_reader->waiting)) {
 atomic_set(&p_rcu_reader->waiting, false);
@@ -141,6 +149,10 @@ static inline void rcu_thread_offline(void)
 {
 struct rcu_reader_data *p_rcu_reader = get_rcu_reader();
 
+if (p_rcu_reader->offline++ > 0) {
+return;
+}
+
 atomic_xchg(&p_rcu_reader->ctr, 0);
 if (atomic_read(&p_rcu_reader->waiting)) {
 atomic_set(&p_rcu_reader->waiting, false);
@@ -150,7 +162,12 @@ static inline void rcu_thread_offline(void)
 
 static inline void rcu_thread_online(void)
 {
-rcu_quiescent_state();
+struct rcu_reader_data *p_rcu_reader = get_rcu_reader();
+
+assert(p_rcu_reader->offline != 0);
+if (--p_rcu_reader->offline == 0) {
+rcu_quiescent_state();
+}
 }
 
 extern void synchronize_rcu(void);
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 03/14] fix #include directive for rcu header

2013-08-14 Thread Mike Day
Signed-off-by: Mike Day 
---
 hw/9pfs/virtio-9p-synth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/9pfs/virtio-9p-synth.c b/hw/9pfs/virtio-9p-synth.c
index d5f5842..fdfea21 100644
--- a/hw/9pfs/virtio-9p-synth.c
+++ b/hw/9pfs/virtio-9p-synth.c
@@ -17,7 +17,7 @@
 #include "virtio-9p-xattr.h"
 #include "fsdev/qemu-fsdev.h"
 #include "virtio-9p-synth.h"
-#include "util/rcu.h"
+#include "qemu/rcu.h"
 
 #include 
 
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 09/14] event loop: report RCU quiescent states

2013-08-14 Thread Mike Day
From: Paolo Bonzini 

Threads that run event loops also have places that can sleep for an extended
time.  Place an extended quiescent state there.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 aio-posix.c | 9 -
 aio-win32.c | 7 +++
 main-loop.c | 7 ++-
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/aio-posix.c b/aio-posix.c
index b68eccd..562add6 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -17,6 +17,7 @@
 #include "block/block.h"
 #include "qemu/queue.h"
 #include "qemu/sockets.h"
+#include "qemu/rcu.h"
 
 struct AioHandler
 {
@@ -232,10 +233,16 @@ bool aio_poll(AioContext *ctx, bool blocking)
 }
 
 /* wait until next event */
+if (blocking) {
+rcu_thread_offline();
+}
 ret = g_poll((GPollFD *)ctx->pollfds->data,
  ctx->pollfds->len,
  blocking ? -1 : 0);
-
+if (blocking) {
+rcu_thread_online();
+}
+
 /* if we have any readable fds, dispatch event */
 if (ret > 0) {
 QLIST_FOREACH(node, &ctx->aio_handlers, node) {
diff --git a/aio-win32.c b/aio-win32.c
index 38723bf..8a6abb0 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -175,7 +175,14 @@ bool aio_poll(AioContext *ctx, bool blocking)
 /* wait until next event */
 while (count > 0) {
 int timeout = blocking ? INFINITE : 0;
+
+if (timeout) {
+rcu_thread_offline();
+}
 int ret = WaitForMultipleObjects(count, events, FALSE, timeout);
+if (timeout) {
+rcu_thread_online();
+}
 
 /* if we have any signaled events, dispatch event */
 if ((DWORD) (ret - WAIT_OBJECT_0) >= count) {
diff --git a/main-loop.c b/main-loop.c
index a44fff6..79f43da 100644
--- a/main-loop.c
+++ b/main-loop.c
@@ -27,6 +27,7 @@
 #include "qemu/sockets.h"  // struct in_addr needed for libslirp.h
 #include "slirp/libslirp.h"
 #include "qemu/main-loop.h"
+#include "qemu/rcu.h"
 #include "block/aio.h"
 
 #ifndef _WIN32
@@ -220,6 +221,7 @@ static int os_host_main_loop_wait(uint32_t timeout)
 if (timeout > 0) {
 spin_counter = 0;
 qemu_mutex_unlock_iothread();
+   rcu_thread_offline();
 } else {
 spin_counter++;
 }
@@ -227,7 +229,8 @@ static int os_host_main_loop_wait(uint32_t timeout)
 ret = g_poll((GPollFD *)gpollfds->data, gpollfds->len, timeout);
 
 if (timeout > 0) {
-qemu_mutex_lock_iothread();
+rcu_thread_online();
+   qemu_mutex_lock_iothread();
 }
 
 glib_pollfds_poll();
@@ -424,7 +427,9 @@ static int os_host_main_loop_wait(uint32_t timeout)
 }
 
 qemu_mutex_unlock_iothread();
+rcu_thread_offline();
 g_poll_ret = g_poll(poll_fds, n_poll_fds + w->num, poll_timeout);
+rcu_thread_online();
 qemu_mutex_lock_iothread();
 if (g_poll_ret > 0) {
 for (i = 0; i < w->num; i++) {
-- 
1.8.3.1




[Qemu-devel] [PATCH 13/14] include osdep.h for definition of glue(a, b)

2013-08-14 Thread Mike Day
diff --git a/include/qemu/rcu.h b/include/qemu/rcu.h
index 3a55045..f398e6c 100644
--- a/include/qemu/rcu.h
+++ b/include/qemu/rcu.h
@@ -39,6 +39,7 @@
 #include "qemu/compiler.h"
 #include "qemu/rcu-pointer.h"
 #include "qemu/thread.h"
+#include "qemu/osdep.h"
 #include "qemu/queue.h"
 #include "qemu/atomic.h"
 
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 10/14] cpus: report RCU quiescent states

2013-08-14 Thread Mike Day
From: Paolo Bonzini 

CPU threads have extended quiescent states while relinquishing control
to the accelerator (except TCG).

Signed-off-by: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 cpus.c| 3 +++
 kvm-all.c | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/cpus.c b/cpus.c
index 0f65e76..624658e 100644
--- a/cpus.c
+++ b/cpus.c
@@ -37,6 +37,7 @@
 #include "sysemu/qtest.h"
 #include "qemu/main-loop.h"
 #include "qemu/bitmap.h"
+#include "qemu/rcu.h"
 
 #ifndef _WIN32
 #include "qemu/compatfd.h"
@@ -818,6 +819,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
 while (1) {
 current_cpu = NULL;
 qemu_mutex_unlock_iothread();
+   rcu_thread_offline();
 do {
 int sig;
 r = sigwait(&waitset, &sig);
@@ -826,6 +828,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
 perror("sigwait");
 exit(1);
 }
+   rcu_thread_online();
 qemu_mutex_lock_iothread();
 current_cpu = cpu;
 qemu_wait_io_event_common(cpu);
diff --git a/kvm-all.c b/kvm-all.c
index 716860f..a47573d 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -33,6 +33,7 @@
 #include "exec/memory.h"
 #include "exec/address-spaces.h"
 #include "qemu/event_notifier.h"
+#include "qemu/rcu.h"
 #include "trace.h"
 
 /* This check must be after config-host.h is included */
@@ -1641,7 +1642,9 @@ int kvm_cpu_exec(CPUState *cpu)
 }
 qemu_mutex_unlock_iothread();
 
+   rcu_thread_offline();
 run_ret = kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
+   rcu_thread_online();
 
 qemu_mutex_lock_iothread();
 kvm_arch_post_run(cpu, run);
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 11/14] block: report RCU quiescent states

2013-08-14 Thread Mike Day
From: Paolo Bonzini 

The aio workers may spend a long time executing I/O operations;
mark that time as an extended quiescent state.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 block/raw-posix.c | 3 +++
 block/raw-win32.c | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index ba721d3..962833e 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -26,6 +26,7 @@
 #include "qemu/log.h"
 #include "block/block_int.h"
 #include "qemu/module.h"
+#include "qemu/rcu.h"
 #include "trace.h"
 #include "block/thread-pool.h"
 #include "qemu/iov.h"
@@ -735,6 +736,7 @@ static int aio_worker(void *arg)
 RawPosixAIOData *aiocb = arg;
 ssize_t ret = 0;
 
+rcu_thread_offline();
 switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) {
 case QEMU_AIO_READ:
 ret = handle_aiocb_rw(aiocb);
@@ -774,6 +776,7 @@ static int aio_worker(void *arg)
 }
 
 g_slice_free(RawPosixAIOData, aiocb);
+rcu_thread_online();
 return ret;
 }
 
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 9b5b2af..5f88452 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -25,6 +25,7 @@
 #include "qemu/timer.h"
 #include "block/block_int.h"
 #include "qemu/module.h"
+#include "qemu/rcu.h"
 #include "raw-aio.h"
 #include "trace.h"
 #include "block/thread-pool.h"
@@ -99,6 +100,7 @@ static int aio_worker(void *arg)
 ssize_t ret = 0;
 size_t count;
 
+rcu_thread_offline();
 switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) {
 case QEMU_AIO_READ:
 count = handle_aiocb_rw(aiocb);
@@ -136,6 +138,7 @@ static int aio_worker(void *arg)
 }
 
 g_slice_free(RawWin32AIOData, aiocb);
+rcu_thread_online();
 return ret;
 }
 
-- 
1.8.3.1




[Qemu-devel] [PATCH 14/14] fix pointer reference to rcu_assign_pointer

2013-08-14 Thread Mike Day
Signed-off-by: Mike Day 
---
 tests/rcutorture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/rcutorture.c b/tests/rcutorture.c
index 236c628..b219ff5 100644
--- a/tests/rcutorture.c
+++ b/tests/rcutorture.c
@@ -293,7 +293,7 @@ static void *rcu_update_stress_test(void *arg)
 smp_mb();
 p->pipe_count = 0;
 p->mbtest = 1;
-rcu_assign_pointer(&rcu_stress_current, &p);
+rcu_assign_pointer(rcu_stress_current, p);
 rcu_stress_idx = i;
 for (i = 0; i < RCU_STRESS_PIPE_LEN; i++)
 if (i != rcu_stress_idx) {
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 12/14] migration: report RCU quiescent states

2013-08-14 Thread Mike Day
From: Paolo Bonzini 

The migration thread polls s->state periodically, it does not
use a mutex or condition variable, so it has to report quiescent
states manually.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 migration.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/migration.c b/migration.c
index 1402fa7..9d0950e 100644
--- a/migration.c
+++ b/migration.c
@@ -22,6 +22,7 @@
 #include "qemu/sockets.h"
 #include "migration/block.h"
 #include "qemu/thread.h"
+#include "qemu/rcu.h"
 #include "qmp-commands.h"
 #include "trace.h"
 
@@ -563,6 +564,7 @@ static void *migration_thread(void *opaque)
 int64_t current_time;
 uint64_t pending_size;
 
+rcu_quiescent_state();
 if (!qemu_file_rate_limit(s->file)) {
 DPRINTF("iterate\n");
 pending_size = qemu_savevm_state_pending(s->file, max_size);
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 01/14] qemu-thread: add QemuEvent

2013-08-14 Thread Mike Day
From: Paolo Bonzini 

This emulates Win32 manual-reset events using futexes or conditional
variables.  Typical ways to use them are with multi-producer,
single-consumer data structures, to test for a complex condition whose
elements come from different threads:

for (;;) {
qemu_event_reset(ev);
... test complex condition ...
if (condition is true) {
break;
}
qemu_event_wait(ev);
}

Alternatively:

... compute condition ...
if (condition) {
do {
qemu_event_wait(ev);
qemu_event_reset(ev);
... compute condition ...
} while(condition);
qemu_event_set(ev);
}

QemuEvent provides a very fast userspace path in the common case when
no other thread is waiting, or the event is not changing state.  It
is used to report RCU quiescent states to the thread calling
synchronize_rcu (the latter being the single consumer), and to report
call_rcu invocations to the thread that receives them.

Signed-off-by: Paolo Bonzini 
Reviewed-by: Mike Day 

---
 include/qemu/thread-posix.h |   8 +++
 include/qemu/thread-win32.h |   4 ++
 include/qemu/thread.h   |   7 +++
 util/qemu-thread-posix.c| 116 
 util/qemu-thread-win32.c|  26 ++
 5 files changed, 161 insertions(+)

diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h
index 361566a..eb5c7a1 100644
--- a/include/qemu/thread-posix.h
+++ b/include/qemu/thread-posix.h
@@ -21,6 +21,14 @@ struct QemuSemaphore {
 #endif
 };
 
+struct QemuEvent {
+#ifndef __linux__
+pthread_mutex_t lock;
+pthread_cond_t cond;
+#endif
+unsigned value;
+};
+
 struct QemuThread {
 pthread_t thread;
 };
diff --git a/include/qemu/thread-win32.h b/include/qemu/thread-win32.h
index 13adb95..3d58081 100644
--- a/include/qemu/thread-win32.h
+++ b/include/qemu/thread-win32.h
@@ -17,6 +17,10 @@ struct QemuSemaphore {
 HANDLE sema;
 };
 
+struct QemuEvent {
+HANDLE event;
+};
+
 typedef struct QemuThreadData QemuThreadData;
 struct QemuThread {
 QemuThreadData *data;
diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index c02404b..3e32c65 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -7,6 +7,7 @@
 typedef struct QemuMutex QemuMutex;
 typedef struct QemuCond QemuCond;
 typedef struct QemuSemaphore QemuSemaphore;
+typedef struct QemuEvent QemuEvent;
 typedef struct QemuThread QemuThread;
 
 #ifdef _WIN32
@@ -45,6 +46,12 @@ void qemu_sem_wait(QemuSemaphore *sem);
 int qemu_sem_timedwait(QemuSemaphore *sem, int ms);
 void qemu_sem_destroy(QemuSemaphore *sem);
 
+void qemu_event_init(QemuEvent *ev, bool init);
+void qemu_event_set(QemuEvent *ev);
+void qemu_event_reset(QemuEvent *ev);
+void qemu_event_wait(QemuEvent *ev);
+void qemu_event_destroy(QemuEvent *ev);
+
 void qemu_thread_create(QemuThread *thread,
 void *(*start_routine)(void *),
 void *arg, int mode);
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 4de133e..37dd298 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -20,7 +20,12 @@
 #include 
 #include 
 #include 
+#ifdef __linux__
+#include 
+#include 
+#endif
 #include "qemu/thread.h"
+#include "qemu/atomic.h"
 
 static void error_exit(int err, const char *msg)
 {
@@ -272,6 +277,117 @@ void qemu_sem_wait(QemuSemaphore *sem)
 #endif
 }
 
+#ifdef __linux__
+#define futex(...)  syscall(__NR_futex, __VA_ARGS__)
+
+static inline void futex_wake(QemuEvent *ev, int n)
+{
+futex(ev, FUTEX_WAKE, n, NULL, NULL, 0);
+}
+
+static inline void futex_wait(QemuEvent *ev, unsigned val)
+{
+futex(ev, FUTEX_WAIT, (int) val, NULL, NULL, 0);
+}
+#else
+static inline void futex_wake(QemuEvent *ev, int n)
+{
+if (n == 1) {
+pthread_cond_signal(&ev->cond);
+} else {
+pthread_cond_broadcast(&ev->cond);
+}
+}
+
+static inline void futex_wait(QemuEvent *ev, unsigned val)
+{
+pthread_mutex_lock(&ev->lock);
+if (ev->value == val) {
+pthread_cond_wait(&ev->cond, &ev->lock);
+}
+pthread_mutex_unlock(&ev->lock);
+}
+#endif
+
+/* Valid transitions:
+ * - free->set, when setting the event
+ * - busy->set, when setting the event, followed by futex_wake
+ * - set->free, when resetting the event
+ * - free->busy, when waiting
+ *
+ * set->busy does not happen (it can be observed from the outside but
+ * it really is set->free->busy).
+ *
+ * busy->free provably cannot happen; to enforce it, the set->free transition
+ * is done with an OR, which becomes a no-op if the event has concurrently
+ * transitioned to free or busy.
+ */
+
+#define EV_SET 0
+#define EV_FREE1
+#define EV_BUSY   -1
+
+void qemu_event_init(QemuEvent *ev, bool init)
+{
+#ifndef __linux__
+pthread_mutex_init(&ev->lock, 

[Qemu-devel] [RFC PATCH 06/14] rcu: add rcutorture

2013-08-14 Thread Mike Day
From: Paolo Bonzini 

A stress test program (works, too :)).

Signed-off-by: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 tests/Makefile |   4 +-
 tests/rcutorture.c | 439 +
 2 files changed, 442 insertions(+), 1 deletion(-)
 create mode 100644 tests/rcutorture.c

diff --git a/tests/Makefile b/tests/Makefile
index d044908..b4a52b4 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -99,7 +99,8 @@ test-obj-y = tests/check-qint.o tests/check-qstring.o 
tests/check-qdict.o \
tests/test-string-input-visitor.o tests/test-qmp-output-visitor.o \
tests/test-qmp-input-visitor.o tests/test-qmp-input-strict.o \
tests/test-qmp-commands.o tests/test-visitor-serialization.o \
-   tests/test-x86-cpuid.o tests/test-mul64.o tests/test-int128.o
+   tests/test-x86-cpuid.o tests/test-mul64.o tests/rcutortore.o \
+tests/test-int128.o
 
 test-qapi-obj-y = tests/test-qapi-visit.o tests/test-qapi-types.o
 
@@ -122,6 +123,7 @@ tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o 
libqemuutil.a libqemustub.a
 tests/test-x86-cpuid$(EXESUF): tests/test-x86-cpuid.o
 tests/test-xbzrle$(EXESUF): tests/test-xbzrle.o xbzrle.o page_cache.o 
libqemuutil.a
 tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
++tests/rcutorture$(EXESUF): tests/rcutorture.o libqemuutil.a
 tests/test-int128$(EXESUF): tests/test-int128.o
 
 tests/test-qapi-types.c tests/test-qapi-types.h :\
diff --git a/tests/rcutorture.c b/tests/rcutorture.c
new file mode 100644
index 000..236c628
--- /dev/null
+++ b/tests/rcutorture.c
@@ -0,0 +1,439 @@
+/*
+ * rcutorture.c: simple user-level performance/stress test of RCU.
+ *
+ * Usage:
+ * ./rcu  rperf [  ]
+ * Run a read-side performance test with the specified
+ * number of readers for  seconds.
+ * ./rcu  uperf [  ]
+ * Run an update-side performance test with the specified
+ * number of updaters and specified duration.
+ * ./rcu  perf [  ]
+ * Run a combined read/update performance test with the specified
+ * number of readers and one updater and specified duration.
+ *
+ * The above tests produce output as follows:
+ *
+ * n_reads: 46008000  n_updates: 146026  nreaders: 2  nupdaters: 1 duration: 1
+ * ns/read: 43.4707  ns/update: 6848.1
+ *
+ * The first line lists the total number of RCU reads and updates executed
+ * during the test, the number of reader threads, the number of updater
+ * threads, and the duration of the test in seconds.  The second line
+ * lists the average duration of each type of operation in nanoseconds,
+ * or "nan" if the corresponding type of operation was not performed.
+ *
+ * ./rcu  stress [  ]
+ * Run a stress test with the specified number of readers and
+ * one updater.
+ *
+ * This test produces output as follows:
+ *
+ * n_reads: 114633217  n_updates: 3903415  n_mberror: 0
+ * rcu_stress_count: 114618391 14826 0 0 0 0 0 0 0 0 0
+ *
+ * The first line lists the number of RCU read and update operations
+ * executed, followed by the number of memory-ordering violations
+ * (which will be zero in a correct RCU implementation).  The second
+ * line lists the number of readers observing progressively more stale
+ * data.  A correct RCU implementation will have all but the first two
+ * numbers non-zero.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (c) 2008 Paul E. McKenney, IBM Corporation.
+ */
+
+/*
+ * Test variables.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include "qemu/atomic.h"
+#include "qemu/rcu.h"
+#include "qemu/compiler.h"
+#include "qemu/thread.h"
+
+long long n_reads = 0LL;
+long n_updates = 0L;
+int nthreadsrunning;
+
+char argsbuf[64];
+
+#define GOFLAG_INIT 0
+#define GOFLAG_RUN  1
+#define GOFLAG_STOP 2
+
+static volatile int goflag = GOFLAG_INIT;
+
+#define RCU_READ_RUN 1000
+
+#define NR_THREADS 100
+static QemuThread threads[NR_THREADS];
+static struct rcu_reader_data *data[NR_THREADS];
+static int n_threads;
+
+static void create_thread(void *(*func)(void *))
+{
+if (n_threads >= NR_THREADS) {
+fprintf(stderr, "Thread limit of %d exceeded!\n", NR_THREADS);
+exit(-1);
+}
+  

[Qemu-devel] [PATCH 1/2] fixed tests/Makefile to correctly link rcutorture

2013-08-15 Thread Mike Day
From: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 tests/Makefile | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/tests/Makefile b/tests/Makefile
index b4a52b4..4d68d28 100644
--- a/tests/Makefile
+++ b/tests/Makefile
@@ -44,9 +44,14 @@ check-unit-y += tests/test-cutils$(EXESUF)
 gcov-files-test-cutils-y += util/cutils.c
 check-unit-y += tests/test-mul64$(EXESUF)
 gcov-files-test-mul64-y = util/host-utils.c
+check-unit-y += tests/test-tls$(EXESUF)
+# all code tested by test-tls is inside tls.h
+gcov-files-test-tls-y =
 check-unit-y += tests/test-int128$(EXESUF)
 # all code tested by test-int128 is inside int128.h
 gcov-files-test-int128-y =
+check-unit-y += tests/rcutorture$(EXESUF)
+gcov-files-rcutorture-y = util/rcu.c
 check-unit-y += tests/test-bitops$(EXESUF)
 
 check-block-$(CONFIG_POSIX) += tests/qemu-iotests-quick.sh
@@ -99,8 +104,8 @@ test-obj-y = tests/check-qint.o tests/check-qstring.o 
tests/check-qdict.o \
tests/test-string-input-visitor.o tests/test-qmp-output-visitor.o \
tests/test-qmp-input-visitor.o tests/test-qmp-input-strict.o \
tests/test-qmp-commands.o tests/test-visitor-serialization.o \
-   tests/test-x86-cpuid.o tests/test-mul64.o tests/rcutortore.o \
-tests/test-int128.o
+   tests/test-x86-cpuid.o tests/test-mul64.o tests/test-int128.o \
+   tests/test-tls.o tests/rcutorture.o
 
 test-qapi-obj-y = tests/test-qapi-visit.o tests/test-qapi-types.o
 
@@ -123,8 +128,9 @@ tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o 
libqemuutil.a libqemustub.a
 tests/test-x86-cpuid$(EXESUF): tests/test-x86-cpuid.o
 tests/test-xbzrle$(EXESUF): tests/test-xbzrle.o xbzrle.o page_cache.o 
libqemuutil.a
 tests/test-cutils$(EXESUF): tests/test-cutils.o util/cutils.o
+tests/rcutorture$(EXESUF): tests/rcutorture.o libqemuutil.a
 tests/test-int128$(EXESUF): tests/test-int128.o
+tests/test-tls$(EXESUF): tests/test-tls.o libqemuutil.a
+tests/rcutorture$(EXESUF): tests/rcutorture.o libqemuutil.a
 
 tests/test-qapi-types.c tests/test-qapi-types.h :\
 $(SRC_PATH)/tests/qapi-schema/qapi-schema-test.json 
$(SRC_PATH)/scripts/qapi-types.py
-- 
1.8.3.1




[Qemu-devel] [RFC PATCH 0/2] v2.1 RCU Implementation for QEMU

2013-08-15 Thread Mike Day
This series applies on top today's git.qemu.org/master and is online at:
https://github.com/ncultra/qemu/tree/rcu-for-1.7

Paolo Bonzini (2):
  fixed tests/Makefile to correctly link rcutorture
  enable TLS in build and activate test-tls in make check

 configure  |  63 ++
 include/qemu/tls.h | 127 +
 include/qom/cpu.h  |   2 +-
 tests/Makefile |  14 --
 tests/test-tls.c   |  87 
 5 files changed, 261 insertions(+), 32 deletions(-)
 create mode 100644 tests/test-tls.c

-- 
1.8.3.1




[Qemu-devel] [PATCH 2/2] enable TLS in build and activate test-tls in make check

2013-08-15 Thread Mike Day
From: Paolo Bonzini 
Reviewed-by: Mike Day 
---
 configure  |  63 ++
 include/qemu/tls.h | 127 +
 include/qom/cpu.h  |   2 +-
 tests/Makefile |   2 +-
 tests/test-tls.c   |  87 
 5 files changed, 252 insertions(+), 29 deletions(-)
 create mode 100644 tests/test-tls.c

diff --git a/configure b/configure
index 18fa608..baf61c8 100755
--- a/configure
+++ b/configure
@@ -285,6 +285,7 @@ fi
 ar="${AR-${cross_prefix}ar}"
 as="${AS-${cross_prefix}as}"
 cpp="${CPP-$cc -E}"
+nm="${NM-${cross_prefix}nm}"
 objcopy="${OBJCOPY-${cross_prefix}objcopy}"
 ld="${LD-${cross_prefix}ld}"
 libtool="${LIBTOOL-${cross_prefix}libtool}"
@@ -1241,6 +1242,29 @@ if compile_prog "-Werror -fno-gcse" "" ; then
   TRANSLATE_OPT_CFLAGS=-fno-gcse
 fi
 
+##
+# Using __thread is either faster than pthread_get/setspecific,
+# or (if using GCC's "emutls" feature) exactly the same.  So
+# we always use it if available.
+
+cat > $TMPC << EOF
+__thread int x;
+
+int main(void)
+{
+  x = 42;
+  return x;
+}
+EOF
+if compile_prog "-Werror" "" ; then
+  tls=yes
+else
+  tls=no
+fi
+
+##
+# Position Independent executables
+
 if test "$static" = "yes" ; then
   if test "$pie" = "yes" ; then
 error_exit "static and pie are mutually incompatible"
@@ -1260,19 +1284,18 @@ if test "$pie" = ""; then
 fi
 
 if test "$pie" != "no" ; then
+  if test "$CONFIG_TLS" = yes; then
+THREAD=__thread
+  else
+THREAD=
+  fi
   cat > $TMPC << EOF
-
-#ifdef __linux__
-#  define THREAD __thread
-#else
-#  define THREAD
-#endif
-
-static THREAD int tls_var;
+static $THREAD int tls_var;
 
 int main(void) { return tls_var; }
 
 EOF
+  unset THREAD
   if compile_prog "-fPIE -DPIE" "-pie"; then
 QEMU_CFLAGS="-fPIE -DPIE $QEMU_CFLAGS"
 LDFLAGS="-pie $LDFLAGS"
@@ -3184,6 +3207,22 @@ if test "$trace_backend" = "dtrace"; then
 fi
 
 ##
+# check for TLS runtime
+
+# Some versions of mingw include the "magic" definitions that make
+# TLS work, some don't.  Check for it.
+
+if test "$mingw32" = yes; then
+  cat > $TMPC << EOF
+int main(void) { return 0; }
+EOF
+  compile_prog "" ""
+  if $nm $TMPE | grep _tls_used > /dev/null 2>&1; then
+mingw32_tls_runtime=yes
+  fi
+fi
+
+##
 # check and set a backend for coroutine
 
 # We prefer ucontext, but it's not always possible. The fallback
@@ -3677,6 +3716,9 @@ if test "$mingw32" = "yes" ; then
   version_micro=0
   echo 
"CONFIG_FILEVERSION=$version_major,$version_minor,$version_subminor,$version_micro"
 >> $config_host_mak
   echo 
"CONFIG_PRODUCTVERSION=$version_major,$version_minor,$version_subminor,$version_micro"
 >> $config_host_mak
+  if test "$mingw32_tls_runtime" = yes; then
+echo "CONFIG_MINGW32_TLS_RUNTIME=y" >> $config_host_mak
+  fi
 else
   echo "CONFIG_POSIX=y" >> $config_host_mak
 fi
@@ -3979,6 +4021,10 @@ if test "$cpuid_h" = "yes" ; then
   echo "CONFIG_CPUID_H=y" >> $config_host_mak
 fi
 
+if test "$tls" = "yes" ; then
+  echo "CONFIG_TLS=y" >> $config_host_mak
+fi
+
 if test "$int128" = "yes" ; then
   echo "CONFIG_INT128=y" >> $config_host_mak
 fi
@@ -4107,6 +4153,7 @@ echo "OBJCC=$objcc" >> $config_host_mak
 echo "AR=$ar" >> $config_host_mak
 echo "AS=$as" >> $config_host_mak
 echo "CPP=$cpp" >> $config_host_mak
+echo "NM=$nm" >> $config_host_mak
 echo "OBJCOPY=$objcopy" >> $config_host_mak
 echo "LD=$ld" >> $config_host_mak
 echo "WINDRES=$windres" >> $config_host_mak
diff --git a/include/qemu/tls.h b/include/qemu/tls.h
index b92ea9d..c878aaa 100644
--- a/include/qemu/tls.h
+++ b/include/qemu/tls.h
@@ -1,7 +1,7 @@
 /*
  * Abstraction layer for defining and using TLS variables
  *
- * Copyright (c) 2011 Red Hat, Inc
+ * Copyright (c) 2011, 2013 Red Hat, Inc
  * Copyright (c) 2011 Linaro Limited
  *
  * Authors:
@@ -25,28 +25,117 @@
 #ifndef QEMU_TLS_H
 #define QEMU_TLS_H
 
-/* Per-thread variables. Note that we only have implementations
- * which are really thread-local on Linux; the dummy implementations
- * define plain global variables.
+#ifdef CONFIG_WIN32
+
+/* Do not use GCC's "emutls" path on 

Re: [Qemu-devel] Configure virtio-scsi options via libvirt

2014-05-08 Thread Mike Perez
On 10:08 Wed 07 May , Stefan Hajnoczi wrote:
> On Tue, May 06, 2014 at 04:13:50PM -0700, Mike Perez wrote:
> > I would like be able to configure virtio-scsi options num_queues, 
> > max_sectors,
> > and cmd_per_lun via libvirt. Are there any plans to have this support?
> 
> Hi Mike,
> I'm not sure about the status of libvirt support for virtio-scsi options
> but in the meantime you can use  passthrough:
> 
> http://blog.vmsplice.net/2011/04/how-to-pass-qemu-command-line-options.html
> 
> Stefan

Thanks Stefan,

My plan is to write the support in OpenStack, which unfortunately does not
accept patches doing qemu passthrough:

http://lists.openstack.org/pipermail/openstack-dev/2013-November/018551.html

-- 
Mike Perez



Re: [Qemu-devel] [PATCH] Update QEMU checkpatch.pl to current linux version

2014-05-08 Thread Mike Day
On Wed, May 7, 2014 at 5:16 PM, Hani Benhabiles  wrote:
> FWIW, this new version doesn't trigger a false positive I was having with
> patches 05-07 in [1].
>
> However, from a quick test for this patch on a couple of patches, I am getting
> warnings like this: "WARNING: braces {} are not necessary for single statement
> blocks" so there are still some Qemu related changes missing.

Thanks for the feedback! As Peter pointed out (and you confirm), I
missed forward-porting QEMU's brace rules.



Re: [Qemu-devel] [PATCH] Update QEMU checkpatch.pl to current linux version

2014-05-08 Thread Mike Day
On Thu, May 8, 2014 at 10:02 AM, Peter Maydell  wrote:
>>> total: 0 errors, 0 warnings, 79 lines checked
>>>
>>> /tmp/a has no obvious style problems and is ready for submission.
>>> Check 500, Bad 52
>>
>> How does your new checkpatch compare to our current one?
>
> Yes; we do sometimes let checkpatch-failing commits through,
> so it would be more interesting to know:
>  * how many commits passed old-checkpatch but fail the new one
>  * how many commits failed with old-checkpatch but pass now
>  * in both cases, how many of the failures are false positives?
>
> (the last of those is tricky to answer without eyeballing
> the errors, unfortunately)

That seems like a good test regime and not a lot of effort using the
bash script Don shared.

There appears to be some interest in this patch. I'll work on a 2nd
revision, following Peter's suggestion to separate the QEMU rules into
a separate, smaller patch (and with the correct handling of brackets).

Mike



Re: [Qemu-devel] [PATCH 1/6] xics: add flags for interrupts

2014-05-09 Thread Mike Day
On Thu, May 8, 2014 at 11:12 PM, Alexey Kardashevskiy  wrote:
>> There are
>> a couple ways to mitigate this type of situation by using alternative
>> data structures to inform the loop traversal. I don't know if it is
>> worth the effort, though.
>
> Here I lost you :)

If I read the code correctly, the problem I'm wondering  about is that
the loop will waste time traversing the array  when there are only
unallocated interrupts from the current index to the end. For example,
if the interrupt array is 256 entries and the highest index that is
allocated is 16, the outside loop will traverse all 256 entries while
it should have exited after the 16th.

To mitigate this you could keep a shadow index of the current highest
allocated index and check for that in the outside loop. Or you could
maintain a shadow linked list that only includes allocated array
entries and just traverse that list. Each element on the list would be
an allocated entry in the interrupt array.

> btw I just realized that in patch#2 it should be xics_find_source instead
> of xics_find_server. There are many interrupt servers already and just one
> interrupt source (we could have many like one per PHB or something like
> that but we are not there yet), this is what I meant.



[Qemu-devel] [PATCH] qemu-img fails to delete last snapshot

2014-05-09 Thread Mike Day
When deleting the last snapshot, copying the resulting snapshot table
currently fails, causing the delete operation to also fail. Fix the
failure by skipping the copy and just writing the snapshot header and
freeing the extra clusters.

Signed-off-by: Mike Day 
---

There are two specific problems in the curent code. First is a lack of
parenthesis in the calculation of a memmove parameter:

s->nb_snapshots - snapshot_index - 1

When s->nb_snapshots is 0, snapshot_index is 1. 

0 - 1 - 1 = 0xfffe

it should be:

0 - (1 - 1) = 0x00

The second problem is shifting the snapshot table to the left. After
removing the last snapshot there are no existing snapshots to be
shifted. All that needs to be done is to write the header and
unallocate the blocks.

 block/qcow2-snapshot.c | 25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 0aa9def..c8b842c 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -165,9 +165,11 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
 
 assert(offset <= INT_MAX);
 snapshots_size = offset;
-
 /* Allocate space for the new snapshot list */
-snapshots_offset = qcow2_alloc_clusters(bs, snapshots_size);
+snapshots_offset = 0;
+if (snapshots_size) {
+snapshots_offset = qcow2_alloc_clusters(bs, snapshots_size);
+}
 offset = snapshots_offset;
 if (offset < 0) {
 ret = offset;
@@ -180,12 +182,13 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
 
 /* The snapshot list position has not yet been updated, so these clusters
  * must indeed be completely free */
-ret = qcow2_pre_write_overlap_check(bs, 0, offset, snapshots_size);
-if (ret < 0) {
-goto fail;
+if (snapshots_size) {
+ret = qcow2_pre_write_overlap_check(bs, 0, offset, snapshots_size);
+if (ret < 0) {
+goto fail;
+}
 }
 
-
 /* Write all snapshots to the new list */
 for(i = 0; i < s->nb_snapshots; i++) {
 sn = s->snapshots + i;
@@ -590,12 +593,14 @@ int qcow2_snapshot_delete(BlockDriverState *bs,
 return -ENOENT;
 }
 sn = s->snapshots[snapshot_index];
-
 /* Remove it from the snapshot list */
-memmove(s->snapshots + snapshot_index,
-s->snapshots + snapshot_index + 1,
-(s->nb_snapshots - snapshot_index - 1) * sizeof(sn));
 s->nb_snapshots--;
+if (s->nb_snapshots) {
+memmove(s->snapshots + snapshot_index,
+s->snapshots + snapshot_index + 1,
+(s->nb_snapshots - (snapshot_index - 1)) * sizeof(sn));
+}
+
 ret = qcow2_write_snapshots(bs);
 if (ret < 0) {
 error_setg_errno(errp, -ret,
-- 
1.9.0




[Qemu-devel] [PATCH] Add backing file option to qemu-img create help.

2014-05-12 Thread Mike Day
For the create subcommand the backing file (-b) option is documented
on-line but not in the binary. Add it.

Signed-off-by: Mike Day 
---
 qemu-img-cmds.hx | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index d029609..7724709 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -16,9 +16,9 @@ STEXI
 ETEXI
 
 DEF("create", img_create,
-"create [-q] [-f fmt] [-o options] filename [size]")
+"create [-q] [-f fmt] [-b backing-file] [-o options] filename [size]")
 STEXI
-@item create [-q] [-f @var{fmt}] [-o @var{options}] @var{filename} [@var{size}]
+@item create [-q] [-f @var{fmt}] [-b @var{backing_file}] [-o @var{options}] 
@var{filename} [@var{size}]
 ETEXI
 
 DEF("commit", img_commit,
-- 
1.9.0




Re: [Qemu-devel] [PATCH] Add backing file option to qemu-img create help.

2014-05-12 Thread Mike Day
On Mon, May 12, 2014 at 11:20 AM, Eric Blake  wrote:
> Online where?  In 'qemu-img --help', or on some web page (at what URL)?

http://qemu.weilnetz.de/qemu-doc.html#vm_005fsnapshots

-o backing file is not documented in the help command.

Mike



Re: [Qemu-devel] [PATCH] Add backing file option to qemu-img create help.

2014-05-12 Thread Mike Day
On Mon, May 12, 2014 at 11:53 AM, Eric Blake  wrote:
> Ah, but it is:
>
> $ qemu-img create -f qcow2 -o help
> Supported options:
> size Virtual disk size
> compat   Compatibility level (0.10 or 1.1)
> backing_file File name of a base image
> backing_fmt  Image format of the base image
> encryption   Encrypt the image
> cluster_size qcow2 cluster size
> preallocationPreallocation mode (allowed values: off, metadata)
> lazy_refcounts   Postpone refcount updates
>
> and more importantly, it only appears in the help output of -f modes
> that actually support a backing file.  Contrast:
>
> $ qemu-img create -f raw -o help
> Supported options:
> size Virtual disk size

This is much more prominent in the top-level help:

rebase [-q] [-f fmt] [-t cache] [-p] [-u] -b backing_file [-F
backing_fmt] filename



Re: [Qemu-devel] [PATCH] Add backing file option to qemu-img create help.

2014-05-12 Thread Mike Day
On Mon, May 12, 2014 at 12:36 PM, Kevin Wolf  wrote:
> What would qemu-img rebase do with -o? It is just for (safely) changing
> the backing file, not for updating options. There is qemu-img amend for
> that, and it does have an -o option.

A little background ... I'm writing a 4-day  KVM training course for
the Linux foundation. From time to time I come across undocumented,
buggy, or confusing options and if I can I try to fix them. In the
case of qemu-img I looked in a lot of places for documentation and
ended up at the unofficial web page that documented the -b backing
file option which works just fine for create and rebase. If an
official page had the information I needed in one place instead of
bits and pieces scattered everywhere I would have used it, naturally.

Here's why the help is confusing. On the top level help, all the
options except for -b are documented. A couple subcommands are even
listed in the top level help with their options. I expected that -b
would be documented on the top level with the others. The fact that -b
 is documented for the rebase subcommand reenforced my
expectation. I was assuming there is some consistency in using options
with backing files. (Especially if the option is called
"backingfile.")

I'm not advocating anything here, and I think its fine to leave things
as they are. I just don't want to be the guy who complains and never
fixes anything.

Mike



Re: [Qemu-devel] [PATCH] qemu-img fails to delete last snapshot

2014-05-12 Thread Mike Day
On Mon, May 12, 2014 at 12:39 PM, Eric Blake  wrote:
> This information is actually quite useful in understanding the patch,
> and I would have included it prior to the --- for inclusion in git,
> rather than in the reviewer-only commentary that gets stripped during
> 'git am'.
>
>>
>>  block/qcow2-snapshot.c | 25 +++--
>>  1 file changed, 15 insertions(+), 10 deletions(-)
>
> I'd suggest posting a v2 with a better commit message; but the code
> itself seems fine, so you can add:
>
> Reviewed-by: Eric Blake 

Got it, thanks!

Mike



[Qemu-devel] [PATCH v2] qemu-img fails to delete last snapshot

2014-05-12 Thread Mike Day
When deleting the last snapshot, copying the resulting snapshot table
currently fails, causing the delete operation to also fail. Fix the
failure by skipping the copy and just writing the snapshot header and
freeing the extra clusters.

There are two specific problems in the current code. First is a lack of
parenthesis in the calculation of the memmove size parameter:

s->nb_snapshots - snapshot_index - 1

When s->nb_snapshots is 0, snapshot_index is 1.

0 - 1 - 1 = 0xfffe

it should be:

0 - (1 - 1) = 0x00

The second problem is shifting the snapshot table to the left. After
removing the last snapshot there are no existing snapshots to be
shifted. All that needs to be done is to write the header and
unallocate the blocks.

Signed-off-by: Mike Day 
Reviewed-by: Eric Blake 
---
v2: improved the git log entry
added Eric Blake as a reviewer

 block/qcow2-snapshot.c | 25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 0aa9def..c8b842c 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -165,9 +165,11 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
 
 assert(offset <= INT_MAX);
 snapshots_size = offset;
-
 /* Allocate space for the new snapshot list */
-snapshots_offset = qcow2_alloc_clusters(bs, snapshots_size);
+snapshots_offset = 0;
+if (snapshots_size) {
+snapshots_offset = qcow2_alloc_clusters(bs, snapshots_size);
+}
 offset = snapshots_offset;
 if (offset < 0) {
 ret = offset;
@@ -180,12 +182,13 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
 
 /* The snapshot list position has not yet been updated, so these clusters
  * must indeed be completely free */
-ret = qcow2_pre_write_overlap_check(bs, 0, offset, snapshots_size);
-if (ret < 0) {
-goto fail;
+if (snapshots_size) {
+ret = qcow2_pre_write_overlap_check(bs, 0, offset, snapshots_size);
+if (ret < 0) {
+goto fail;
+}
 }
 
-
 /* Write all snapshots to the new list */
 for(i = 0; i < s->nb_snapshots; i++) {
 sn = s->snapshots + i;
@@ -590,12 +593,14 @@ int qcow2_snapshot_delete(BlockDriverState *bs,
 return -ENOENT;
 }
 sn = s->snapshots[snapshot_index];
-
 /* Remove it from the snapshot list */
-memmove(s->snapshots + snapshot_index,
-s->snapshots + snapshot_index + 1,
-(s->nb_snapshots - snapshot_index - 1) * sizeof(sn));
 s->nb_snapshots--;
+if (s->nb_snapshots) {
+memmove(s->snapshots + snapshot_index,
+s->snapshots + snapshot_index + 1,
+(s->nb_snapshots - (snapshot_index - 1)) * sizeof(sn));
+}
+
 ret = qcow2_write_snapshots(bs);
 if (ret < 0) {
 error_setg_errno(errp, -ret,
-- 
1.9.0




[Qemu-devel] [PATCH] qemu-img: sort block formats in help message

2014-05-13 Thread Mike Day
The help message for qemu-img lists the supported block formats, of
which there are 27 as of version 2.0.50. The formats are printed in
the order of their driver's position in a linked list, which appears
random. This patch prints the formats in sorted order, making it
easier to read and to find a specific format in the list.

[Added suggestions from Fam Zheng  to declare variables
at the top of the scope in help() and to omit explicit cast for void*
opaque.
--Stefan]

[Removed call to g_sequence_lookup because it breaks the build on
 machines with glib < 2.28.
--Mike]

Signed-off-by: Mike Day 
Signed-off-by: Stefan Hajnoczi 
---
 qemu-img.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 96f4463..93e51d1 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -32,6 +32,7 @@
 #include "block/block_int.h"
 #include "block/qapi.h"
 #include 
+#include 
 
 #define QEMU_IMG_VERSION "qemu-img version " QEMU_VERSION \
   ", Copyright (c) 2004-2008 Fabrice Bellard\n"
@@ -55,9 +56,22 @@ typedef enum OutputFormat {
 #define BDRV_O_FLAGS BDRV_O_CACHE_WB
 #define BDRV_DEFAULT_CACHE "writeback"
 
-static void format_print(void *opaque, const char *name)
+static gint compare_data(gconstpointer a, gconstpointer b, gpointer user)
 {
-printf(" %s", name);
+return g_strcmp0(a, b);
+}
+
+static void print_format(gpointer data, gpointer user)
+{
+printf(" %s", (char *)data);
+}
+
+static void add_format_to_seq(void *opaque, const char *fmt_name)
+{
+GSequence *seq = opaque;
+
+g_sequence_insert_sorted(seq, (gpointer)fmt_name,
+ compare_data, NULL);
 }
 
 static void QEMU_NORETURN GCC_FMT_ATTR(1, 2) error_exit(const char *fmt, ...)
@@ -142,10 +156,15 @@ static void QEMU_NORETURN help(void)
"  '-f' first image format\n"
"  '-F' second image format\n"
"  '-s' run in Strict mode - fail on different image size or sector 
allocation\n";
+GSequence *seq;
 
 printf("%s\nSupported formats:", help_msg);
-bdrv_iterate_format(format_print, NULL);
+seq = g_sequence_new(NULL);
+bdrv_iterate_format(add_format_to_seq, seq);
+g_sequence_foreach(seq, print_format, NULL);
 printf("\n");
+g_sequence_free(seq);
+
 exit(EXIT_SUCCESS);
 }
 
-- 
1.9.0




  1   2   3   4   5   6   7   8   9   10   >