date:20110118

[Qemu-devel] [V3 PATCH 7/8] virtio-9p: Move file post creation changes to none security model

2011-01-18 Thread M. Mohan Kumar

After creating a file object, its permission and ownership details are updated
as per client's request for both passthrough and none security model. But with
chrooted environment its not required for passthrough security model. Move all
post file creation changes to none security model

Signed-off-by: M. Mohan Kumar 
---
 hw/9pfs/virtio-9p-local.c |   19 ++-
 1 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/virtio-9p-local.c
index 08fd67f..d2e32e2 100644
--- a/hw/9pfs/virtio-9p-local.c
+++ b/hw/9pfs/virtio-9p-local.c
@@ -208,21 +208,14 @@ static int local_set_xattr(const char *path, FsCred 
*credp)
 return 0;
 }
 
-static int local_post_create_passthrough(FsContext *fs_ctx, const char *path,
+static int local_post_create_none(FsContext *fs_ctx, const char *path,
 FsCred *credp)
 {
+int retval;
 if (chmod(rpath(fs_ctx, path), credp->fc_mode & 0) < 0) {
 return -1;
 }
-if (lchown(rpath(fs_ctx, path), credp->fc_uid, credp->fc_gid) < 0) {
-/*
- * If we fail to change ownership and if we are
- * using security model none. Ignore the error
- */
-if (fs_ctx->fs_sm != SM_NONE) {
-return -1;
-}
-}
+retval = lchown(rpath(fs_ctx, path), credp->fc_uid, credp->fc_gid);
 return 0;
 }
 
@@ -363,7 +356,7 @@ static int local_mknod(FsContext *fs_ctx, const char *path, 
FsCred *credp)
 if (err == -1) {
 return err;
 }
-err = local_post_create_passthrough(fs_ctx, path, credp);
+err = local_post_create_none(fs_ctx, path, credp);
 if (err == -1) {
 serrno = errno;
 goto err_end;
@@ -405,7 +398,7 @@ static int local_mkdir(FsContext *fs_ctx, const char *path, 
FsCred *credp)
 if (err == -1) {
 return err;
 }
-err = local_post_create_passthrough(fs_ctx, path, credp);
+err = local_post_create_none(fs_ctx, path, credp);
 if (err == -1) {
 serrno = errno;
 goto err_end;
@@ -480,7 +473,7 @@ static int local_open2(FsContext *fs_ctx, const char *path, 
int flags,
 if (fd == -1) {
 return fd;
 }
-err = local_post_create_passthrough(fs_ctx, path, credp);
+err = local_post_create_none(fs_ctx, path, credp);
 if (err == -1) {
 serrno = errno;
 goto err_end;
-- 
1.7.3.4

Re: [Qemu-devel] [PATCH] linux-user: Fix possible realloc memory leak

2011-01-18 Thread Markus Armbruster

Stefan Weil  writes:

> Extract from "man realloc":
> "If realloc() fails the original block is left untouched;
> it is not freed or moved."
>
> Fix a possible memory leak (reported by cppcheck).
>
> Cc: Riku Voipio 
> Signed-off-by: Stefan Weil 

Sidestep the problem via qemu_realloc() instead?

[Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations

2011-01-18 Thread Jan Kiszka

On 2011-01-18 04:03, Stefan Berger wrote:
> On 01/16/2011 09:43 AM, Avi Kivity wrote:
>> On 01/14/2011 09:27 PM, Stefan Berger wrote:
>>>

 Can you sprinkle some printfs() arount kvm_run (in qemu-kvm.c) to
 verify this?

>>> Here's what I did:
>>>
>>>
>>> interrupt exit requested
>>
>> It appears from this you're using qemu.git.  Please try qemu-kvm.git,
>> where the code appears to be correct.
>>
> Cc'ing qemu-devel now. For reference, here the initial problem description:
> 
> http://www.spinics.net/lists/kvm/msg48274.html
> 
> I didn't know there was another tree...
> 
> I have seen now a couple of suspends-while-reading with patches applied
> to the qemu-kvm.git tree and indeed, when run with the same host kernel
> and VM I do not see the debugging dumps due to double-reads that I would
> have anticipated seeing by now. Now what? Can this be easily fixed in
> the other Qemu tree as well?

Please give this a try:

git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream

I bet (& hope) "kvm: Unconditionally reenter kernel after IO exits"
fixes the issue for you. If other problems pop up with that tree, also
try resetting to that particular commit.

I'm currently trying to shake all those hidden or forgotten bug fixes
out of qemu-kvm and port them upstream. Most of those subtle differences
should hopefully soon be history.

> 
> One thing I'd like to mention is that I have seen what I think are
> interrupt stalls when running my tests inside the qemu-kvm.git tree
> version and not suspending at all. A some point the interrupt counter in
> the guest kernel does not increase anymore even though I see the device
> model raising the IRQ and lowering it. The same tests run literally
> forever in the qemu.git tree version of Qemu.

What about qemu-kmv and -no-kvm-irqchip?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts

2011-01-18 Thread Gerd Hoffmann


  Hi,


Worse might also be that unknown issue that force you to inject an IRQ
here. We don't know. That's probably worst.


Well, IIRC the issue was that usually a level high interrupt line would
simply retrigger an interrupt after enabling the interrupt line in the
APIC again.


edge triggered interrupts wouldn't though.


Unless my memory completely fails on me, that didn't happen,
so I added the manual retrigger on a partial command ACK in ahci code.


That re-trigger smells like you are not clearing the interrupt line 
where you should.  For starters try calling ahci_check_irq() after guest 
writes to PORT_IRQ_STAT.


cheers,
  Gerd

Re: [Qemu-devel] [PATCH 2/2] strtosz(): Use suffix macros in switch() statement

2011-01-18 Thread Jes Sorensen

On 01/18/11 10:20, Markus Armbruster wrote:
>> diff --git a/cutils.c b/cutils.c
>> index 328738c..f2c8bbd 100644
>> --- a/cutils.c
>> +++ b/cutils.c
>> @@ -324,26 +324,26 @@ ssize_t strtosz_suffix(const char *nptr, char **end, 
>> const char default_suffix)
>>  }
>>  }
>>  switch (toupper(d)) {
>> -case 'B':
>> +case STRTOSZ_DEFSUFFIX_B:
>>  mul = 1;
>>  if (mul_required) {
>>  goto fail;
>>  }
>>  break;
>> -case 'K':
>> +case STRTOSZ_DEFSUFFIX_KB:
>>  mul = 1 << 10;
>>  break;
>>  case 0:
>>  if (mul_required) {
>>  goto fail;
>>  }
>> -case 'M':
>> +case STRTOSZ_DEFSUFFIX_MB:
>>  mul = 1ULL << 20;
>>  break;
>> -case 'G':
>> +case STRTOSZ_DEFSUFFIX_GB:
>>  mul = 1ULL << 30;
>>  break;
>> -case 'T':
>> +case STRTOSZ_DEFSUFFIX_TB:
>>  mul = 1ULL << 40;
>>  break;
>>  default:
> 
> And this improves what?  Certainly not clarity.
> 
> In my opinion, the STRTOSZ_DEFSUFFIX_TB are useless chaff.  Chacun à son
> goût.

It cuts out lines of code, which is good, and using the macros means the
user is less likely to make a type and use a wrong character.

It's a taste issue though, I agree!

Cheers,
Jes

Re: [Qemu-devel] [PATCH 2/2] strtosz(): Use suffix macros in switch() statement

2011-01-18 Thread Markus Armbruster

jes.soren...@redhat.com writes:

> From: Jes Sorensen 
>
> Signed-off-by: Jes Sorensen 
> ---
>  cutils.c |   10 +-
>  1 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/cutils.c b/cutils.c
> index 328738c..f2c8bbd 100644
> --- a/cutils.c
> +++ b/cutils.c
> @@ -324,26 +324,26 @@ ssize_t strtosz_suffix(const char *nptr, char **end, 
> const char default_suffix)
>  }
>  }
>  switch (toupper(d)) {
> -case 'B':
> +case STRTOSZ_DEFSUFFIX_B:
>  mul = 1;
>  if (mul_required) {
>  goto fail;
>  }
>  break;
> -case 'K':
> +case STRTOSZ_DEFSUFFIX_KB:
>  mul = 1 << 10;
>  break;
>  case 0:
>  if (mul_required) {
>  goto fail;
>  }
> -case 'M':
> +case STRTOSZ_DEFSUFFIX_MB:
>  mul = 1ULL << 20;
>  break;
> -case 'G':
> +case STRTOSZ_DEFSUFFIX_GB:
>  mul = 1ULL << 30;
>  break;
> -case 'T':
> +case STRTOSZ_DEFSUFFIX_TB:
>  mul = 1ULL << 40;
>  break;
>  default:

And this improves what?  Certainly not clarity.

In my opinion, the STRTOSZ_DEFSUFFIX_TB are useless chaff.  Chacun à son
goût.

[Qemu-devel] Changing the content of target cpu registers

2011-01-18 Thread Stefano Bonifazi


Hi all!
 I am working on qemu-user (qemu-ppc).
I'd like to edit the values of target registers during the execution. 
Can I do that by simply changing the content of env->gpr[] or do these 
only contain a copy of the values of the registers?
In this last case, where are the real values of the target registers 
stored so that by modifying them I can alter the behavior of the target 
code execution?

Thank you in advance!
Stefano B.

[Qemu-devel] Re: MIPS, io-thread, icount and wfi

2011-01-18 Thread Jan Kiszka

On 2011-01-18 01:19, Edgar E. Iglesias wrote:
> On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote:
>> Hi,
>>
>> I'm running an io-thread enabled qemu-system-mipsel with icount.
>> When the guest (linux) goes to sleep through the wait insn (waiting
>> to be woken up by future timer interrupts), the thing deadlocks.
>>
>> IIUC, this is because vm timers are driven by icount, but the CPU is
>> halted so icount makes no progress and time stands still.
>>
>> I've locally disabled vcpu halting when icount is enabled, that
>> works around my problem but of course makes qemu consume 100% host cpu.
>>
>> I don't know why I only see this problem with io-thread builds?
>> Could be related timing and luck.
>>
>> Would be interesting to know if someone has any info on how this was
>> intended to work (if it was)? And if there are ideas for better
>> workarounds or fixes that don't disable vcpu halting entirely.
> 
> Hi,
> 
> I've found the problem. For some reason io-thread builds use a
> static timeout for wait loops. The entire chunk of code that
> makes sure qemu_icount makes forward progress when the CPU's
> are idle has been ifdef'ed away...
> 
> This fixes the problem for me, hopefully without affecting
> io-thread runs without icount.
> 
> commit 0f4f3a919952500b487b438c5520f07a1c6be35b
> Author: Edgar E. Iglesias 
> Date:   Tue Jan 18 01:01:57 2011 +0100
> 
> qemu-timer: Fix timeout calc for io-thread with icount
> 
> Make sure we always make forward progress with qemu_icount to
> avoid deadlocks. For io-thread, use the static 1000 timeout
> only if icount is disabled.
> 
> Signed-off-by: Edgar E. Iglesias 
> 
> diff --git a/qemu-timer.c b/qemu-timer.c
> index 95814af..db1ec49 100644
> --- a/qemu-timer.c
> +++ b/qemu-timer.c
> @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void)
>  }
>  }
>  
> -#ifndef CONFIG_IOTHREAD
>  static int64_t qemu_icount_delta(void)
>  {
>  if (!use_icount) {
> @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void)
>  return cpu_get_icount() - cpu_get_clock();
>  }
>  }
> -#endif
>  
>  /* enable cpu_get_ticks() */
>  void cpu_enable_ticks(void)
> @@ -1077,9 +1075,17 @@ void quit_timers(void)
>  
>  int qemu_calculate_timeout(void)
>  {
> -#ifndef CONFIG_IOTHREAD
>  int timeout;
>  
> +#ifdef CONFIG_IOTHREAD
> +/* When using icount, making forward progress with qemu_icount when the
> +   guest CPU is idle is critical. We only use the static io-thread 
> timeout
> +   for non icount runs.  */
> +if (!use_icount) {
> +return 1000;
> +}
> +#endif
> +
>  if (!vm_running)
>  timeout = 5000;
>  else {
> @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void)
>  }
>  
>  return timeout;
> -#else /* CONFIG_IOTHREAD */
> -return 1000;
> -#endif
>  }
>  
> 
> 

This logic and timeout values were imported on iothread merge. And I bet
at least the timeout value of 1s (vs. 5s) can still be found in
qemu-kvm. Maybe someone over there can remember the rationales behind
choosing this value.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH v2 1/3] scsi-disk: Allow overriding SCSI INQUIRY removable bit

2011-01-18 Thread Stefan Hajnoczi

Provide the "removable" qdev property bit to override the SCSI INQUIRY
removable (RMB) bit for non-CDROM devices.  This will be used by USB
Mass Storage Devices, which sometimes have this guest-visible bit set
and sometimes do not.  They therefore requires a means for user
configuration.

Signed-off-by: Stefan Hajnoczi 
---
 hw/scsi-disk.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 6cb317c..db423ca 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -72,6 +72,7 @@ struct SCSIDiskState
 /* The qemu block layer uses a fixed 512 byte sector size.
This is the number of 512 byte blocks in a single scsi sector.  */
 int cluster_size;
+uint32_t removable;
 uint64_t max_lba;
 QEMUBH *bh;
 char *version;
@@ -552,6 +553,7 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)
 memcpy(&outbuf[16], "QEMU CD-ROM ", 16);
 } else {
 outbuf[0] = 0;
+outbuf[1] = s->removable ? 0x80 : 0;
 memcpy(&outbuf[16], "QEMU HARDDISK   ", 16);
 }
 memcpy(&outbuf[8], "QEMU", 8);
@@ -1295,6 +1297,7 @@ static SCSIDeviceInfo scsi_disk_info = {
 DEFINE_BLOCK_PROPERTIES(SCSIDiskState, qdev.conf),
 DEFINE_PROP_STRING("ver",  SCSIDiskState, version),
 DEFINE_PROP_STRING("serial",  SCSIDiskState, serial),
+DEFINE_PROP_BIT("removable", SCSIDiskState, removable, 1, false),
 DEFINE_PROP_END_OF_LIST(),
 },
 };
-- 
1.7.2.3

[Qemu-devel] [PATCH v2 2/3] scsi: Allow SCSI devices to override the removable bit

2011-01-18 Thread Stefan Hajnoczi

Some SCSI devices may wish to override the removable bit.  Add support
for a qdev property on the SCSI device.

Signed-off-by: Stefan Hajnoczi 
---
 hw/pci-hotplug.c |2 +-
 hw/scsi-bus.c|8 ++--
 hw/scsi.h|3 ++-
 hw/usb-msd.c |2 +-
 4 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/hw/pci-hotplug.c b/hw/pci-hotplug.c
index 716133c..270a982 100644
--- a/hw/pci-hotplug.c
+++ b/hw/pci-hotplug.c
@@ -90,7 +90,7 @@ static int scsi_hot_add(Monitor *mon, DeviceState *adapter,
  * specified).
  */
 dinfo->unit = qemu_opt_get_number(dinfo->opts, "unit", -1);
-scsidev = scsi_bus_legacy_add_drive(scsibus, dinfo->bdrv, dinfo->unit);
+scsidev = scsi_bus_legacy_add_drive(scsibus, dinfo->bdrv, dinfo->unit, 
false);
 if (!scsidev) {
 return -1;
 }
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index 7febb86..ceeb4ec 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -87,7 +87,8 @@ void scsi_qdev_register(SCSIDeviceInfo *info)
 }
 
 /* handle legacy '-drive if=scsi,...' cmd line args */
-SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, 
int unit)
+SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv,
+  int unit, bool removable)
 {
 const char *driver;
 DeviceState *dev;
@@ -95,6 +96,9 @@ SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, 
BlockDriverState *bdrv, int
 driver = bdrv_is_sg(bdrv) ? "scsi-generic" : "scsi-disk";
 dev = qdev_create(&bus->qbus, driver);
 qdev_prop_set_uint32(dev, "scsi-id", unit);
+if (qdev_prop_exists(dev, "removable")) {
+qdev_prop_set_bit(dev, "removable", removable);
+}
 if (qdev_prop_set_drive(dev, "drive", bdrv) < 0) {
 qdev_free(dev);
 return NULL;
@@ -117,7 +121,7 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
 continue;
 }
 qemu_opts_loc_restore(dinfo->opts);
-if (!scsi_bus_legacy_add_drive(bus, dinfo->bdrv, unit)) {
+if (!scsi_bus_legacy_add_drive(bus, dinfo->bdrv, unit, false)) {
 res = -1;
 break;
 }
diff --git a/hw/scsi.h b/hw/scsi.h
index bf02adf..846fbba 100644
--- a/hw/scsi.h
+++ b/hw/scsi.h
@@ -94,7 +94,8 @@ static inline SCSIBus *scsi_bus_from_device(SCSIDevice *d)
 return DO_UPCAST(SCSIBus, qbus, d->qdev.parent_bus);
 }
 
-SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, 
int unit);
+SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv,
+  int unit, bool removable);
 int scsi_bus_legacy_handle_cmdline(SCSIBus *bus);
 
 SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, uint32_t 
lun);
diff --git a/hw/usb-msd.c b/hw/usb-msd.c
index 0a95d8d..ee897b6 100644
--- a/hw/usb-msd.c
+++ b/hw/usb-msd.c
@@ -544,7 +544,7 @@ static int usb_msd_initfn(USBDevice *dev)
 
 s->dev.speed = USB_SPEED_FULL;
 scsi_bus_new(&s->bus, &s->dev.qdev, 0, 1, usb_msd_command_complete);
-s->scsi_dev = scsi_bus_legacy_add_drive(&s->bus, bs, 0);
+s->scsi_dev = scsi_bus_legacy_add_drive(&s->bus, bs, 0, false);
 if (!s->scsi_dev) {
 return -1;
 }
-- 
1.7.2.3

[Qemu-devel] [PATCH v2 0/3] usb-msd: Add usb-storage, removable=on|off property

2011-01-18 Thread Stefan Hajnoczi

Allow overriding the SCSI INQUIRY removable (RMB) bit for scsi-disk and usb-msd
devices.  In particular this addresses the problem that some usb-msd devices
have the bit set while other do not have it set.  Now the user can choose and
get desired guest behavior.

qemu -usb
 -drive if=none,file=test.img,cache=none,id=disk0
 -device usb-storage,drive=disk0,removable=on

The default is off.

v2:
 * Rewritten to override the bit at the scsi-disk level

 hw/pci-hotplug.c |2 +-
 hw/scsi-bus.c|8 ++--
 hw/scsi-disk.c   |3 +++
 hw/scsi.h|3 ++-
 hw/usb-msd.c |4 +++-
 5 files changed, 15 insertions(+), 5 deletions(-)

[Qemu-devel] Re: MIPS, io-thread, icount and wfi

2011-01-18 Thread Jan Kiszka

On 2011-01-18 11:00, Jan Kiszka wrote:
> On 2011-01-18 01:19, Edgar E. Iglesias wrote:
>> On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote:
>>> Hi,
>>>
>>> I'm running an io-thread enabled qemu-system-mipsel with icount.
>>> When the guest (linux) goes to sleep through the wait insn (waiting
>>> to be woken up by future timer interrupts), the thing deadlocks.
>>>
>>> IIUC, this is because vm timers are driven by icount, but the CPU is
>>> halted so icount makes no progress and time stands still.
>>>
>>> I've locally disabled vcpu halting when icount is enabled, that
>>> works around my problem but of course makes qemu consume 100% host cpu.
>>>
>>> I don't know why I only see this problem with io-thread builds?
>>> Could be related timing and luck.
>>>
>>> Would be interesting to know if someone has any info on how this was
>>> intended to work (if it was)? And if there are ideas for better
>>> workarounds or fixes that don't disable vcpu halting entirely.
>>
>> Hi,
>>
>> I've found the problem. For some reason io-thread builds use a
>> static timeout for wait loops. The entire chunk of code that
>> makes sure qemu_icount makes forward progress when the CPU's
>> are idle has been ifdef'ed away...
>>
>> This fixes the problem for me, hopefully without affecting
>> io-thread runs without icount.
>>
>> commit 0f4f3a919952500b487b438c5520f07a1c6be35b
>> Author: Edgar E. Iglesias 
>> Date:   Tue Jan 18 01:01:57 2011 +0100
>>
>> qemu-timer: Fix timeout calc for io-thread with icount
>> 
>> Make sure we always make forward progress with qemu_icount to
>> avoid deadlocks. For io-thread, use the static 1000 timeout
>> only if icount is disabled.
>> 
>> Signed-off-by: Edgar E. Iglesias 
>>
>> diff --git a/qemu-timer.c b/qemu-timer.c
>> index 95814af..db1ec49 100644
>> --- a/qemu-timer.c
>> +++ b/qemu-timer.c
>> @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void)
>>  }
>>  }
>>  
>> -#ifndef CONFIG_IOTHREAD
>>  static int64_t qemu_icount_delta(void)
>>  {
>>  if (!use_icount) {
>> @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void)
>>  return cpu_get_icount() - cpu_get_clock();
>>  }
>>  }
>> -#endif
>>  
>>  /* enable cpu_get_ticks() */
>>  void cpu_enable_ticks(void)
>> @@ -1077,9 +1075,17 @@ void quit_timers(void)
>>  
>>  int qemu_calculate_timeout(void)
>>  {
>> -#ifndef CONFIG_IOTHREAD
>>  int timeout;
>>  
>> +#ifdef CONFIG_IOTHREAD
>> +/* When using icount, making forward progress with qemu_icount when the
>> +   guest CPU is idle is critical. We only use the static io-thread 
>> timeout
>> +   for non icount runs.  */
>> +if (!use_icount) {
>> +return 1000;
>> +}
>> +#endif
>> +
>>  if (!vm_running)
>>  timeout = 5000;
>>  else {
>> @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void)
>>  }
>>  
>>  return timeout;
>> -#else /* CONFIG_IOTHREAD */
>> -return 1000;
>> -#endif
>>  }
>>  
>>
>>
> 
> This logic and timeout values were imported on iothread merge. And I bet
> at least the timeout value of 1s (vs. 5s) can still be found in
> qemu-kvm. Maybe someone over there can remember the rationales behind
> choosing this value.

Correction: qemu-kvm does _not_ use a fixed timeout value for the
iothread nor the removed code path (as CONFIG_IOTHREAD is off in
qemu-kvm, that tree does not even build when you enable it).

Still, the reason for once introducing this difference to qemu should be
reflected.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: [PATCH v2 3/3] checkpatch: adjust to QEMUisms

2011-01-18 Thread Paolo Bonzini


On 01/17/2011 08:37 PM, Blue Swirl wrote:

On Mon, Jan 17, 2011 at 7:40 AM, Paolo Bonzini  wrote:

On 01/15/2011 06:45 PM, Blue Swirl wrote:


+   if ($level == 0&&  !$block =~ /^\s*\{/&&
!$allowed) {


I'm not a Perl expert at all, but I think you need parentheses for the
argument of "!":


! has higher precedence than =~:
http://perldoc.perl.org/perlop.html#Operator-Precedence-and-Associativity


I think that's what I meant. :)


  if ($level == 0&&  !($block =~ /^\s*\{/)&&  !$allowed) {


Maybe instead:
if ($level == 0&&  $block !~ /^\s*\{/&&  !$allowed) {


Yes, this too.

Paolo

Re: [Qemu-devel] [PATCH 1/3] mips: Break TBs after mfc0_count

2011-01-18 Thread Aurelien Jarno

On Tue, Jan 18, 2011 at 12:29:40AM +0100, edgar.igles...@gmail.com wrote:
> From: Edgar E. Iglesias 
> 
> Break the TB after reading the count register. This makes it
> possible to take timer interrupts immediately after a read of
> a possibly expired timer.
> 
> Signed-off-by: Edgar E. Iglesias 
> ---
>  target-mips/translate.c |4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/target-mips/translate.c b/target-mips/translate.c
> index cce77be..313cc29 100644
> --- a/target-mips/translate.c
> +++ b/target-mips/translate.c
> @@ -3410,8 +3410,10 @@ static void gen_mfc0 (CPUState *env, DisasContext 
> *ctx, TCGv arg, int reg, int s
>  gen_helper_mfc0_count(arg);
>  if (use_icount) {
>  gen_io_end();
> -ctx->bstate = BS_STOP;
>  }
> +/* Break the TB to be able to take timer interrupts immediately
> +   after reading count.  */
> +ctx->bstate = BS_STOP;
>  rn = "Count";
>  break;
>  /* 6,7 are implementation dependent */

This looks fine, however it should probably be done the same way for
dmfc0 on 64-bit MIPS.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

[Qemu-devel] Re: [PATCH 2/3] mips: Break out cpu_mips_timer_expire

2011-01-18 Thread Aurelien Jarno

On Tue, Jan 18, 2011 at 12:29:41AM +0100, edgar.igles...@gmail.com wrote:
> From: Edgar E. Iglesias 
> 
> Reorganize for future patches, no functional change.
> 
> Signed-off-by: Edgar E. Iglesias 
> ---
>  hw/mips_timer.c |   36 ++--
>  1 files changed, 22 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/mips_timer.c b/hw/mips_timer.c
> index e3beee8..8c32087 100644
> --- a/hw/mips_timer.c
> +++ b/hw/mips_timer.c
> @@ -42,16 +42,6 @@ uint32_t cpu_mips_get_random (CPUState *env)
>  }
>  
>  /* MIPS R4K timer */
> -uint32_t cpu_mips_get_count (CPUState *env)
> -{
> -if (env->CP0_Cause & (1 << CP0Ca_DC))
> -return env->CP0_Count;
> -else
> -return env->CP0_Count +
> -(uint32_t)muldiv64(qemu_get_clock(vm_clock),
> -   TIMER_FREQ, get_ticks_per_sec());
> -}
> -
>  static void cpu_mips_timer_update(CPUState *env)
>  {
>  uint64_t now, next;
> @@ -64,6 +54,27 @@ static void cpu_mips_timer_update(CPUState *env)
>  qemu_mod_timer(env->timer, next);
>  }
>  
> +/* Expire the timer.  */
> +static void cpu_mips_timer_expire(CPUState *env)
> +{
> +cpu_mips_timer_update(env);
> +if (env->insn_flags & ISA_MIPS32R2) {
> +env->CP0_Cause |= 1 << CP0Ca_TI;
> +}
> +qemu_irq_raise(env->irq[(env->CP0_IntCtl >> CP0IntCtl_IPTI) & 0x7]);
> +}
> +
> +uint32_t cpu_mips_get_count (CPUState *env)
> +{
> +if (env->CP0_Cause & (1 << CP0Ca_DC)) {
> +return env->CP0_Count;
> +} else {
> +return env->CP0_Count +
> +(uint32_t)muldiv64(qemu_get_clock(vm_clock),
> +   TIMER_FREQ, get_ticks_per_sec());
> +}
> +}
> +
>  void cpu_mips_store_count (CPUState *env, uint32_t count)
>  {
>  if (env->CP0_Cause & (1 << CP0Ca_DC))
> @@ -116,11 +127,8 @@ static void mips_timer_cb (void *opaque)
> the comparator value.  Offset the count by one to avoid immediately
> retriggering the callback before any virtual time has passed.  */
>  env->CP0_Count++;
> -cpu_mips_timer_update(env);
> +cpu_mips_timer_expire(env);
>  env->CP0_Count--;
> -if (env->insn_flags & ISA_MIPS32R2)
> -env->CP0_Cause |= 1 << CP0Ca_TI;
> -qemu_irq_raise(env->irq[(env->CP0_IntCtl >> CP0IntCtl_IPTI) & 0x7]);
>  }
>  
>  void cpu_mips_clock_init (CPUState *env)
> -- 
> 1.7.2.2
> 
> 

Acked-by: Aurelien Jarno 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] Re: [PATCH 3/3] mips: Expire late timers when reading cp0_count

2011-01-18 Thread Aurelien Jarno

On Tue, Jan 18, 2011 at 01:33:00AM +0100, Edgar E. Iglesias wrote:
> On Tue, Jan 18, 2011 at 12:29:42AM +0100, edgar.igles...@gmail.com wrote:
> > From: Edgar E. Iglesias 
> > 
> > When reading cp0_count from a timer with a late trigger that should
> > already have expired, expire it and raise the timer irq.
> > 
> > This makes it possible for guest code (e.g, Linux) that first read
> > cp0_count, then compare it with cp0_compare and check for raised
> > timer interrupt lines to run reliably.
> > 
> > Signed-off-by: Edgar E. Iglesias 
> 
> Sorry sent the wrong version of this one. It's supposed to be the
> following:
> 
> commit 139330de404209528712fd703952c0b5ad4459a1
> Author: Edgar E. Iglesias 
> Date:   Tue Jan 18 00:12:22 2011 +0100
> 
> mips: Expire late timers when reading cp0_count
> 
> When reading cp0_count from a timer with a late trigger that should
> already have expired, expire it and raise the timer irq.
> 
> This makes it possible for guest code (e.g, Linux) that first read
> cp0_count, then compare it with cp0_compare and check for raised
> timer interrupt lines to run reliably.
> 
> Signed-off-by: Edgar E. Iglesias 
> 
> diff --git a/hw/mips_timer.c b/hw/mips_timer.c
> index 8c32087..9c95f28 100644
> --- a/hw/mips_timer.c
> +++ b/hw/mips_timer.c
> @@ -69,9 +69,17 @@ uint32_t cpu_mips_get_count (CPUState *env)
>  if (env->CP0_Cause & (1 << CP0Ca_DC)) {
>  return env->CP0_Count;
>  } else {
> +uint64_t now;
> +
> +now = qemu_get_clock(vm_clock);
> +if (qemu_timer_pending(env->timer)
> +&& qemu_timer_expired(env->timer, now)) {
> +/* The timer has already expired.  */
> +cpu_mips_timer_expire(env);
> +}
> +
>  return env->CP0_Count +
> -(uint32_t)muldiv64(qemu_get_clock(vm_clock),
> -   TIMER_FREQ, get_ticks_per_sec());
> +(uint32_t)muldiv64(now, TIMER_FREQ, get_ticks_per_sec());
>  }
>  }
>  

Given the TB is now ended after this instruction (due to patch 1), isn't
the interrupt handled before starting the next TB, where the interrupt
line (I guess CP0_Cause) read?

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] Re: [PATCH 3/3] mips: Expire late timers when reading cp0_count

2011-01-18 Thread Edgar E. Iglesias

On Tue, Jan 18, 2011 at 11:36:25AM +0100, Aurelien Jarno wrote:
> On Tue, Jan 18, 2011 at 01:33:00AM +0100, Edgar E. Iglesias wrote:
> > On Tue, Jan 18, 2011 at 12:29:42AM +0100, edgar.igles...@gmail.com wrote:
> > > From: Edgar E. Iglesias 
> > > 
> > > When reading cp0_count from a timer with a late trigger that should
> > > already have expired, expire it and raise the timer irq.
> > > 
> > > This makes it possible for guest code (e.g, Linux) that first read
> > > cp0_count, then compare it with cp0_compare and check for raised
> > > timer interrupt lines to run reliably.
> > > 
> > > Signed-off-by: Edgar E. Iglesias 
> > 
> > Sorry sent the wrong version of this one. It's supposed to be the
> > following:
> > 
> > commit 139330de404209528712fd703952c0b5ad4459a1
> > Author: Edgar E. Iglesias 
> > Date:   Tue Jan 18 00:12:22 2011 +0100
> > 
> > mips: Expire late timers when reading cp0_count
> > 
> > When reading cp0_count from a timer with a late trigger that should
> > already have expired, expire it and raise the timer irq.
> > 
> > This makes it possible for guest code (e.g, Linux) that first read
> > cp0_count, then compare it with cp0_compare and check for raised
> > timer interrupt lines to run reliably.
> > 
> > Signed-off-by: Edgar E. Iglesias 
> > 
> > diff --git a/hw/mips_timer.c b/hw/mips_timer.c
> > index 8c32087..9c95f28 100644
> > --- a/hw/mips_timer.c
> > +++ b/hw/mips_timer.c
> > @@ -69,9 +69,17 @@ uint32_t cpu_mips_get_count (CPUState *env)
> >  if (env->CP0_Cause & (1 << CP0Ca_DC)) {
> >  return env->CP0_Count;
> >  } else {
> > +uint64_t now;
> > +
> > +now = qemu_get_clock(vm_clock);
> > +if (qemu_timer_pending(env->timer)
> > +&& qemu_timer_expired(env->timer, now)) {
> > +/* The timer has already expired.  */
> > +cpu_mips_timer_expire(env);
> > +}
> > +
> >  return env->CP0_Count +
> > -(uint32_t)muldiv64(qemu_get_clock(vm_clock),
> > -   TIMER_FREQ, get_ticks_per_sec());
> > +(uint32_t)muldiv64(now, TIMER_FREQ, get_ticks_per_sec());
> >  }
> >  }
> >  
> 
> Given the TB is now ended after this instruction (due to patch 1), isn't
> the interrupt handled before starting the next TB, where the interrupt
> line (I guess CP0_Cause) read?

Hi,

The problem here is different. Due to host timing granularity, the
timer might expire later than it's precise scheduled time. If that
happens, get_count will return a count value that goes beyond the
trigger time but the interrupt may come later (when the host timer
expires).

This patch catches that case and expires the timer in-band, raising
the timer interrupt if needed.

Cheers

Re: [Qemu-devel] [PATCH 1/3] mips: Break TBs after mfc0_count

2011-01-18 Thread Edgar E. Iglesias

On Tue, Jan 18, 2011 at 11:34:28AM +0100, Aurelien Jarno wrote:
> On Tue, Jan 18, 2011 at 12:29:40AM +0100, edgar.igles...@gmail.com wrote:
> > From: Edgar E. Iglesias 
> > 
> > Break the TB after reading the count register. This makes it
> > possible to take timer interrupts immediately after a read of
> > a possibly expired timer.
> > 
> > Signed-off-by: Edgar E. Iglesias 
> > ---
> >  target-mips/translate.c |4 +++-
> >  1 files changed, 3 insertions(+), 1 deletions(-)
> > 
> > diff --git a/target-mips/translate.c b/target-mips/translate.c
> > index cce77be..313cc29 100644
> > --- a/target-mips/translate.c
> > +++ b/target-mips/translate.c
> > @@ -3410,8 +3410,10 @@ static void gen_mfc0 (CPUState *env, DisasContext 
> > *ctx, TCGv arg, int reg, int s
> >  gen_helper_mfc0_count(arg);
> >  if (use_icount) {
> >  gen_io_end();
> > -ctx->bstate = BS_STOP;
> >  }
> > +/* Break the TB to be able to take timer interrupts immediately
> > +   after reading count.  */
> > +ctx->bstate = BS_STOP;
> >  rn = "Count";
> >  break;
> >  /* 6,7 are implementation dependent */
> 
> This looks fine, however it should probably be done the same way for
> dmfc0 on 64-bit MIPS.

You're right, I'll fix that. Thanks.

Re: [Qemu-devel] Re: [PATCH 3/3] mips: Expire late timers when reading cp0_count

2011-01-18 Thread Aurelien Jarno

On Tue, Jan 18, 2011 at 11:41:54AM +0100, Edgar E. Iglesias wrote:
> On Tue, Jan 18, 2011 at 11:36:25AM +0100, Aurelien Jarno wrote:
> > On Tue, Jan 18, 2011 at 01:33:00AM +0100, Edgar E. Iglesias wrote:
> > > On Tue, Jan 18, 2011 at 12:29:42AM +0100, edgar.igles...@gmail.com wrote:
> > > > From: Edgar E. Iglesias 
> > > > 
> > > > When reading cp0_count from a timer with a late trigger that should
> > > > already have expired, expire it and raise the timer irq.
> > > > 
> > > > This makes it possible for guest code (e.g, Linux) that first read
> > > > cp0_count, then compare it with cp0_compare and check for raised
> > > > timer interrupt lines to run reliably.
> > > > 
> > > > Signed-off-by: Edgar E. Iglesias 
> > > 
> > > Sorry sent the wrong version of this one. It's supposed to be the
> > > following:
> > > 
> > > commit 139330de404209528712fd703952c0b5ad4459a1
> > > Author: Edgar E. Iglesias 
> > > Date:   Tue Jan 18 00:12:22 2011 +0100
> > > 
> > > mips: Expire late timers when reading cp0_count
> > > 
> > > When reading cp0_count from a timer with a late trigger that should
> > > already have expired, expire it and raise the timer irq.
> > > 
> > > This makes it possible for guest code (e.g, Linux) that first read
> > > cp0_count, then compare it with cp0_compare and check for raised
> > > timer interrupt lines to run reliably.
> > > 
> > > Signed-off-by: Edgar E. Iglesias 
> > > 
> > > diff --git a/hw/mips_timer.c b/hw/mips_timer.c
> > > index 8c32087..9c95f28 100644
> > > --- a/hw/mips_timer.c
> > > +++ b/hw/mips_timer.c
> > > @@ -69,9 +69,17 @@ uint32_t cpu_mips_get_count (CPUState *env)
> > >  if (env->CP0_Cause & (1 << CP0Ca_DC)) {
> > >  return env->CP0_Count;
> > >  } else {
> > > +uint64_t now;
> > > +
> > > +now = qemu_get_clock(vm_clock);
> > > +if (qemu_timer_pending(env->timer)
> > > +&& qemu_timer_expired(env->timer, now)) {
> > > +/* The timer has already expired.  */
> > > +cpu_mips_timer_expire(env);
> > > +}
> > > +
> > >  return env->CP0_Count +
> > > -(uint32_t)muldiv64(qemu_get_clock(vm_clock),
> > > -   TIMER_FREQ, get_ticks_per_sec());
> > > +(uint32_t)muldiv64(now, TIMER_FREQ, get_ticks_per_sec());
> > >  }
> > >  }
> > >  
> > 
> > Given the TB is now ended after this instruction (due to patch 1), isn't
> > the interrupt handled before starting the next TB, where the interrupt
> > line (I guess CP0_Cause) read?
> 
> Hi,
> 
> The problem here is different. Due to host timing granularity, the
> timer might expire later than it's precise scheduled time. If that
> happens, get_count will return a count value that goes beyond the
> trigger time but the interrupt may come later (when the host timer
> expires).
> 
> This patch catches that case and expires the timer in-band, raising
> the timer interrupt if needed.
> 

Ok, thanks for the explanation.

Acked-by: Aurelien Jarno 


-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net

[Qemu-devel] [PATCH 3/3] qdev-properties: add PROP_TYPE_ENUM

2011-01-18 Thread Alon Levy

Example usage:

EnumTable foo_enum_table[] = {
{"bar", 1},
{"buz", 2},
{NULL, 0},
};

DEFINE_PROP_ENUM("foo", State, foo, 1, foo_enum_table)

When using qemu -device foodev,? it will appear as:
 foodev.foo=bar/buz
---
 hw/qdev-properties.c |   60 ++
 hw/qdev.h|   15 
 2 files changed, 75 insertions(+), 0 deletions(-)

diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index a493087..3157721 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -63,6 +63,66 @@ PropertyInfo qdev_prop_bit = {
 .print = print_bit,
 };
 
+/* --- Enumeration --- */
+/* Example usage:
+EnumTable foo_enum_table[] = {
+{"bar", 1},
+{"buz", 2},
+{NULL, 0},
+};
+DEFINE_PROP_ENUM("foo", State, foo, 1, foo_enum_table),
+ */
+static int parse_enum(DeviceState *dev, Property *prop, const char *str)
+{
+uint8_t *ptr = qdev_get_prop_ptr(dev, prop);
+EnumTable *option = (EnumTable*)prop->data;
+
+while (option->name != NULL) {
+if (!strncmp(str, option->name, strlen(option->name))) {
+*ptr = option->value;
+return 0;
+}
+option++;
+}
+return -EINVAL;
+}
+
+static int print_enum(DeviceState *dev, Property *prop, char *dest, size_t len)
+{
+uint32_t *p = qdev_get_prop_ptr(dev, prop);
+EnumTable *option = (EnumTable*)prop->data;
+while (option->name != NULL) {
+if (*p == option->value) {
+return snprintf(dest, len, "%s", option->name);
+}
+option++;
+}
+return 0;
+}
+
+static int print_enum_options(DeviceInfo *info, Property *prop, char *dest, 
size_t len)
+{
+int ret = 0;
+EnumTable *option = (EnumTable*)prop->data;
+while (option->name != NULL) {
+ret += snprintf(dest + ret, len - ret, "%s", option->name);
+if (option[1].name != NULL) {
+ret += snprintf(dest + ret, len - ret, "/");
+}
+option++;
+}
+return ret;
+}
+
+PropertyInfo qdev_prop_enum = {
+.name  = "enum",
+.type  = PROP_TYPE_ENUM,
+.size  = sizeof(uint32_t),
+.parse = parse_enum,
+.print = print_enum,
+.print_options = print_enum_options,
+};
+
 /* --- 8bit integer --- */
 
 static int parse_uint8(DeviceState *dev, Property *prop, const char *str)
diff --git a/hw/qdev.h b/hw/qdev.h
index f6d5279..26b3d9e 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -102,6 +102,7 @@ enum PropertyType {
 PROP_TYPE_VLAN,
 PROP_TYPE_PTR,
 PROP_TYPE_BIT,
+PROP_TYPE_ENUM,
 };
 
 struct PropertyInfo {
@@ -121,6 +122,11 @@ typedef struct GlobalProperty {
 QTAILQ_ENTRY(GlobalProperty) next;
 } GlobalProperty;
 
+typedef struct EnumTable {
+const char *name;
+uint32_tvalue;
+} EnumTable;
+
 /*** Board API.  This should go away once we have a machine config file.  ***/
 
 DeviceState *qdev_create(BusState *bus, const char *name);
@@ -237,6 +243,7 @@ extern PropertyInfo qdev_prop_drive;
 extern PropertyInfo qdev_prop_netdev;
 extern PropertyInfo qdev_prop_vlan;
 extern PropertyInfo qdev_prop_pci_devfn;
+extern PropertyInfo qdev_prop_enum;
 
 #define DEFINE_PROP(_name, _state, _field, _prop, _type) { \
 .name  = (_name),\
@@ -259,6 +266,14 @@ extern PropertyInfo qdev_prop_pci_devfn;
 + type_check(uint32_t,typeof_field(_state, _field)), \
 .defval= (bool[]) { (_defval) }, \
 }
+#define DEFINE_PROP_ENUM(_name, _state, _field, _defval, _options) {\
+.name  = (_name),   \
+.info  = &(qdev_prop_enum), \
+.offset= offsetof(_state, _field)   \
++ type_check(uint32_t,typeof_field(_state, _field)),\
+.defval= (uint32_t[]) { (_defval) },\
+.data  = (void*)(_options), \
+}
 
 #define DEFINE_PROP_UINT8(_n, _s, _f, _d)   \
 DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_uint8, uint8_t)
-- 
1.7.3.4

[Qemu-devel] [PATCH 0/3] add PROP_TYPE_ENUM and print_options callback

2011-01-18 Thread Alon Levy

This patchset allows a new property type called PROP_TYPE_ENUM,
I want to use it for the backend property in the ccid patches (will
send the patchset that uses it after this), libvirt is the ultimate
user.

The first patch adds a print_options callback that works with this
property type to print the optional values.

The second patch allows storing the name/value mapping in the property,
using a void ptr for later different uses.

The third patch adds the property itself.

Alon Levy (3):
  qdev: add print_options callback
  qdev: add data pointer to Property
  qdev-properties: add PROP_TYPE_ENUM

 hw/qdev-properties.c |   60 ++
 hw/qdev.c|   10 +++-
 hw/qdev.h|   17 ++
 3 files changed, 86 insertions(+), 1 deletions(-)

-- 
1.7.3.4

[Qemu-devel] [PATCH 2/3] qdev: add data pointer to Property

2011-01-18 Thread Alon Levy

For later use by PROP_TYPE_ENUM, will store enumeration name/value
table there.
---
 hw/qdev.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/hw/qdev.h b/hw/qdev.h
index 530fc5d..f6d5279 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -83,6 +83,7 @@ struct Property {
 int  offset;
 int  bitnr;
 void *defval;
+void *data;
 };
 
 enum PropertyType {
-- 
1.7.3.4

[Qemu-devel] [PATCH 1/3] qdev: add print_options callback

2011-01-18 Thread Alon Levy

another callback added to PropertyInfo, for later use by PROP_TYPE_ENUM.
Allows printing of runtime computed options when doing:
 qemu -device foo,?
---
 hw/qdev.c |   10 +-
 hw/qdev.h |1 +
 2 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/hw/qdev.c b/hw/qdev.c
index 5b8d374..d1550b9 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -188,7 +188,15 @@ int qdev_device_help(QemuOpts *opts)
 if (!prop->info->parse) {
 continue;   /* no way to set it, don't show */
 }
-error_printf("%s.%s=%s\n", info->name, prop->name, prop->info->name);
+if (prop->info->print_options) {
+char buf[256];
+int ret;
+ret = prop->info->print_options(info, prop, buf, sizeof(buf) - 3);
+error_printf("%s.%s=%s%s\n", info->name, prop->name, buf,
+ret == sizeof(buf) - 3 ? "..." : "" );
+} else {
+error_printf("%s.%s=%s\n", info->name, prop->name, 
prop->info->name);
+}
 }
 return 1;
 }
diff --git a/hw/qdev.h b/hw/qdev.h
index e520aaa..530fc5d 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -109,6 +109,7 @@ struct PropertyInfo {
 enum PropertyType type;
 int (*parse)(DeviceState *dev, Property *prop, const char *str);
 int (*print)(DeviceState *dev, Property *prop, char *dest, size_t len);
+int (*print_options)(DeviceInfo *info, Property *prop, char *dest, size_t 
len);
 void (*free)(DeviceState *dev, Property *prop);
 };
 
-- 
1.7.3.4

Re: [Qemu-devel] [PATCH v2 1/3] scsi-disk: Allow overriding SCSI INQUIRY removable bit

2011-01-18 Thread Kevin Wolf

Am 18.01.2011 11:10, schrieb Stefan Hajnoczi:
> Provide the "removable" qdev property bit to override the SCSI INQUIRY
> removable (RMB) bit for non-CDROM devices.  This will be used by USB
> Mass Storage Devices, which sometimes have this guest-visible bit set
> and sometimes do not.  They therefore requires a means for user
> configuration.
> 
> Signed-off-by: Stefan Hajnoczi 

Should we print an error message when the user tries to make a CD-ROM
non-removable instead of silently ignoring the option?

Kevin

[Qemu-devel] Re: Fwd: Re: port-sparc/44389: awk failure during m4 installation with pkgsrc

2011-01-18 Thread Mateusz Loskot

Blue Swirl  gmail.com> writes:
> Mateusz Loskot  loskot.net> wrote:
> > Hi,
> >
> > I emulate SPARC with NetBSD 5.0 installed and I've experienced a problem
> > with pkgsrc installing one of packages.
> > I submitted bug report to NetBSD team and received short response
> > suggesting it is likely QEMU problem (see original message below).
> >
> > Shortly, the problem is that AWK throws "floating point exception"
> > Here is the bug report with details:
> >
> > http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389
> >
> > Given the nature of the problem, it "feels" the issue is unrelated to
> > QEMU, but I'm not on position to judge the suggestion posted by NetBSD
> > team is pointless.
> >
> > I have no idea how to isolate the problem on NetBSD, so I replied to
> > NetBSD team asking:
> > ===
> > Could you suggest how to isolate the problem, find exact awk command
> > causing the error, so I can debug it or forward to QEMU developers with
> > details necessary to reproduce it?
> > ===
> >
> > In the meantime, I thought I will attack qemu-devel hoping it may ring
> > some bells here as well.
> >
> > Any ideas?
> 
> It's entirely possible that the floating point support for Sparc can
> be buggy, there have been a few fixes to softfloat recently for other
> architectures.
> 
> I'd check if NaN handling or floating point exception support works
> correctly, regular programs don't use those. But a reproducible test
> case is essential.


I think I have managed to isolate reproducible test for this problem
and it seems the issue is in NetBSD userland.
See my last update to the problem reprot:

http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389

Long story short, the problem boils down to the following command:

# echo NaN | awk '{print "test"}'
awk: floating point exception 8
  source line number 1

which interprets "NaN" as a number when it isn't asked to.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
Member of ACCU, http://accu.org

Re: [Qemu-devel] [PATCH 1/3] mips: Break TBs after mfc0_count

2011-01-18 Thread Edgar E. Iglesias

On Tue, Jan 18, 2011 at 11:34:28AM +0100, Aurelien Jarno wrote:
> On Tue, Jan 18, 2011 at 12:29:40AM +0100, edgar.igles...@gmail.com wrote:
> > From: Edgar E. Iglesias 
> > 
> > Break the TB after reading the count register. This makes it
> > possible to take timer interrupts immediately after a read of
> > a possibly expired timer.
> > 
> > Signed-off-by: Edgar E. Iglesias 
> > ---
> >  target-mips/translate.c |4 +++-
> >  1 files changed, 3 insertions(+), 1 deletions(-)
> > 
> > diff --git a/target-mips/translate.c b/target-mips/translate.c
> > index cce77be..313cc29 100644
> > --- a/target-mips/translate.c
> > +++ b/target-mips/translate.c
> > @@ -3410,8 +3410,10 @@ static void gen_mfc0 (CPUState *env, DisasContext 
> > *ctx, TCGv arg, int reg, int s
> >  gen_helper_mfc0_count(arg);
> >  if (use_icount) {
> >  gen_io_end();
> > -ctx->bstate = BS_STOP;
> >  }
> > +/* Break the TB to be able to take timer interrupts immediately
> > +   after reading count.  */
> > +ctx->bstate = BS_STOP;
> >  rn = "Count";
> >  break;
> >  /* 6,7 are implementation dependent */
> 
> This looks fine, however it should probably be done the same way for
> dmfc0 on 64-bit MIPS.

Thanks for the quick review, I've pushed the series with this
suggested change.

Cheers

[Qemu-devel] Re: [PATCH v2 0/5] iohandlers: Add support for enabling/disabling individual handlers

2011-01-18 Thread Amit Shah

On (Mon) Jan 17 2011 [08:57:16], Anthony Liguori wrote:
> >Also -- this patchset was prompted by a bug in qemu chardevs that
> >freezes guests if they write faster than the chardevs can consume.
> >What should the strategy on fixing that bug be?
> 
> Fix the loop API such that we're not just fixing one bug but that we
> can address a bunch of other bugs that are out there.

But what's the right fix?

* Pull in glib or some other library
* Mimic API from some other library

Amit

[Qemu-devel] [PATCH v3 0/3] qcow2 metadata cache

2011-01-18 Thread Kevin Wolf

block-queue turned out to be too big effort to be useful for quickly fixing the
performance problems that qcow2 got since we introduced the metadata flushes.
While I still think the idea is right, it needs more time and qcow2 doesn't
have more time. Let's come back to block-queue later when the most urgent qcow2
problems are fixed.

So this is the idea of block-queue applied to the very specific case of qcow2.
Whereas block-queue tried to be a generic solution for all kind of things and
tried to make all writes asynchronous at the same time, this is only about
batching writes to refcount blocks and L2 tables in qcow2 and getting the
dependencies right. (Yes, the L1 table and refcount table is left alone. They
are almost never written to anyway.)

This should be much easier to understand and review, and I myself feel a bit
more confident about it than with block-queue, too.

v1:
- Don't read newly allocated tables from the disk before memsetting them to
  zero

v2:
- Addressed Stefan's review comments
- Added patch 3 to avoid an unnecessary bdrv_flush after COW

v3:
- Some error path fixes (esp. missing or double qcow2_cache_put)

Kevin Wolf (3):
  qcow2: Add QcowCache
  qcow2: Use QcowCache
  qcow2: Batch flushes for COW

 Makefile.objs  |2 +-
 block/qcow2-cache.c|  304 
 block/qcow2-cluster.c  |  206 -
 block/qcow2-refcount.c |  249 
 block/qcow2.c  |   48 +++-
 block/qcow2.h  |   32 +-
 6 files changed, 546 insertions(+), 295 deletions(-)
 create mode 100644 block/qcow2-cache.c

-- 
1.7.2.3

[Qemu-devel] [PATCH v3 1/3] qcow2: Add QcowCache

2011-01-18 Thread Kevin Wolf

This adds some new cache functions to qcow2 which can be used for caching
refcount blocks and L2 tables. When used with cache=writethrough they work
like the old caching code which is spread all over qcow2, so for this case we
have merely a cleanup.

The interesting case is with writeback caching (this includes cache=none) where
data isn't written to disk immediately but only kept in cache initially. This
leads to some form of metadata write batching which avoids the current "write
to refcount block, flush, write to L2 table" pattern for each single request
when a lot of cluster allocations happen. Instead, cache entries are only
written out if its required to maintain the right order. In the pure cluster
allocation case this means that all metadata updates for requests are done in
memory initially and on sync, first the refcount blocks are written to disk,
then fsync, then L2 tables.

This improves performance of scenarios with lots of cluster allocations
noticably (e.g. installation or after taking a snapshot).

Signed-off-by: Kevin Wolf 
---
 Makefile.objs   |2 +-
 block/qcow2-cache.c |  290 +++
 block/qcow2.h   |   19 
 3 files changed, 310 insertions(+), 1 deletions(-)
 create mode 100644 block/qcow2-cache.c

diff --git a/Makefile.objs b/Makefile.objs
index c3e52c5..65889c9 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -19,7 +19,7 @@ block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 
 block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o 
vvfat.o
-block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o
+block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o 
qcow2-cache.o
 block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-nested-y += qed-check.o
 block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o
diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
new file mode 100644
index 000..f7c4e2a
--- /dev/null
+++ b/block/qcow2-cache.c
@@ -0,0 +1,290 @@
+/*
+ * L2/refcount table cache for the QCOW2 format
+ *
+ * Copyright (c) 2010 Kevin Wolf 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "block_int.h"
+#include "qemu-common.h"
+#include "qcow2.h"
+
+typedef struct Qcow2CachedTable {
+void*   table;
+int64_t offset;
+booldirty;
+int cache_hits;
+int ref;
+} Qcow2CachedTable;
+
+struct Qcow2Cache {
+int size;
+Qcow2CachedTable*   entries;
+struct Qcow2Cache*  depends;
+boolwritethrough;
+};
+
+Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables,
+bool writethrough)
+{
+BDRVQcowState *s = bs->opaque;
+Qcow2Cache *c;
+int i;
+
+c = qemu_mallocz(sizeof(*c));
+c->size = num_tables;
+c->entries = qemu_mallocz(sizeof(*c->entries) * num_tables);
+c->writethrough = writethrough;
+
+for (i = 0; i < c->size; i++) {
+c->entries[i].table = qemu_blockalign(bs, s->cluster_size);
+}
+
+return c;
+}
+
+int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c)
+{
+int i;
+
+for (i = 0; i < c->size; i++) {
+assert(c->entries[i].ref == 0);
+qemu_vfree(c->entries[i].table);
+}
+
+qemu_free(c->entries);
+qemu_free(c);
+
+return 0;
+}
+
+static int qcow2_cache_flush_dependency(BlockDriverState *bs, Qcow2Cache *c)
+{
+int ret;
+
+ret = qcow2_cache_flush(bs, c->depends);
+if (ret < 0) {
+return ret;
+}
+
+c->depends = NULL;
+return 0;
+}
+
+static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
+{
+BDRVQcowState *s = bs->opaque;
+int ret;
+
+if (!c->entries[i].dirty || !c->entries[i].offset) {
+return 0;
+}
+
+if (c->depends) {
+ret = qcow2_cache_flush_dependency(

[Qemu-devel] [PATCH v3 3/3] qcow2: Batch flushes for COW

2011-01-18 Thread Kevin Wolf

qcow2 calls bdrv_flush() after performing COW in order to ensure that the
L2 table change is never written before the copy is safe on disk. Now that the
L2 table is cached, we can wait with flushing until we write out the next L2
table.

Signed-off-by: Kevin Wolf 
---
 block/qcow2-cache.c   |   20 +---
 block/qcow2-cluster.c |2 +-
 block/qcow2.h |1 +
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index f7c4e2a..57626c1 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -38,6 +38,7 @@ struct Qcow2Cache {
 int size;
 Qcow2CachedTable*   entries;
 struct Qcow2Cache*  depends;
+booldepends_on_flush;
 boolwritethrough;
 };
 
@@ -85,13 +86,15 @@ static int qcow2_cache_flush_dependency(BlockDriverState 
*bs, Qcow2Cache *c)
 }
 
 c->depends = NULL;
+c->depends_on_flush = false;
+
 return 0;
 }
 
 static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
 {
 BDRVQcowState *s = bs->opaque;
-int ret;
+int ret = 0;
 
 if (!c->entries[i].dirty || !c->entries[i].offset) {
 return 0;
@@ -99,11 +102,17 @@ static int qcow2_cache_entry_flush(BlockDriverState *bs, 
Qcow2Cache *c, int i)
 
 if (c->depends) {
 ret = qcow2_cache_flush_dependency(bs, c);
-if (ret < 0) {
-return ret;
+} else if (c->depends_on_flush) {
+ret = bdrv_flush(bs->file);
+if (ret >= 0) {
+c->depends_on_flush = false;
 }
 }
 
+if (ret < 0) {
+return ret;
+}
+
 ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table,
 s->cluster_size);
 if (ret < 0) {
@@ -161,6 +170,11 @@ int qcow2_cache_set_dependency(BlockDriverState *bs, 
Qcow2Cache *c,
 return 0;
 }
 
+void qcow2_cache_depends_on_flush(Qcow2Cache *c)
+{
+c->depends_on_flush = true;
+}
+
 static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c)
 {
 int i;
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 5eaf8e2..10c40ba 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -638,7 +638,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, 
QCowL2Meta *m)
  * handled.
  */
 if (cow) {
-bdrv_flush(bs->file);
+qcow2_cache_depends_on_flush(s->l2_table_cache);
 }
 
 qcow2_cache_set_dependency(bs, s->l2_table_cache, s->refcount_block_cache);
diff --git a/block/qcow2.h b/block/qcow2.h
index 11cbce3..6d80120 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -229,6 +229,7 @@ void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void 
*table);
 int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c);
 int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
 Qcow2Cache *dependency);
+void qcow2_cache_depends_on_flush(Qcow2Cache *c);
 
 int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
 void **table);
-- 
1.7.2.3

[Qemu-devel] [PATCH v3 2/3] qcow2: Use QcowCache

2011-01-18 Thread Kevin Wolf

Use the new functions of qcow2-cache.c for everything that works on refcount
block and L2 tables.

Signed-off-by: Kevin Wolf 
---
 block/qcow2-cluster.c  |  206 ++--
 block/qcow2-refcount.c |  249 +++-
 block/qcow2.c  |   48 +-
 block/qcow2.h  |   12 ++-
 4 files changed, 221 insertions(+), 294 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index c3ef550..5eaf8e2 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -67,7 +67,11 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, 
bool exact_size)
 qemu_free(new_l1_table);
 return new_l1_table_offset;
 }
-bdrv_flush(bs->file);
+
+ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+if (ret < 0) {
+return ret;
+}
 
 BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_WRITE_TABLE);
 for(i = 0; i < s->l1_size; i++)
@@ -98,63 +102,6 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, 
bool exact_size)
 return ret;
 }
 
-void qcow2_l2_cache_reset(BlockDriverState *bs)
-{
-BDRVQcowState *s = bs->opaque;
-
-memset(s->l2_cache, 0, s->l2_size * L2_CACHE_SIZE * sizeof(uint64_t));
-memset(s->l2_cache_offsets, 0, L2_CACHE_SIZE * sizeof(uint64_t));
-memset(s->l2_cache_counts, 0, L2_CACHE_SIZE * sizeof(uint32_t));
-}
-
-static inline int l2_cache_new_entry(BlockDriverState *bs)
-{
-BDRVQcowState *s = bs->opaque;
-uint32_t min_count;
-int min_index, i;
-
-/* find a new entry in the least used one */
-min_index = 0;
-min_count = 0x;
-for(i = 0; i < L2_CACHE_SIZE; i++) {
-if (s->l2_cache_counts[i] < min_count) {
-min_count = s->l2_cache_counts[i];
-min_index = i;
-}
-}
-return min_index;
-}
-
-/*
- * seek_l2_table
- *
- * seek l2_offset in the l2_cache table
- * if not found, return NULL,
- * if found,
- *   increments the l2 cache hit count of the entry,
- *   if counter overflow, divide by two all counters
- *   return the pointer to the l2 cache entry
- *
- */
-
-static uint64_t *seek_l2_table(BDRVQcowState *s, uint64_t l2_offset)
-{
-int i, j;
-
-for(i = 0; i < L2_CACHE_SIZE; i++) {
-if (l2_offset == s->l2_cache_offsets[i]) {
-/* increment the hit count */
-if (++s->l2_cache_counts[i] == 0x) {
-for(j = 0; j < L2_CACHE_SIZE; j++) {
-s->l2_cache_counts[j] >>= 1;
-}
-}
-return s->l2_cache + (i << s->l2_bits);
-}
-}
-return NULL;
-}
-
 /*
  * l2_load
  *
@@ -169,33 +116,12 @@ static int l2_load(BlockDriverState *bs, uint64_t 
l2_offset,
 uint64_t **l2_table)
 {
 BDRVQcowState *s = bs->opaque;
-int min_index;
 int ret;
 
-/* seek if the table for the given offset is in the cache */
-
-*l2_table = seek_l2_table(s, l2_offset);
-if (*l2_table != NULL) {
-return 0;
-}
-
-/* not found: load a new entry in the least used one */
-
-min_index = l2_cache_new_entry(bs);
-*l2_table = s->l2_cache + (min_index << s->l2_bits);
-
 BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD);
-ret = bdrv_pread(bs->file, l2_offset, *l2_table,
-s->l2_size * sizeof(uint64_t));
-if (ret < 0) {
-qcow2_l2_cache_reset(bs);
-return ret;
-}
+ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, (void**) l2_table);
 
-s->l2_cache_offsets[min_index] = l2_offset;
-s->l2_cache_counts[min_index] = 1;
-
-return 0;
+return ret;
 }
 
 /*
@@ -238,7 +164,6 @@ static int write_l1_entry(BlockDriverState *bs, int 
l1_index)
 static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table)
 {
 BDRVQcowState *s = bs->opaque;
-int min_index;
 uint64_t old_l2_offset;
 uint64_t *l2_table;
 int64_t l2_offset;
@@ -252,29 +177,48 @@ static int l2_allocate(BlockDriverState *bs, int 
l1_index, uint64_t **table)
 if (l2_offset < 0) {
 return l2_offset;
 }
-bdrv_flush(bs->file);
+
+ret = qcow2_cache_flush(bs, s->refcount_block_cache);
+if (ret < 0) {
+goto fail;
+}
 
 /* allocate a new entry in the l2 cache */
 
-min_index = l2_cache_new_entry(bs);
-l2_table = s->l2_cache + (min_index << s->l2_bits);
+ret = qcow2_cache_get_empty(bs, s->l2_table_cache, l2_offset, (void**) 
table);
+if (ret < 0) {
+return ret;
+}
+
+l2_table = *table;
 
 if (old_l2_offset == 0) {
 /* if there was no old l2 table, clear the new table */
 memset(l2_table, 0, s->l2_size * sizeof(uint64_t));
 } else {
+uint64_t* old_table;
+
 /* if there was an old l2 table, read it from the disk */
 BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_COW_READ);
-ret = bdrv_pread(bs->file, old_l2_offset, l2_table,
-s->l2_size * sizeof(uint64_t));
+ret = qcow2_cache

[Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts

2011-01-18 Thread Alexander Graf

On 18.01.2011, at 10:08, Gerd Hoffmann wrote:

>  Hi,
> 
>>> Worse might also be that unknown issue that force you to inject an IRQ
>>> here. We don't know. That's probably worst.
>> 
>> Well, IIRC the issue was that usually a level high interrupt line would
>> simply retrigger an interrupt after enabling the interrupt line in the
>> APIC again.
> 
> edge triggered interrupts wouldn't though.

The code change doesn't change anything for edge triggered interrupts - they 
work fine. Only !msi (== level) ones are broken.

> 
>> Unless my memory completely fails on me, that didn't happen,
>> so I added the manual retrigger on a partial command ACK in ahci code.
> 
> That re-trigger smells like you are not clearing the interrupt line where you 
> should.  For starters try calling ahci_check_irq() after guest writes to 
> PORT_IRQ_STAT.

The problem happened when I had the following:

ahci irq bits = 0

ahci irq bits = 1 | 2
irq line trigger
guest clears 1
ahci bits = 2

On normal hardware, the high state of the irq line would simply trigger another 
interrupt in the guest when it gets ACKed on the LAPIC. Somehow it doesn't get 
triggered here. I don't see why clearing the interrupt line would help? It's 
not what real hardware would do, no?

Alex

Re: [Qemu-devel] [PATCH v2 1/3] scsi-disk: Allow overriding SCSI INQUIRY removable bit

2011-01-18 Thread Stefan Hajnoczi

On Tue, Jan 18, 2011 at 11:39 AM, Kevin Wolf  wrote:
> Am 18.01.2011 11:10, schrieb Stefan Hajnoczi:
>> Provide the "removable" qdev property bit to override the SCSI INQUIRY
>> removable (RMB) bit for non-CDROM devices.  This will be used by USB
>> Mass Storage Devices, which sometimes have this guest-visible bit set
>> and sometimes do not.  They therefore requires a means for user
>> configuration.
>>
>> Signed-off-by: Stefan Hajnoczi 
>
> Should we print an error message when the user tries to make a CD-ROM
> non-removable instead of silently ignoring the option?

Good point.  I will add a check in scsi_disk_initfn() for v3.

Stefan

[Qemu-devel] Re: [PATCH 2/8] ahci: split ICH and AHCI even more

2011-01-18 Thread Kevin Wolf

Am 20.12.2010 22:13, schrieb Alexander Graf:
> Sebastian's patch already did a pretty good job at splitting up ICH-9
> AHCI code and the AHCI core. We need some more though. Copyright was missing,
> the lspci dump belongs to ICH-9, we don't need the AHCI core to have its
> own qdev device duplicate.
> 
> So let's split them a bit more in this patch, making things easier to
> read an understand.
> 
> Signed-off-by: Alexander Graf 

The license header is still missing in ahci.h.

Kevin

[Qemu-devel] Re: [PATCH 3/8] ahci: send init d2h fis on fis enable

2011-01-18 Thread Kevin Wolf

Am 20.12.2010 22:13, schrieb Alexander Graf:
> The drive sends a d2h init fis on initialization. Usually, the guest doesn't
> receive fises yet at that point though, so the delivery is deferred.
> 
> Let's reflect that by sending the init fis on fis receive enablement.
> 
> Signed-off-by: Alexander Graf 

Hm... If I read the spec right, the real solution wouldn't be an
init_d2h_sent flag, but implementing a queue for FISes that are received
when the guest hasn't set FRE yet.

I'm not against taking a hack like this, but maybe leave a comment
somewhere at least.

Kevin

[Qemu-devel] Re: [PATCH V2 1/3] Introduce log_start/log_stop in CPUPhysMemoryClient

2011-01-18 Thread anthony . perard

From: Anthony PERARD 

In order to use log_start/log_stop with Xen as well in the vga code,
this two operations have been put in CPUPhysMemoryClient.

The two new functions cpu_physical_log_start,cpu_physical_log_stop are
used in hw/vga.c and replace the kvm_log_start/stop. With this, vga does
no longer depends on kvm header.

Signed-off-by: Anthony PERARD 
---
 cpu-all.h|6 ++
 cpu-common.h |4 
 exec.c   |   28 
 hw/vga.c |   31 ---
 hw/vhost.c   |2 ++
 kvm-all.c|8 ++--
 kvm-stub.c   |   10 --
 kvm.h|3 ---
 8 files changed, 62 insertions(+), 30 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 30ae17d..aaf5442 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -957,6 +957,12 @@ int cpu_physical_memory_get_dirty_tracking(void);
 int cpu_physical_sync_dirty_bitmap(target_phys_addr_t start_addr,
target_phys_addr_t end_addr);
 
+int cpu_physical_log_start(target_phys_addr_t start_addr,
+   ram_addr_t size);
+
+int cpu_physical_log_stop(target_phys_addr_t start_addr,
+   ram_addr_t size);
+
 void dump_exec_info(FILE *f, fprintf_function cpu_fprintf);
 #endif /* !CONFIG_USER_ONLY */
 
diff --git a/cpu-common.h b/cpu-common.h
index 8ec01f4..2344842 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -88,6 +88,10 @@ struct CPUPhysMemoryClient {
  target_phys_addr_t end_addr);
 int (*migration_log)(struct CPUPhysMemoryClient *client,
  int enable);
+int (*log_start)(struct CPUPhysMemoryClient *client,
+ target_phys_addr_t phys_addr, ram_addr_t size);
+int (*log_stop)(struct CPUPhysMemoryClient *client,
+target_phys_addr_t phys_addr, ram_addr_t size);
 QLIST_ENTRY(CPUPhysMemoryClient) list;
 };
 
diff --git a/exec.c b/exec.c
index c6ed96d..59a6426 100644
--- a/exec.c
+++ b/exec.c
@@ -2073,6 +2073,34 @@ int cpu_physical_sync_dirty_bitmap(target_phys_addr_t 
start_addr,
 return ret;
 }
 
+int cpu_physical_log_start(target_phys_addr_t start_addr,
+   ram_addr_t size)
+{
+CPUPhysMemoryClient *client;
+QLIST_FOREACH(client, &memory_client_list, list) {
+if (client->log_start) {
+int r = client->log_start(client, start_addr, size);
+if (r < 0)
+return r;
+}
+}
+return 0;
+}
+
+int cpu_physical_log_stop(target_phys_addr_t start_addr,
+  ram_addr_t size)
+{
+CPUPhysMemoryClient *client;
+QLIST_FOREACH(client, &memory_client_list, list) {
+if (client->log_stop) {
+int r = client->log_stop(client, start_addr, size);
+if (r < 0)
+return r;
+}
+}
+return 0;
+}
+
 static inline void tlb_update_dirty(CPUTLBEntry *tlb_entry)
 {
 ram_addr_t ram_addr;
diff --git a/hw/vga.c b/hw/vga.c
index c057f4f..a9bf172 100644
--- a/hw/vga.c
+++ b/hw/vga.c
@@ -28,7 +28,6 @@
 #include "vga_int.h"
 #include "pixel_ops.h"
 #include "qemu-timer.h"
-#include "kvm.h"
 
 //#define DEBUG_VGA
 //#define DEBUG_VGA_MEM
@@ -1597,34 +1596,36 @@ static void vga_sync_dirty_bitmap(VGACommonState *s)
 
 void vga_dirty_log_start(VGACommonState *s)
 {
-if (kvm_enabled() && s->map_addr)
-kvm_log_start(s->map_addr, s->map_end - s->map_addr);
+if (s->map_addr) {
+cpu_physical_log_start(s->map_addr, s->map_end - s->map_addr);
+}
 
-if (kvm_enabled() && s->lfb_vram_mapped) {
-kvm_log_start(isa_mem_base + 0xa, 0x8000);
-kvm_log_start(isa_mem_base + 0xa8000, 0x8000);
+if (s->lfb_vram_mapped) {
+cpu_physical_log_start(isa_mem_base + 0xa, 0x8000);
+cpu_physical_log_start(isa_mem_base + 0xa8000, 0x8000);
 }
 
 #ifdef CONFIG_BOCHS_VBE
-if (kvm_enabled() && s->vbe_mapped) {
-kvm_log_start(VBE_DISPI_LFB_PHYSICAL_ADDRESS, s->vram_size);
+if (s->vbe_mapped) {
+cpu_physical_log_start(VBE_DISPI_LFB_PHYSICAL_ADDRESS, s->vram_size);
 }
 #endif
 }
 
 void vga_dirty_log_stop(VGACommonState *s)
 {
-if (kvm_enabled() && s->map_addr)
-   kvm_log_stop(s->map_addr, s->map_end - s->map_addr);
+if (s->map_addr) {
+   cpu_physical_log_stop(s->map_addr, s->map_end - s->map_addr);
+}
 
-if (kvm_enabled() && s->lfb_vram_mapped) {
-   kvm_log_stop(isa_mem_base + 0xa, 0x8000);
-   kvm_log_stop(isa_mem_base + 0xa8000, 0x8000);
+if (s->lfb_vram_mapped) {
+   cpu_physical_log_stop(isa_mem_base + 0xa, 0x8000);
+   cpu_physical_log_stop(isa_mem_base + 0xa8000, 0x8000);
 }
 
 #ifdef CONFIG_BOCHS_VBE
-if (kvm_enabled() && s->vbe_mapped) {
-   kvm_log_stop(VBE_DISPI_LFB_PHYSICAL_ADDRESS, s->vram_size);
+if (s->vbe_mapped) {
+   cpu_physical_log_stop(VBE_DISPI_LFB_PHYSICAL_ADDRESS, s->vram_size);
 }
 #endif
 }
diff --git a/hw/vhost.

Re: [Qemu-devel] [PATCH 1/3] block: add resize monitor command

2011-01-18 Thread Christoph Hellwig

On Mon, Jan 17, 2011 at 06:36:15PM +0100, Kevin Wolf wrote:
> size should be 'o' instead of 'l'. The latter may be too small on 32 bit
> hosts and doesn't support convenient suffixes:

Fixed.

> Hm, is there a real reason except that CD-ROMs are read-only? The code
> below seems to take read-only devices into account.

I've removed the CDROM check for the next version.

[Qemu-devel] Re: [PATCH 4/8] ahci: use qiov instead of dma helpers

2011-01-18 Thread Kevin Wolf

Am 20.12.2010 22:13, schrieb Alexander Graf:
> The DMA helpers incur additional overhead on data transfers. I'm not
> sure we need the additional complexity provided by them. So let's just
> use qiovs directly when running in the fast path (ncq).
> 
> Signed-off-by: Alexander Graf 
> ---
>  hw/ide/ahci.c |  100 
>  hw/ide/ahci.h |3 ++
>  2 files changed, 95 insertions(+), 8 deletions(-)

I don't feel comfortable with this one, and I think a while ago we
discussed on IRC why the DMA helpers even exist. If AHCI doesn't need
them, probably nobody needed them.

However, I'm inclined to think that AHCI actually _does_ need them in
corner cases, even though it might not break in all the common cases
that you have tested. Can you explain why only AHCI doesn't need it or
is it just "didn't break for me in practice"?

Where does the overhead in the DMA helpers come from? Can we optimize
this code instead of making the device emulation less correct?

Kevin

Re: [Qemu-devel] [PATCH 0/3] allow online resizing of block devices

2011-01-18 Thread Luiz Capitulino

On Fri, 14 Jan 2011 17:20:44 +0100
Christoph Hellwig  wrote:

> This patchset adds support for online resizing of block devices.
> 
> The first patch adds a new resize monitor command which call into
> the existing image resize code.  This is the meat of the series
> and probably needs quite a bit of review and help as I'm not sure
> about how to implement the error handling for monitor commands
> correctly.  Am I really supposed to add a new QERR_ definition
> for each possible error?  And if yes how am I supposed to define
> them?  The macros for them aren't exactly self-explaining.

Well, what happens is this: we screwed up with that interface and we
should replace it soon.

I see you're not adding the new command in QMP (only in the human monitor),
is that intentional? (qmp commands are added to the qmp-commands.hx file).

If it's intentional, then using only error_report() should be ok. If you
plan to have a qmp version, then we'll have to choose between reporting
a generic error version to the client, which is what's going to happen if
you use error_report(), or add the QERR_ macros.

Markus, what do you think? I feel it's pretty urgent for us to replace
that interface.

[Qemu-devel] Re: [PATCH 6/8] ahci: make number of ports runtime determined

2011-01-18 Thread Kevin Wolf

Am 20.12.2010 22:13, schrieb Alexander Graf:
> Different AHCI controllers have a different number of ports, so the core
> shouldn't care about the amount of ports available.
> 
> This patch makes the number of ports available to the AHCI core runtime
> configurable, allowing us to have multiple different AHCI implementations
> with different amounts of ports.
> 
> Signed-off-by: Alexander Graf 
> ---
>  hw/ide/ahci.c |   29 +++--
>  hw/ide/ahci.h |   10 +-
>  hw/ide/ich.c  |3 ++-
>  3 files changed, 26 insertions(+), 16 deletions(-)
> 
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index bd4f8a4..c0bc5ff 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -146,7 +146,7 @@ static void ahci_check_irq(AHCIState *s)
>  
>  DPRINTF(-1, "check irq %#x\n", s->control_regs.irqstatus);
>  
> -for (i = 0; i < SATA_PORTS; i++) {
> +for (i = 0; i < s->ports; i++) {
>  AHCIPortRegs *pr = &s->dev[i].port_regs;
>  if (pr->irq_stat & pr->irq_mask) {
>  s->control_regs.irqstatus |= (1 << i);
> @@ -300,7 +300,8 @@ static uint32_t ahci_mem_readl(void *ptr, 
> target_phys_addr_t addr)
>  
>  DPRINTF(-1, "(addr 0x%08X), val 0x%08X\n", (unsigned) addr, val);
>  } else if ((addr >= AHCI_PORT_REGS_START_ADDR) &&
> -   (addr < AHCI_PORT_REGS_END_ADDR)) {
> +   (addr < (AHCI_PORT_REGS_START_ADDR +
> +(s->ports * AHCI_PORT_ADDR_OFFSET_LEN {
>  val = ahci_port_read(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7,
>   addr & AHCI_PORT_ADDR_OFFSET_MASK);
>  }
> @@ -352,7 +353,8 @@ static void ahci_mem_writel(void *ptr, target_phys_addr_t 
> addr, uint32_t val)
>  DPRINTF(-1, "write to unknown register 0x%x\n", 
> (unsigned)addr);
>  }
>  } else if ((addr >= AHCI_PORT_REGS_START_ADDR) &&
> -   (addr < AHCI_PORT_REGS_END_ADDR)) {
> +   (addr < (AHCI_PORT_REGS_START_ADDR +
> +(s->ports * AHCI_PORT_ADDR_OFFSET_LEN {
>  ahci_port_write(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7,
>  addr & AHCI_PORT_ADDR_OFFSET_MASK, val);
>  }
> @@ -375,16 +377,16 @@ static void ahci_reg_init(AHCIState *s)
>  {
>  int i;
>  
> -s->control_regs.cap = (SATA_PORTS - 1) |
> +s->control_regs.cap = (s->ports - 1) |
>(AHCI_NUM_COMMAND_SLOTS << 8) |
>(AHCI_SUPPORTED_SPEED_GEN1 << 
> AHCI_SUPPORTED_SPEED) |
>HOST_CAP_NCQ | HOST_CAP_AHCI;
>  
> -s->control_regs.impl = (1 << SATA_PORTS) - 1;
> +s->control_regs.impl = (1 << s->ports) - 1;
>  
>  s->control_regs.version = AHCI_VERSION_1_0;
>  
> -for (i = 0; i < SATA_PORTS; i++) {
> +for (i = 0; i < s->ports; i++) {
>  s->dev[i].port_state = STATE_RUN;
>  }
>  }
> @@ -1177,17 +1179,19 @@ static const IDEDMAOps ahci_dma_ops = {
>  .reset = ahci_dma_reset,
>  };
>  
> -void ahci_init(AHCIState *s, DeviceState *qdev)
> +void ahci_init(AHCIState *s, DeviceState *qdev, int ports)
>  {
>  qemu_irq *irqs;
>  int i;
>  
> +s->ports = ports;
> +s->dev = qemu_mallocz(sizeof(AHCIDevice) * ports);
>  ahci_reg_init(s);
>  s->mem = cpu_register_io_memory(ahci_readfn, ahci_writefn, s,
>  DEVICE_LITTLE_ENDIAN);
> -irqs = qemu_allocate_irqs(ahci_irq_set, s, SATA_PORTS);
> +irqs = qemu_allocate_irqs(ahci_irq_set, s, s->ports);
>  
> -for (i = 0; i < SATA_PORTS; i++) {
> +for (i = 0; i < s->ports; i++) {
>  AHCIDevice *ad = &s->dev[i];
>  
>  ide_bus_new(&ad->port, qdev, i);
> @@ -1201,6 +1205,11 @@ void ahci_init(AHCIState *s, DeviceState *qdev)
>  }
>  }
>  
> +void ahci_uninit(AHCIState *s)
> +{
> +qemu_free(s->dev);
> +}
> +
>  void ahci_pci_map(PCIDevice *pci_dev, int region_num,
>  pcibus_t addr, pcibus_t size, int type)
>  {
> @@ -1218,7 +1227,7 @@ void ahci_reset(void *opaque)
>  d->ahci.control_regs.irqstatus = 0;
>  d->ahci.control_regs.ghc = 0;
>  
> -for (i = 0; i < SATA_PORTS; i++) {
> +for (i = 0; i < d->ahci.ports; i++) {
>  ahci_reset_port(&d->ahci, i);
>  }
>  }
> diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h
> index 0824064..eb87dcd 100644
> --- a/hw/ide/ahci.h
> +++ b/hw/ide/ahci.h
> @@ -165,11 +165,9 @@
>  #define AHCI_GENERIC_HOST_CONTROL_REGS_MAX_ADDR 0x20
>  /* Shouldn't this be 0x2c? */
>  
> -#define SATA_PORTS 4
> -
>  #define AHCI_PORT_REGS_START_ADDR  0x100
> -#define AHCI_PORT_REGS_END_ADDR (AHCI_PORT_REGS_START_ADDR + SATA_PORTS * 
> 0x80)
>  #define AHCI_PORT_ADDR_OFFSET_MASK 0x7f
> +#define AHCI_PORT_ADDR_OFFSET_LEN  0x80
>  
>  #define AHCI_NUM_COMMAND_SLOTS 31
>  #define AHCI_SUPPORTED_SPEED   20
> @@ -269,9 +267,10 @@ struct AHCIDevice {
>  };
>  
>  typede

[Qemu-devel] Re: [PATCH 7/8] ahci: free dynamically allocated iovs

2011-01-18 Thread Kevin Wolf

Am 20.12.2010 22:13, schrieb Alexander Graf:
> We allocate iovs on the fly now, but also need to free them on uninit.
> This patch does that.
> 
> Signed-off-by: Alexander Graf 

Isn't this a fix for patch 4? If so, please merge it there.

Kevin

[Qemu-devel] Re: [PATCH 3/8] ahci: send init d2h fis on fis enable

2011-01-18 Thread Alexander Graf


On 18.01.2011, at 13:25, Kevin Wolf wrote:

> Am 20.12.2010 22:13, schrieb Alexander Graf:
>> The drive sends a d2h init fis on initialization. Usually, the guest doesn't
>> receive fises yet at that point though, so the delivery is deferred.
>> 
>> Let's reflect that by sending the init fis on fis receive enablement.
>> 
>> Signed-off-by: Alexander Graf 
> 
> Hm... If I read the spec right, the real solution wouldn't be an
> init_d2h_sent flag, but implementing a queue for FISes that are received
> when the guest hasn't set FRE yet.
> 
> I'm not against taking a hack like this, but maybe leave a comment
> somewhere at least.

Yes, they'd get queued. In practice it doesn't really matter that much, which 
is why the hack works :). But you're right - a comment would be nice.


Alex

[Qemu-devel] Re: [PATCH 4/8] ahci: use qiov instead of dma helpers

2011-01-18 Thread Alexander Graf

On 18.01.2011, at 13:35, Kevin Wolf wrote:

> Am 20.12.2010 22:13, schrieb Alexander Graf:
>> The DMA helpers incur additional overhead on data transfers. I'm not
>> sure we need the additional complexity provided by them. So let's just
>> use qiovs directly when running in the fast path (ncq).
>> 
>> Signed-off-by: Alexander Graf 
>> ---
>> hw/ide/ahci.c |  100 
>> hw/ide/ahci.h |3 ++
>> 2 files changed, 95 insertions(+), 8 deletions(-)
> 
> I don't feel comfortable with this one, and I think a while ago we
> discussed on IRC why the DMA helpers even exist. If AHCI doesn't need
> them, probably nobody needed them.
> 
> However, I'm inclined to think that AHCI actually _does_ need them in
> corner cases, even though it might not break in all the common cases
> that you have tested. Can you explain why only AHCI doesn't need it or
> is it just "didn't break for me in practice"?
> 

It's the latter.

> Where does the overhead in the DMA helpers come from? Can we optimize
> this code instead of making the device emulation less correct?

Well, dma helpers involve another malloc which is probably the biggest hog. I 
frankly don't see the point in making it correct for the fast path though. I'd 
rather like to have a fast block emulation that works with all OSs than an 
accurate one that emulates something nobody cares about.

Virtio for example doesn't use dma helpers either - they just claim it's not 
defined in the spec. So if virtio-blk gets away with it, it means that all OSs 
should never make use of the additional complexity.

Alex

[Qemu-devel] Re: [PATCH 6/8] ahci: make number of ports runtime determined

2011-01-18 Thread Alexander Graf


On 18.01.2011, at 13:40, Kevin Wolf wrote:

> Am 20.12.2010 22:13, schrieb Alexander Graf:
>> Different AHCI controllers have a different number of ports, so the core
>> shouldn't care about the amount of ports available.
>> 
>> This patch makes the number of ports available to the AHCI core runtime
>> configurable, allowing us to have multiple different AHCI implementations
>> with different amounts of ports.
>> 
>> Signed-off-by: Alexander Graf 
>> ---
>> hw/ide/ahci.c |   29 +++--
>> hw/ide/ahci.h |   10 +-
>> hw/ide/ich.c  |3 ++-
>> 3 files changed, 26 insertions(+), 16 deletions(-)
>> 
>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
>> index bd4f8a4..c0bc5ff 100644
>> --- a/hw/ide/ahci.c
>> +++ b/hw/ide/ahci.c
>> @@ -146,7 +146,7 @@ static void ahci_check_irq(AHCIState *s)
>> 
>> DPRINTF(-1, "check irq %#x\n", s->control_regs.irqstatus);
>> 
>> -for (i = 0; i < SATA_PORTS; i++) {
>> +for (i = 0; i < s->ports; i++) {
>> AHCIPortRegs *pr = &s->dev[i].port_regs;
>> if (pr->irq_stat & pr->irq_mask) {
>> s->control_regs.irqstatus |= (1 << i);
>> @@ -300,7 +300,8 @@ static uint32_t ahci_mem_readl(void *ptr, 
>> target_phys_addr_t addr)
>> 
>> DPRINTF(-1, "(addr 0x%08X), val 0x%08X\n", (unsigned) addr, val);
>> } else if ((addr >= AHCI_PORT_REGS_START_ADDR) &&
>> -   (addr < AHCI_PORT_REGS_END_ADDR)) {
>> +   (addr < (AHCI_PORT_REGS_START_ADDR +
>> +(s->ports * AHCI_PORT_ADDR_OFFSET_LEN {
>> val = ahci_port_read(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7,
>>  addr & AHCI_PORT_ADDR_OFFSET_MASK);
>> }
>> @@ -352,7 +353,8 @@ static void ahci_mem_writel(void *ptr, 
>> target_phys_addr_t addr, uint32_t val)
>> DPRINTF(-1, "write to unknown register 0x%x\n", 
>> (unsigned)addr);
>> }
>> } else if ((addr >= AHCI_PORT_REGS_START_ADDR) &&
>> -   (addr < AHCI_PORT_REGS_END_ADDR)) {
>> +   (addr < (AHCI_PORT_REGS_START_ADDR +
>> +(s->ports * AHCI_PORT_ADDR_OFFSET_LEN {
>> ahci_port_write(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7,
>> addr & AHCI_PORT_ADDR_OFFSET_MASK, val);
>> }
>> @@ -375,16 +377,16 @@ static void ahci_reg_init(AHCIState *s)
>> {
>> int i;
>> 
>> -s->control_regs.cap = (SATA_PORTS - 1) |
>> +s->control_regs.cap = (s->ports - 1) |
>>   (AHCI_NUM_COMMAND_SLOTS << 8) |
>>   (AHCI_SUPPORTED_SPEED_GEN1 << 
>> AHCI_SUPPORTED_SPEED) |
>>   HOST_CAP_NCQ | HOST_CAP_AHCI;
>> 
>> -s->control_regs.impl = (1 << SATA_PORTS) - 1;
>> +s->control_regs.impl = (1 << s->ports) - 1;
>> 
>> s->control_regs.version = AHCI_VERSION_1_0;
>> 
>> -for (i = 0; i < SATA_PORTS; i++) {
>> +for (i = 0; i < s->ports; i++) {
>> s->dev[i].port_state = STATE_RUN;
>> }
>> }
>> @@ -1177,17 +1179,19 @@ static const IDEDMAOps ahci_dma_ops = {
>> .reset = ahci_dma_reset,
>> };
>> 
>> -void ahci_init(AHCIState *s, DeviceState *qdev)
>> +void ahci_init(AHCIState *s, DeviceState *qdev, int ports)
>> {
>> qemu_irq *irqs;
>> int i;
>> 
>> +s->ports = ports;
>> +s->dev = qemu_mallocz(sizeof(AHCIDevice) * ports);
>> ahci_reg_init(s);
>> s->mem = cpu_register_io_memory(ahci_readfn, ahci_writefn, s,
>> DEVICE_LITTLE_ENDIAN);
>> -irqs = qemu_allocate_irqs(ahci_irq_set, s, SATA_PORTS);
>> +irqs = qemu_allocate_irqs(ahci_irq_set, s, s->ports);
>> 
>> -for (i = 0; i < SATA_PORTS; i++) {
>> +for (i = 0; i < s->ports; i++) {
>> AHCIDevice *ad = &s->dev[i];
>> 
>> ide_bus_new(&ad->port, qdev, i);
>> @@ -1201,6 +1205,11 @@ void ahci_init(AHCIState *s, DeviceState *qdev)
>> }
>> }
>> 
>> +void ahci_uninit(AHCIState *s)
>> +{
>> +qemu_free(s->dev);
>> +}
>> +
>> void ahci_pci_map(PCIDevice *pci_dev, int region_num,
>> pcibus_t addr, pcibus_t size, int type)
>> {
>> @@ -1218,7 +1227,7 @@ void ahci_reset(void *opaque)
>> d->ahci.control_regs.irqstatus = 0;
>> d->ahci.control_regs.ghc = 0;
>> 
>> -for (i = 0; i < SATA_PORTS; i++) {
>> +for (i = 0; i < d->ahci.ports; i++) {
>> ahci_reset_port(&d->ahci, i);
>> }
>> }
>> diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h
>> index 0824064..eb87dcd 100644
>> --- a/hw/ide/ahci.h
>> +++ b/hw/ide/ahci.h
>> @@ -165,11 +165,9 @@
>> #define AHCI_GENERIC_HOST_CONTROL_REGS_MAX_ADDR 0x20
>> /* Shouldn't this be 0x2c? */
>> 
>> -#define SATA_PORTS 4
>> -
>> #define AHCI_PORT_REGS_START_ADDR  0x100
>> -#define AHCI_PORT_REGS_END_ADDR (AHCI_PORT_REGS_START_ADDR + SATA_PORTS * 
>> 0x80)
>> #define AHCI_PORT_ADDR_OFFSET_MASK 0x7f
>> +#define AHCI_PORT_ADDR_OFFSET_LEN  0x80
>> 
>> #define AHCI_NUM_COMMAND_SLOTS 31

Re: [Qemu-devel] [PATCH 0/3] allow online resizing of block devices

2011-01-18 Thread Christoph Hellwig

On Tue, Jan 18, 2011 at 10:35:39AM -0200, Luiz Capitulino wrote:
> Well, what happens is this: we screwed up with that interface and we
> should replace it soon.
> 
> I see you're not adding the new command in QMP (only in the human monitor),
> is that intentional? (qmp commands are added to the qmp-commands.hx file).
> 
> If it's intentional, then using only error_report() should be ok. If you
> plan to have a qmp version, then we'll have to choose between reporting
> a generic error version to the client, which is what's going to happen if
> you use error_report(), or add the QERR_ macros.

No, I hoped this would also add the QMP interface.  Why do we need to
declare the commands twice?  Especially as hmp-commands.hx is full of
magic that makes little sense only for human readable commands.

Any help on how to define the QERR_ macros?

[Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts

2011-01-18 Thread Jan Kiszka

On 2011-01-18 13:05, Alexander Graf wrote:
> 
> On 18.01.2011, at 10:08, Gerd Hoffmann wrote:
> 
>>  Hi,
>>
 Worse might also be that unknown issue that force you to inject an IRQ
 here. We don't know. That's probably worst.
>>>
>>> Well, IIRC the issue was that usually a level high interrupt line would
>>> simply retrigger an interrupt after enabling the interrupt line in the
>>> APIC again.
>>
>> edge triggered interrupts wouldn't though.
> 
> The code change doesn't change anything for edge triggered interrupts - they 
> work fine. Only !msi (== level) ones are broken.
> 
>>
>>> Unless my memory completely fails on me, that didn't happen,
>>> so I added the manual retrigger on a partial command ACK in ahci code.
>>
>> That re-trigger smells like you are not clearing the interrupt line where 
>> you should.  For starters try calling ahci_check_irq() after guest writes to 
>> PORT_IRQ_STAT.
> 
> The problem happened when I had the following:
> 
> ahci irq bits = 0
> 
> ahci irq bits = 1 | 2
> irq line trigger
> guest clears 1
> ahci bits = 2
> 
> 
> On normal hardware, the high state of the irq line would simply trigger 
> another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it 
> doesn't get triggered here. I don't see why clearing the interrupt line would 
> help? It's not what real hardware would do, no?
> 

No, real devices would simply leave the line asserted as long as there
is a reason to.

So again my question: With which irqchips do you see this effect, kvm's
in-kernel model and/or qemu's user space model? If there is an issue
with retriggering a CPU interrupt while the source is still asserted,
that probably needs to be fixed.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts

2011-01-18 Thread Gerd Hoffmann


  Hi,


edge triggered interrupts wouldn't though.


The code change doesn't change anything for edge triggered
interrupts

- they work fine. Only !msi (== level) ones are broken.

apic irqs can be both edge and level triggered too ...


That re-trigger smells like you are not clearing the interrupt line
where you should.  For starters try calling ahci_check_irq() after
guest writes to PORT_IRQ_STAT.




The problem happened when I had the following:

ahci irq bits = 0

ahci irq bits = 1 | 2
irq line trigger
guest clears 1
ahci bits = 2






On normal hardware, the high state of the irq line would simply

trigger another interrupt in the guest when it gets ACKed on the LAPIC.
Somehow it doesn't get triggered here. I don't see why clearing the
interrupt line would help? It's not what real hardware would do, no?

I think the guest and the ahci emulation simply disagree on whenever a 
irq is pending or not.  Guest thinks it has cleared the IRQ, but the 
code path it has taken didn't trigger ahci_irq_lower().


It is probably related to how the per-port and global irq status play 
together.  It isn't covered very well in the specs :-(


If an interrupt condition happens a bit in pr->irq_stat will be set. 
When the contition is enabled the port bit in irqstatus will be set, 
which in turn will trigger an irq.  That is the easy part.


Now the guest checks irqstatus to find the port, then checks 
pr->irq_stat to find the condition, acks it by clearing the bits it has 
seen in pr->irq_stat.  What does happen with the port bit in irqstatus 
now?  Will it be cleared automatically?  Must the guest clear it 
explicitly?  What happens in case another irq condition happened which 
the guest didn't ack (yet), will the guest be able to clear the bit?


cheers,
  Gerd

[Qemu-devel] [PATCH] target-arm: Log instruction start in TCG code

2011-01-18 Thread Peter Maydell

Add support for logging the start of instructions in TCG
code debug dumps for ARM targets.

Signed-off-by: Peter Maydell 
---
 target-arm/translate.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 907d73a..c60cd18 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -9199,6 +9199,10 @@ static inline void 
gen_intermediate_code_internal(CPUState *env,
 if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
 gen_io_start();
 
+if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) {
+tcg_gen_debug_insn_start(dc->pc);
+}
+
 if (dc->thumb) {
 disas_thumb_insn(env, dc);
 if (dc->condexec_mask) {
-- 
1.7.1

Re: [Qemu-devel] Re: [PATCH 6/8] ahci: make number of ports runtime determined

2011-01-18 Thread Gerd Hoffmann


  Hi,


diff --git a/hw/ide/ich.c b/hw/ide/ich.c
index 70cb766..f242d7a 100644
--- a/hw/ide/ich.c
+++ b/hw/ide/ich.c
@@ -100,7 +100,7 @@ static int pci_ich9_ahci_init(PCIDevice *dev)

  msi_init(dev, 0x50, 1, true, false);

-ahci_init(&d->ahci,&dev->qdev);
+ahci_init(&d->ahci,&dev->qdev, 6);
  d->ahci.irq = d->card.irq[0];


What about a qdev property instead of just hardcoding the value
somewhere else?


That particular piece of emulated hardware (ich9-ahci) actually has 6 
ports.  ich7 has 4 ports.  A thin hardware-specific wrapper which 
hardcodes this stuff (and pci ids and maybe some other minor 
differences) looks fine to me.  Maybe alex should add a ich7 variant 
just to prove the point ;)


cheers,
  Gerd

[Qemu-devel] Re: [PATCH 4/8] ahci: use qiov instead of dma helpers

2011-01-18 Thread Stefan Hajnoczi

On Tue, Jan 18, 2011 at 01:45:40PM +0100, Alexander Graf wrote:
> 
> On 18.01.2011, at 13:35, Kevin Wolf wrote:
> 
> > Am 20.12.2010 22:13, schrieb Alexander Graf:
> >> The DMA helpers incur additional overhead on data transfers. I'm not
> >> sure we need the additional complexity provided by them. So let's just
> >> use qiovs directly when running in the fast path (ncq).
> >> 
> >> Signed-off-by: Alexander Graf 
> >> ---
> >> hw/ide/ahci.c |  100 
> >> 
> >> hw/ide/ahci.h |3 ++
> >> 2 files changed, 95 insertions(+), 8 deletions(-)
> > 
> > I don't feel comfortable with this one, and I think a while ago we
> > discussed on IRC why the DMA helpers even exist. If AHCI doesn't need
> > them, probably nobody needed them.
> > 
> > However, I'm inclined to think that AHCI actually _does_ need them in
> > corner cases, even though it might not break in all the common cases
> > that you have tested. Can you explain why only AHCI doesn't need it or
> > is it just "didn't break for me in practice"?
> > 
> 
> It's the latter.
> 
> > Where does the overhead in the DMA helpers come from? Can we optimize
> > this code instead of making the device emulation less correct?
> 
> Well, dma helpers involve another malloc which is probably the biggest hog. I 
> frankly don't see the point in making it correct for the fast path though. 
> I'd rather like to have a fast block emulation that works with all OSs than 
> an accurate one that emulates something nobody cares about.
> 
> Virtio for example doesn't use dma helpers either - they just claim it's not 
> defined in the spec. So if virtio-blk gets away with it, it means that all 
> OSs should never make use of the additional complexity.

>From what I can tell DMA helpers is common AIO request code plus:

1. It handles map failure using cpu_register_map_client().
2. It handles short maps that are unable to map a full sglist element.

These two requirements are due to QEMU's guest memory mapping API.  IIUC
the limitations on that API are due to a limited amount of bounce buffer
space being used for some targets allowing mapping of non-RAM memory.

Perhaps this means that AHCI does not work on those targets if you
decide to send non-RAM pages to disk?

I'd be interested in understanding how this all works and how the QEMU
RAM API that Anthony and Alex Williamson have been playing with comes
into play.

Stefan

Re: [Qemu-devel] [PATCH 0/3] allow online resizing of block devices

2011-01-18 Thread Luiz Capitulino

On Tue, 18 Jan 2011 13:48:06 +0100
Christoph Hellwig  wrote:

> On Tue, Jan 18, 2011 at 10:35:39AM -0200, Luiz Capitulino wrote:
> > Well, what happens is this: we screwed up with that interface and we
> > should replace it soon.
> > 
> > I see you're not adding the new command in QMP (only in the human monitor),
> > is that intentional? (qmp commands are added to the qmp-commands.hx file).
> > 
> > If it's intentional, then using only error_report() should be ok. If you
> > plan to have a qmp version, then we'll have to choose between reporting
> > a generic error version to the client, which is what's going to happen if
> > you use error_report(), or add the QERR_ macros.
> 
> No, I hoped this would also add the QMP interface.  Why do we need to
> declare the commands twice?

Because they're different interfaces and should not be tied to each other,
they have different name spaces for example and commands in QMP might have
no relation to commands in HMP and vice-versa.

> Especially as hmp-commands.hx is full of
> magic that makes little sense only for human readable commands.

Yes, the reason for that is that we're in the middle of a refactoring,
but I have around 100 patches that makes a clean separation between the
two and creates a cleaner interface (specially for HMP).

> Any help on how to define the QERR_ macros?

I think we should emit a generic error in QMP for now, I don't think it's
a good idea to expend the bad interface we have today.

[Qemu-devel] [PATCH v2 3/3] usb-msd: Propagate removable bit to SCSI device

2011-01-18 Thread Stefan Hajnoczi

USB Mass Storage Devices sometimes have the RMB (removable) bit set in
the SCSI INQUIRY response.  Thumbdrives tend to have the bit set whereas
hard disks do not.

Operating systems differentiate between removable devices and fixed
devices.  Under Linux, the anaconda installer looks for removable
devices.  Under Windows, only fixed devices may have more than one
partition and AutoRun is also affected by the removable bit.

For these reasons, allow USB Mass Storage Devices to override the
removable bit:

qemu -usb
 -drive if=none,file=test.img,cache=none,id=disk0
 -device usb-storage,drive=disk0,removable=on

The default is off.

Signed-off-by: Stefan Hajnoczi 
---
 hw/usb-msd.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/hw/usb-msd.c b/hw/usb-msd.c
index ee897b6..f593c46 100644
--- a/hw/usb-msd.c
+++ b/hw/usb-msd.c
@@ -50,6 +50,7 @@ typedef struct {
 SCSIBus bus;
 BlockConf conf;
 SCSIDevice *scsi_dev;
+uint32_t removable;
 int result;
 /* For async completion.  */
 USBPacket *packet;
@@ -544,7 +545,7 @@ static int usb_msd_initfn(USBDevice *dev)
 
 s->dev.speed = USB_SPEED_FULL;
 scsi_bus_new(&s->bus, &s->dev.qdev, 0, 1, usb_msd_command_complete);
-s->scsi_dev = scsi_bus_legacy_add_drive(&s->bus, bs, 0, false);
+s->scsi_dev = scsi_bus_legacy_add_drive(&s->bus, bs, 0, !!s->removable);
 if (!s->scsi_dev) {
 return -1;
 }
@@ -634,6 +635,7 @@ static struct USBDeviceInfo msd_info = {
 .usbdevice_init = usb_msd_init,
 .qdev.props = (Property[]) {
 DEFINE_BLOCK_PROPERTIES(MSDState, conf),
+DEFINE_PROP_BIT("removable", MSDState, removable, 1, false),
 DEFINE_PROP_END_OF_LIST(),
 },
 };
-- 
1.7.2.3

[Qemu-devel] Re: [PATCH 4/8] ahci: use qiov instead of dma helpers

2011-01-18 Thread Kevin Wolf

Am 18.01.2011 14:14, schrieb Stefan Hajnoczi:
> On Tue, Jan 18, 2011 at 01:45:40PM +0100, Alexander Graf wrote:
>>
>> On 18.01.2011, at 13:35, Kevin Wolf wrote:
>>
>>> Am 20.12.2010 22:13, schrieb Alexander Graf:
 The DMA helpers incur additional overhead on data transfers. I'm not
 sure we need the additional complexity provided by them. So let's just
 use qiovs directly when running in the fast path (ncq).

 Signed-off-by: Alexander Graf 
 ---
 hw/ide/ahci.c |  100 
 
 hw/ide/ahci.h |3 ++
 2 files changed, 95 insertions(+), 8 deletions(-)
>>>
>>> I don't feel comfortable with this one, and I think a while ago we
>>> discussed on IRC why the DMA helpers even exist. If AHCI doesn't need
>>> them, probably nobody needed them.
>>>
>>> However, I'm inclined to think that AHCI actually _does_ need them in
>>> corner cases, even though it might not break in all the common cases
>>> that you have tested. Can you explain why only AHCI doesn't need it or
>>> is it just "didn't break for me in practice"?
>>>
>>
>> It's the latter.
>>
>>> Where does the overhead in the DMA helpers come from? Can we optimize
>>> this code instead of making the device emulation less correct?
>>
>> Well, dma helpers involve another malloc which is probably the biggest hog.

If you can get around this malloc in AHCI (IIUC, you do it by reusing
the old sglist buffer?), we can probably do something similar in the
generic DMA helper code and have IDE etc. benefit from it as well.

> I frankly don't see the point in making it correct for the fast path though. 
> I'd rather like to have a fast block emulation that works with all OSs than 
> an accurate one that emulates something nobody cares about.

Sorry, but "Windows and Linux" != "all OSes".

I'm sure there is a way that gives us correctness _and_ performance.

>> Virtio for example doesn't use dma helpers either - they just claim it's not 
>> defined in the spec. So if virtio-blk gets away with it, it means that all 
>> OSs should never make use of the additional complexity.
> 
> From what I can tell DMA helpers is common AIO request code plus:
> 
> 1. It handles map failure using cpu_register_map_client().
> 2. It handles short maps that are unable to map a full sglist element.
> 
> These two requirements are due to QEMU's guest memory mapping API.  IIUC
> the limitations on that API are due to a limited amount of bounce buffer
> space being used for some targets allowing mapping of non-RAM memory.
> 
> Perhaps this means that AHCI does not work on those targets if you
> decide to send non-RAM pages to disk?
> 
> I'd be interested in understanding how this all works and how the QEMU
> RAM API that Anthony and Alex Williamson have been playing with comes
> into play.

Yeah, I don't know the details either. I hope that Anthony will comment
on it.

Kevin

[Qemu-devel] Re: [PATCH 6/8] ahci: make number of ports runtime determined

2011-01-18 Thread Kevin Wolf

Am 18.01.2011 13:46, schrieb Alexander Graf:
> 
> On 18.01.2011, at 13:40, Kevin Wolf wrote:
> 
>> Am 20.12.2010 22:13, schrieb Alexander Graf:
>>> Different AHCI controllers have a different number of ports, so the core
>>> shouldn't care about the amount of ports available.
>>>
>>> This patch makes the number of ports available to the AHCI core runtime
>>> configurable, allowing us to have multiple different AHCI implementations
>>> with different amounts of ports.
>>>
>>> Signed-off-by: Alexander Graf 
>>> ---
>>> hw/ide/ahci.c |   29 +++--
>>> hw/ide/ahci.h |   10 +-
>>> hw/ide/ich.c  |3 ++-
>>> 3 files changed, 26 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
>>> index bd4f8a4..c0bc5ff 100644
>>> --- a/hw/ide/ahci.c
>>> +++ b/hw/ide/ahci.c
>>> @@ -146,7 +146,7 @@ static void ahci_check_irq(AHCIState *s)
>>>
>>> DPRINTF(-1, "check irq %#x\n", s->control_regs.irqstatus);
>>>
>>> -for (i = 0; i < SATA_PORTS; i++) {
>>> +for (i = 0; i < s->ports; i++) {
>>> AHCIPortRegs *pr = &s->dev[i].port_regs;
>>> if (pr->irq_stat & pr->irq_mask) {
>>> s->control_regs.irqstatus |= (1 << i);
>>> @@ -300,7 +300,8 @@ static uint32_t ahci_mem_readl(void *ptr, 
>>> target_phys_addr_t addr)
>>>
>>> DPRINTF(-1, "(addr 0x%08X), val 0x%08X\n", (unsigned) addr, val);
>>> } else if ((addr >= AHCI_PORT_REGS_START_ADDR) &&
>>> -   (addr < AHCI_PORT_REGS_END_ADDR)) {
>>> +   (addr < (AHCI_PORT_REGS_START_ADDR +
>>> +(s->ports * AHCI_PORT_ADDR_OFFSET_LEN {
>>> val = ahci_port_read(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7,
>>>  addr & AHCI_PORT_ADDR_OFFSET_MASK);
>>> }
>>> @@ -352,7 +353,8 @@ static void ahci_mem_writel(void *ptr, 
>>> target_phys_addr_t addr, uint32_t val)
>>> DPRINTF(-1, "write to unknown register 0x%x\n", 
>>> (unsigned)addr);
>>> }
>>> } else if ((addr >= AHCI_PORT_REGS_START_ADDR) &&
>>> -   (addr < AHCI_PORT_REGS_END_ADDR)) {
>>> +   (addr < (AHCI_PORT_REGS_START_ADDR +
>>> +(s->ports * AHCI_PORT_ADDR_OFFSET_LEN {
>>> ahci_port_write(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7,
>>> addr & AHCI_PORT_ADDR_OFFSET_MASK, val);
>>> }
>>> @@ -375,16 +377,16 @@ static void ahci_reg_init(AHCIState *s)
>>> {
>>> int i;
>>>
>>> -s->control_regs.cap = (SATA_PORTS - 1) |
>>> +s->control_regs.cap = (s->ports - 1) |
>>>   (AHCI_NUM_COMMAND_SLOTS << 8) |
>>>   (AHCI_SUPPORTED_SPEED_GEN1 << 
>>> AHCI_SUPPORTED_SPEED) |
>>>   HOST_CAP_NCQ | HOST_CAP_AHCI;
>>>
>>> -s->control_regs.impl = (1 << SATA_PORTS) - 1;
>>> +s->control_regs.impl = (1 << s->ports) - 1;
>>>
>>> s->control_regs.version = AHCI_VERSION_1_0;
>>>
>>> -for (i = 0; i < SATA_PORTS; i++) {
>>> +for (i = 0; i < s->ports; i++) {
>>> s->dev[i].port_state = STATE_RUN;
>>> }
>>> }
>>> @@ -1177,17 +1179,19 @@ static const IDEDMAOps ahci_dma_ops = {
>>> .reset = ahci_dma_reset,
>>> };
>>>
>>> -void ahci_init(AHCIState *s, DeviceState *qdev)
>>> +void ahci_init(AHCIState *s, DeviceState *qdev, int ports)
>>> {
>>> qemu_irq *irqs;
>>> int i;
>>>
>>> +s->ports = ports;
>>> +s->dev = qemu_mallocz(sizeof(AHCIDevice) * ports);
>>> ahci_reg_init(s);
>>> s->mem = cpu_register_io_memory(ahci_readfn, ahci_writefn, s,
>>> DEVICE_LITTLE_ENDIAN);
>>> -irqs = qemu_allocate_irqs(ahci_irq_set, s, SATA_PORTS);
>>> +irqs = qemu_allocate_irqs(ahci_irq_set, s, s->ports);
>>>
>>> -for (i = 0; i < SATA_PORTS; i++) {
>>> +for (i = 0; i < s->ports; i++) {
>>> AHCIDevice *ad = &s->dev[i];
>>>
>>> ide_bus_new(&ad->port, qdev, i);
>>> @@ -1201,6 +1205,11 @@ void ahci_init(AHCIState *s, DeviceState *qdev)
>>> }
>>> }
>>>
>>> +void ahci_uninit(AHCIState *s)
>>> +{
>>> +qemu_free(s->dev);
>>> +}
>>> +
>>> void ahci_pci_map(PCIDevice *pci_dev, int region_num,
>>> pcibus_t addr, pcibus_t size, int type)
>>> {
>>> @@ -1218,7 +1227,7 @@ void ahci_reset(void *opaque)
>>> d->ahci.control_regs.irqstatus = 0;
>>> d->ahci.control_regs.ghc = 0;
>>>
>>> -for (i = 0; i < SATA_PORTS; i++) {
>>> +for (i = 0; i < d->ahci.ports; i++) {
>>> ahci_reset_port(&d->ahci, i);
>>> }
>>> }
>>> diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h
>>> index 0824064..eb87dcd 100644
>>> --- a/hw/ide/ahci.h
>>> +++ b/hw/ide/ahci.h
>>> @@ -165,11 +165,9 @@
>>> #define AHCI_GENERIC_HOST_CONTROL_REGS_MAX_ADDR 0x20
>>> /* Shouldn't this be 0x2c? */
>>>
>>> -#define SATA_PORTS 4
>>> -
>>> #define AHCI_PORT_REGS_START_ADDR  0x100
>>> -#define AHCI_PORT_REGS_END_ADDR (AHCI_PORT_REGS_START_ADDR + SATA_PORTS * 
>>>

Re: [Qemu-devel] paravirtual tablet v3

2011-01-18 Thread Gerd Hoffmann


  Hi,


Sure, someone needs to map multitouch to non-multitouch. I'd leave
that job to the guest driver tough. Why do you think doing it in the
host is better?


My assumptions are 1) the host is capable of doing the mapping just as
easily as the guest


Probably.


2) the host can do something useful with the
information that the guest is not multitouch capable.


What do you think of here?

cheers,
  Gerd

Re: [Qemu-devel] [RFC][PATCH v6 00/23] virtagent: host/guest RPC communication agent

2011-01-18 Thread Gerd Hoffmann


On 01/17/11 15:53, Michael Roth wrote:

On 01/17/2011 07:53 AM, Gerd Hoffmann wrote:

What is your plan to handle system-level queries+actions (such as
reboot) vs. per-user stuff (such as cut+paste)?


This is an area that hasn't been well-defined yet and is definitely open
for suggestions.


One option would be to have two virtio-serial channels, one for the 
system and one for the user stuff.  gdm could grant the desktop user 
access to the user channel like it does with sound devices and simliar 
stuff, so the user agent has access to it.


Another option is to have some socket where the user agent can talk to 
the system agent and have it relay the requests.


Maybe it is also possible to use dbus for communication between the 
system agent and user agent (and maybe other components).  Maybe it even 
makes sense to run the dbus protocol over the virtio-serial line? 
Disclaimer: I know next to nothing about dbus details.



For host->guest RPCs the current plan is to always have the RPC executed
at the system level, but for situations involving a specific user we
fork and drop privileges with the RPC, and report back the status of the
fork/exec. The fork'd process would then do what it needs to do, then if
needed, communicate status back to the system-level daemon via some IPC
mechanism (most likely a socket we listen to in addition to the serial
channel) that can be used to send an event. The system-level daemon then
communicates these events back to the host with a guest->host RPC.


Hmm.  A bit heavy to fork+exec on every rpc.  might be ok for rare 
events though.



Processes created independently of the system-level daemon could report
events in the same manner, via this socket. I think this might suit the
vdagent client model for Spice as well,


Yes, vdagent works this way, except that *all* communication goes 
through that socket, i.e. events/requests coming from the host for the 
user-level agent are routed through that socket too.


It is the only sane way to handle clipboard support IMHO as there is 
quite some message ping-pong involved to get a clipboard transaction done.


How does xmlrpm transmit binary blobs btw?

cheers,
  Gerd

Re: [Qemu-devel] [RFC][PATCH v6 00/23] virtagent: host/guest RPC communication agent

2011-01-18 Thread Anthony Liguori


On 01/18/2011 08:02 AM, Gerd Hoffmann wrote:

On 01/17/11 15:53, Michael Roth wrote:

On 01/17/2011 07:53 AM, Gerd Hoffmann wrote:

What is your plan to handle system-level queries+actions (such as
reboot) vs. per-user stuff (such as cut+paste)?


This is an area that hasn't been well-defined yet and is definitely open
for suggestions.


One option would be to have two virtio-serial channels, one for the 
system and one for the user stuff.  gdm could grant the desktop user 
access to the user channel like it does with sound devices and simliar 
stuff, so the user agent has access to it.


Another option is to have some socket where the user agent can talk to 
the system agent and have it relay the requests.


I think this is the best approach.  One requirement we've been working 
with is that all actions from guest agents are logged.  This is to give 
an administrator confidence that the hypervisor isn't doing anything 
stupid.  If you route all of the user traffic through a privileged 
daemon, you can log everything to syslog or an appropriate log file.


Maybe it is also possible to use dbus for communication between the 
system agent and user agent (and maybe other components).  Maybe it 
even makes sense to run the dbus protocol over the virtio-serial line? 
Disclaimer: I know next to nothing about dbus details.


The way I'd prefer to think about it is that the transport and protocol 
used are separate layers that may have multiple implementations over time.


For instance, we currently support virtio-serial and isa-serial.  
Supporting another protocol wouldn't be a big deal.  The part that's 
needs to remain consistent is the API supported by the 
transport/protocol combinations.



For host->guest RPCs the current plan is to always have the RPC executed
at the system level, but for situations involving a specific user we
fork and drop privileges with the RPC, and report back the status of the
fork/exec. The fork'd process would then do what it needs to do, then if
needed, communicate status back to the system-level daemon via some IPC
mechanism (most likely a socket we listen to in addition to the serial
channel) that can be used to send an event. The system-level daemon then
communicates these events back to the host with a guest->host RPC.


Hmm.  A bit heavy to fork+exec on every rpc.  might be ok for rare 
events though.



Processes created independently of the system-level daemon could report
events in the same manner, via this socket. I think this might suit the
vdagent client model for Spice as well,


Yes, vdagent works this way, except that *all* communication goes 
through that socket, i.e. events/requests coming from the host for the 
user-level agent are routed through that socket too.


It is the only sane way to handle clipboard support IMHO as there is 
quite some message ping-pong involved to get a clipboard transaction 
done.


How does xmlrpm transmit binary blobs btw?


XML-RPC has a base64 encoding that's part of the standard for encoding 
binary data.  It also supports UTF-8 encoded strings.


Regards,

Anthony Liguori


cheers,
  Gerd

[Qemu-devel] Re: [PATCH v2 0/5] iohandlers: Add support for enabling/disabling individual handlers

2011-01-18 Thread Anthony Liguori


On 01/18/2011 05:56 AM, Amit Shah wrote:

On (Mon) Jan 17 2011 [08:57:16], Anthony Liguori wrote:
   

Also -- this patchset was prompted by a bug in qemu chardevs that
freezes guests if they write faster than the chardevs can consume.
What should the strategy on fixing that bug be?
   

Fix the loop API such that we're not just fixing one bug but that we
can address a bunch of other bugs that are out there.
 

But what's the right fix?

* Pull in glib or some other library
* Mimic API from some other library
   


1) get rid of poll callbacks

These kill us right now because it makes it impossible to use poll/epoll 
or to use some other API.


2) switch to a single callback and an event mask

We can't do this without (1) because the can_read callback takes a 
different signature.  But read/write callbacks can be a single callback 
(but it needs to be changed to accept an event mask).


3) allow the event mask to be changed via a new API

This gives the functionality you're looking for while making a huge 
improvement to the internal API.


A quick python/perl script can probably do 90% of the work here.

Regards,

Anthony Liguori


Amit

Re: [Qemu-devel] [PATCH] target-arm: Log instruction start in TCG code

2011-01-18 Thread Edgar E. Iglesias

On Tue, Jan 18, 2011 at 01:08:40PM +, Peter Maydell wrote:
> Add support for logging the start of instructions in TCG
> code debug dumps for ARM targets.
> 
> Signed-off-by: Peter Maydell 

Applied, thanks.


> ---
>  target-arm/translate.c |4 
>  1 files changed, 4 insertions(+), 0 deletions(-)
> 
> diff --git a/target-arm/translate.c b/target-arm/translate.c
> index 907d73a..c60cd18 100644
> --- a/target-arm/translate.c
> +++ b/target-arm/translate.c
> @@ -9199,6 +9199,10 @@ static inline void 
> gen_intermediate_code_internal(CPUState *env,
>  if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
>  gen_io_start();
>  
> +if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) {
> +tcg_gen_debug_insn_start(dc->pc);
> +}
> +
>  if (dc->thumb) {
>  disas_thumb_insn(env, dc);
>  if (dc->condexec_mask) {
> -- 
> 1.7.1
> 
>

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Jan Kiszka

On 2011-01-12 11:31, Jan Kiszka wrote:
> Am 12.01.2011 11:22, Avi Kivity wrote:
>> On 01/11/2011 03:54 PM, Anthony Liguori wrote:
>>>
>>> Right, we should introduce a KVMBus that KVM devices are created on. 
>>> The devices can get at KVMState through the BusState.
>>
>> There is no kvm bus in a PC (I looked).  We're bending the device model
>> here because a device is implemented in the kernel and not in
>> userspace.  An implementation detail is magnified beyond all proportions.
>>
>> An ioapic that is implemented by kvm lives in exactly the same place
>> that the qemu ioapic lives in.  An assigned pci device lives on the PCI
>> bus, not a KVMBus.  If we need a pointer to KVMState, then we must find
>> it elsewhere, not through creating imaginary buses that don't exist.
>>
> 
> Exactly.
> 
> So we can either "infect" the whole device tree with kvm (or maybe a
> more generic accelerator structure that also deals with Xen) or we need
> to pull the reference inside the device's init function from some global
> service (kvm_get_state).

Note that this topic is still waiting for good suggestions, specifically
from those who believe in kvm_state references :). This is not only
blocking kvmstate merge but will affect KVM irqchips as well.

It boils down to how we reasonably pass a kvm_state reference from
machine init code to a sysbus device. I'm probably biased, but I don't
see any way that does not work against the idea of confining access to
kvm_state or breaks device instantiation from the command line or a
config file.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: KVM call agenda for Jan 18

2011-01-18 Thread Chris Wright

* Chris Wright (chr...@redhat.com) wrote:
> Please send in any agenda items you are interested in covering.

No agenda, this week's call is cancelled.

thanks,
-chris

[Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation.

2011-01-18 Thread Christophe Lyon

Fix garbage collection of temporaries in Neon emulation.


Signed-off-by: Christophe Lyon 
---
 target-arm/translate.c |   22 +-
 1 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 57664bc..363351e 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -4176,6 +4176,18 @@ static inline void gen_neon_mull(TCGv_i64 dest,
TCGv a, TCGv b, int size, int u)
 break;
 default: abort();
 }
+
+/* gen_helper_neon_mull_[su]{8|16} do not free their parameters.
+   Don't forget to clean them now.  */
+switch ((size << 1) | u) {
+case 0:
+case 1:
+case 2:
+case 3:
+  dead_tmp(a);
+  dead_tmp(b);
+  break;
+}
 }

 /* Translate a NEON data processing instruction.  Return nonzero if the
@@ -4840,7 +4852,7 @@ static int disas_neon_data_insn(CPUState * env,
DisasContext *s, uint32_t insn)
 if (size == 3) {
 tcg_temp_free_i64(tmp64);
 } else {
-dead_tmp(tmp2);
+tcg_temp_free_i32(tmp2);
 }
 } else if (op == 10) {
 /* VSHLL */
@@ -5076,8 +5088,6 @@ static int disas_neon_data_insn(CPUState * env,
DisasContext *s, uint32_t insn)
 case 8: case 9: case 10: case 11: case 12: case 13:
 /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL,
VQDMULL */
 gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
-dead_tmp(tmp2);
-dead_tmp(tmp);
 break;
 case 14: /* Polynomial VMULL */
 cpu_abort(env, "Polynomial VMULL not implemented");
@@ -5235,9 +5245,12 @@ static int disas_neon_data_insn(CPUState * env,
DisasContext *s, uint32_t insn)
 tmp = neon_load_reg(rn, 0);
 } else {
 tmp = tmp3;
+   /* tmp2 has been discarded in
+  gen_neon_mull during pass 0, we need to
+  recreate it.  */
+   tmp2 = neon_get_scalar(size, rm);
 }
 gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
-dead_tmp(tmp);
 if (op == 6 || op == 7) {
 gen_neon_negl(cpu_V0, size);
 }
@@ -5264,7 +5277,6 @@ static int disas_neon_data_insn(CPUState * env,
DisasContext *s, uint32_t insn)
 neon_store_reg64(cpu_V0, rd + pass);
 }

-dead_tmp(tmp2);

 break;
 default: /* 14 and 15 are RESERVED */
-- 
1.7.2.3

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Anthony Liguori


On 01/18/2011 08:28 AM, Jan Kiszka wrote:

On 2011-01-12 11:31, Jan Kiszka wrote:
   

Am 12.01.2011 11:22, Avi Kivity wrote:
 

On 01/11/2011 03:54 PM, Anthony Liguori wrote:
   

Right, we should introduce a KVMBus that KVM devices are created on.
The devices can get at KVMState through the BusState.
 

There is no kvm bus in a PC (I looked).  We're bending the device model
here because a device is implemented in the kernel and not in
userspace.  An implementation detail is magnified beyond all proportions.

An ioapic that is implemented by kvm lives in exactly the same place
that the qemu ioapic lives in.  An assigned pci device lives on the PCI
bus, not a KVMBus.  If we need a pointer to KVMState, then we must find
it elsewhere, not through creating imaginary buses that don't exist.

   

Exactly.

So we can either "infect" the whole device tree with kvm (or maybe a
more generic accelerator structure that also deals with Xen) or we need
to pull the reference inside the device's init function from some global
service (kvm_get_state).
 

Note that this topic is still waiting for good suggestions, specifically
from those who believe in kvm_state references :). This is not only
blocking kvmstate merge but will affect KVM irqchips as well.

It boils down to how we reasonably pass a kvm_state reference from
machine init code to a sysbus device. I'm probably biased, but I don't
see any way that does not work against the idea of confining access to
kvm_state or breaks device instantiation from the command line or a
config file.
   


A KVM device should sit on a KVM specific bus that hangs off of sysbus.  
It can get to kvm_state through that bus.


That bus doesn't get instantiated through qdev so requiring a pointer 
argument should not be an issue.


Regards,

Anthony Liguori


Jan

Re: [Qemu-devel] paravirtual tablet v3

2011-01-18 Thread Anthony Liguori


On 01/18/2011 07:46 AM, Gerd Hoffmann wrote:

  Hi,


Sure, someone needs to map multitouch to non-multitouch. I'd leave
that job to the guest driver tough. Why do you think doing it in the
host is better?


My assumptions are 1) the host is capable of doing the mapping just as
easily as the guest


Probably.


2) the host can do something useful with the
information that the guest is not multitouch capable.


What do you think of here?


Imagine a nice looking GUI, you would likely have icons representing the 
mouse status.   You've got at least PS/2 mouse, PS/2 mouse with capture, 
PV mouse in absolute mode, PV mouse in multitouch mode, etc.


VirtualBox has an interface like this FWIW (although not multitouch aware).

Regards,

Anthony Liguori



cheers,
  Gerd

[Qemu-devel] [sparc] Floating point exception issue

2011-01-18 Thread Mateusz Loskot


Hi,

Recently, I have reported mysterious issues on NetBSD 5.1
emulated on SPARC. The whole first thread is here:

http://lists.gnu.org/archive/html/qemu-devel/2011-01/msg01509.html

I decided to investigate the problem deeper and with great help
from NetBSD folks, I managed to find reproducible test case.
Initially, it was AWK command:

# echo NaN | awk '{print "test"}'
awk: floating point exception 8
  source line number 1

and next it boiled down to simple C program (see below).
Details of the investigation are archived in the NetBSD Problem
Report #44389 here:

http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389


Here is final version of the test program which reproduces the problem:

#include 
#include 
#include 
#include 

int is_number(const char *s)
{
double r;
char *ep;
errno = 0;
r = strtod(s, &ep);
if (r == HUGE_VAL)
printf("X:%g\n", r);

if (ep == s || r == HUGE_VAL || errno == ERANGE)
return 0;
while (*ep == ' ' || *ep == '\t' || *ep == '\n')
ep++;
if (*ep == '\0')
return 1;
else
return 0;
}

int main(int argc, char **argv)
{
double v;

if (is_number("NaN")) {
printf("is a number\n");
v = atof("NaN");
} else {
printf("not a number\n");
v = 0.0;
}
printf("%.4f\n", v);

return 0;
}


On NetBSD/SPARC, the program receives SIGFPE:

$ gcc ./nan_test_2.c
 $ ./a.out
 [1]   Floating point exception (core dumped) ./a.out

Specifically, it's caused by r == HUGE_VAL condition in
  if (ep == s || r == HUGE_VAL || errno == ERANGE)
where r is NaN.

All the signs indicate there is a bug in QEMU.

Best regards,
--
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
Member of ACCU, http://accu.org

Re: [Qemu-devel] [PATCH 2/2] strtosz(): Use suffix macros in switch() statement

2011-01-18 Thread Alex Williamson

On Tue, 2011-01-18 at 10:20 +0100, Markus Armbruster wrote:
> jes.soren...@redhat.com writes:
> 
> > From: Jes Sorensen 
> >
> > Signed-off-by: Jes Sorensen 
> > ---
> >  cutils.c |   10 +-
> >  1 files changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/cutils.c b/cutils.c
> > index 328738c..f2c8bbd 100644
> > --- a/cutils.c
> > +++ b/cutils.c
> > @@ -324,26 +324,26 @@ ssize_t strtosz_suffix(const char *nptr, char **end, 
> > const char default_suffix)
> >  }
> >  }
> >  switch (toupper(d)) {
> > -case 'B':
> > +case STRTOSZ_DEFSUFFIX_B:
> >  mul = 1;
> >  if (mul_required) {
> >  goto fail;
> >  }
> >  break;
> > -case 'K':
> > +case STRTOSZ_DEFSUFFIX_KB:
> >  mul = 1 << 10;
> >  break;
> >  case 0:
> >  if (mul_required) {
> >  goto fail;
> >  }
> > -case 'M':
> > +case STRTOSZ_DEFSUFFIX_MB:
> >  mul = 1ULL << 20;
> >  break;
> > -case 'G':
> > +case STRTOSZ_DEFSUFFIX_GB:
> >  mul = 1ULL << 30;
> >  break;
> > -case 'T':
> > +case STRTOSZ_DEFSUFFIX_TB:
> >  mul = 1ULL << 40;
> >  break;
> >  default:
> 
> And this improves what?  Certainly not clarity.
> 
> In my opinion, the STRTOSZ_DEFSUFFIX_TB are useless chaff.  Chacun à son
> goût.

 I prefer it.  Better than having 'B'/'K'/'M'/'G'/'T' sprinkled
throughout the code.  Easier to search for, easier to update, more
consistent.

Alex

Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation.

2011-01-18 Thread Peter Maydell

On 18 January 2011 14:34, Christophe Lyon  wrote:
> +
> +    /* gen_helper_neon_mull_[su]{8|16} do not free their parameters.
> +       Don't forget to clean them now.  */
> +    switch ((size << 1) | u) {
> +    case 0:
> +    case 1:
> +    case 2:
> +    case 3:
> +      dead_tmp(a);
> +      dead_tmp(b);
> +      break;
> +    }
>  }

This seems a rather convoluted way to write "if (size < 2) { ... }"

> @@ -5235,9 +5245,12 @@ static int disas_neon_data_insn(CPUState * env,
> DisasContext *s, uint32_t insn)
>                             tmp = neon_load_reg(rn, 0);
>                         } else {
>                             tmp = tmp3;
> +                           /* tmp2 has been discarded in
> +                              gen_neon_mull during pass 0, we need to
> +                              recreate it.  */
> +                           tmp2 = neon_get_scalar(size, rm);
>                         }

I think this will give the wrong results for instructions where the
scalar operand is in the same Neon register as the destination
for the first pass, because calling neon_get_scalar() again will
do a reload from the Neon register and it might have changed.
(Also loading it once at the start rather than in every pass is
more efficient as well as being correct :-))

Also your patch has hard-coded tabs in it (please see
CODING_STYLE on the subject of whitespace) and your
mail client or server has line-wrapped long lines in the patch
so it doesn't apply cleanly...

-- PMM

Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation.

2011-01-18 Thread Peter Maydell

Incidentally there are some correctness fixes for the multiply-by-scalar
neon insns from the qemu-meego tree which are on my list to push
upstream. So you probably aren't getting the right results even if
you've managed to shut up qemu's warnings :-)

-- PMM

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Jan Kiszka

On 2011-01-18 16:04, Anthony Liguori wrote:
> On 01/18/2011 08:28 AM, Jan Kiszka wrote:
>> On 2011-01-12 11:31, Jan Kiszka wrote:
>>
>>> Am 12.01.2011 11:22, Avi Kivity wrote:
>>>  
 On 01/11/2011 03:54 PM, Anthony Liguori wrote:

> Right, we should introduce a KVMBus that KVM devices are created on.
> The devices can get at KVMState through the BusState.
>  
 There is no kvm bus in a PC (I looked).  We're bending the device model
 here because a device is implemented in the kernel and not in
 userspace.  An implementation detail is magnified beyond all proportions.

 An ioapic that is implemented by kvm lives in exactly the same place
 that the qemu ioapic lives in.  An assigned pci device lives on the PCI
 bus, not a KVMBus.  If we need a pointer to KVMState, then we must find
 it elsewhere, not through creating imaginary buses that don't exist.


>>> Exactly.
>>>
>>> So we can either "infect" the whole device tree with kvm (or maybe a
>>> more generic accelerator structure that also deals with Xen) or we need
>>> to pull the reference inside the device's init function from some global
>>> service (kvm_get_state).
>>>  
>> Note that this topic is still waiting for good suggestions, specifically
>> from those who believe in kvm_state references :). This is not only
>> blocking kvmstate merge but will affect KVM irqchips as well.
>>
>> It boils down to how we reasonably pass a kvm_state reference from
>> machine init code to a sysbus device. I'm probably biased, but I don't
>> see any way that does not work against the idea of confining access to
>> kvm_state or breaks device instantiation from the command line or a
>> config file.
>>
> 
> A KVM device should sit on a KVM specific bus that hangs off of sysbus.  
> It can get to kvm_state through that bus.
> 
> That bus doesn't get instantiated through qdev so requiring a pointer 
> argument should not be an issue.
> 

This design is in conflict with the requirement to attach KVM-assisted
devices also to their home bus, e.g. an assigned PCI device to the PCI
bus. We don't support multi-homed qdev devices.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Anthony Liguori


On 01/18/2011 09:43 AM, Jan Kiszka wrote:

On 2011-01-18 16:04, Anthony Liguori wrote:
   

On 01/18/2011 08:28 AM, Jan Kiszka wrote:
 

On 2011-01-12 11:31, Jan Kiszka wrote:

   

Am 12.01.2011 11:22, Avi Kivity wrote:

 

On 01/11/2011 03:54 PM, Anthony Liguori wrote:

   

Right, we should introduce a KVMBus that KVM devices are created on.
The devices can get at KVMState through the BusState.

 

There is no kvm bus in a PC (I looked).  We're bending the device model
here because a device is implemented in the kernel and not in
userspace.  An implementation detail is magnified beyond all proportions.

An ioapic that is implemented by kvm lives in exactly the same place
that the qemu ioapic lives in.  An assigned pci device lives on the PCI
bus, not a KVMBus.  If we need a pointer to KVMState, then we must find
it elsewhere, not through creating imaginary buses that don't exist.


   

Exactly.

So we can either "infect" the whole device tree with kvm (or maybe a
more generic accelerator structure that also deals with Xen) or we need
to pull the reference inside the device's init function from some global
service (kvm_get_state).

 

Note that this topic is still waiting for good suggestions, specifically
from those who believe in kvm_state references :). This is not only
blocking kvmstate merge but will affect KVM irqchips as well.

It boils down to how we reasonably pass a kvm_state reference from
machine init code to a sysbus device. I'm probably biased, but I don't
see any way that does not work against the idea of confining access to
kvm_state or breaks device instantiation from the command line or a
config file.

   

A KVM device should sit on a KVM specific bus that hangs off of sysbus.
It can get to kvm_state through that bus.

That bus doesn't get instantiated through qdev so requiring a pointer
argument should not be an issue.

 

This design is in conflict with the requirement to attach KVM-assisted
devices also to their home bus, e.g. an assigned PCI device to the PCI
bus. We don't support multi-homed qdev devices.
   


With vfio, would an assigned PCI device even need kvm_state?

Regards,

Anthony Liguori


Jan

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Anthony Liguori


On 01/18/2011 09:43 AM, Jan Kiszka wrote:

On 2011-01-18 16:04, Anthony Liguori wrote:
   

On 01/18/2011 08:28 AM, Jan Kiszka wrote:
 

On 2011-01-12 11:31, Jan Kiszka wrote:

   

Am 12.01.2011 11:22, Avi Kivity wrote:

 

On 01/11/2011 03:54 PM, Anthony Liguori wrote:

   

Right, we should introduce a KVMBus that KVM devices are created on.
The devices can get at KVMState through the BusState.

 

There is no kvm bus in a PC (I looked).  We're bending the device model
here because a device is implemented in the kernel and not in
userspace.  An implementation detail is magnified beyond all proportions.

An ioapic that is implemented by kvm lives in exactly the same place
that the qemu ioapic lives in.  An assigned pci device lives on the PCI
bus, not a KVMBus.  If we need a pointer to KVMState, then we must find
it elsewhere, not through creating imaginary buses that don't exist.


   

Exactly.

So we can either "infect" the whole device tree with kvm (or maybe a
more generic accelerator structure that also deals with Xen) or we need
to pull the reference inside the device's init function from some global
service (kvm_get_state).

 

Note that this topic is still waiting for good suggestions, specifically
from those who believe in kvm_state references :). This is not only
blocking kvmstate merge but will affect KVM irqchips as well.

It boils down to how we reasonably pass a kvm_state reference from
machine init code to a sysbus device. I'm probably biased, but I don't
see any way that does not work against the idea of confining access to
kvm_state or breaks device instantiation from the command line or a
config file.

   

A KVM device should sit on a KVM specific bus that hangs off of sysbus.
It can get to kvm_state through that bus.

That bus doesn't get instantiated through qdev so requiring a pointer
argument should not be an issue.

 

This design is in conflict with the requirement to attach KVM-assisted
devices also to their home bus, e.g. an assigned PCI device to the PCI
bus. We don't support multi-homed qdev devices.
   


The bus topology reflects how I/O flows in and out of a device.  We do 
not model a perfect PC bus architecture and I don't think we ever intend 
to.  Instead, we model a functional architecture.


I/O from an assigned device does not flow through the emulated PCI bus.  
Therefore, it does not belong on the emulated PCI bus.


Assigned devices need to interact with the emulated PCI bus, but they 
shouldn't be children of it.


Regards,

Anthony Liguori


Jan

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Jan Kiszka

On 2011-01-18 16:48, Anthony Liguori wrote:
> On 01/18/2011 09:43 AM, Jan Kiszka wrote:
>> On 2011-01-18 16:04, Anthony Liguori wrote:
>>
>>> On 01/18/2011 08:28 AM, Jan Kiszka wrote:
>>>  
 On 2011-01-12 11:31, Jan Kiszka wrote:


> Am 12.01.2011 11:22, Avi Kivity wrote:
>
>  
>> On 01/11/2011 03:54 PM, Anthony Liguori wrote:
>>
>>
>>> Right, we should introduce a KVMBus that KVM devices are created on.
>>> The devices can get at KVMState through the BusState.
>>>
>>>  
>> There is no kvm bus in a PC (I looked).  We're bending the device model
>> here because a device is implemented in the kernel and not in
>> userspace.  An implementation detail is magnified beyond all proportions.
>>
>> An ioapic that is implemented by kvm lives in exactly the same place
>> that the qemu ioapic lives in.  An assigned pci device lives on the PCI
>> bus, not a KVMBus.  If we need a pointer to KVMState, then we must find
>> it elsewhere, not through creating imaginary buses that don't exist.
>>
>>
>>
> Exactly.
>
> So we can either "infect" the whole device tree with kvm (or maybe a
> more generic accelerator structure that also deals with Xen) or we need
> to pull the reference inside the device's init function from some global
> service (kvm_get_state).
>
>  
 Note that this topic is still waiting for good suggestions, specifically
 from those who believe in kvm_state references :). This is not only
 blocking kvmstate merge but will affect KVM irqchips as well.

 It boils down to how we reasonably pass a kvm_state reference from
 machine init code to a sysbus device. I'm probably biased, but I don't
 see any way that does not work against the idea of confining access to
 kvm_state or breaks device instantiation from the command line or a
 config file.


>>> A KVM device should sit on a KVM specific bus that hangs off of sysbus.
>>> It can get to kvm_state through that bus.
>>>
>>> That bus doesn't get instantiated through qdev so requiring a pointer
>>> argument should not be an issue.
>>>
>>>  
>> This design is in conflict with the requirement to attach KVM-assisted
>> devices also to their home bus, e.g. an assigned PCI device to the PCI
>> bus. We don't support multi-homed qdev devices.
>>
> 
> With vfio, would an assigned PCI device even need kvm_state?

IIUC: Yes, for establishing the irqfd link.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Jan Kiszka

On 2011-01-18 16:50, Anthony Liguori wrote:
> On 01/18/2011 09:43 AM, Jan Kiszka wrote:
>> On 2011-01-18 16:04, Anthony Liguori wrote:
>>
>>> On 01/18/2011 08:28 AM, Jan Kiszka wrote:
>>>  
 On 2011-01-12 11:31, Jan Kiszka wrote:


> Am 12.01.2011 11:22, Avi Kivity wrote:
>
>  
>> On 01/11/2011 03:54 PM, Anthony Liguori wrote:
>>
>>
>>> Right, we should introduce a KVMBus that KVM devices are created on.
>>> The devices can get at KVMState through the BusState.
>>>
>>>  
>> There is no kvm bus in a PC (I looked).  We're bending the device model
>> here because a device is implemented in the kernel and not in
>> userspace.  An implementation detail is magnified beyond all proportions.
>>
>> An ioapic that is implemented by kvm lives in exactly the same place
>> that the qemu ioapic lives in.  An assigned pci device lives on the PCI
>> bus, not a KVMBus.  If we need a pointer to KVMState, then we must find
>> it elsewhere, not through creating imaginary buses that don't exist.
>>
>>
>>
> Exactly.
>
> So we can either "infect" the whole device tree with kvm (or maybe a
> more generic accelerator structure that also deals with Xen) or we need
> to pull the reference inside the device's init function from some global
> service (kvm_get_state).
>
>  
 Note that this topic is still waiting for good suggestions, specifically
 from those who believe in kvm_state references :). This is not only
 blocking kvmstate merge but will affect KVM irqchips as well.

 It boils down to how we reasonably pass a kvm_state reference from
 machine init code to a sysbus device. I'm probably biased, but I don't
 see any way that does not work against the idea of confining access to
 kvm_state or breaks device instantiation from the command line or a
 config file.


>>> A KVM device should sit on a KVM specific bus that hangs off of sysbus.
>>> It can get to kvm_state through that bus.
>>>
>>> That bus doesn't get instantiated through qdev so requiring a pointer
>>> argument should not be an issue.
>>>
>>>  
>> This design is in conflict with the requirement to attach KVM-assisted
>> devices also to their home bus, e.g. an assigned PCI device to the PCI
>> bus. We don't support multi-homed qdev devices.
>>
> 
> The bus topology reflects how I/O flows in and out of a device.  We do 
> not model a perfect PC bus architecture and I don't think we ever intend 
> to.  Instead, we model a functional architecture.
> 
> I/O from an assigned device does not flow through the emulated PCI bus.  
> Therefore, it does not belong on the emulated PCI bus.
> 
> Assigned devices need to interact with the emulated PCI bus, but they 
> shouldn't be children of it.

You should be able to find assigned devices on some PCI bus, so you
either have to hack up the existing bus to host devices that are, on the
other side, not part of it or branch off a pci-kvm sub-bus, just like
you would have to create a sysbus-kvm. I guess, if at all, we want the
latter.

Is that acceptable for everyone?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Anthony Liguori


On 01/18/2011 10:01 AM, Jan Kiszka wrote:

On 2011-01-18 16:50, Anthony Liguori wrote:
   

On 01/18/2011 09:43 AM, Jan Kiszka wrote:
 

On 2011-01-18 16:04, Anthony Liguori wrote:

   

On 01/18/2011 08:28 AM, Jan Kiszka wrote:

 

On 2011-01-12 11:31, Jan Kiszka wrote:


   

Am 12.01.2011 11:22, Avi Kivity wrote:


 

On 01/11/2011 03:54 PM, Anthony Liguori wrote:


   

Right, we should introduce a KVMBus that KVM devices are created on.
The devices can get at KVMState through the BusState.


 

There is no kvm bus in a PC (I looked).  We're bending the device model
here because a device is implemented in the kernel and not in
userspace.  An implementation detail is magnified beyond all proportions.

An ioapic that is implemented by kvm lives in exactly the same place
that the qemu ioapic lives in.  An assigned pci device lives on the PCI
bus, not a KVMBus.  If we need a pointer to KVMState, then we must find
it elsewhere, not through creating imaginary buses that don't exist.



   

Exactly.

So we can either "infect" the whole device tree with kvm (or maybe a
more generic accelerator structure that also deals with Xen) or we need
to pull the reference inside the device's init function from some global
service (kvm_get_state).


 

Note that this topic is still waiting for good suggestions, specifically
from those who believe in kvm_state references :). This is not only
blocking kvmstate merge but will affect KVM irqchips as well.

It boils down to how we reasonably pass a kvm_state reference from
machine init code to a sysbus device. I'm probably biased, but I don't
see any way that does not work against the idea of confining access to
kvm_state or breaks device instantiation from the command line or a
config file.


   

A KVM device should sit on a KVM specific bus that hangs off of sysbus.
It can get to kvm_state through that bus.

That bus doesn't get instantiated through qdev so requiring a pointer
argument should not be an issue.


 

This design is in conflict with the requirement to attach KVM-assisted
devices also to their home bus, e.g. an assigned PCI device to the PCI
bus. We don't support multi-homed qdev devices.

   

The bus topology reflects how I/O flows in and out of a device.  We do
not model a perfect PC bus architecture and I don't think we ever intend
to.  Instead, we model a functional architecture.

I/O from an assigned device does not flow through the emulated PCI bus.
Therefore, it does not belong on the emulated PCI bus.

Assigned devices need to interact with the emulated PCI bus, but they
shouldn't be children of it.
 

You should be able to find assigned devices on some PCI bus, so you
either have to hack up the existing bus to host devices that are, on the
other side, not part of it or branch off a pci-kvm sub-bus, just like
you would have to create a sysbus-kvm.


Management tools should never transverse the device tree to find 
devices.  This is a recipe for disaster in the long term because the 
device tree will not remain stable.


So yes, a management tool should be able to enumerate assigned devices 
as they would enumerate any other PCI device but that has almost nothing 
to do with what the tree layout is.


Regards,

Anthony Liguori


  I guess, if at all, we want the
latter.

Is that acceptable for everyone?

Jan

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Jan Kiszka

On 2011-01-18 17:04, Anthony Liguori wrote:
> A KVM device should sit on a KVM specific bus that hangs off of
> sysbus.
> It can get to kvm_state through that bus.
>
> That bus doesn't get instantiated through qdev so requiring a pointer
> argument should not be an issue.
>
>
>  
 This design is in conflict with the requirement to attach KVM-assisted
 devices also to their home bus, e.g. an assigned PCI device to the PCI
 bus. We don't support multi-homed qdev devices.


>>> The bus topology reflects how I/O flows in and out of a device.  We do
>>> not model a perfect PC bus architecture and I don't think we ever intend
>>> to.  Instead, we model a functional architecture.
>>>
>>> I/O from an assigned device does not flow through the emulated PCI bus.
>>> Therefore, it does not belong on the emulated PCI bus.
>>>
>>> Assigned devices need to interact with the emulated PCI bus, but they
>>> shouldn't be children of it.
>>>  
>> You should be able to find assigned devices on some PCI bus, so you
>> either have to hack up the existing bus to host devices that are, on the
>> other side, not part of it or branch off a pci-kvm sub-bus, just like
>> you would have to create a sysbus-kvm.
> 
> Management tools should never transverse the device tree to find
> devices.  This is a recipe for disaster in the long term because the
> device tree will not remain stable.
> 
> So yes, a management tool should be able to enumerate assigned devices
> as they would enumerate any other PCI device but that has almost nothing
> to do with what the tree layout is.

I'm probably misunderstanding you, but if the bus topology as the guest
sees it is not properly reflected in an object tree on the qemu side, we
are creating hacks again.

Management and analysis tools must be able to traverse the system buses
and find guest devices this way. If they create a device on bus X, it
must never end up on bus Y just because it happens to be KVM-assisted or
has some other property. On the other hand, trying to hide this
dependency will likely cause severe damage to the qdev design.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH] pxa2xx_lcd: restore updating of display

2011-01-18 Thread Dmitry Eremin-Solenikov

Recently PXA2xx lcd have stopped to be updated incrementally (picture
frozen). This patch fixes that by passing non min/max x/y, but rather
(correctly) x/y and w/h.

Signed-off-by: Dmitry Eremin-Solenikov 
---
 hw/pxa2xx_lcd.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/pxa2xx_lcd.c b/hw/pxa2xx_lcd.c
index 1f2211a..5b2b07e 100644
--- a/hw/pxa2xx_lcd.c
+++ b/hw/pxa2xx_lcd.c
@@ -796,9 +796,9 @@ static void pxa2xx_update_display(void *opaque)
 
 if (miny >= 0) {
 if (s->orientation)
-dpy_update(s->ds, miny, 0, maxy, s->xres);
+dpy_update(s->ds, miny, 0, maxy - miny, s->xres);
 else
-dpy_update(s->ds, 0, miny, s->xres, maxy);
+dpy_update(s->ds, 0, miny, s->xres, maxy - miny);
 }
 pxa2xx_lcdc_int_update(s);
 
-- 
1.7.2.3

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Anthony Liguori


On 01/18/2011 10:17 AM, Jan Kiszka wrote:

On 2011-01-18 17:04, Anthony Liguori wrote:
   

A KVM device should sit on a KVM specific bus that hangs off of
sysbus.
It can get to kvm_state through that bus.

That bus doesn't get instantiated through qdev so requiring a pointer
argument should not be an issue.



 

This design is in conflict with the requirement to attach KVM-assisted
devices also to their home bus, e.g. an assigned PCI device to the PCI
bus. We don't support multi-homed qdev devices.


   

The bus topology reflects how I/O flows in and out of a device.  We do
not model a perfect PC bus architecture and I don't think we ever intend
to.  Instead, we model a functional architecture.

I/O from an assigned device does not flow through the emulated PCI bus.
Therefore, it does not belong on the emulated PCI bus.

Assigned devices need to interact with the emulated PCI bus, but they
shouldn't be children of it.

 

You should be able to find assigned devices on some PCI bus, so you
either have to hack up the existing bus to host devices that are, on the
other side, not part of it or branch off a pci-kvm sub-bus, just like
you would have to create a sysbus-kvm.
   

Management tools should never transverse the device tree to find
devices.  This is a recipe for disaster in the long term because the
device tree will not remain stable.

So yes, a management tool should be able to enumerate assigned devices
as they would enumerate any other PCI device but that has almost nothing
to do with what the tree layout is.
 

I'm probably misunderstanding you, but if the bus topology as the guest
sees it is not properly reflected in an object tree on the qemu side, we
are creating hacks again.
   


There is no such thing as the "bus topology as the guest sees it".

The guest just sees a bunch of devices.  The guest can only infer things 
like ISA busses.  The guest sees a bunch of devices: an i8254, i8259, 
RTC, etc.  Whether those devices are on an ISA bus, and LPC bus, or all 
in a SuperI/O chip that's part of the southbridge is all invisible to 
the guest.


The device model topology is 100% a hidden architectural detail.


Management and analysis tools must be able to traverse the system buses
and find guest devices this way.


We need to provide a compatible interface to the guest.  If you agree 
with my above statements, then you'll also agree that we can do this 
without keeping the device model topology stable.


But we also need to provide a compatible interface to management tools.  
Exposing the device model topology as a compatible interface 
artificially limits us.  It's far better to provide higher level 
supported interfaces to give us the flexibility to change the device 
model as we need to.



  If they create a device on bus X, it
must never end up on bus Y just because it happens to be KVM-assisted or
has some other property.


Nope.  This is exactly what should happen.

90% of the devices in the device model are not created by management 
tools.  They're part of a chipset.  The chipset has well defined 
extension points and we provide management interfaces to create devices 
on those extension points.  That is, interfaces to create PCI devices.


Regards,

Anthony Liguori


  On the other hand, trying to hide this
dependency will likely cause severe damage to the qdev design.

Jan

Re: [Qemu-devel] [PATCH 2/2] strtosz(): Use suffix macros in switch() statement

2011-01-18 Thread Anthony Liguori


On 01/18/2011 03:20 AM, Markus Armbruster wrote:

jes.soren...@redhat.com writes:

   

From: Jes Sorensen

Signed-off-by: Jes Sorensen
---
  cutils.c |   10 +-
  1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/cutils.c b/cutils.c
index 328738c..f2c8bbd 100644
--- a/cutils.c
+++ b/cutils.c
@@ -324,26 +324,26 @@ ssize_t strtosz_suffix(const char *nptr, char **end, 
const char default_suffix)
  }
  }
  switch (toupper(d)) {
-case 'B':
+case STRTOSZ_DEFSUFFIX_B:
  mul = 1;
  if (mul_required) {
  goto fail;
  }
  break;
-case 'K':
+case STRTOSZ_DEFSUFFIX_KB:
  mul = 1<<  10;
  break;
  case 0:
  if (mul_required) {
  goto fail;
  }
-case 'M':
+case STRTOSZ_DEFSUFFIX_MB:
  mul = 1ULL<<  20;
  break;
-case 'G':
+case STRTOSZ_DEFSUFFIX_GB:
  mul = 1ULL<<  30;
  break;
-case 'T':
+case STRTOSZ_DEFSUFFIX_TB:
  mul = 1ULL<<  40;
  break;
  default:
 

And this improves what?  Certainly not clarity.

In my opinion, the STRTOSZ_DEFSUFFIX_TB are useless chaff.  Chacun à son
goût.
   


Yeah, I have to agree.  I'm not of the literals are evil camp.

BTW, a useful change would be to accept both upper and lower case letters.

Regards,

Anthony Liguori

Re: [Qemu-devel] [PATCH 2/2] strtosz(): Use suffix macros in switch() statement

2011-01-18 Thread Jes Sorensen

On 01/18/11 17:50, Anthony Liguori wrote:
> On 01/18/2011 03:20 AM, Markus Armbruster wrote:
>> jes.soren...@redhat.com writes:
>>
>>   
>>> From: Jes Sorensen
>>>
>>> Signed-off-by: Jes Sorensen
>>> ---
>>>   cutils.c |   10 +-
>>>   1 files changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/cutils.c b/cutils.c
>>> index 328738c..f2c8bbd 100644
>>> --- a/cutils.c
>>> +++ b/cutils.c
>>> @@ -324,26 +324,26 @@ ssize_t strtosz_suffix(const char *nptr, char
>>> **end, const char default_suffix)
>>>   }
>>>   }
>>>   switch (toupper(d)) {

^^^

>>>  
>> And this improves what?  Certainly not clarity.
>>
>> In my opinion, the STRTOSZ_DEFSUFFIX_TB are useless chaff.  Chacun à son
>> goût.
>>
> 
> Yeah, I have to agree.  I'm not of the literals are evil camp.
> 
> BTW, a useful change would be to accept both upper and lower case letters.

It already supports both, it's handle in the toupper() call.

Cheers,
Jes

Re: [Qemu-devel] [PATCH 2/2] strtosz(): Use suffix macros in switch() statement

2011-01-18 Thread Eric Blake

On 01/18/2011 09:50 AM, Anthony Liguori wrote:
>>> @@ -324,26 +324,26 @@ ssize_t strtosz_suffix(const char *nptr, char
>>> **end, const char default_suffix)
>>>   }
>>>   }
>>>   switch (toupper(d)) {
> BTW, a useful change would be to accept both upper and lower case letters.

And it does, via the toupper() added earlier in the series (and which
has separately been pointed out that using qemu_toupper() might be nicer).

-- 
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Jan Kiszka

On 2011-01-18 17:37, Anthony Liguori wrote:
> On 01/18/2011 10:17 AM, Jan Kiszka wrote:
>> On 2011-01-18 17:04, Anthony Liguori wrote:
>>
>>> A KVM device should sit on a KVM specific bus that hangs off of
>>> sysbus.
>>> It can get to kvm_state through that bus.
>>>
>>> That bus doesn't get instantiated through qdev so requiring a pointer
>>> argument should not be an issue.
>>>
>>>
>>>
>>>  
>> This design is in conflict with the requirement to attach KVM-assisted
>> devices also to their home bus, e.g. an assigned PCI device to the PCI
>> bus. We don't support multi-homed qdev devices.
>>
>>
>>
> The bus topology reflects how I/O flows in and out of a device.  We do
> not model a perfect PC bus architecture and I don't think we ever intend
> to.  Instead, we model a functional architecture.
>
> I/O from an assigned device does not flow through the emulated PCI bus.
> Therefore, it does not belong on the emulated PCI bus.
>
> Assigned devices need to interact with the emulated PCI bus, but they
> shouldn't be children of it.
>
>  
 You should be able to find assigned devices on some PCI bus, so you
 either have to hack up the existing bus to host devices that are, on the
 other side, not part of it or branch off a pci-kvm sub-bus, just like
 you would have to create a sysbus-kvm.

>>> Management tools should never transverse the device tree to find
>>> devices.  This is a recipe for disaster in the long term because the
>>> device tree will not remain stable.
>>>
>>> So yes, a management tool should be able to enumerate assigned devices
>>> as they would enumerate any other PCI device but that has almost nothing
>>> to do with what the tree layout is.
>>>  
>> I'm probably misunderstanding you, but if the bus topology as the guest
>> sees it is not properly reflected in an object tree on the qemu side, we
>> are creating hacks again.
>>
> 
> There is no such thing as the "bus topology as the guest sees it".
> 
> The guest just sees a bunch of devices.  The guest can only infer things 
> like ISA busses.  The guest sees a bunch of devices: an i8254, i8259, 
> RTC, etc.  Whether those devices are on an ISA bus, and LPC bus, or all 
> in a SuperI/O chip that's part of the southbridge is all invisible to 
> the guest.
> 
> The device model topology is 100% a hidden architectural detail.

This is true for the sysbus, it is obviously not the case for PCI and
similarly discoverable buses. There we have a guest-explorable topology
that is currently equivalent to the the qdev layout.

> 
>> Management and analysis tools must be able to traverse the system buses
>> and find guest devices this way.
> 
> We need to provide a compatible interface to the guest.  If you agree 
> with my above statements, then you'll also agree that we can do this 
> without keeping the device model topology stable.
> 
> But we also need to provide a compatible interface to management tools.  
> Exposing the device model topology as a compatible interface 
> artificially limits us.  It's far better to provide higher level 
> supported interfaces to give us the flexibility to change the device 
> model as we need to.

How do you want to change qdev to keep the guest and management tool
view stable while branching off kvm sub-buses? Please propose such
extensions so that they can be discussed. IIUC, that would be second
relation between qdev and qbus instances besides the physical topology.
What further use cases (besides passing kvm_state around) do you have in
mind?

> 
>>   If they create a device on bus X, it
>> must never end up on bus Y just because it happens to be KVM-assisted or
>> has some other property.
> 
> Nope.  This is exactly what should happen.
> 
> 90% of the devices in the device model are not created by management 
> tools.  They're part of a chipset.  The chipset has well defined 
> extension points and we provide management interfaces to create devices 
> on those extension points.  That is, interfaces to create PCI devices.
> 

Creating kvm irqchips via the management tool would be one simple way
(not the only one, though) to enable/disable kvm assistance for those
devices.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Alex Williamson

On Tue, 2011-01-18 at 16:54 +0100, Jan Kiszka wrote:
> On 2011-01-18 16:48, Anthony Liguori wrote:
> > On 01/18/2011 09:43 AM, Jan Kiszka wrote:
> >> On 2011-01-18 16:04, Anthony Liguori wrote:
> >>
> >>> On 01/18/2011 08:28 AM, Jan Kiszka wrote:
> >>>  
>  On 2011-01-12 11:31, Jan Kiszka wrote:
> 
> 
> > Am 12.01.2011 11:22, Avi Kivity wrote:
> >
> >  
> >> On 01/11/2011 03:54 PM, Anthony Liguori wrote:
> >>
> >>
> >>> Right, we should introduce a KVMBus that KVM devices are created on.
> >>> The devices can get at KVMState through the BusState.
> >>>
> >>>  
> >> There is no kvm bus in a PC (I looked).  We're bending the device model
> >> here because a device is implemented in the kernel and not in
> >> userspace.  An implementation detail is magnified beyond all 
> >> proportions.
> >>
> >> An ioapic that is implemented by kvm lives in exactly the same place
> >> that the qemu ioapic lives in.  An assigned pci device lives on the PCI
> >> bus, not a KVMBus.  If we need a pointer to KVMState, then we must find
> >> it elsewhere, not through creating imaginary buses that don't exist.
> >>
> >>
> >>
> > Exactly.
> >
> > So we can either "infect" the whole device tree with kvm (or maybe a
> > more generic accelerator structure that also deals with Xen) or we need
> > to pull the reference inside the device's init function from some global
> > service (kvm_get_state).
> >
> >  
>  Note that this topic is still waiting for good suggestions, specifically
>  from those who believe in kvm_state references :). This is not only
>  blocking kvmstate merge but will affect KVM irqchips as well.
> 
>  It boils down to how we reasonably pass a kvm_state reference from
>  machine init code to a sysbus device. I'm probably biased, but I don't
>  see any way that does not work against the idea of confining access to
>  kvm_state or breaks device instantiation from the command line or a
>  config file.
> 
> 
> >>> A KVM device should sit on a KVM specific bus that hangs off of sysbus.
> >>> It can get to kvm_state through that bus.
> >>>
> >>> That bus doesn't get instantiated through qdev so requiring a pointer
> >>> argument should not be an issue.
> >>>
> >>>  
> >> This design is in conflict with the requirement to attach KVM-assisted
> >> devices also to their home bus, e.g. an assigned PCI device to the PCI
> >> bus. We don't support multi-homed qdev devices.
> >>
> > 
> > With vfio, would an assigned PCI device even need kvm_state?
> 
> IIUC: Yes, for establishing the irqfd link.

We abstract this through the msi/msix layer though, so the vfio driver
doesn't directly know anything about kvm_state.

Alex

Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation.

2011-01-18 Thread Christophe Lyon

On 18.01.2011 16:26, Peter Maydell wrote:
> On 18 January 2011 14:34, Christophe Lyon  wrote:
>> +
>> +/* gen_helper_neon_mull_[su]{8|16} do not free their parameters.
>> +   Don't forget to clean them now.  */
>> +switch ((size << 1) | u) {
>> +case 0:
>> +case 1:
>> +case 2:
>> +case 3:
>> +  dead_tmp(a);
>> +  dead_tmp(b);
>> +  break;
>> +}
>>  }
> 
> This seems a rather convoluted way to write "if (size < 2) { ... }"
> 
It was for consistency/readability with the preceding paragraph.

>> @@ -5235,9 +5245,12 @@ static int disas_neon_data_insn(CPUState * env,
>> DisasContext *s, uint32_t insn)
>> tmp = neon_load_reg(rn, 0);
>> } else {
>> tmp = tmp3;
>> +   /* tmp2 has been discarded in
>> +  gen_neon_mull during pass 0, we need to
>> +  recreate it.  */
>> +   tmp2 = neon_get_scalar(size, rm);
>> }
> 
> I think this will give the wrong results for instructions where the
> scalar operand is in the same Neon register as the destination
> for the first pass, because calling neon_get_scalar() again will
> do a reload from the Neon register and it might have changed.
> (Also loading it once at the start rather than in every pass is
> more efficient as well as being correct :-))

I agree it's more efficient, but as the temporary is freed by gen_neon_mull, 
how can I make an efficient copy?

If we decide not to free the temporary in gen_mul[us]_i64_i32, we'll have to 
make sure clean up is performed correctly in many places.

 
> Also your patch has hard-coded tabs in it (please see
> CODING_STYLE on the subject of whitespace) and your
> mail client or server has line-wrapped long lines in the patch
> so it doesn't apply cleanly...

Sorry, I know we have some trouble with the mail client or server. Is it 
possible to send patches as attachments on this list?

Re: [Qemu-devel] [V3 PATCH 2/8] virtio-9p: Provide chroot environment server side interfaces

2011-01-18 Thread Blue Swirl

On Tue, Jan 18, 2011 at 6:25 AM, M. Mohan Kumar  wrote:
> Implement chroot server side interfaces like sending the file
> descriptor to qemu process, reading the object request from socket etc.
> Also add chroot main function and other helper routines.
>
> Signed-off-by: M. Mohan Kumar 
> ---
>  Makefile.objs              |    1 +
>  hw/9pfs/virtio-9p-chroot.c |  183 
> 
>  hw/9pfs/virtio-9p-chroot.h |   39 +
>  hw/9pfs/virtio-9p.c        |   23 ++
>  hw/file-op-9p.h            |    2 +
>  5 files changed, 248 insertions(+), 0 deletions(-)
>  create mode 100644 hw/9pfs/virtio-9p-chroot.c
>  create mode 100644 hw/9pfs/virtio-9p-chroot.h
>
> diff --git a/Makefile.objs b/Makefile.objs
> index bc0344c..3007b6d 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -273,6 +273,7 @@ hw-obj-$(CONFIG_SOUND) += $(sound-obj-y)
>  9pfs-nested-$(CONFIG_VIRTFS) = virtio-9p-debug.o
>  9pfs-nested-$(CONFIG_VIRTFS) +=  virtio-9p-local.o virtio-9p-xattr.o
>  9pfs-nested-$(CONFIG_VIRTFS) +=   virtio-9p-xattr-user.o 
> virtio-9p-posix-acl.o
> +9pfs-nested-$(CONFIG_VIRTFS) +=   virtio-9p-chroot.o
>
>  hw-obj-$(CONFIG_REALLY_VIRTFS) += $(addprefix 9pfs/, $(9pfs-nested-y))
>  $(addprefix 9pfs/, $(9pfs-nested-y)): CFLAGS +=  -I$(SRC_PATH)/hw/
> diff --git a/hw/9pfs/virtio-9p-chroot.c b/hw/9pfs/virtio-9p-chroot.c
> new file mode 100644
> index 000..dcde2cc
> --- /dev/null
> +++ b/hw/9pfs/virtio-9p-chroot.c
> @@ -0,0 +1,183 @@
> +/*
> + * Virtio 9p chroot environment for contained access to exported path
> + *
> + * Copyright IBM, Corp. 2011
> + *
> + * Authors:
> + * M. Mohan Kumar 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See
> + * the copying file in the top-level directory
> + *
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include "virtio.h"
> +#include "qemu_socket.h"
> +#include "qemu-thread.h"
> +#include "qerror.h"
> +#include "virtio-9p.h"
> +#include "virtio-9p-chroot.h"
> +
> +/*
> + * Structure used by chroot functions to transmit file descriptor and
> + * error info
> + */
> +typedef struct {
> +    int fi_fd;
> +    int fi_error;
> +} FdInfo;
> +
> +union MsgControl {
> +    struct cmsghdr cmsg;
> +    char control[CMSG_SPACE(sizeof(int))];
> +};
> +
> +/* Send file descriptor and error status to qemu process */
> +static int chroot_sendfd(int sockfd, FdInfo *fd_info)

const FdInfo *?

> +{
> +    struct msghdr msg = { };
> +    struct iovec iov;
> +    struct cmsghdr *cmsg;
> +    int retval;
> +    union MsgControl msg_control;
> +
> +    iov.iov_base = fd_info;
> +    iov.iov_len = sizeof(*fd_info);
> +
> +    memset(&msg, 0, sizeof(msg));
> +    msg.msg_iov = &iov;
> +    msg.msg_iovlen = 1;
> +    /* No ancillary data on error */
> +    if (!fd_info->fi_error) {
> +        msg.msg_control = &msg_control;
> +        msg.msg_controllen = sizeof(msg_control);
> +
> +        cmsg = &msg_control.cmsg;
> +        cmsg->cmsg_len = CMSG_LEN(sizeof(fd_info->fi_fd));
> +        cmsg->cmsg_level = SOL_SOCKET;
> +        cmsg->cmsg_type = SCM_RIGHTS;
> +        memcpy(CMSG_DATA(cmsg), &fd_info->fi_fd, sizeof(fd_info->fi_fd));
> +    }
> +    retval = sendmsg(sockfd, &msg, 0);
> +    close(fd_info->fi_fd);
> +    return retval;
> +}
> +
> +/* Read V9fsFileObjectRequest written by QEMU process */
> +static void chroot_read_request(int sockfd, V9fsFileObjectRequest *request)
> +{
> +    int retval;
> +    retval = qemu_read_full(sockfd, request, sizeof(request->data));
> +    if (retval <= 0 || request->data.path_len <= 0) {
> +        _exit(-1);
> +    }
> +    request->path.path = qemu_mallocz(request->data.path_len + 1);
> +    retval = qemu_read_full(sockfd, (void *)request->path.path,
> +                        request->data.path_len);
> +    if (retval <= 0) {
> +        _exit(-1);
> +    }
> +    if (request->data.oldpath_len > 0) {
> +        request->path.old_path =
> +                qemu_mallocz(request->data.oldpath_len + 1);
> +        retval = qemu_read_full(sockfd, (void *)request->path.old_path,
> +                            request->data.oldpath_len);
> +        if (retval <= 0) {
> +            _exit(-1);
> +        }
> +    }
> +}
> +
> +static int chroot_daemonize(int chroot_sock)
> +{
> +    sigset_t sigset;
> +    struct rlimit nr_fd;
> +    int fd;
> +
> +    /* Block all signals for this process */
> +    sigprocmask(SIG_SETMASK, &sigset, NULL);
> +
> +    /* Daemonize */
> +    if (setsid() < 0) {
> +        error_report("setsid %s", strerror(errno));
> +        return -1;
> +    }
> +
> +    /* Close other file descriptors */
> +    getrlimit(RLIMIT_NOFILE, &nr_fd);
> +    for (fd = 0; fd < nr_fd.rlim_cur; fd++) {
> +        if (fd != chroot_sock) {
> +            close(fd);
> +        }
> +    }
> +
> +    /* Create files with mode as requested by client */
> +    umask(0);
> +    return 0;
> +}
> +
> +static void chroot_dummy(void)
> +{
> +    (void)chroot_sendfd;
> +}
> +
> +/*
> + * Fork a pr

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Jan Kiszka

On 2011-01-18 18:02, Alex Williamson wrote:
> On Tue, 2011-01-18 at 16:54 +0100, Jan Kiszka wrote:
>> On 2011-01-18 16:48, Anthony Liguori wrote:
>>> On 01/18/2011 09:43 AM, Jan Kiszka wrote:
 On 2011-01-18 16:04, Anthony Liguori wrote:

> On 01/18/2011 08:28 AM, Jan Kiszka wrote:
>  
>> On 2011-01-12 11:31, Jan Kiszka wrote:
>>
>>
>>> Am 12.01.2011 11:22, Avi Kivity wrote:
>>>
>>>  
 On 01/11/2011 03:54 PM, Anthony Liguori wrote:


> Right, we should introduce a KVMBus that KVM devices are created on.
> The devices can get at KVMState through the BusState.
>
>  
 There is no kvm bus in a PC (I looked).  We're bending the device model
 here because a device is implemented in the kernel and not in
 userspace.  An implementation detail is magnified beyond all 
 proportions.

 An ioapic that is implemented by kvm lives in exactly the same place
 that the qemu ioapic lives in.  An assigned pci device lives on the PCI
 bus, not a KVMBus.  If we need a pointer to KVMState, then we must find
 it elsewhere, not through creating imaginary buses that don't exist.



>>> Exactly.
>>>
>>> So we can either "infect" the whole device tree with kvm (or maybe a
>>> more generic accelerator structure that also deals with Xen) or we need
>>> to pull the reference inside the device's init function from some global
>>> service (kvm_get_state).
>>>
>>>  
>> Note that this topic is still waiting for good suggestions, specifically
>> from those who believe in kvm_state references :). This is not only
>> blocking kvmstate merge but will affect KVM irqchips as well.
>>
>> It boils down to how we reasonably pass a kvm_state reference from
>> machine init code to a sysbus device. I'm probably biased, but I don't
>> see any way that does not work against the idea of confining access to
>> kvm_state or breaks device instantiation from the command line or a
>> config file.
>>
>>
> A KVM device should sit on a KVM specific bus that hangs off of sysbus.
> It can get to kvm_state through that bus.
>
> That bus doesn't get instantiated through qdev so requiring a pointer
> argument should not be an issue.
>
>  
 This design is in conflict with the requirement to attach KVM-assisted
 devices also to their home bus, e.g. an assigned PCI device to the PCI
 bus. We don't support multi-homed qdev devices.

>>>
>>> With vfio, would an assigned PCI device even need kvm_state?
>>
>> IIUC: Yes, for establishing the irqfd link.
> 
> We abstract this through the msi/msix layer though, so the vfio driver
> doesn't directly know anything about kvm_state.

Which version/tree are you referring to? It wasn't the case in the last
version I found on the list.

Does the msi layer use irqfd for every source in kvm mode then? Of
course, the key question will be how that layer will once obtain kvm_state.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Anthony Liguori


On 01/18/2011 10:56 AM, Jan Kiszka wrote:



The device model topology is 100% a hidden architectural detail.
 

This is true for the sysbus, it is obviously not the case for PCI and
similarly discoverable buses. There we have a guest-explorable topology
that is currently equivalent to the the qdev layout.
   


But we also don't do PCI passthrough so we really haven't even explored 
how that maps in qdev.  I don't know if qemu-kvm has attempted to 
qdev-ify it.



Management and analysis tools must be able to traverse the system buses
and find guest devices this way.
   

We need to provide a compatible interface to the guest.  If you agree
with my above statements, then you'll also agree that we can do this
without keeping the device model topology stable.

But we also need to provide a compatible interface to management tools.
Exposing the device model topology as a compatible interface
artificially limits us.  It's far better to provide higher level
supported interfaces to give us the flexibility to change the device
model as we need to.
 

How do you want to change qdev to keep the guest and management tool
view stable while branching off kvm sub-buses?


The qdev device model is not a stable interface.  I think that's been 
clear from the very beginning.



  Please propose such
extensions so that they can be discussed. IIUC, that would be second
relation between qdev and qbus instances besides the physical topology.
What further use cases (besides passing kvm_state around) do you have in
mind?
   


The -device interface is a stable interface.  Right now, you don't 
specify any type of identifier of the pci bus when you create a PCI 
device.  It's implied in the interface.


   
 

   If they create a device on bus X, it
must never end up on bus Y just because it happens to be KVM-assisted or
has some other property.
   

Nope.  This is exactly what should happen.

90% of the devices in the device model are not created by management
tools.  They're part of a chipset.  The chipset has well defined
extension points and we provide management interfaces to create devices
on those extension points.  That is, interfaces to create PCI devices.

 

Creating kvm irqchips via the management tool would be one simple way
(not the only one, though) to enable/disable kvm assistance for those
devices.
   


It's automatically created as part of the CPUs or as part of the 
chipset.  How to enable/disable kvm assistance is a property of the CPU 
and/or chipset.


Regards,

Anthony Liguori



Jan

Re: [Qemu-devel] [V3 PATCH 5/8] virtio-9p: Create support in chroot environment

2011-01-18 Thread Blue Swirl

On Tue, Jan 18, 2011 at 6:25 AM, M. Mohan Kumar  wrote:
> Add both server & client side interfaces to create regular files in
> chroot environment
>
> Signed-off-by: M. Mohan Kumar 
> ---
>  hw/9pfs/virtio-9p-chroot.c |   42 ++
>  hw/9pfs/virtio-9p-local.c  |   22 --
>  2 files changed, 62 insertions(+), 2 deletions(-)
>
> diff --git a/hw/9pfs/virtio-9p-chroot.c b/hw/9pfs/virtio-9p-chroot.c
> index b599e23..e7f85e2 100644
> --- a/hw/9pfs/virtio-9p-chroot.c
> +++ b/hw/9pfs/virtio-9p-chroot.c
> @@ -193,6 +193,42 @@ static void chroot_do_open(V9fsFileObjectRequest 
> *request, FdInfo *fd_info)
>     }
>  }
>
> +/*
> + * Helper routine to create a file and return the file descriptor and
> + * error status in FdInfo structure.
> + */
> +static void chroot_do_create(V9fsFileObjectRequest *request, FdInfo *fd_info)
> +{
> +    int cur_uid, cur_gid;

uid_t cur_uid;
gid_t cur_gid;

> +
> +    cur_uid = geteuid();
> +    cur_gid = getegid();
> +
> +    fd_info->fi_fd = -1;
> +
> +    if (setfsuid(request->data.uid) < 0) {
> +        fd_info->fi_error = errno;
> +        return;
> +    }
> +    if (setfsgid(request->data.gid) < 0) {
> +        fd_info->fi_error = errno;
> +        goto unset_uid;
> +    }
> +
> +    fd_info->fi_fd = open(request->path.path, request->data.flags,
> +                        request->data.mode);
> +
> +    if (fd_info->fi_fd < 0) {
> +        fd_info->fi_error = errno;
> +    } else {
> +        fd_info->fi_error = 0;
> +    }
> +
> +    setfsgid(cur_gid);
> +unset_uid:
> +    setfsuid(cur_uid);
> +}
> +
>  static int chroot_daemonize(int chroot_sock)
>  {
>     sigset_t sigset;
> @@ -276,6 +312,12 @@ int v9fs_chroot(FsContext *fs_ctx)
>                 error = -2;
>             }
>             break;
> +        case T_CREATE:
> +            chroot_do_create(&request, &fd_info);
> +            if (chroot_sendfd(chroot_sock, &fd_info) <= 0) {
> +                error = -2;
> +            }
> +            break;
>         default:
>             break;
>         }
> diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/virtio-9p-local.c
> index 2376ec2..7f39b40 100644
> --- a/hw/9pfs/virtio-9p-local.c
> +++ b/hw/9pfs/virtio-9p-local.c
> @@ -52,6 +52,23 @@ static int __open(FsContext *fs_ctx, const char *path, int 
> flags)
>     return fd;
>  }
>
> +static int __create(FsContext *fs_ctx, const char *path, int flags,

Please don't use identifiers starting with underscores.

Re: [Qemu-devel] [V3 PATCH 6/8] virtio-9p: Support for creating special files

2011-01-18 Thread Blue Swirl

On Tue, Jan 18, 2011 at 6:26 AM, M. Mohan Kumar  wrote:
> Add both server and client side interfaces to create special files
> (directory, device nodes, links and symbolic links)
>
> Signed-off-by: M. Mohan Kumar 
> ---
>  hw/9pfs/virtio-9p-chroot.c |   84 ++-
>  hw/9pfs/virtio-9p-chroot.h |    2 +
>  hw/9pfs/virtio-9p-local.c  |  141 
> ++--
>  3 files changed, 195 insertions(+), 32 deletions(-)
>
> diff --git a/hw/9pfs/virtio-9p-chroot.c b/hw/9pfs/virtio-9p-chroot.c
> index e7f85e2..92a4917 100644
> --- a/hw/9pfs/virtio-9p-chroot.c
> +++ b/hw/9pfs/virtio-9p-chroot.c
> @@ -193,6 +193,29 @@ static void chroot_do_open(V9fsFileObjectRequest 
> *request, FdInfo *fd_info)
>     }
>  }
>
> +int v9fs_create_special(FsContext *fs_ctx,
> +                V9fsFileObjectRequest *request, int *error)
> +{
> +    int retval;
> +
> +    pthread_mutex_lock(&fs_ctx->chroot_mutex);
> +
> +    *error = 0;
> +    v9fs_write_request(fs_ctx->chroot_socket, request);
> +    retval = qemu_read_full(fs_ctx->chroot_socket, error, sizeof(*error));
> +    if (retval != sizeof(*error)) {
> +        error_report("reading from socket failed: %s", strerror(errno));
> +        exit(1);
> +    }
> +
> +    pthread_mutex_unlock(&fs_ctx->chroot_mutex);
> +    if (*error) {
> +        return -1;
> +    } else {
> +        return 0;
> +    }
> +}
> +
>  /*
>  * Helper routine to create a file and return the file descriptor and
>  * error status in FdInfo structure.
> @@ -229,6 +252,56 @@ unset_uid:
>     setfsuid(cur_uid);
>  }
>
> +/*
> + * Create directory, symbolic link, link, device node and regular files
> + * Similar to create, but it does not return the fd of created object
> + * Returns 0 on success, returns errno on failure
> + */
> +static int chroot_do_create_special(V9fsFileObjectRequest *request)
> +{
> +    int cur_uid, cur_gid;

uid_t cur_uid;
gid_t cur_gid;

> +    int retval, error;
> +
> +    cur_uid = geteuid();
> +    cur_gid = getegid();
> +
> +    if (setfsuid(request->data.uid) < 0) {
> +        return errno;
> +    }
> +    if (setfsgid(request->data.gid) < 0) {
> +        error = errno;
> +        goto unset_uid;
> +    }
> +
> +    switch (request->data.type) {
> +    case T_MKDIR:
> +        retval = mkdir(request->path.path, request->data.mode);
> +        break;
> +    case T_SYMLINK:
> +        retval = symlink(request->path.old_path, request->path.path);
> +        break;
> +    case T_LINK:
> +        retval = link(request->path.old_path, request->path.path);
> +        break;
> +    default:
> +        retval = mknod(request->path.path, request->data.mode,
> +                        request->data.dev);
> +        break;
> +    }
> +
> +    if (retval < 0) {
> +        error = errno;
> +    } else {
> +        error = 0;
> +    }
> +
> +    setfsgid(cur_gid);
> +unset_uid:
> +    setfsuid(cur_uid);
> +
> +    return error;
> +}
> +
>  static int chroot_daemonize(int chroot_sock)
>  {
>     sigset_t sigset;
> @@ -263,7 +336,7 @@ static int chroot_daemonize(int chroot_sock)
>  */
>  int v9fs_chroot(FsContext *fs_ctx)
>  {
> -    int fd_pair[2], pid, chroot_sock, error;
> +    int fd_pair[2], pid, chroot_sock, error, retval;
>     V9fsFileObjectRequest request;
>     FdInfo fd_info;
>
> @@ -318,6 +391,15 @@ int v9fs_chroot(FsContext *fs_ctx)
>                 error = -2;
>             }
>             break;
> +        case T_MKDIR:
> +        case T_SYMLINK:
> +        case T_LINK:
> +        case T_MKNOD:
> +            retval = chroot_do_create_special(&request);
> +            if (qemu_write_full(chroot_sock, &retval, sizeof(retval)) < 0) {
> +                error = -2;
> +            }
> +            break;
>         default:
>             break;
>         }
> diff --git a/hw/9pfs/virtio-9p-chroot.h b/hw/9pfs/virtio-9p-chroot.h
> index f5a2ca0..9a0ba88 100644
> --- a/hw/9pfs/virtio-9p-chroot.h
> +++ b/hw/9pfs/virtio-9p-chroot.h
> @@ -36,5 +36,7 @@ typedef struct V9fsFileObjectRequest
>
>  int v9fs_chroot(FsContext *fs_ctx);
>  int v9fs_request(FsContext *fs_ctx, V9fsFileObjectRequest *or, int *error);
> +int v9fs_create_special(FsContext *fs_ctx,
> +                V9fsFileObjectRequest *request, int *error);
>
>  #endif /* _QEMU_VIRTIO_9P_CHROOT_H */
> diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/virtio-9p-local.c
> index 7f39b40..08fd67f 100644
> --- a/hw/9pfs/virtio-9p-local.c
> +++ b/hw/9pfs/virtio-9p-local.c
> @@ -69,6 +69,78 @@ static int __create(FsContext *fs_ctx, const char *path, 
> int flags,
>     return fd;
>  }
>
> +static int __mknod(FsContext *fs_ctx, const char *path, FsCred *credp)

underscores

Re: [Qemu-devel] [PATCH] linux-user: Fix possible realloc memory leak

2011-01-18 Thread Stefan Weil


Am 18.01.2011 09:26, schrieb Markus Armbruster:

Stefan Weil  writes:


Extract from "man realloc":
"If realloc() fails the original block is left untouched;
it is not freed or moved."

Fix a possible memory leak (reported by cppcheck).

Cc: Riku Voipio 
Signed-off-by: Stefan Weil 


Sidestep the problem via qemu_realloc() instead?


The same change was applied to bsd-user/elfload.c.

As symbol loading is not essential in most applications,
returning after out-of-memory should be better than
aborting (that's what qemu_realloc does).

Re: [Qemu-devel] Changing the content of target cpu registers

2011-01-18 Thread Blue Swirl

On Tue, Jan 18, 2011 at 9:29 AM, Stefano Bonifazi
 wrote:
> Hi all!
>  I am working on qemu-user (qemu-ppc).
> I'd like to edit the values of target registers during the execution. Can I
> do that by simply changing the content of env->gpr[] or do these only
> contain a copy of the values of the registers?
> In this last case, where are the real values of the target registers stored
> so that by modifying them I can alter the behavior of the target code
> execution?

env->gpr is the canonical location, but the translator assigns TCG
variables to them (cpu_gpr[] in translate.c), so GPR contents may be
cached to these. But when helpers are called or the TB finishes,
env->gpr should be valid again.

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Jan Kiszka

On 2011-01-18 18:09, Anthony Liguori wrote:
> On 01/18/2011 10:56 AM, Jan Kiszka wrote:
>>
>>> The device model topology is 100% a hidden architectural detail.
>>>  
>> This is true for the sysbus, it is obviously not the case for PCI and
>> similarly discoverable buses. There we have a guest-explorable topology
>> that is currently equivalent to the the qdev layout.
>>
> 
> But we also don't do PCI passthrough so we really haven't even explored 
> how that maps in qdev.  I don't know if qemu-kvm has attempted to 
> qdev-ify it.

It is. And even if it weren't or the current version in qemu-kvm was not
perfect, we need to consider those uses cases now as we are about to
define a generic model for kvm device integration. That's the point of
this discussion.

> 
 Management and analysis tools must be able to traverse the system buses
 and find guest devices this way.

>>> We need to provide a compatible interface to the guest.  If you agree
>>> with my above statements, then you'll also agree that we can do this
>>> without keeping the device model topology stable.
>>>
>>> But we also need to provide a compatible interface to management tools.
>>> Exposing the device model topology as a compatible interface
>>> artificially limits us.  It's far better to provide higher level
>>> supported interfaces to give us the flexibility to change the device
>>> model as we need to.
>>>  
>> How do you want to change qdev to keep the guest and management tool
>> view stable while branching off kvm sub-buses?
> 
> The qdev device model is not a stable interface.  I think that's been 
> clear from the very beginning.

Internals aren't stable, but they should only be changed for a good
reason, specifically when the change may impact the whole set of device
models.

> 
>>   Please propose such
>> extensions so that they can be discussed. IIUC, that would be second
>> relation between qdev and qbus instances besides the physical topology.
>> What further use cases (besides passing kvm_state around) do you have in
>> mind?
>>
> 
> The -device interface is a stable interface.  Right now, you don't 
> specify any type of identifier of the pci bus when you create a PCI 
> device.  It's implied in the interface.

Which only works as along as we expose a single bus. You don't need to
be an oracle to predict that this is not a stable interface.

> 
>>
>>>  
If they create a device on bus X, it
 must never end up on bus Y just because it happens to be KVM-assisted or
 has some other property.

>>> Nope.  This is exactly what should happen.
>>>
>>> 90% of the devices in the device model are not created by management
>>> tools.  They're part of a chipset.  The chipset has well defined
>>> extension points and we provide management interfaces to create devices
>>> on those extension points.  That is, interfaces to create PCI devices.
>>>
>>>  
>> Creating kvm irqchips via the management tool would be one simple way
>> (not the only one, though) to enable/disable kvm assistance for those
>> devices.
>>
> 
> It's automatically created as part of the CPUs or as part of the 
> chipset.  How to enable/disable kvm assistance is a property of the CPU 
> and/or chipset.

If we exclude creation via command line / config files, we could also
pass the kvm_state directly from the machine or chipset setup code and
save us at least the kvm system buses.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation.

2011-01-18 Thread Peter Maydell

On 18 January 2011 17:00, Christophe Lyon  wrote:
> On 18.01.2011 16:36, Peter Maydell wrote:
>> Incidentally there are some correctness fixes for the multiply-by-scalar
>> neon insns from the qemu-meego tree which are on my list to push
>> upstream. So you probably aren't getting the right results even if
>> you've managed to shut up qemu's warnings :-)

> And yes I can confirm there are many other wrong results in the
> Neon support, which I am currently fixing.

Please coordinate this with me! I have a big pile of fixes which
I am working through, testing and submitting upstream, so you
are in significant danger of duplicating work, which would be
unfortunate.

-- PMM

Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation.

2011-01-18 Thread Christophe Lyon

On 18.01.2011 16:36, Peter Maydell wrote:
> Incidentally there are some correctness fixes for the multiply-by-scalar
> neon insns from the qemu-meego tree which are on my list to push
> upstream. So you probably aren't getting the right results even if
> you've managed to shut up qemu's warnings :-)
> 

Actually it did not only shut up qemu's warnings. It was asserting. After 
fixing the asserts, it did warn a lot about resource leakage indeed, which I 
tried to fix with this patch.

And yes I can confirm there are many other wrong results in the Neon support, 
which I am currently fixing.

Christophe.

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Anthony Liguori


On 01/18/2011 11:20 AM, Jan Kiszka wrote:


Which only works as along as we expose a single bus. You don't need to
be an oracle to predict that this is not a stable interface.
   


Today we only have a very low level factory interface--device creation 
and deletion.


I think we should move to higher level bus factory interfaces.  An 
interface to create a PCI device and to delete PCI devices.  This is the 
only sane way to do hot plug.


This also makes supporting multiple busses a lot more reasonable since 
this factory interface could be a method of the controller.



It's automatically created as part of the CPUs or as part of the
chipset.  How to enable/disable kvm assistance is a property of the CPU
and/or chipset.
 

If we exclude creation via command line / config files, we could also
pass the kvm_state directly from the machine or chipset setup code and
save us at least the kvm system buses.
   


Which is fine in the short term.  This is exactly why we don't want the 
device model to be an ABI.   It gives us the ability to make changes as 
they make sense instead of trying to be perfect from the start (which we 
never will be).


Regards,

Anthony Liguori


Jan

Re: [Qemu-devel] [sparc] Floating point exception issue

2011-01-18 Thread Blue Swirl

On Tue, Jan 18, 2011 at 3:27 PM, Mateusz Loskot  wrote:
> Hi,
>
> Recently, I have reported mysterious issues on NetBSD 5.1
> emulated on SPARC. The whole first thread is here:
>
> http://lists.gnu.org/archive/html/qemu-devel/2011-01/msg01509.html
>
> I decided to investigate the problem deeper and with great help
> from NetBSD folks, I managed to find reproducible test case.
> Initially, it was AWK command:
>
> # echo NaN | awk '{print "test"}'
> awk: floating point exception 8
>  source line number 1
>
> and next it boiled down to simple C program (see below).
> Details of the investigation are archived in the NetBSD Problem
> Report #44389 here:
>
> http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389
>
>
> Here is final version of the test program which reproduces the problem:
>
> #include 
> #include 
> #include 
> #include 
>
> int is_number(const char *s)
> {
>        double r;
>        char *ep;
>        errno = 0;
>        r = strtod(s, &ep);
>        if (r == HUGE_VAL)
>                printf("X:%g\n", r);
>
>        if (ep == s || r == HUGE_VAL || errno == ERANGE)
>                return 0;
>        while (*ep == ' ' || *ep == '\t' || *ep == '\n')
>                ep++;
>        if (*ep == '\0')
>                return 1;
>        else
>                return 0;
> }
>
> int main(int argc, char **argv)
> {
>        double v;
>
>        if (is_number("NaN")) {
>                printf("is a number\n");
>                v = atof("NaN");
>        } else {
>                printf("not a number\n");
>                v = 0.0;
>        }
>        printf("%.4f\n", v);
>
>        return 0;
> }
>
>
> On NetBSD/SPARC, the program receives SIGFPE:
>
> $ gcc ./nan_test_2.c
>  $ ./a.out
>  [1]   Floating point exception (core dumped) ./a.out
>
> Specifically, it's caused by r == HUGE_VAL condition in
>  if (ep == s || r == HUGE_VAL || errno == ERANGE)
> where r is NaN.
>
> All the signs indicate there is a bug in QEMU.

I'll install 5.1, but on 4.0 which I had installed, the program works fine:
$ ./sigfpe
is a number
nan

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Alex Williamson

On Tue, 2011-01-18 at 18:08 +0100, Jan Kiszka wrote:
> On 2011-01-18 18:02, Alex Williamson wrote:
> > On Tue, 2011-01-18 at 16:54 +0100, Jan Kiszka wrote:
> >> On 2011-01-18 16:48, Anthony Liguori wrote:
> >>> On 01/18/2011 09:43 AM, Jan Kiszka wrote:
>  On 2011-01-18 16:04, Anthony Liguori wrote:
> 
> > On 01/18/2011 08:28 AM, Jan Kiszka wrote:
> >  
> >> On 2011-01-12 11:31, Jan Kiszka wrote:
> >>
> >>
> >>> Am 12.01.2011 11:22, Avi Kivity wrote:
> >>>
> >>>  
>  On 01/11/2011 03:54 PM, Anthony Liguori wrote:
> 
> 
> > Right, we should introduce a KVMBus that KVM devices are created on.
> > The devices can get at KVMState through the BusState.
> >
> >  
>  There is no kvm bus in a PC (I looked).  We're bending the device 
>  model
>  here because a device is implemented in the kernel and not in
>  userspace.  An implementation detail is magnified beyond all 
>  proportions.
> 
>  An ioapic that is implemented by kvm lives in exactly the same place
>  that the qemu ioapic lives in.  An assigned pci device lives on the 
>  PCI
>  bus, not a KVMBus.  If we need a pointer to KVMState, then we must 
>  find
>  it elsewhere, not through creating imaginary buses that don't exist.
> 
> 
> 
> >>> Exactly.
> >>>
> >>> So we can either "infect" the whole device tree with kvm (or maybe a
> >>> more generic accelerator structure that also deals with Xen) or we 
> >>> need
> >>> to pull the reference inside the device's init function from some 
> >>> global
> >>> service (kvm_get_state).
> >>>
> >>>  
> >> Note that this topic is still waiting for good suggestions, 
> >> specifically
> >> from those who believe in kvm_state references :). This is not only
> >> blocking kvmstate merge but will affect KVM irqchips as well.
> >>
> >> It boils down to how we reasonably pass a kvm_state reference from
> >> machine init code to a sysbus device. I'm probably biased, but I don't
> >> see any way that does not work against the idea of confining access to
> >> kvm_state or breaks device instantiation from the command line or a
> >> config file.
> >>
> >>
> > A KVM device should sit on a KVM specific bus that hangs off of sysbus.
> > It can get to kvm_state through that bus.
> >
> > That bus doesn't get instantiated through qdev so requiring a pointer
> > argument should not be an issue.
> >
> >  
>  This design is in conflict with the requirement to attach KVM-assisted
>  devices also to their home bus, e.g. an assigned PCI device to the PCI
>  bus. We don't support multi-homed qdev devices.
> 
> >>>
> >>> With vfio, would an assigned PCI device even need kvm_state?
> >>
> >> IIUC: Yes, for establishing the irqfd link.
> > 
> > We abstract this through the msi/msix layer though, so the vfio driver
> > doesn't directly know anything about kvm_state.
> 
> Which version/tree are you referring to? It wasn't the case in the last
> version I found on the list.
> 
> Does the msi layer use irqfd for every source in kvm mode then? Of
> course, the key question will be how that layer will once obtain kvm_state.

Looking at "[RFC PATCH v2] VFIO based device assignment" sent on Nov
5th, I guess we do call kvm_set_irqfd.  Maybe I'm just wishing that the
msi layer abstracted it better.  I'd like to be able to pass in a
userspace interrupt handler function pointer and an event notifier fd
and let the interrupt layers worry about how it's hooked up.

Alex

Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state

2011-01-18 Thread Jan Kiszka

On 2011-01-18 18:31, Anthony Liguori wrote:
>>> It's automatically created as part of the CPUs or as part of the
>>> chipset.  How to enable/disable kvm assistance is a property of the CPU
>>> and/or chipset.
>>>  
>> If we exclude creation via command line / config files, we could also
>> pass the kvm_state directly from the machine or chipset setup code and
>> save us at least the kvm system buses.
>>
> 
> Which is fine in the short term.

Unless we want to abuse the pointer property for this, and there was
some resistance, we would have to change the sysbus init function
signature. I don't want to propose this for a short-term workaround, we
need a clearer vision and roadmap to avoid multiple invasive changes to
the device model.

>  This is exactly why we don't want the 
> device model to be an ABI.   It gives us the ability to make changes as 
> they make sense instead of trying to be perfect from the start (which we 
> never will be).

The device model will always consist of a stable part, the guest and
management visible topology. That beast needs to be modeled as well,
likely via some new bus objects. If that's the way to go, starting now
is probably the right time as we have an urgent use case, right?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH] linux-user: Fix possible realloc memory leak

2011-01-18 Thread Markus Armbruster

Stefan Weil  writes:

> Am 18.01.2011 09:26, schrieb Markus Armbruster:
>> Stefan Weil  writes:
>>
>>> Extract from "man realloc":
>>> "If realloc() fails the original block is left untouched;
>>> it is not freed or moved."
>>>
>>> Fix a possible memory leak (reported by cppcheck).
>>>
>>> Cc: Riku Voipio 
>>> Signed-off-by: Stefan Weil 
>>
>> Sidestep the problem via qemu_realloc() instead?
>
> The same change was applied to bsd-user/elfload.c.
>
> As symbol loading is not essential in most applications,
> returning after out-of-memory should be better than
> aborting (that's what qemu_realloc does).

Unless the requested size is *really* large, I'd expect this to stave
off the out-of-memory failure for a few microseconds at best.

Re: [Qemu-devel] Re: Fwd: Re: port-sparc/44389: awk failure during m4 installation with pkgsrc

2011-01-18 Thread Blue Swirl

On Tue, Jan 18, 2011 at 11:40 AM, Mateusz Loskot  wrote:
> Blue Swirl  gmail.com> writes:
>> Mateusz Loskot  loskot.net> wrote:
>> > Hi,
>> >
>> > I emulate SPARC with NetBSD 5.0 installed and I've experienced a problem
>> > with pkgsrc installing one of packages.
>> > I submitted bug report to NetBSD team and received short response
>> > suggesting it is likely QEMU problem (see original message below).
>> >
>> > Shortly, the problem is that AWK throws "floating point exception"
>> > Here is the bug report with details:
>> >
>> > http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389
>> >
>> > Given the nature of the problem, it "feels" the issue is unrelated to
>> > QEMU, but I'm not on position to judge the suggestion posted by NetBSD
>> > team is pointless.
>> >
>> > I have no idea how to isolate the problem on NetBSD, so I replied to
>> > NetBSD team asking:
>> > ===
>> > Could you suggest how to isolate the problem, find exact awk command
>> > causing the error, so I can debug it or forward to QEMU developers with
>> > details necessary to reproduce it?
>> > ===
>> >
>> > In the meantime, I thought I will attack qemu-devel hoping it may ring
>> > some bells here as well.
>> >
>> > Any ideas?
>>
>> It's entirely possible that the floating point support for Sparc can
>> be buggy, there have been a few fixes to softfloat recently for other
>> architectures.
>>
>> I'd check if NaN handling or floating point exception support works
>> correctly, regular programs don't use those. But a reproducible test
>> case is essential.
>
>
> I think I have managed to isolate reproducible test for this problem
> and it seems the issue is in NetBSD userland.
> See my last update to the problem reprot:
>
> http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389
>
> Long story short, the problem boils down to the following command:
>
> # echo NaN | awk '{print "test"}'
> awk: floating point exception 8
>  source line number 1
>
> which interprets "NaN" as a number when it isn't asked to.

FYI: also this succeeds on 4.0:
$ echo NaN | awk '{print "test"}'
test

Re: [Qemu-devel] [sparc] Floating point exception issue

2011-01-18 Thread Mateusz Loskot


On 18/01/11 17:36, Blue Swirl wrote:

On Tue, Jan 18, 2011 at 3:27 PM, Mateusz Loskot  wrote:

Hi,

Recently, I have reported mysterious issues on NetBSD 5.1
emulated on SPARC. The whole first thread is here:

http://lists.gnu.org/archive/html/qemu-devel/2011-01/msg01509.html

I decided to investigate the problem deeper and with great help
from NetBSD folks, I managed to find reproducible test case.
Initially, it was AWK command:

# echo NaN | awk '{print "test"}'
awk: floating point exception 8
  source line number 1

and next it boiled down to simple C program (see below).
Details of the investigation are archived in the NetBSD Problem
Report #44389 here:

http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389


Here is final version of the test program which reproduces the problem:

#include
#include
#include
#include

int is_number(const char *s)
{
double r;
char *ep;
errno = 0;
r = strtod(s,&ep);
if (r == HUGE_VAL)
printf("X:%g\n", r);

if (ep == s || r == HUGE_VAL || errno == ERANGE)
return 0;
while (*ep == ' ' || *ep == '\t' || *ep == '\n')
ep++;
if (*ep == '\0')
return 1;
else
return 0;
}

int main(int argc, char **argv)
{
double v;

if (is_number("NaN")) {
printf("is a number\n");
v = atof("NaN");
} else {
printf("not a number\n");
v = 0.0;
}
printf("%.4f\n", v);

return 0;
}


On NetBSD/SPARC, the program receives SIGFPE:

$ gcc ./nan_test_2.c
  $ ./a.out
  [1]   Floating point exception (core dumped) ./a.out

Specifically, it's caused by r == HUGE_VAL condition in
  if (ep == s || r == HUGE_VAL || errno == ERANGE)
where r is NaN.

All the signs indicate there is a bug in QEMU.


I'll install 5.1, but on 4.0 which I had installed, the program works fine:
$ ./sigfpe
is a number
nan


I just tested on NetBSD 5.0/SPARC under QEMU 0.13 (same version I use 
with NetBSD 5.1/SPARC) and it works well indeed:


mloskot@qemu-netbsd-50-sparc:~/tmp# ./a.out
is a number
nan
mloskot@qemu-netbsd-50-sparc:~/tmp#

Hmm, this is becoming interesting.

I run QEMU 0.13 on Windows Vista (64-bit).
Perhaps host system and QEMU binaries are relevant here.
I will try on Linux host system later tonight.

BTW, here are my images:

http://mateusz.loskot.net/tmp/qemu/

Best regards,
--
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
Member of ACCU, http://accu.org

1 2 >

1 - 100 of 140 matches

Mail list logo