[Qemu-devel] [V3 PATCH 7/8] virtio-9p: Move file post creation changes to none security model
After creating a file object, its permission and ownership details are updated as per client's request for both passthrough and none security model. But with chrooted environment its not required for passthrough security model. Move all post file creation changes to none security model Signed-off-by: M. Mohan Kumar --- hw/9pfs/virtio-9p-local.c | 19 ++- 1 files changed, 6 insertions(+), 13 deletions(-) diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/virtio-9p-local.c index 08fd67f..d2e32e2 100644 --- a/hw/9pfs/virtio-9p-local.c +++ b/hw/9pfs/virtio-9p-local.c @@ -208,21 +208,14 @@ static int local_set_xattr(const char *path, FsCred *credp) return 0; } -static int local_post_create_passthrough(FsContext *fs_ctx, const char *path, +static int local_post_create_none(FsContext *fs_ctx, const char *path, FsCred *credp) { +int retval; if (chmod(rpath(fs_ctx, path), credp->fc_mode & 0) < 0) { return -1; } -if (lchown(rpath(fs_ctx, path), credp->fc_uid, credp->fc_gid) < 0) { -/* - * If we fail to change ownership and if we are - * using security model none. Ignore the error - */ -if (fs_ctx->fs_sm != SM_NONE) { -return -1; -} -} +retval = lchown(rpath(fs_ctx, path), credp->fc_uid, credp->fc_gid); return 0; } @@ -363,7 +356,7 @@ static int local_mknod(FsContext *fs_ctx, const char *path, FsCred *credp) if (err == -1) { return err; } -err = local_post_create_passthrough(fs_ctx, path, credp); +err = local_post_create_none(fs_ctx, path, credp); if (err == -1) { serrno = errno; goto err_end; @@ -405,7 +398,7 @@ static int local_mkdir(FsContext *fs_ctx, const char *path, FsCred *credp) if (err == -1) { return err; } -err = local_post_create_passthrough(fs_ctx, path, credp); +err = local_post_create_none(fs_ctx, path, credp); if (err == -1) { serrno = errno; goto err_end; @@ -480,7 +473,7 @@ static int local_open2(FsContext *fs_ctx, const char *path, int flags, if (fd == -1) { return fd; } -err = local_post_create_passthrough(fs_ctx, path, credp); +err = local_post_create_none(fs_ctx, path, credp); if (err == -1) { serrno = errno; goto err_end; -- 1.7.3.4
Re: [Qemu-devel] [PATCH] linux-user: Fix possible realloc memory leak
Stefan Weil writes: > Extract from "man realloc": > "If realloc() fails the original block is left untouched; > it is not freed or moved." > > Fix a possible memory leak (reported by cppcheck). > > Cc: Riku Voipio > Signed-off-by: Stefan Weil Sidestep the problem via qemu_realloc() instead?
[Qemu-devel] Re: Errors on MMIO read access on VM suspend / resume operations
On 2011-01-18 04:03, Stefan Berger wrote: > On 01/16/2011 09:43 AM, Avi Kivity wrote: >> On 01/14/2011 09:27 PM, Stefan Berger wrote: >>> Can you sprinkle some printfs() arount kvm_run (in qemu-kvm.c) to verify this? >>> Here's what I did: >>> >>> >>> interrupt exit requested >> >> It appears from this you're using qemu.git. Please try qemu-kvm.git, >> where the code appears to be correct. >> > Cc'ing qemu-devel now. For reference, here the initial problem description: > > http://www.spinics.net/lists/kvm/msg48274.html > > I didn't know there was another tree... > > I have seen now a couple of suspends-while-reading with patches applied > to the qemu-kvm.git tree and indeed, when run with the same host kernel > and VM I do not see the debugging dumps due to double-reads that I would > have anticipated seeing by now. Now what? Can this be easily fixed in > the other Qemu tree as well? Please give this a try: git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream I bet (& hope) "kvm: Unconditionally reenter kernel after IO exits" fixes the issue for you. If other problems pop up with that tree, also try resetting to that particular commit. I'm currently trying to shake all those hidden or forgotten bug fixes out of qemu-kvm and port them upstream. Most of those subtle differences should hopefully soon be history. > > One thing I'd like to mention is that I have seen what I think are > interrupt stalls when running my tests inside the qemu-kvm.git tree > version and not suspending at all. A some point the interrupt counter in > the guest kernel does not increase anymore even though I see the device > model raising the IRQ and lowering it. The same tests run literally > forever in the qemu.git tree version of Qemu. What about qemu-kmv and -no-kvm-irqchip? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts
Hi, Worse might also be that unknown issue that force you to inject an IRQ here. We don't know. That's probably worst. Well, IIRC the issue was that usually a level high interrupt line would simply retrigger an interrupt after enabling the interrupt line in the APIC again. edge triggered interrupts wouldn't though. Unless my memory completely fails on me, that didn't happen, so I added the manual retrigger on a partial command ACK in ahci code. That re-trigger smells like you are not clearing the interrupt line where you should. For starters try calling ahci_check_irq() after guest writes to PORT_IRQ_STAT. cheers, Gerd
Re: [Qemu-devel] [PATCH 2/2] strtosz(): Use suffix macros in switch() statement
On 01/18/11 10:20, Markus Armbruster wrote: >> diff --git a/cutils.c b/cutils.c >> index 328738c..f2c8bbd 100644 >> --- a/cutils.c >> +++ b/cutils.c >> @@ -324,26 +324,26 @@ ssize_t strtosz_suffix(const char *nptr, char **end, >> const char default_suffix) >> } >> } >> switch (toupper(d)) { >> -case 'B': >> +case STRTOSZ_DEFSUFFIX_B: >> mul = 1; >> if (mul_required) { >> goto fail; >> } >> break; >> -case 'K': >> +case STRTOSZ_DEFSUFFIX_KB: >> mul = 1 << 10; >> break; >> case 0: >> if (mul_required) { >> goto fail; >> } >> -case 'M': >> +case STRTOSZ_DEFSUFFIX_MB: >> mul = 1ULL << 20; >> break; >> -case 'G': >> +case STRTOSZ_DEFSUFFIX_GB: >> mul = 1ULL << 30; >> break; >> -case 'T': >> +case STRTOSZ_DEFSUFFIX_TB: >> mul = 1ULL << 40; >> break; >> default: > > And this improves what? Certainly not clarity. > > In my opinion, the STRTOSZ_DEFSUFFIX_TB are useless chaff. Chacun à son > goût. It cuts out lines of code, which is good, and using the macros means the user is less likely to make a type and use a wrong character. It's a taste issue though, I agree! Cheers, Jes
Re: [Qemu-devel] [PATCH 2/2] strtosz(): Use suffix macros in switch() statement
jes.soren...@redhat.com writes: > From: Jes Sorensen > > Signed-off-by: Jes Sorensen > --- > cutils.c | 10 +- > 1 files changed, 5 insertions(+), 5 deletions(-) > > diff --git a/cutils.c b/cutils.c > index 328738c..f2c8bbd 100644 > --- a/cutils.c > +++ b/cutils.c > @@ -324,26 +324,26 @@ ssize_t strtosz_suffix(const char *nptr, char **end, > const char default_suffix) > } > } > switch (toupper(d)) { > -case 'B': > +case STRTOSZ_DEFSUFFIX_B: > mul = 1; > if (mul_required) { > goto fail; > } > break; > -case 'K': > +case STRTOSZ_DEFSUFFIX_KB: > mul = 1 << 10; > break; > case 0: > if (mul_required) { > goto fail; > } > -case 'M': > +case STRTOSZ_DEFSUFFIX_MB: > mul = 1ULL << 20; > break; > -case 'G': > +case STRTOSZ_DEFSUFFIX_GB: > mul = 1ULL << 30; > break; > -case 'T': > +case STRTOSZ_DEFSUFFIX_TB: > mul = 1ULL << 40; > break; > default: And this improves what? Certainly not clarity. In my opinion, the STRTOSZ_DEFSUFFIX_TB are useless chaff. Chacun à son goût.
[Qemu-devel] Changing the content of target cpu registers
Hi all! I am working on qemu-user (qemu-ppc). I'd like to edit the values of target registers during the execution. Can I do that by simply changing the content of env->gpr[] or do these only contain a copy of the values of the registers? In this last case, where are the real values of the target registers stored so that by modifying them I can alter the behavior of the target code execution? Thank you in advance! Stefano B.
[Qemu-devel] Re: MIPS, io-thread, icount and wfi
On 2011-01-18 01:19, Edgar E. Iglesias wrote: > On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote: >> Hi, >> >> I'm running an io-thread enabled qemu-system-mipsel with icount. >> When the guest (linux) goes to sleep through the wait insn (waiting >> to be woken up by future timer interrupts), the thing deadlocks. >> >> IIUC, this is because vm timers are driven by icount, but the CPU is >> halted so icount makes no progress and time stands still. >> >> I've locally disabled vcpu halting when icount is enabled, that >> works around my problem but of course makes qemu consume 100% host cpu. >> >> I don't know why I only see this problem with io-thread builds? >> Could be related timing and luck. >> >> Would be interesting to know if someone has any info on how this was >> intended to work (if it was)? And if there are ideas for better >> workarounds or fixes that don't disable vcpu halting entirely. > > Hi, > > I've found the problem. For some reason io-thread builds use a > static timeout for wait loops. The entire chunk of code that > makes sure qemu_icount makes forward progress when the CPU's > are idle has been ifdef'ed away... > > This fixes the problem for me, hopefully without affecting > io-thread runs without icount. > > commit 0f4f3a919952500b487b438c5520f07a1c6be35b > Author: Edgar E. Iglesias > Date: Tue Jan 18 01:01:57 2011 +0100 > > qemu-timer: Fix timeout calc for io-thread with icount > > Make sure we always make forward progress with qemu_icount to > avoid deadlocks. For io-thread, use the static 1000 timeout > only if icount is disabled. > > Signed-off-by: Edgar E. Iglesias > > diff --git a/qemu-timer.c b/qemu-timer.c > index 95814af..db1ec49 100644 > --- a/qemu-timer.c > +++ b/qemu-timer.c > @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void) > } > } > > -#ifndef CONFIG_IOTHREAD > static int64_t qemu_icount_delta(void) > { > if (!use_icount) { > @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void) > return cpu_get_icount() - cpu_get_clock(); > } > } > -#endif > > /* enable cpu_get_ticks() */ > void cpu_enable_ticks(void) > @@ -1077,9 +1075,17 @@ void quit_timers(void) > > int qemu_calculate_timeout(void) > { > -#ifndef CONFIG_IOTHREAD > int timeout; > > +#ifdef CONFIG_IOTHREAD > +/* When using icount, making forward progress with qemu_icount when the > + guest CPU is idle is critical. We only use the static io-thread > timeout > + for non icount runs. */ > +if (!use_icount) { > +return 1000; > +} > +#endif > + > if (!vm_running) > timeout = 5000; > else { > @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void) > } > > return timeout; > -#else /* CONFIG_IOTHREAD */ > -return 1000; > -#endif > } > > > This logic and timeout values were imported on iothread merge. And I bet at least the timeout value of 1s (vs. 5s) can still be found in qemu-kvm. Maybe someone over there can remember the rationales behind choosing this value. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] [PATCH v2 1/3] scsi-disk: Allow overriding SCSI INQUIRY removable bit
Provide the "removable" qdev property bit to override the SCSI INQUIRY removable (RMB) bit for non-CDROM devices. This will be used by USB Mass Storage Devices, which sometimes have this guest-visible bit set and sometimes do not. They therefore requires a means for user configuration. Signed-off-by: Stefan Hajnoczi --- hw/scsi-disk.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index 6cb317c..db423ca 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -72,6 +72,7 @@ struct SCSIDiskState /* The qemu block layer uses a fixed 512 byte sector size. This is the number of 512 byte blocks in a single scsi sector. */ int cluster_size; +uint32_t removable; uint64_t max_lba; QEMUBH *bh; char *version; @@ -552,6 +553,7 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, uint8_t *outbuf) memcpy(&outbuf[16], "QEMU CD-ROM ", 16); } else { outbuf[0] = 0; +outbuf[1] = s->removable ? 0x80 : 0; memcpy(&outbuf[16], "QEMU HARDDISK ", 16); } memcpy(&outbuf[8], "QEMU", 8); @@ -1295,6 +1297,7 @@ static SCSIDeviceInfo scsi_disk_info = { DEFINE_BLOCK_PROPERTIES(SCSIDiskState, qdev.conf), DEFINE_PROP_STRING("ver", SCSIDiskState, version), DEFINE_PROP_STRING("serial", SCSIDiskState, serial), +DEFINE_PROP_BIT("removable", SCSIDiskState, removable, 1, false), DEFINE_PROP_END_OF_LIST(), }, }; -- 1.7.2.3
[Qemu-devel] [PATCH v2 2/3] scsi: Allow SCSI devices to override the removable bit
Some SCSI devices may wish to override the removable bit. Add support for a qdev property on the SCSI device. Signed-off-by: Stefan Hajnoczi --- hw/pci-hotplug.c |2 +- hw/scsi-bus.c|8 ++-- hw/scsi.h|3 ++- hw/usb-msd.c |2 +- 4 files changed, 10 insertions(+), 5 deletions(-) diff --git a/hw/pci-hotplug.c b/hw/pci-hotplug.c index 716133c..270a982 100644 --- a/hw/pci-hotplug.c +++ b/hw/pci-hotplug.c @@ -90,7 +90,7 @@ static int scsi_hot_add(Monitor *mon, DeviceState *adapter, * specified). */ dinfo->unit = qemu_opt_get_number(dinfo->opts, "unit", -1); -scsidev = scsi_bus_legacy_add_drive(scsibus, dinfo->bdrv, dinfo->unit); +scsidev = scsi_bus_legacy_add_drive(scsibus, dinfo->bdrv, dinfo->unit, false); if (!scsidev) { return -1; } diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c index 7febb86..ceeb4ec 100644 --- a/hw/scsi-bus.c +++ b/hw/scsi-bus.c @@ -87,7 +87,8 @@ void scsi_qdev_register(SCSIDeviceInfo *info) } /* handle legacy '-drive if=scsi,...' cmd line args */ -SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, int unit) +SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, + int unit, bool removable) { const char *driver; DeviceState *dev; @@ -95,6 +96,9 @@ SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, int driver = bdrv_is_sg(bdrv) ? "scsi-generic" : "scsi-disk"; dev = qdev_create(&bus->qbus, driver); qdev_prop_set_uint32(dev, "scsi-id", unit); +if (qdev_prop_exists(dev, "removable")) { +qdev_prop_set_bit(dev, "removable", removable); +} if (qdev_prop_set_drive(dev, "drive", bdrv) < 0) { qdev_free(dev); return NULL; @@ -117,7 +121,7 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus) continue; } qemu_opts_loc_restore(dinfo->opts); -if (!scsi_bus_legacy_add_drive(bus, dinfo->bdrv, unit)) { +if (!scsi_bus_legacy_add_drive(bus, dinfo->bdrv, unit, false)) { res = -1; break; } diff --git a/hw/scsi.h b/hw/scsi.h index bf02adf..846fbba 100644 --- a/hw/scsi.h +++ b/hw/scsi.h @@ -94,7 +94,8 @@ static inline SCSIBus *scsi_bus_from_device(SCSIDevice *d) return DO_UPCAST(SCSIBus, qbus, d->qdev.parent_bus); } -SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, int unit); +SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockDriverState *bdrv, + int unit, bool removable); int scsi_bus_legacy_handle_cmdline(SCSIBus *bus); SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, uint32_t lun); diff --git a/hw/usb-msd.c b/hw/usb-msd.c index 0a95d8d..ee897b6 100644 --- a/hw/usb-msd.c +++ b/hw/usb-msd.c @@ -544,7 +544,7 @@ static int usb_msd_initfn(USBDevice *dev) s->dev.speed = USB_SPEED_FULL; scsi_bus_new(&s->bus, &s->dev.qdev, 0, 1, usb_msd_command_complete); -s->scsi_dev = scsi_bus_legacy_add_drive(&s->bus, bs, 0); +s->scsi_dev = scsi_bus_legacy_add_drive(&s->bus, bs, 0, false); if (!s->scsi_dev) { return -1; } -- 1.7.2.3
[Qemu-devel] [PATCH v2 0/3] usb-msd: Add usb-storage, removable=on|off property
Allow overriding the SCSI INQUIRY removable (RMB) bit for scsi-disk and usb-msd devices. In particular this addresses the problem that some usb-msd devices have the bit set while other do not have it set. Now the user can choose and get desired guest behavior. qemu -usb -drive if=none,file=test.img,cache=none,id=disk0 -device usb-storage,drive=disk0,removable=on The default is off. v2: * Rewritten to override the bit at the scsi-disk level hw/pci-hotplug.c |2 +- hw/scsi-bus.c|8 ++-- hw/scsi-disk.c |3 +++ hw/scsi.h|3 ++- hw/usb-msd.c |4 +++- 5 files changed, 15 insertions(+), 5 deletions(-)
[Qemu-devel] Re: MIPS, io-thread, icount and wfi
On 2011-01-18 11:00, Jan Kiszka wrote: > On 2011-01-18 01:19, Edgar E. Iglesias wrote: >> On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote: >>> Hi, >>> >>> I'm running an io-thread enabled qemu-system-mipsel with icount. >>> When the guest (linux) goes to sleep through the wait insn (waiting >>> to be woken up by future timer interrupts), the thing deadlocks. >>> >>> IIUC, this is because vm timers are driven by icount, but the CPU is >>> halted so icount makes no progress and time stands still. >>> >>> I've locally disabled vcpu halting when icount is enabled, that >>> works around my problem but of course makes qemu consume 100% host cpu. >>> >>> I don't know why I only see this problem with io-thread builds? >>> Could be related timing and luck. >>> >>> Would be interesting to know if someone has any info on how this was >>> intended to work (if it was)? And if there are ideas for better >>> workarounds or fixes that don't disable vcpu halting entirely. >> >> Hi, >> >> I've found the problem. For some reason io-thread builds use a >> static timeout for wait loops. The entire chunk of code that >> makes sure qemu_icount makes forward progress when the CPU's >> are idle has been ifdef'ed away... >> >> This fixes the problem for me, hopefully without affecting >> io-thread runs without icount. >> >> commit 0f4f3a919952500b487b438c5520f07a1c6be35b >> Author: Edgar E. Iglesias >> Date: Tue Jan 18 01:01:57 2011 +0100 >> >> qemu-timer: Fix timeout calc for io-thread with icount >> >> Make sure we always make forward progress with qemu_icount to >> avoid deadlocks. For io-thread, use the static 1000 timeout >> only if icount is disabled. >> >> Signed-off-by: Edgar E. Iglesias >> >> diff --git a/qemu-timer.c b/qemu-timer.c >> index 95814af..db1ec49 100644 >> --- a/qemu-timer.c >> +++ b/qemu-timer.c >> @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void) >> } >> } >> >> -#ifndef CONFIG_IOTHREAD >> static int64_t qemu_icount_delta(void) >> { >> if (!use_icount) { >> @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void) >> return cpu_get_icount() - cpu_get_clock(); >> } >> } >> -#endif >> >> /* enable cpu_get_ticks() */ >> void cpu_enable_ticks(void) >> @@ -1077,9 +1075,17 @@ void quit_timers(void) >> >> int qemu_calculate_timeout(void) >> { >> -#ifndef CONFIG_IOTHREAD >> int timeout; >> >> +#ifdef CONFIG_IOTHREAD >> +/* When using icount, making forward progress with qemu_icount when the >> + guest CPU is idle is critical. We only use the static io-thread >> timeout >> + for non icount runs. */ >> +if (!use_icount) { >> +return 1000; >> +} >> +#endif >> + >> if (!vm_running) >> timeout = 5000; >> else { >> @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void) >> } >> >> return timeout; >> -#else /* CONFIG_IOTHREAD */ >> -return 1000; >> -#endif >> } >> >> >> > > This logic and timeout values were imported on iothread merge. And I bet > at least the timeout value of 1s (vs. 5s) can still be found in > qemu-kvm. Maybe someone over there can remember the rationales behind > choosing this value. Correction: qemu-kvm does _not_ use a fixed timeout value for the iothread nor the removed code path (as CONFIG_IOTHREAD is off in qemu-kvm, that tree does not even build when you enable it). Still, the reason for once introducing this difference to qemu should be reflected. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] Re: [PATCH v2 3/3] checkpatch: adjust to QEMUisms
On 01/17/2011 08:37 PM, Blue Swirl wrote: On Mon, Jan 17, 2011 at 7:40 AM, Paolo Bonzini wrote: On 01/15/2011 06:45 PM, Blue Swirl wrote: + if ($level == 0&& !$block =~ /^\s*\{/&& !$allowed) { I'm not a Perl expert at all, but I think you need parentheses for the argument of "!": ! has higher precedence than =~: http://perldoc.perl.org/perlop.html#Operator-Precedence-and-Associativity I think that's what I meant. :) if ($level == 0&& !($block =~ /^\s*\{/)&& !$allowed) { Maybe instead: if ($level == 0&& $block !~ /^\s*\{/&& !$allowed) { Yes, this too. Paolo
Re: [Qemu-devel] [PATCH 1/3] mips: Break TBs after mfc0_count
On Tue, Jan 18, 2011 at 12:29:40AM +0100, edgar.igles...@gmail.com wrote: > From: Edgar E. Iglesias > > Break the TB after reading the count register. This makes it > possible to take timer interrupts immediately after a read of > a possibly expired timer. > > Signed-off-by: Edgar E. Iglesias > --- > target-mips/translate.c |4 +++- > 1 files changed, 3 insertions(+), 1 deletions(-) > > diff --git a/target-mips/translate.c b/target-mips/translate.c > index cce77be..313cc29 100644 > --- a/target-mips/translate.c > +++ b/target-mips/translate.c > @@ -3410,8 +3410,10 @@ static void gen_mfc0 (CPUState *env, DisasContext > *ctx, TCGv arg, int reg, int s > gen_helper_mfc0_count(arg); > if (use_icount) { > gen_io_end(); > -ctx->bstate = BS_STOP; > } > +/* Break the TB to be able to take timer interrupts immediately > + after reading count. */ > +ctx->bstate = BS_STOP; > rn = "Count"; > break; > /* 6,7 are implementation dependent */ This looks fine, however it should probably be done the same way for dmfc0 on 64-bit MIPS. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
[Qemu-devel] Re: [PATCH 2/3] mips: Break out cpu_mips_timer_expire
On Tue, Jan 18, 2011 at 12:29:41AM +0100, edgar.igles...@gmail.com wrote: > From: Edgar E. Iglesias > > Reorganize for future patches, no functional change. > > Signed-off-by: Edgar E. Iglesias > --- > hw/mips_timer.c | 36 ++-- > 1 files changed, 22 insertions(+), 14 deletions(-) > > diff --git a/hw/mips_timer.c b/hw/mips_timer.c > index e3beee8..8c32087 100644 > --- a/hw/mips_timer.c > +++ b/hw/mips_timer.c > @@ -42,16 +42,6 @@ uint32_t cpu_mips_get_random (CPUState *env) > } > > /* MIPS R4K timer */ > -uint32_t cpu_mips_get_count (CPUState *env) > -{ > -if (env->CP0_Cause & (1 << CP0Ca_DC)) > -return env->CP0_Count; > -else > -return env->CP0_Count + > -(uint32_t)muldiv64(qemu_get_clock(vm_clock), > - TIMER_FREQ, get_ticks_per_sec()); > -} > - > static void cpu_mips_timer_update(CPUState *env) > { > uint64_t now, next; > @@ -64,6 +54,27 @@ static void cpu_mips_timer_update(CPUState *env) > qemu_mod_timer(env->timer, next); > } > > +/* Expire the timer. */ > +static void cpu_mips_timer_expire(CPUState *env) > +{ > +cpu_mips_timer_update(env); > +if (env->insn_flags & ISA_MIPS32R2) { > +env->CP0_Cause |= 1 << CP0Ca_TI; > +} > +qemu_irq_raise(env->irq[(env->CP0_IntCtl >> CP0IntCtl_IPTI) & 0x7]); > +} > + > +uint32_t cpu_mips_get_count (CPUState *env) > +{ > +if (env->CP0_Cause & (1 << CP0Ca_DC)) { > +return env->CP0_Count; > +} else { > +return env->CP0_Count + > +(uint32_t)muldiv64(qemu_get_clock(vm_clock), > + TIMER_FREQ, get_ticks_per_sec()); > +} > +} > + > void cpu_mips_store_count (CPUState *env, uint32_t count) > { > if (env->CP0_Cause & (1 << CP0Ca_DC)) > @@ -116,11 +127,8 @@ static void mips_timer_cb (void *opaque) > the comparator value. Offset the count by one to avoid immediately > retriggering the callback before any virtual time has passed. */ > env->CP0_Count++; > -cpu_mips_timer_update(env); > +cpu_mips_timer_expire(env); > env->CP0_Count--; > -if (env->insn_flags & ISA_MIPS32R2) > -env->CP0_Cause |= 1 << CP0Ca_TI; > -qemu_irq_raise(env->irq[(env->CP0_IntCtl >> CP0IntCtl_IPTI) & 0x7]); > } > > void cpu_mips_clock_init (CPUState *env) > -- > 1.7.2.2 > > Acked-by: Aurelien Jarno -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] Re: [PATCH 3/3] mips: Expire late timers when reading cp0_count
On Tue, Jan 18, 2011 at 01:33:00AM +0100, Edgar E. Iglesias wrote: > On Tue, Jan 18, 2011 at 12:29:42AM +0100, edgar.igles...@gmail.com wrote: > > From: Edgar E. Iglesias > > > > When reading cp0_count from a timer with a late trigger that should > > already have expired, expire it and raise the timer irq. > > > > This makes it possible for guest code (e.g, Linux) that first read > > cp0_count, then compare it with cp0_compare and check for raised > > timer interrupt lines to run reliably. > > > > Signed-off-by: Edgar E. Iglesias > > Sorry sent the wrong version of this one. It's supposed to be the > following: > > commit 139330de404209528712fd703952c0b5ad4459a1 > Author: Edgar E. Iglesias > Date: Tue Jan 18 00:12:22 2011 +0100 > > mips: Expire late timers when reading cp0_count > > When reading cp0_count from a timer with a late trigger that should > already have expired, expire it and raise the timer irq. > > This makes it possible for guest code (e.g, Linux) that first read > cp0_count, then compare it with cp0_compare and check for raised > timer interrupt lines to run reliably. > > Signed-off-by: Edgar E. Iglesias > > diff --git a/hw/mips_timer.c b/hw/mips_timer.c > index 8c32087..9c95f28 100644 > --- a/hw/mips_timer.c > +++ b/hw/mips_timer.c > @@ -69,9 +69,17 @@ uint32_t cpu_mips_get_count (CPUState *env) > if (env->CP0_Cause & (1 << CP0Ca_DC)) { > return env->CP0_Count; > } else { > +uint64_t now; > + > +now = qemu_get_clock(vm_clock); > +if (qemu_timer_pending(env->timer) > +&& qemu_timer_expired(env->timer, now)) { > +/* The timer has already expired. */ > +cpu_mips_timer_expire(env); > +} > + > return env->CP0_Count + > -(uint32_t)muldiv64(qemu_get_clock(vm_clock), > - TIMER_FREQ, get_ticks_per_sec()); > +(uint32_t)muldiv64(now, TIMER_FREQ, get_ticks_per_sec()); > } > } > Given the TB is now ended after this instruction (due to patch 1), isn't the interrupt handled before starting the next TB, where the interrupt line (I guess CP0_Cause) read? -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] Re: [PATCH 3/3] mips: Expire late timers when reading cp0_count
On Tue, Jan 18, 2011 at 11:36:25AM +0100, Aurelien Jarno wrote: > On Tue, Jan 18, 2011 at 01:33:00AM +0100, Edgar E. Iglesias wrote: > > On Tue, Jan 18, 2011 at 12:29:42AM +0100, edgar.igles...@gmail.com wrote: > > > From: Edgar E. Iglesias > > > > > > When reading cp0_count from a timer with a late trigger that should > > > already have expired, expire it and raise the timer irq. > > > > > > This makes it possible for guest code (e.g, Linux) that first read > > > cp0_count, then compare it with cp0_compare and check for raised > > > timer interrupt lines to run reliably. > > > > > > Signed-off-by: Edgar E. Iglesias > > > > Sorry sent the wrong version of this one. It's supposed to be the > > following: > > > > commit 139330de404209528712fd703952c0b5ad4459a1 > > Author: Edgar E. Iglesias > > Date: Tue Jan 18 00:12:22 2011 +0100 > > > > mips: Expire late timers when reading cp0_count > > > > When reading cp0_count from a timer with a late trigger that should > > already have expired, expire it and raise the timer irq. > > > > This makes it possible for guest code (e.g, Linux) that first read > > cp0_count, then compare it with cp0_compare and check for raised > > timer interrupt lines to run reliably. > > > > Signed-off-by: Edgar E. Iglesias > > > > diff --git a/hw/mips_timer.c b/hw/mips_timer.c > > index 8c32087..9c95f28 100644 > > --- a/hw/mips_timer.c > > +++ b/hw/mips_timer.c > > @@ -69,9 +69,17 @@ uint32_t cpu_mips_get_count (CPUState *env) > > if (env->CP0_Cause & (1 << CP0Ca_DC)) { > > return env->CP0_Count; > > } else { > > +uint64_t now; > > + > > +now = qemu_get_clock(vm_clock); > > +if (qemu_timer_pending(env->timer) > > +&& qemu_timer_expired(env->timer, now)) { > > +/* The timer has already expired. */ > > +cpu_mips_timer_expire(env); > > +} > > + > > return env->CP0_Count + > > -(uint32_t)muldiv64(qemu_get_clock(vm_clock), > > - TIMER_FREQ, get_ticks_per_sec()); > > +(uint32_t)muldiv64(now, TIMER_FREQ, get_ticks_per_sec()); > > } > > } > > > > Given the TB is now ended after this instruction (due to patch 1), isn't > the interrupt handled before starting the next TB, where the interrupt > line (I guess CP0_Cause) read? Hi, The problem here is different. Due to host timing granularity, the timer might expire later than it's precise scheduled time. If that happens, get_count will return a count value that goes beyond the trigger time but the interrupt may come later (when the host timer expires). This patch catches that case and expires the timer in-band, raising the timer interrupt if needed. Cheers
Re: [Qemu-devel] [PATCH 1/3] mips: Break TBs after mfc0_count
On Tue, Jan 18, 2011 at 11:34:28AM +0100, Aurelien Jarno wrote: > On Tue, Jan 18, 2011 at 12:29:40AM +0100, edgar.igles...@gmail.com wrote: > > From: Edgar E. Iglesias > > > > Break the TB after reading the count register. This makes it > > possible to take timer interrupts immediately after a read of > > a possibly expired timer. > > > > Signed-off-by: Edgar E. Iglesias > > --- > > target-mips/translate.c |4 +++- > > 1 files changed, 3 insertions(+), 1 deletions(-) > > > > diff --git a/target-mips/translate.c b/target-mips/translate.c > > index cce77be..313cc29 100644 > > --- a/target-mips/translate.c > > +++ b/target-mips/translate.c > > @@ -3410,8 +3410,10 @@ static void gen_mfc0 (CPUState *env, DisasContext > > *ctx, TCGv arg, int reg, int s > > gen_helper_mfc0_count(arg); > > if (use_icount) { > > gen_io_end(); > > -ctx->bstate = BS_STOP; > > } > > +/* Break the TB to be able to take timer interrupts immediately > > + after reading count. */ > > +ctx->bstate = BS_STOP; > > rn = "Count"; > > break; > > /* 6,7 are implementation dependent */ > > This looks fine, however it should probably be done the same way for > dmfc0 on 64-bit MIPS. You're right, I'll fix that. Thanks.
Re: [Qemu-devel] Re: [PATCH 3/3] mips: Expire late timers when reading cp0_count
On Tue, Jan 18, 2011 at 11:41:54AM +0100, Edgar E. Iglesias wrote: > On Tue, Jan 18, 2011 at 11:36:25AM +0100, Aurelien Jarno wrote: > > On Tue, Jan 18, 2011 at 01:33:00AM +0100, Edgar E. Iglesias wrote: > > > On Tue, Jan 18, 2011 at 12:29:42AM +0100, edgar.igles...@gmail.com wrote: > > > > From: Edgar E. Iglesias > > > > > > > > When reading cp0_count from a timer with a late trigger that should > > > > already have expired, expire it and raise the timer irq. > > > > > > > > This makes it possible for guest code (e.g, Linux) that first read > > > > cp0_count, then compare it with cp0_compare and check for raised > > > > timer interrupt lines to run reliably. > > > > > > > > Signed-off-by: Edgar E. Iglesias > > > > > > Sorry sent the wrong version of this one. It's supposed to be the > > > following: > > > > > > commit 139330de404209528712fd703952c0b5ad4459a1 > > > Author: Edgar E. Iglesias > > > Date: Tue Jan 18 00:12:22 2011 +0100 > > > > > > mips: Expire late timers when reading cp0_count > > > > > > When reading cp0_count from a timer with a late trigger that should > > > already have expired, expire it and raise the timer irq. > > > > > > This makes it possible for guest code (e.g, Linux) that first read > > > cp0_count, then compare it with cp0_compare and check for raised > > > timer interrupt lines to run reliably. > > > > > > Signed-off-by: Edgar E. Iglesias > > > > > > diff --git a/hw/mips_timer.c b/hw/mips_timer.c > > > index 8c32087..9c95f28 100644 > > > --- a/hw/mips_timer.c > > > +++ b/hw/mips_timer.c > > > @@ -69,9 +69,17 @@ uint32_t cpu_mips_get_count (CPUState *env) > > > if (env->CP0_Cause & (1 << CP0Ca_DC)) { > > > return env->CP0_Count; > > > } else { > > > +uint64_t now; > > > + > > > +now = qemu_get_clock(vm_clock); > > > +if (qemu_timer_pending(env->timer) > > > +&& qemu_timer_expired(env->timer, now)) { > > > +/* The timer has already expired. */ > > > +cpu_mips_timer_expire(env); > > > +} > > > + > > > return env->CP0_Count + > > > -(uint32_t)muldiv64(qemu_get_clock(vm_clock), > > > - TIMER_FREQ, get_ticks_per_sec()); > > > +(uint32_t)muldiv64(now, TIMER_FREQ, get_ticks_per_sec()); > > > } > > > } > > > > > > > Given the TB is now ended after this instruction (due to patch 1), isn't > > the interrupt handled before starting the next TB, where the interrupt > > line (I guess CP0_Cause) read? > > Hi, > > The problem here is different. Due to host timing granularity, the > timer might expire later than it's precise scheduled time. If that > happens, get_count will return a count value that goes beyond the > trigger time but the interrupt may come later (when the host timer > expires). > > This patch catches that case and expires the timer in-band, raising > the timer interrupt if needed. > Ok, thanks for the explanation. Acked-by: Aurelien Jarno -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
[Qemu-devel] [PATCH 3/3] qdev-properties: add PROP_TYPE_ENUM
Example usage: EnumTable foo_enum_table[] = { {"bar", 1}, {"buz", 2}, {NULL, 0}, }; DEFINE_PROP_ENUM("foo", State, foo, 1, foo_enum_table) When using qemu -device foodev,? it will appear as: foodev.foo=bar/buz --- hw/qdev-properties.c | 60 ++ hw/qdev.h| 15 2 files changed, 75 insertions(+), 0 deletions(-) diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c index a493087..3157721 100644 --- a/hw/qdev-properties.c +++ b/hw/qdev-properties.c @@ -63,6 +63,66 @@ PropertyInfo qdev_prop_bit = { .print = print_bit, }; +/* --- Enumeration --- */ +/* Example usage: +EnumTable foo_enum_table[] = { +{"bar", 1}, +{"buz", 2}, +{NULL, 0}, +}; +DEFINE_PROP_ENUM("foo", State, foo, 1, foo_enum_table), + */ +static int parse_enum(DeviceState *dev, Property *prop, const char *str) +{ +uint8_t *ptr = qdev_get_prop_ptr(dev, prop); +EnumTable *option = (EnumTable*)prop->data; + +while (option->name != NULL) { +if (!strncmp(str, option->name, strlen(option->name))) { +*ptr = option->value; +return 0; +} +option++; +} +return -EINVAL; +} + +static int print_enum(DeviceState *dev, Property *prop, char *dest, size_t len) +{ +uint32_t *p = qdev_get_prop_ptr(dev, prop); +EnumTable *option = (EnumTable*)prop->data; +while (option->name != NULL) { +if (*p == option->value) { +return snprintf(dest, len, "%s", option->name); +} +option++; +} +return 0; +} + +static int print_enum_options(DeviceInfo *info, Property *prop, char *dest, size_t len) +{ +int ret = 0; +EnumTable *option = (EnumTable*)prop->data; +while (option->name != NULL) { +ret += snprintf(dest + ret, len - ret, "%s", option->name); +if (option[1].name != NULL) { +ret += snprintf(dest + ret, len - ret, "/"); +} +option++; +} +return ret; +} + +PropertyInfo qdev_prop_enum = { +.name = "enum", +.type = PROP_TYPE_ENUM, +.size = sizeof(uint32_t), +.parse = parse_enum, +.print = print_enum, +.print_options = print_enum_options, +}; + /* --- 8bit integer --- */ static int parse_uint8(DeviceState *dev, Property *prop, const char *str) diff --git a/hw/qdev.h b/hw/qdev.h index f6d5279..26b3d9e 100644 --- a/hw/qdev.h +++ b/hw/qdev.h @@ -102,6 +102,7 @@ enum PropertyType { PROP_TYPE_VLAN, PROP_TYPE_PTR, PROP_TYPE_BIT, +PROP_TYPE_ENUM, }; struct PropertyInfo { @@ -121,6 +122,11 @@ typedef struct GlobalProperty { QTAILQ_ENTRY(GlobalProperty) next; } GlobalProperty; +typedef struct EnumTable { +const char *name; +uint32_tvalue; +} EnumTable; + /*** Board API. This should go away once we have a machine config file. ***/ DeviceState *qdev_create(BusState *bus, const char *name); @@ -237,6 +243,7 @@ extern PropertyInfo qdev_prop_drive; extern PropertyInfo qdev_prop_netdev; extern PropertyInfo qdev_prop_vlan; extern PropertyInfo qdev_prop_pci_devfn; +extern PropertyInfo qdev_prop_enum; #define DEFINE_PROP(_name, _state, _field, _prop, _type) { \ .name = (_name),\ @@ -259,6 +266,14 @@ extern PropertyInfo qdev_prop_pci_devfn; + type_check(uint32_t,typeof_field(_state, _field)), \ .defval= (bool[]) { (_defval) }, \ } +#define DEFINE_PROP_ENUM(_name, _state, _field, _defval, _options) {\ +.name = (_name), \ +.info = &(qdev_prop_enum), \ +.offset= offsetof(_state, _field) \ ++ type_check(uint32_t,typeof_field(_state, _field)),\ +.defval= (uint32_t[]) { (_defval) },\ +.data = (void*)(_options), \ +} #define DEFINE_PROP_UINT8(_n, _s, _f, _d) \ DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_uint8, uint8_t) -- 1.7.3.4
[Qemu-devel] [PATCH 0/3] add PROP_TYPE_ENUM and print_options callback
This patchset allows a new property type called PROP_TYPE_ENUM, I want to use it for the backend property in the ccid patches (will send the patchset that uses it after this), libvirt is the ultimate user. The first patch adds a print_options callback that works with this property type to print the optional values. The second patch allows storing the name/value mapping in the property, using a void ptr for later different uses. The third patch adds the property itself. Alon Levy (3): qdev: add print_options callback qdev: add data pointer to Property qdev-properties: add PROP_TYPE_ENUM hw/qdev-properties.c | 60 ++ hw/qdev.c| 10 +++- hw/qdev.h| 17 ++ 3 files changed, 86 insertions(+), 1 deletions(-) -- 1.7.3.4
[Qemu-devel] [PATCH 2/3] qdev: add data pointer to Property
For later use by PROP_TYPE_ENUM, will store enumeration name/value table there. --- hw/qdev.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/hw/qdev.h b/hw/qdev.h index 530fc5d..f6d5279 100644 --- a/hw/qdev.h +++ b/hw/qdev.h @@ -83,6 +83,7 @@ struct Property { int offset; int bitnr; void *defval; +void *data; }; enum PropertyType { -- 1.7.3.4
[Qemu-devel] [PATCH 1/3] qdev: add print_options callback
another callback added to PropertyInfo, for later use by PROP_TYPE_ENUM. Allows printing of runtime computed options when doing: qemu -device foo,? --- hw/qdev.c | 10 +- hw/qdev.h |1 + 2 files changed, 10 insertions(+), 1 deletions(-) diff --git a/hw/qdev.c b/hw/qdev.c index 5b8d374..d1550b9 100644 --- a/hw/qdev.c +++ b/hw/qdev.c @@ -188,7 +188,15 @@ int qdev_device_help(QemuOpts *opts) if (!prop->info->parse) { continue; /* no way to set it, don't show */ } -error_printf("%s.%s=%s\n", info->name, prop->name, prop->info->name); +if (prop->info->print_options) { +char buf[256]; +int ret; +ret = prop->info->print_options(info, prop, buf, sizeof(buf) - 3); +error_printf("%s.%s=%s%s\n", info->name, prop->name, buf, +ret == sizeof(buf) - 3 ? "..." : "" ); +} else { +error_printf("%s.%s=%s\n", info->name, prop->name, prop->info->name); +} } return 1; } diff --git a/hw/qdev.h b/hw/qdev.h index e520aaa..530fc5d 100644 --- a/hw/qdev.h +++ b/hw/qdev.h @@ -109,6 +109,7 @@ struct PropertyInfo { enum PropertyType type; int (*parse)(DeviceState *dev, Property *prop, const char *str); int (*print)(DeviceState *dev, Property *prop, char *dest, size_t len); +int (*print_options)(DeviceInfo *info, Property *prop, char *dest, size_t len); void (*free)(DeviceState *dev, Property *prop); }; -- 1.7.3.4
Re: [Qemu-devel] [PATCH v2 1/3] scsi-disk: Allow overriding SCSI INQUIRY removable bit
Am 18.01.2011 11:10, schrieb Stefan Hajnoczi: > Provide the "removable" qdev property bit to override the SCSI INQUIRY > removable (RMB) bit for non-CDROM devices. This will be used by USB > Mass Storage Devices, which sometimes have this guest-visible bit set > and sometimes do not. They therefore requires a means for user > configuration. > > Signed-off-by: Stefan Hajnoczi Should we print an error message when the user tries to make a CD-ROM non-removable instead of silently ignoring the option? Kevin
[Qemu-devel] Re: Fwd: Re: port-sparc/44389: awk failure during m4 installation with pkgsrc
Blue Swirl gmail.com> writes: > Mateusz Loskot loskot.net> wrote: > > Hi, > > > > I emulate SPARC with NetBSD 5.0 installed and I've experienced a problem > > with pkgsrc installing one of packages. > > I submitted bug report to NetBSD team and received short response > > suggesting it is likely QEMU problem (see original message below). > > > > Shortly, the problem is that AWK throws "floating point exception" > > Here is the bug report with details: > > > > http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389 > > > > Given the nature of the problem, it "feels" the issue is unrelated to > > QEMU, but I'm not on position to judge the suggestion posted by NetBSD > > team is pointless. > > > > I have no idea how to isolate the problem on NetBSD, so I replied to > > NetBSD team asking: > > === > > Could you suggest how to isolate the problem, find exact awk command > > causing the error, so I can debug it or forward to QEMU developers with > > details necessary to reproduce it? > > === > > > > In the meantime, I thought I will attack qemu-devel hoping it may ring > > some bells here as well. > > > > Any ideas? > > It's entirely possible that the floating point support for Sparc can > be buggy, there have been a few fixes to softfloat recently for other > architectures. > > I'd check if NaN handling or floating point exception support works > correctly, regular programs don't use those. But a reproducible test > case is essential. I think I have managed to isolate reproducible test for this problem and it seems the issue is in NetBSD userland. See my last update to the problem reprot: http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389 Long story short, the problem boils down to the following command: # echo NaN | awk '{print "test"}' awk: floating point exception 8 source line number 1 which interprets "NaN" as a number when it isn't asked to. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org Member of ACCU, http://accu.org
Re: [Qemu-devel] [PATCH 1/3] mips: Break TBs after mfc0_count
On Tue, Jan 18, 2011 at 11:34:28AM +0100, Aurelien Jarno wrote: > On Tue, Jan 18, 2011 at 12:29:40AM +0100, edgar.igles...@gmail.com wrote: > > From: Edgar E. Iglesias > > > > Break the TB after reading the count register. This makes it > > possible to take timer interrupts immediately after a read of > > a possibly expired timer. > > > > Signed-off-by: Edgar E. Iglesias > > --- > > target-mips/translate.c |4 +++- > > 1 files changed, 3 insertions(+), 1 deletions(-) > > > > diff --git a/target-mips/translate.c b/target-mips/translate.c > > index cce77be..313cc29 100644 > > --- a/target-mips/translate.c > > +++ b/target-mips/translate.c > > @@ -3410,8 +3410,10 @@ static void gen_mfc0 (CPUState *env, DisasContext > > *ctx, TCGv arg, int reg, int s > > gen_helper_mfc0_count(arg); > > if (use_icount) { > > gen_io_end(); > > -ctx->bstate = BS_STOP; > > } > > +/* Break the TB to be able to take timer interrupts immediately > > + after reading count. */ > > +ctx->bstate = BS_STOP; > > rn = "Count"; > > break; > > /* 6,7 are implementation dependent */ > > This looks fine, however it should probably be done the same way for > dmfc0 on 64-bit MIPS. Thanks for the quick review, I've pushed the series with this suggested change. Cheers
[Qemu-devel] Re: [PATCH v2 0/5] iohandlers: Add support for enabling/disabling individual handlers
On (Mon) Jan 17 2011 [08:57:16], Anthony Liguori wrote: > >Also -- this patchset was prompted by a bug in qemu chardevs that > >freezes guests if they write faster than the chardevs can consume. > >What should the strategy on fixing that bug be? > > Fix the loop API such that we're not just fixing one bug but that we > can address a bunch of other bugs that are out there. But what's the right fix? * Pull in glib or some other library * Mimic API from some other library Amit
[Qemu-devel] [PATCH v3 0/3] qcow2 metadata cache
block-queue turned out to be too big effort to be useful for quickly fixing the performance problems that qcow2 got since we introduced the metadata flushes. While I still think the idea is right, it needs more time and qcow2 doesn't have more time. Let's come back to block-queue later when the most urgent qcow2 problems are fixed. So this is the idea of block-queue applied to the very specific case of qcow2. Whereas block-queue tried to be a generic solution for all kind of things and tried to make all writes asynchronous at the same time, this is only about batching writes to refcount blocks and L2 tables in qcow2 and getting the dependencies right. (Yes, the L1 table and refcount table is left alone. They are almost never written to anyway.) This should be much easier to understand and review, and I myself feel a bit more confident about it than with block-queue, too. v1: - Don't read newly allocated tables from the disk before memsetting them to zero v2: - Addressed Stefan's review comments - Added patch 3 to avoid an unnecessary bdrv_flush after COW v3: - Some error path fixes (esp. missing or double qcow2_cache_put) Kevin Wolf (3): qcow2: Add QcowCache qcow2: Use QcowCache qcow2: Batch flushes for COW Makefile.objs |2 +- block/qcow2-cache.c| 304 block/qcow2-cluster.c | 206 - block/qcow2-refcount.c | 249 block/qcow2.c | 48 +++- block/qcow2.h | 32 +- 6 files changed, 546 insertions(+), 295 deletions(-) create mode 100644 block/qcow2-cache.c -- 1.7.2.3
[Qemu-devel] [PATCH v3 1/3] qcow2: Add QcowCache
This adds some new cache functions to qcow2 which can be used for caching refcount blocks and L2 tables. When used with cache=writethrough they work like the old caching code which is spread all over qcow2, so for this case we have merely a cleanup. The interesting case is with writeback caching (this includes cache=none) where data isn't written to disk immediately but only kept in cache initially. This leads to some form of metadata write batching which avoids the current "write to refcount block, flush, write to L2 table" pattern for each single request when a lot of cluster allocations happen. Instead, cache entries are only written out if its required to maintain the right order. In the pure cluster allocation case this means that all metadata updates for requests are done in memory initially and on sync, first the refcount blocks are written to disk, then fsync, then L2 tables. This improves performance of scenarios with lots of cluster allocations noticably (e.g. installation or after taking a snapshot). Signed-off-by: Kevin Wolf --- Makefile.objs |2 +- block/qcow2-cache.c | 290 +++ block/qcow2.h | 19 3 files changed, 310 insertions(+), 1 deletions(-) create mode 100644 block/qcow2-cache.c diff --git a/Makefile.objs b/Makefile.objs index c3e52c5..65889c9 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -19,7 +19,7 @@ block-obj-$(CONFIG_POSIX) += posix-aio-compat.o block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat.o -block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o +block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o block-nested-y += qed-check.o block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c new file mode 100644 index 000..f7c4e2a --- /dev/null +++ b/block/qcow2-cache.c @@ -0,0 +1,290 @@ +/* + * L2/refcount table cache for the QCOW2 format + * + * Copyright (c) 2010 Kevin Wolf + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "block_int.h" +#include "qemu-common.h" +#include "qcow2.h" + +typedef struct Qcow2CachedTable { +void* table; +int64_t offset; +booldirty; +int cache_hits; +int ref; +} Qcow2CachedTable; + +struct Qcow2Cache { +int size; +Qcow2CachedTable* entries; +struct Qcow2Cache* depends; +boolwritethrough; +}; + +Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables, +bool writethrough) +{ +BDRVQcowState *s = bs->opaque; +Qcow2Cache *c; +int i; + +c = qemu_mallocz(sizeof(*c)); +c->size = num_tables; +c->entries = qemu_mallocz(sizeof(*c->entries) * num_tables); +c->writethrough = writethrough; + +for (i = 0; i < c->size; i++) { +c->entries[i].table = qemu_blockalign(bs, s->cluster_size); +} + +return c; +} + +int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c) +{ +int i; + +for (i = 0; i < c->size; i++) { +assert(c->entries[i].ref == 0); +qemu_vfree(c->entries[i].table); +} + +qemu_free(c->entries); +qemu_free(c); + +return 0; +} + +static int qcow2_cache_flush_dependency(BlockDriverState *bs, Qcow2Cache *c) +{ +int ret; + +ret = qcow2_cache_flush(bs, c->depends); +if (ret < 0) { +return ret; +} + +c->depends = NULL; +return 0; +} + +static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i) +{ +BDRVQcowState *s = bs->opaque; +int ret; + +if (!c->entries[i].dirty || !c->entries[i].offset) { +return 0; +} + +if (c->depends) { +ret = qcow2_cache_flush_dependency(
[Qemu-devel] [PATCH v3 3/3] qcow2: Batch flushes for COW
qcow2 calls bdrv_flush() after performing COW in order to ensure that the L2 table change is never written before the copy is safe on disk. Now that the L2 table is cached, we can wait with flushing until we write out the next L2 table. Signed-off-by: Kevin Wolf --- block/qcow2-cache.c | 20 +--- block/qcow2-cluster.c |2 +- block/qcow2.h |1 + 3 files changed, 19 insertions(+), 4 deletions(-) diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c index f7c4e2a..57626c1 100644 --- a/block/qcow2-cache.c +++ b/block/qcow2-cache.c @@ -38,6 +38,7 @@ struct Qcow2Cache { int size; Qcow2CachedTable* entries; struct Qcow2Cache* depends; +booldepends_on_flush; boolwritethrough; }; @@ -85,13 +86,15 @@ static int qcow2_cache_flush_dependency(BlockDriverState *bs, Qcow2Cache *c) } c->depends = NULL; +c->depends_on_flush = false; + return 0; } static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i) { BDRVQcowState *s = bs->opaque; -int ret; +int ret = 0; if (!c->entries[i].dirty || !c->entries[i].offset) { return 0; @@ -99,11 +102,17 @@ static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i) if (c->depends) { ret = qcow2_cache_flush_dependency(bs, c); -if (ret < 0) { -return ret; +} else if (c->depends_on_flush) { +ret = bdrv_flush(bs->file); +if (ret >= 0) { +c->depends_on_flush = false; } } +if (ret < 0) { +return ret; +} + ret = bdrv_pwrite(bs->file, c->entries[i].offset, c->entries[i].table, s->cluster_size); if (ret < 0) { @@ -161,6 +170,11 @@ int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, return 0; } +void qcow2_cache_depends_on_flush(Qcow2Cache *c) +{ +c->depends_on_flush = true; +} + static int qcow2_cache_find_entry_to_replace(Qcow2Cache *c) { int i; diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 5eaf8e2..10c40ba 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -638,7 +638,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m) * handled. */ if (cow) { -bdrv_flush(bs->file); +qcow2_cache_depends_on_flush(s->l2_table_cache); } qcow2_cache_set_dependency(bs, s->l2_table_cache, s->refcount_block_cache); diff --git a/block/qcow2.h b/block/qcow2.h index 11cbce3..6d80120 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -229,6 +229,7 @@ void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table); int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c); int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c, Qcow2Cache *dependency); +void qcow2_cache_depends_on_flush(Qcow2Cache *c); int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset, void **table); -- 1.7.2.3
[Qemu-devel] [PATCH v3 2/3] qcow2: Use QcowCache
Use the new functions of qcow2-cache.c for everything that works on refcount block and L2 tables. Signed-off-by: Kevin Wolf --- block/qcow2-cluster.c | 206 ++-- block/qcow2-refcount.c | 249 +++- block/qcow2.c | 48 +- block/qcow2.h | 12 ++- 4 files changed, 221 insertions(+), 294 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index c3ef550..5eaf8e2 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -67,7 +67,11 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size) qemu_free(new_l1_table); return new_l1_table_offset; } -bdrv_flush(bs->file); + +ret = qcow2_cache_flush(bs, s->refcount_block_cache); +if (ret < 0) { +return ret; +} BLKDBG_EVENT(bs->file, BLKDBG_L1_GROW_WRITE_TABLE); for(i = 0; i < s->l1_size; i++) @@ -98,63 +102,6 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size) return ret; } -void qcow2_l2_cache_reset(BlockDriverState *bs) -{ -BDRVQcowState *s = bs->opaque; - -memset(s->l2_cache, 0, s->l2_size * L2_CACHE_SIZE * sizeof(uint64_t)); -memset(s->l2_cache_offsets, 0, L2_CACHE_SIZE * sizeof(uint64_t)); -memset(s->l2_cache_counts, 0, L2_CACHE_SIZE * sizeof(uint32_t)); -} - -static inline int l2_cache_new_entry(BlockDriverState *bs) -{ -BDRVQcowState *s = bs->opaque; -uint32_t min_count; -int min_index, i; - -/* find a new entry in the least used one */ -min_index = 0; -min_count = 0x; -for(i = 0; i < L2_CACHE_SIZE; i++) { -if (s->l2_cache_counts[i] < min_count) { -min_count = s->l2_cache_counts[i]; -min_index = i; -} -} -return min_index; -} - -/* - * seek_l2_table - * - * seek l2_offset in the l2_cache table - * if not found, return NULL, - * if found, - * increments the l2 cache hit count of the entry, - * if counter overflow, divide by two all counters - * return the pointer to the l2 cache entry - * - */ - -static uint64_t *seek_l2_table(BDRVQcowState *s, uint64_t l2_offset) -{ -int i, j; - -for(i = 0; i < L2_CACHE_SIZE; i++) { -if (l2_offset == s->l2_cache_offsets[i]) { -/* increment the hit count */ -if (++s->l2_cache_counts[i] == 0x) { -for(j = 0; j < L2_CACHE_SIZE; j++) { -s->l2_cache_counts[j] >>= 1; -} -} -return s->l2_cache + (i << s->l2_bits); -} -} -return NULL; -} - /* * l2_load * @@ -169,33 +116,12 @@ static int l2_load(BlockDriverState *bs, uint64_t l2_offset, uint64_t **l2_table) { BDRVQcowState *s = bs->opaque; -int min_index; int ret; -/* seek if the table for the given offset is in the cache */ - -*l2_table = seek_l2_table(s, l2_offset); -if (*l2_table != NULL) { -return 0; -} - -/* not found: load a new entry in the least used one */ - -min_index = l2_cache_new_entry(bs); -*l2_table = s->l2_cache + (min_index << s->l2_bits); - BLKDBG_EVENT(bs->file, BLKDBG_L2_LOAD); -ret = bdrv_pread(bs->file, l2_offset, *l2_table, -s->l2_size * sizeof(uint64_t)); -if (ret < 0) { -qcow2_l2_cache_reset(bs); -return ret; -} +ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, (void**) l2_table); -s->l2_cache_offsets[min_index] = l2_offset; -s->l2_cache_counts[min_index] = 1; - -return 0; +return ret; } /* @@ -238,7 +164,6 @@ static int write_l1_entry(BlockDriverState *bs, int l1_index) static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) { BDRVQcowState *s = bs->opaque; -int min_index; uint64_t old_l2_offset; uint64_t *l2_table; int64_t l2_offset; @@ -252,29 +177,48 @@ static int l2_allocate(BlockDriverState *bs, int l1_index, uint64_t **table) if (l2_offset < 0) { return l2_offset; } -bdrv_flush(bs->file); + +ret = qcow2_cache_flush(bs, s->refcount_block_cache); +if (ret < 0) { +goto fail; +} /* allocate a new entry in the l2 cache */ -min_index = l2_cache_new_entry(bs); -l2_table = s->l2_cache + (min_index << s->l2_bits); +ret = qcow2_cache_get_empty(bs, s->l2_table_cache, l2_offset, (void**) table); +if (ret < 0) { +return ret; +} + +l2_table = *table; if (old_l2_offset == 0) { /* if there was no old l2 table, clear the new table */ memset(l2_table, 0, s->l2_size * sizeof(uint64_t)); } else { +uint64_t* old_table; + /* if there was an old l2 table, read it from the disk */ BLKDBG_EVENT(bs->file, BLKDBG_L2_ALLOC_COW_READ); -ret = bdrv_pread(bs->file, old_l2_offset, l2_table, -s->l2_size * sizeof(uint64_t)); +ret = qcow2_cache
[Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts
On 18.01.2011, at 10:08, Gerd Hoffmann wrote: > Hi, > >>> Worse might also be that unknown issue that force you to inject an IRQ >>> here. We don't know. That's probably worst. >> >> Well, IIRC the issue was that usually a level high interrupt line would >> simply retrigger an interrupt after enabling the interrupt line in the >> APIC again. > > edge triggered interrupts wouldn't though. The code change doesn't change anything for edge triggered interrupts - they work fine. Only !msi (== level) ones are broken. > >> Unless my memory completely fails on me, that didn't happen, >> so I added the manual retrigger on a partial command ACK in ahci code. > > That re-trigger smells like you are not clearing the interrupt line where you > should. For starters try calling ahci_check_irq() after guest writes to > PORT_IRQ_STAT. The problem happened when I had the following: ahci irq bits = 0 ahci irq bits = 1 | 2 irq line trigger guest clears 1 ahci bits = 2 On normal hardware, the high state of the irq line would simply trigger another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it doesn't get triggered here. I don't see why clearing the interrupt line would help? It's not what real hardware would do, no? Alex
Re: [Qemu-devel] [PATCH v2 1/3] scsi-disk: Allow overriding SCSI INQUIRY removable bit
On Tue, Jan 18, 2011 at 11:39 AM, Kevin Wolf wrote: > Am 18.01.2011 11:10, schrieb Stefan Hajnoczi: >> Provide the "removable" qdev property bit to override the SCSI INQUIRY >> removable (RMB) bit for non-CDROM devices. This will be used by USB >> Mass Storage Devices, which sometimes have this guest-visible bit set >> and sometimes do not. They therefore requires a means for user >> configuration. >> >> Signed-off-by: Stefan Hajnoczi > > Should we print an error message when the user tries to make a CD-ROM > non-removable instead of silently ignoring the option? Good point. I will add a check in scsi_disk_initfn() for v3. Stefan
[Qemu-devel] Re: [PATCH 2/8] ahci: split ICH and AHCI even more
Am 20.12.2010 22:13, schrieb Alexander Graf: > Sebastian's patch already did a pretty good job at splitting up ICH-9 > AHCI code and the AHCI core. We need some more though. Copyright was missing, > the lspci dump belongs to ICH-9, we don't need the AHCI core to have its > own qdev device duplicate. > > So let's split them a bit more in this patch, making things easier to > read an understand. > > Signed-off-by: Alexander Graf The license header is still missing in ahci.h. Kevin
[Qemu-devel] Re: [PATCH 3/8] ahci: send init d2h fis on fis enable
Am 20.12.2010 22:13, schrieb Alexander Graf: > The drive sends a d2h init fis on initialization. Usually, the guest doesn't > receive fises yet at that point though, so the delivery is deferred. > > Let's reflect that by sending the init fis on fis receive enablement. > > Signed-off-by: Alexander Graf Hm... If I read the spec right, the real solution wouldn't be an init_d2h_sent flag, but implementing a queue for FISes that are received when the guest hasn't set FRE yet. I'm not against taking a hack like this, but maybe leave a comment somewhere at least. Kevin
[Qemu-devel] Re: [PATCH V2 1/3] Introduce log_start/log_stop in CPUPhysMemoryClient
From: Anthony PERARD In order to use log_start/log_stop with Xen as well in the vga code, this two operations have been put in CPUPhysMemoryClient. The two new functions cpu_physical_log_start,cpu_physical_log_stop are used in hw/vga.c and replace the kvm_log_start/stop. With this, vga does no longer depends on kvm header. Signed-off-by: Anthony PERARD --- cpu-all.h|6 ++ cpu-common.h |4 exec.c | 28 hw/vga.c | 31 --- hw/vhost.c |2 ++ kvm-all.c|8 ++-- kvm-stub.c | 10 -- kvm.h|3 --- 8 files changed, 62 insertions(+), 30 deletions(-) diff --git a/cpu-all.h b/cpu-all.h index 30ae17d..aaf5442 100644 --- a/cpu-all.h +++ b/cpu-all.h @@ -957,6 +957,12 @@ int cpu_physical_memory_get_dirty_tracking(void); int cpu_physical_sync_dirty_bitmap(target_phys_addr_t start_addr, target_phys_addr_t end_addr); +int cpu_physical_log_start(target_phys_addr_t start_addr, + ram_addr_t size); + +int cpu_physical_log_stop(target_phys_addr_t start_addr, + ram_addr_t size); + void dump_exec_info(FILE *f, fprintf_function cpu_fprintf); #endif /* !CONFIG_USER_ONLY */ diff --git a/cpu-common.h b/cpu-common.h index 8ec01f4..2344842 100644 --- a/cpu-common.h +++ b/cpu-common.h @@ -88,6 +88,10 @@ struct CPUPhysMemoryClient { target_phys_addr_t end_addr); int (*migration_log)(struct CPUPhysMemoryClient *client, int enable); +int (*log_start)(struct CPUPhysMemoryClient *client, + target_phys_addr_t phys_addr, ram_addr_t size); +int (*log_stop)(struct CPUPhysMemoryClient *client, +target_phys_addr_t phys_addr, ram_addr_t size); QLIST_ENTRY(CPUPhysMemoryClient) list; }; diff --git a/exec.c b/exec.c index c6ed96d..59a6426 100644 --- a/exec.c +++ b/exec.c @@ -2073,6 +2073,34 @@ int cpu_physical_sync_dirty_bitmap(target_phys_addr_t start_addr, return ret; } +int cpu_physical_log_start(target_phys_addr_t start_addr, + ram_addr_t size) +{ +CPUPhysMemoryClient *client; +QLIST_FOREACH(client, &memory_client_list, list) { +if (client->log_start) { +int r = client->log_start(client, start_addr, size); +if (r < 0) +return r; +} +} +return 0; +} + +int cpu_physical_log_stop(target_phys_addr_t start_addr, + ram_addr_t size) +{ +CPUPhysMemoryClient *client; +QLIST_FOREACH(client, &memory_client_list, list) { +if (client->log_stop) { +int r = client->log_stop(client, start_addr, size); +if (r < 0) +return r; +} +} +return 0; +} + static inline void tlb_update_dirty(CPUTLBEntry *tlb_entry) { ram_addr_t ram_addr; diff --git a/hw/vga.c b/hw/vga.c index c057f4f..a9bf172 100644 --- a/hw/vga.c +++ b/hw/vga.c @@ -28,7 +28,6 @@ #include "vga_int.h" #include "pixel_ops.h" #include "qemu-timer.h" -#include "kvm.h" //#define DEBUG_VGA //#define DEBUG_VGA_MEM @@ -1597,34 +1596,36 @@ static void vga_sync_dirty_bitmap(VGACommonState *s) void vga_dirty_log_start(VGACommonState *s) { -if (kvm_enabled() && s->map_addr) -kvm_log_start(s->map_addr, s->map_end - s->map_addr); +if (s->map_addr) { +cpu_physical_log_start(s->map_addr, s->map_end - s->map_addr); +} -if (kvm_enabled() && s->lfb_vram_mapped) { -kvm_log_start(isa_mem_base + 0xa, 0x8000); -kvm_log_start(isa_mem_base + 0xa8000, 0x8000); +if (s->lfb_vram_mapped) { +cpu_physical_log_start(isa_mem_base + 0xa, 0x8000); +cpu_physical_log_start(isa_mem_base + 0xa8000, 0x8000); } #ifdef CONFIG_BOCHS_VBE -if (kvm_enabled() && s->vbe_mapped) { -kvm_log_start(VBE_DISPI_LFB_PHYSICAL_ADDRESS, s->vram_size); +if (s->vbe_mapped) { +cpu_physical_log_start(VBE_DISPI_LFB_PHYSICAL_ADDRESS, s->vram_size); } #endif } void vga_dirty_log_stop(VGACommonState *s) { -if (kvm_enabled() && s->map_addr) - kvm_log_stop(s->map_addr, s->map_end - s->map_addr); +if (s->map_addr) { + cpu_physical_log_stop(s->map_addr, s->map_end - s->map_addr); +} -if (kvm_enabled() && s->lfb_vram_mapped) { - kvm_log_stop(isa_mem_base + 0xa, 0x8000); - kvm_log_stop(isa_mem_base + 0xa8000, 0x8000); +if (s->lfb_vram_mapped) { + cpu_physical_log_stop(isa_mem_base + 0xa, 0x8000); + cpu_physical_log_stop(isa_mem_base + 0xa8000, 0x8000); } #ifdef CONFIG_BOCHS_VBE -if (kvm_enabled() && s->vbe_mapped) { - kvm_log_stop(VBE_DISPI_LFB_PHYSICAL_ADDRESS, s->vram_size); +if (s->vbe_mapped) { + cpu_physical_log_stop(VBE_DISPI_LFB_PHYSICAL_ADDRESS, s->vram_size); } #endif } diff --git a/hw/vhost.
Re: [Qemu-devel] [PATCH 1/3] block: add resize monitor command
On Mon, Jan 17, 2011 at 06:36:15PM +0100, Kevin Wolf wrote: > size should be 'o' instead of 'l'. The latter may be too small on 32 bit > hosts and doesn't support convenient suffixes: Fixed. > Hm, is there a real reason except that CD-ROMs are read-only? The code > below seems to take read-only devices into account. I've removed the CDROM check for the next version.
[Qemu-devel] Re: [PATCH 4/8] ahci: use qiov instead of dma helpers
Am 20.12.2010 22:13, schrieb Alexander Graf: > The DMA helpers incur additional overhead on data transfers. I'm not > sure we need the additional complexity provided by them. So let's just > use qiovs directly when running in the fast path (ncq). > > Signed-off-by: Alexander Graf > --- > hw/ide/ahci.c | 100 > hw/ide/ahci.h |3 ++ > 2 files changed, 95 insertions(+), 8 deletions(-) I don't feel comfortable with this one, and I think a while ago we discussed on IRC why the DMA helpers even exist. If AHCI doesn't need them, probably nobody needed them. However, I'm inclined to think that AHCI actually _does_ need them in corner cases, even though it might not break in all the common cases that you have tested. Can you explain why only AHCI doesn't need it or is it just "didn't break for me in practice"? Where does the overhead in the DMA helpers come from? Can we optimize this code instead of making the device emulation less correct? Kevin
Re: [Qemu-devel] [PATCH 0/3] allow online resizing of block devices
On Fri, 14 Jan 2011 17:20:44 +0100 Christoph Hellwig wrote: > This patchset adds support for online resizing of block devices. > > The first patch adds a new resize monitor command which call into > the existing image resize code. This is the meat of the series > and probably needs quite a bit of review and help as I'm not sure > about how to implement the error handling for monitor commands > correctly. Am I really supposed to add a new QERR_ definition > for each possible error? And if yes how am I supposed to define > them? The macros for them aren't exactly self-explaining. Well, what happens is this: we screwed up with that interface and we should replace it soon. I see you're not adding the new command in QMP (only in the human monitor), is that intentional? (qmp commands are added to the qmp-commands.hx file). If it's intentional, then using only error_report() should be ok. If you plan to have a qmp version, then we'll have to choose between reporting a generic error version to the client, which is what's going to happen if you use error_report(), or add the QERR_ macros. Markus, what do you think? I feel it's pretty urgent for us to replace that interface.
[Qemu-devel] Re: [PATCH 6/8] ahci: make number of ports runtime determined
Am 20.12.2010 22:13, schrieb Alexander Graf: > Different AHCI controllers have a different number of ports, so the core > shouldn't care about the amount of ports available. > > This patch makes the number of ports available to the AHCI core runtime > configurable, allowing us to have multiple different AHCI implementations > with different amounts of ports. > > Signed-off-by: Alexander Graf > --- > hw/ide/ahci.c | 29 +++-- > hw/ide/ahci.h | 10 +- > hw/ide/ich.c |3 ++- > 3 files changed, 26 insertions(+), 16 deletions(-) > > diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c > index bd4f8a4..c0bc5ff 100644 > --- a/hw/ide/ahci.c > +++ b/hw/ide/ahci.c > @@ -146,7 +146,7 @@ static void ahci_check_irq(AHCIState *s) > > DPRINTF(-1, "check irq %#x\n", s->control_regs.irqstatus); > > -for (i = 0; i < SATA_PORTS; i++) { > +for (i = 0; i < s->ports; i++) { > AHCIPortRegs *pr = &s->dev[i].port_regs; > if (pr->irq_stat & pr->irq_mask) { > s->control_regs.irqstatus |= (1 << i); > @@ -300,7 +300,8 @@ static uint32_t ahci_mem_readl(void *ptr, > target_phys_addr_t addr) > > DPRINTF(-1, "(addr 0x%08X), val 0x%08X\n", (unsigned) addr, val); > } else if ((addr >= AHCI_PORT_REGS_START_ADDR) && > - (addr < AHCI_PORT_REGS_END_ADDR)) { > + (addr < (AHCI_PORT_REGS_START_ADDR + > +(s->ports * AHCI_PORT_ADDR_OFFSET_LEN { > val = ahci_port_read(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7, > addr & AHCI_PORT_ADDR_OFFSET_MASK); > } > @@ -352,7 +353,8 @@ static void ahci_mem_writel(void *ptr, target_phys_addr_t > addr, uint32_t val) > DPRINTF(-1, "write to unknown register 0x%x\n", > (unsigned)addr); > } > } else if ((addr >= AHCI_PORT_REGS_START_ADDR) && > - (addr < AHCI_PORT_REGS_END_ADDR)) { > + (addr < (AHCI_PORT_REGS_START_ADDR + > +(s->ports * AHCI_PORT_ADDR_OFFSET_LEN { > ahci_port_write(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7, > addr & AHCI_PORT_ADDR_OFFSET_MASK, val); > } > @@ -375,16 +377,16 @@ static void ahci_reg_init(AHCIState *s) > { > int i; > > -s->control_regs.cap = (SATA_PORTS - 1) | > +s->control_regs.cap = (s->ports - 1) | >(AHCI_NUM_COMMAND_SLOTS << 8) | >(AHCI_SUPPORTED_SPEED_GEN1 << > AHCI_SUPPORTED_SPEED) | >HOST_CAP_NCQ | HOST_CAP_AHCI; > > -s->control_regs.impl = (1 << SATA_PORTS) - 1; > +s->control_regs.impl = (1 << s->ports) - 1; > > s->control_regs.version = AHCI_VERSION_1_0; > > -for (i = 0; i < SATA_PORTS; i++) { > +for (i = 0; i < s->ports; i++) { > s->dev[i].port_state = STATE_RUN; > } > } > @@ -1177,17 +1179,19 @@ static const IDEDMAOps ahci_dma_ops = { > .reset = ahci_dma_reset, > }; > > -void ahci_init(AHCIState *s, DeviceState *qdev) > +void ahci_init(AHCIState *s, DeviceState *qdev, int ports) > { > qemu_irq *irqs; > int i; > > +s->ports = ports; > +s->dev = qemu_mallocz(sizeof(AHCIDevice) * ports); > ahci_reg_init(s); > s->mem = cpu_register_io_memory(ahci_readfn, ahci_writefn, s, > DEVICE_LITTLE_ENDIAN); > -irqs = qemu_allocate_irqs(ahci_irq_set, s, SATA_PORTS); > +irqs = qemu_allocate_irqs(ahci_irq_set, s, s->ports); > > -for (i = 0; i < SATA_PORTS; i++) { > +for (i = 0; i < s->ports; i++) { > AHCIDevice *ad = &s->dev[i]; > > ide_bus_new(&ad->port, qdev, i); > @@ -1201,6 +1205,11 @@ void ahci_init(AHCIState *s, DeviceState *qdev) > } > } > > +void ahci_uninit(AHCIState *s) > +{ > +qemu_free(s->dev); > +} > + > void ahci_pci_map(PCIDevice *pci_dev, int region_num, > pcibus_t addr, pcibus_t size, int type) > { > @@ -1218,7 +1227,7 @@ void ahci_reset(void *opaque) > d->ahci.control_regs.irqstatus = 0; > d->ahci.control_regs.ghc = 0; > > -for (i = 0; i < SATA_PORTS; i++) { > +for (i = 0; i < d->ahci.ports; i++) { > ahci_reset_port(&d->ahci, i); > } > } > diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h > index 0824064..eb87dcd 100644 > --- a/hw/ide/ahci.h > +++ b/hw/ide/ahci.h > @@ -165,11 +165,9 @@ > #define AHCI_GENERIC_HOST_CONTROL_REGS_MAX_ADDR 0x20 > /* Shouldn't this be 0x2c? */ > > -#define SATA_PORTS 4 > - > #define AHCI_PORT_REGS_START_ADDR 0x100 > -#define AHCI_PORT_REGS_END_ADDR (AHCI_PORT_REGS_START_ADDR + SATA_PORTS * > 0x80) > #define AHCI_PORT_ADDR_OFFSET_MASK 0x7f > +#define AHCI_PORT_ADDR_OFFSET_LEN 0x80 > > #define AHCI_NUM_COMMAND_SLOTS 31 > #define AHCI_SUPPORTED_SPEED 20 > @@ -269,9 +267,10 @@ struct AHCIDevice { > }; > > typede
[Qemu-devel] Re: [PATCH 7/8] ahci: free dynamically allocated iovs
Am 20.12.2010 22:13, schrieb Alexander Graf: > We allocate iovs on the fly now, but also need to free them on uninit. > This patch does that. > > Signed-off-by: Alexander Graf Isn't this a fix for patch 4? If so, please merge it there. Kevin
[Qemu-devel] Re: [PATCH 3/8] ahci: send init d2h fis on fis enable
On 18.01.2011, at 13:25, Kevin Wolf wrote: > Am 20.12.2010 22:13, schrieb Alexander Graf: >> The drive sends a d2h init fis on initialization. Usually, the guest doesn't >> receive fises yet at that point though, so the delivery is deferred. >> >> Let's reflect that by sending the init fis on fis receive enablement. >> >> Signed-off-by: Alexander Graf > > Hm... If I read the spec right, the real solution wouldn't be an > init_d2h_sent flag, but implementing a queue for FISes that are received > when the guest hasn't set FRE yet. > > I'm not against taking a hack like this, but maybe leave a comment > somewhere at least. Yes, they'd get queued. In practice it doesn't really matter that much, which is why the hack works :). But you're right - a comment would be nice. Alex
[Qemu-devel] Re: [PATCH 4/8] ahci: use qiov instead of dma helpers
On 18.01.2011, at 13:35, Kevin Wolf wrote: > Am 20.12.2010 22:13, schrieb Alexander Graf: >> The DMA helpers incur additional overhead on data transfers. I'm not >> sure we need the additional complexity provided by them. So let's just >> use qiovs directly when running in the fast path (ncq). >> >> Signed-off-by: Alexander Graf >> --- >> hw/ide/ahci.c | 100 >> hw/ide/ahci.h |3 ++ >> 2 files changed, 95 insertions(+), 8 deletions(-) > > I don't feel comfortable with this one, and I think a while ago we > discussed on IRC why the DMA helpers even exist. If AHCI doesn't need > them, probably nobody needed them. > > However, I'm inclined to think that AHCI actually _does_ need them in > corner cases, even though it might not break in all the common cases > that you have tested. Can you explain why only AHCI doesn't need it or > is it just "didn't break for me in practice"? > It's the latter. > Where does the overhead in the DMA helpers come from? Can we optimize > this code instead of making the device emulation less correct? Well, dma helpers involve another malloc which is probably the biggest hog. I frankly don't see the point in making it correct for the fast path though. I'd rather like to have a fast block emulation that works with all OSs than an accurate one that emulates something nobody cares about. Virtio for example doesn't use dma helpers either - they just claim it's not defined in the spec. So if virtio-blk gets away with it, it means that all OSs should never make use of the additional complexity. Alex
[Qemu-devel] Re: [PATCH 6/8] ahci: make number of ports runtime determined
On 18.01.2011, at 13:40, Kevin Wolf wrote: > Am 20.12.2010 22:13, schrieb Alexander Graf: >> Different AHCI controllers have a different number of ports, so the core >> shouldn't care about the amount of ports available. >> >> This patch makes the number of ports available to the AHCI core runtime >> configurable, allowing us to have multiple different AHCI implementations >> with different amounts of ports. >> >> Signed-off-by: Alexander Graf >> --- >> hw/ide/ahci.c | 29 +++-- >> hw/ide/ahci.h | 10 +- >> hw/ide/ich.c |3 ++- >> 3 files changed, 26 insertions(+), 16 deletions(-) >> >> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c >> index bd4f8a4..c0bc5ff 100644 >> --- a/hw/ide/ahci.c >> +++ b/hw/ide/ahci.c >> @@ -146,7 +146,7 @@ static void ahci_check_irq(AHCIState *s) >> >> DPRINTF(-1, "check irq %#x\n", s->control_regs.irqstatus); >> >> -for (i = 0; i < SATA_PORTS; i++) { >> +for (i = 0; i < s->ports; i++) { >> AHCIPortRegs *pr = &s->dev[i].port_regs; >> if (pr->irq_stat & pr->irq_mask) { >> s->control_regs.irqstatus |= (1 << i); >> @@ -300,7 +300,8 @@ static uint32_t ahci_mem_readl(void *ptr, >> target_phys_addr_t addr) >> >> DPRINTF(-1, "(addr 0x%08X), val 0x%08X\n", (unsigned) addr, val); >> } else if ((addr >= AHCI_PORT_REGS_START_ADDR) && >> - (addr < AHCI_PORT_REGS_END_ADDR)) { >> + (addr < (AHCI_PORT_REGS_START_ADDR + >> +(s->ports * AHCI_PORT_ADDR_OFFSET_LEN { >> val = ahci_port_read(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7, >> addr & AHCI_PORT_ADDR_OFFSET_MASK); >> } >> @@ -352,7 +353,8 @@ static void ahci_mem_writel(void *ptr, >> target_phys_addr_t addr, uint32_t val) >> DPRINTF(-1, "write to unknown register 0x%x\n", >> (unsigned)addr); >> } >> } else if ((addr >= AHCI_PORT_REGS_START_ADDR) && >> - (addr < AHCI_PORT_REGS_END_ADDR)) { >> + (addr < (AHCI_PORT_REGS_START_ADDR + >> +(s->ports * AHCI_PORT_ADDR_OFFSET_LEN { >> ahci_port_write(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7, >> addr & AHCI_PORT_ADDR_OFFSET_MASK, val); >> } >> @@ -375,16 +377,16 @@ static void ahci_reg_init(AHCIState *s) >> { >> int i; >> >> -s->control_regs.cap = (SATA_PORTS - 1) | >> +s->control_regs.cap = (s->ports - 1) | >> (AHCI_NUM_COMMAND_SLOTS << 8) | >> (AHCI_SUPPORTED_SPEED_GEN1 << >> AHCI_SUPPORTED_SPEED) | >> HOST_CAP_NCQ | HOST_CAP_AHCI; >> >> -s->control_regs.impl = (1 << SATA_PORTS) - 1; >> +s->control_regs.impl = (1 << s->ports) - 1; >> >> s->control_regs.version = AHCI_VERSION_1_0; >> >> -for (i = 0; i < SATA_PORTS; i++) { >> +for (i = 0; i < s->ports; i++) { >> s->dev[i].port_state = STATE_RUN; >> } >> } >> @@ -1177,17 +1179,19 @@ static const IDEDMAOps ahci_dma_ops = { >> .reset = ahci_dma_reset, >> }; >> >> -void ahci_init(AHCIState *s, DeviceState *qdev) >> +void ahci_init(AHCIState *s, DeviceState *qdev, int ports) >> { >> qemu_irq *irqs; >> int i; >> >> +s->ports = ports; >> +s->dev = qemu_mallocz(sizeof(AHCIDevice) * ports); >> ahci_reg_init(s); >> s->mem = cpu_register_io_memory(ahci_readfn, ahci_writefn, s, >> DEVICE_LITTLE_ENDIAN); >> -irqs = qemu_allocate_irqs(ahci_irq_set, s, SATA_PORTS); >> +irqs = qemu_allocate_irqs(ahci_irq_set, s, s->ports); >> >> -for (i = 0; i < SATA_PORTS; i++) { >> +for (i = 0; i < s->ports; i++) { >> AHCIDevice *ad = &s->dev[i]; >> >> ide_bus_new(&ad->port, qdev, i); >> @@ -1201,6 +1205,11 @@ void ahci_init(AHCIState *s, DeviceState *qdev) >> } >> } >> >> +void ahci_uninit(AHCIState *s) >> +{ >> +qemu_free(s->dev); >> +} >> + >> void ahci_pci_map(PCIDevice *pci_dev, int region_num, >> pcibus_t addr, pcibus_t size, int type) >> { >> @@ -1218,7 +1227,7 @@ void ahci_reset(void *opaque) >> d->ahci.control_regs.irqstatus = 0; >> d->ahci.control_regs.ghc = 0; >> >> -for (i = 0; i < SATA_PORTS; i++) { >> +for (i = 0; i < d->ahci.ports; i++) { >> ahci_reset_port(&d->ahci, i); >> } >> } >> diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h >> index 0824064..eb87dcd 100644 >> --- a/hw/ide/ahci.h >> +++ b/hw/ide/ahci.h >> @@ -165,11 +165,9 @@ >> #define AHCI_GENERIC_HOST_CONTROL_REGS_MAX_ADDR 0x20 >> /* Shouldn't this be 0x2c? */ >> >> -#define SATA_PORTS 4 >> - >> #define AHCI_PORT_REGS_START_ADDR 0x100 >> -#define AHCI_PORT_REGS_END_ADDR (AHCI_PORT_REGS_START_ADDR + SATA_PORTS * >> 0x80) >> #define AHCI_PORT_ADDR_OFFSET_MASK 0x7f >> +#define AHCI_PORT_ADDR_OFFSET_LEN 0x80 >> >> #define AHCI_NUM_COMMAND_SLOTS 31
Re: [Qemu-devel] [PATCH 0/3] allow online resizing of block devices
On Tue, Jan 18, 2011 at 10:35:39AM -0200, Luiz Capitulino wrote: > Well, what happens is this: we screwed up with that interface and we > should replace it soon. > > I see you're not adding the new command in QMP (only in the human monitor), > is that intentional? (qmp commands are added to the qmp-commands.hx file). > > If it's intentional, then using only error_report() should be ok. If you > plan to have a qmp version, then we'll have to choose between reporting > a generic error version to the client, which is what's going to happen if > you use error_report(), or add the QERR_ macros. No, I hoped this would also add the QMP interface. Why do we need to declare the commands twice? Especially as hmp-commands.hx is full of magic that makes little sense only for human readable commands. Any help on how to define the QERR_ macros?
[Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts
On 2011-01-18 13:05, Alexander Graf wrote: > > On 18.01.2011, at 10:08, Gerd Hoffmann wrote: > >> Hi, >> Worse might also be that unknown issue that force you to inject an IRQ here. We don't know. That's probably worst. >>> >>> Well, IIRC the issue was that usually a level high interrupt line would >>> simply retrigger an interrupt after enabling the interrupt line in the >>> APIC again. >> >> edge triggered interrupts wouldn't though. > > The code change doesn't change anything for edge triggered interrupts - they > work fine. Only !msi (== level) ones are broken. > >> >>> Unless my memory completely fails on me, that didn't happen, >>> so I added the manual retrigger on a partial command ACK in ahci code. >> >> That re-trigger smells like you are not clearing the interrupt line where >> you should. For starters try calling ahci_check_irq() after guest writes to >> PORT_IRQ_STAT. > > The problem happened when I had the following: > > ahci irq bits = 0 > > ahci irq bits = 1 | 2 > irq line trigger > guest clears 1 > ahci bits = 2 > > > On normal hardware, the high state of the irq line would simply trigger > another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it > doesn't get triggered here. I don't see why clearing the interrupt line would > help? It's not what real hardware would do, no? > No, real devices would simply leave the line asserted as long as there is a reason to. So again my question: With which irqchips do you see this effect, kvm's in-kernel model and/or qemu's user space model? If there is an issue with retriggering a CPU interrupt while the source is still asserted, that probably needs to be fixed. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] Re: [PATCH 8/8] ahci: fix !msi interrupts
Hi, edge triggered interrupts wouldn't though. The code change doesn't change anything for edge triggered interrupts - they work fine. Only !msi (== level) ones are broken. apic irqs can be both edge and level triggered too ... That re-trigger smells like you are not clearing the interrupt line where you should. For starters try calling ahci_check_irq() after guest writes to PORT_IRQ_STAT. The problem happened when I had the following: ahci irq bits = 0 ahci irq bits = 1 | 2 irq line trigger guest clears 1 ahci bits = 2 On normal hardware, the high state of the irq line would simply trigger another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it doesn't get triggered here. I don't see why clearing the interrupt line would help? It's not what real hardware would do, no? I think the guest and the ahci emulation simply disagree on whenever a irq is pending or not. Guest thinks it has cleared the IRQ, but the code path it has taken didn't trigger ahci_irq_lower(). It is probably related to how the per-port and global irq status play together. It isn't covered very well in the specs :-( If an interrupt condition happens a bit in pr->irq_stat will be set. When the contition is enabled the port bit in irqstatus will be set, which in turn will trigger an irq. That is the easy part. Now the guest checks irqstatus to find the port, then checks pr->irq_stat to find the condition, acks it by clearing the bits it has seen in pr->irq_stat. What does happen with the port bit in irqstatus now? Will it be cleared automatically? Must the guest clear it explicitly? What happens in case another irq condition happened which the guest didn't ack (yet), will the guest be able to clear the bit? cheers, Gerd
[Qemu-devel] [PATCH] target-arm: Log instruction start in TCG code
Add support for logging the start of instructions in TCG code debug dumps for ARM targets. Signed-off-by: Peter Maydell --- target-arm/translate.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 907d73a..c60cd18 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -9199,6 +9199,10 @@ static inline void gen_intermediate_code_internal(CPUState *env, if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) gen_io_start(); +if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) { +tcg_gen_debug_insn_start(dc->pc); +} + if (dc->thumb) { disas_thumb_insn(env, dc); if (dc->condexec_mask) { -- 1.7.1
Re: [Qemu-devel] Re: [PATCH 6/8] ahci: make number of ports runtime determined
Hi, diff --git a/hw/ide/ich.c b/hw/ide/ich.c index 70cb766..f242d7a 100644 --- a/hw/ide/ich.c +++ b/hw/ide/ich.c @@ -100,7 +100,7 @@ static int pci_ich9_ahci_init(PCIDevice *dev) msi_init(dev, 0x50, 1, true, false); -ahci_init(&d->ahci,&dev->qdev); +ahci_init(&d->ahci,&dev->qdev, 6); d->ahci.irq = d->card.irq[0]; What about a qdev property instead of just hardcoding the value somewhere else? That particular piece of emulated hardware (ich9-ahci) actually has 6 ports. ich7 has 4 ports. A thin hardware-specific wrapper which hardcodes this stuff (and pci ids and maybe some other minor differences) looks fine to me. Maybe alex should add a ich7 variant just to prove the point ;) cheers, Gerd
[Qemu-devel] Re: [PATCH 4/8] ahci: use qiov instead of dma helpers
On Tue, Jan 18, 2011 at 01:45:40PM +0100, Alexander Graf wrote: > > On 18.01.2011, at 13:35, Kevin Wolf wrote: > > > Am 20.12.2010 22:13, schrieb Alexander Graf: > >> The DMA helpers incur additional overhead on data transfers. I'm not > >> sure we need the additional complexity provided by them. So let's just > >> use qiovs directly when running in the fast path (ncq). > >> > >> Signed-off-by: Alexander Graf > >> --- > >> hw/ide/ahci.c | 100 > >> > >> hw/ide/ahci.h |3 ++ > >> 2 files changed, 95 insertions(+), 8 deletions(-) > > > > I don't feel comfortable with this one, and I think a while ago we > > discussed on IRC why the DMA helpers even exist. If AHCI doesn't need > > them, probably nobody needed them. > > > > However, I'm inclined to think that AHCI actually _does_ need them in > > corner cases, even though it might not break in all the common cases > > that you have tested. Can you explain why only AHCI doesn't need it or > > is it just "didn't break for me in practice"? > > > > It's the latter. > > > Where does the overhead in the DMA helpers come from? Can we optimize > > this code instead of making the device emulation less correct? > > Well, dma helpers involve another malloc which is probably the biggest hog. I > frankly don't see the point in making it correct for the fast path though. > I'd rather like to have a fast block emulation that works with all OSs than > an accurate one that emulates something nobody cares about. > > Virtio for example doesn't use dma helpers either - they just claim it's not > defined in the spec. So if virtio-blk gets away with it, it means that all > OSs should never make use of the additional complexity. >From what I can tell DMA helpers is common AIO request code plus: 1. It handles map failure using cpu_register_map_client(). 2. It handles short maps that are unable to map a full sglist element. These two requirements are due to QEMU's guest memory mapping API. IIUC the limitations on that API are due to a limited amount of bounce buffer space being used for some targets allowing mapping of non-RAM memory. Perhaps this means that AHCI does not work on those targets if you decide to send non-RAM pages to disk? I'd be interested in understanding how this all works and how the QEMU RAM API that Anthony and Alex Williamson have been playing with comes into play. Stefan
Re: [Qemu-devel] [PATCH 0/3] allow online resizing of block devices
On Tue, 18 Jan 2011 13:48:06 +0100 Christoph Hellwig wrote: > On Tue, Jan 18, 2011 at 10:35:39AM -0200, Luiz Capitulino wrote: > > Well, what happens is this: we screwed up with that interface and we > > should replace it soon. > > > > I see you're not adding the new command in QMP (only in the human monitor), > > is that intentional? (qmp commands are added to the qmp-commands.hx file). > > > > If it's intentional, then using only error_report() should be ok. If you > > plan to have a qmp version, then we'll have to choose between reporting > > a generic error version to the client, which is what's going to happen if > > you use error_report(), or add the QERR_ macros. > > No, I hoped this would also add the QMP interface. Why do we need to > declare the commands twice? Because they're different interfaces and should not be tied to each other, they have different name spaces for example and commands in QMP might have no relation to commands in HMP and vice-versa. > Especially as hmp-commands.hx is full of > magic that makes little sense only for human readable commands. Yes, the reason for that is that we're in the middle of a refactoring, but I have around 100 patches that makes a clean separation between the two and creates a cleaner interface (specially for HMP). > Any help on how to define the QERR_ macros? I think we should emit a generic error in QMP for now, I don't think it's a good idea to expend the bad interface we have today.
[Qemu-devel] [PATCH v2 3/3] usb-msd: Propagate removable bit to SCSI device
USB Mass Storage Devices sometimes have the RMB (removable) bit set in the SCSI INQUIRY response. Thumbdrives tend to have the bit set whereas hard disks do not. Operating systems differentiate between removable devices and fixed devices. Under Linux, the anaconda installer looks for removable devices. Under Windows, only fixed devices may have more than one partition and AutoRun is also affected by the removable bit. For these reasons, allow USB Mass Storage Devices to override the removable bit: qemu -usb -drive if=none,file=test.img,cache=none,id=disk0 -device usb-storage,drive=disk0,removable=on The default is off. Signed-off-by: Stefan Hajnoczi --- hw/usb-msd.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/hw/usb-msd.c b/hw/usb-msd.c index ee897b6..f593c46 100644 --- a/hw/usb-msd.c +++ b/hw/usb-msd.c @@ -50,6 +50,7 @@ typedef struct { SCSIBus bus; BlockConf conf; SCSIDevice *scsi_dev; +uint32_t removable; int result; /* For async completion. */ USBPacket *packet; @@ -544,7 +545,7 @@ static int usb_msd_initfn(USBDevice *dev) s->dev.speed = USB_SPEED_FULL; scsi_bus_new(&s->bus, &s->dev.qdev, 0, 1, usb_msd_command_complete); -s->scsi_dev = scsi_bus_legacy_add_drive(&s->bus, bs, 0, false); +s->scsi_dev = scsi_bus_legacy_add_drive(&s->bus, bs, 0, !!s->removable); if (!s->scsi_dev) { return -1; } @@ -634,6 +635,7 @@ static struct USBDeviceInfo msd_info = { .usbdevice_init = usb_msd_init, .qdev.props = (Property[]) { DEFINE_BLOCK_PROPERTIES(MSDState, conf), +DEFINE_PROP_BIT("removable", MSDState, removable, 1, false), DEFINE_PROP_END_OF_LIST(), }, }; -- 1.7.2.3
[Qemu-devel] Re: [PATCH 4/8] ahci: use qiov instead of dma helpers
Am 18.01.2011 14:14, schrieb Stefan Hajnoczi: > On Tue, Jan 18, 2011 at 01:45:40PM +0100, Alexander Graf wrote: >> >> On 18.01.2011, at 13:35, Kevin Wolf wrote: >> >>> Am 20.12.2010 22:13, schrieb Alexander Graf: The DMA helpers incur additional overhead on data transfers. I'm not sure we need the additional complexity provided by them. So let's just use qiovs directly when running in the fast path (ncq). Signed-off-by: Alexander Graf --- hw/ide/ahci.c | 100 hw/ide/ahci.h |3 ++ 2 files changed, 95 insertions(+), 8 deletions(-) >>> >>> I don't feel comfortable with this one, and I think a while ago we >>> discussed on IRC why the DMA helpers even exist. If AHCI doesn't need >>> them, probably nobody needed them. >>> >>> However, I'm inclined to think that AHCI actually _does_ need them in >>> corner cases, even though it might not break in all the common cases >>> that you have tested. Can you explain why only AHCI doesn't need it or >>> is it just "didn't break for me in practice"? >>> >> >> It's the latter. >> >>> Where does the overhead in the DMA helpers come from? Can we optimize >>> this code instead of making the device emulation less correct? >> >> Well, dma helpers involve another malloc which is probably the biggest hog. If you can get around this malloc in AHCI (IIUC, you do it by reusing the old sglist buffer?), we can probably do something similar in the generic DMA helper code and have IDE etc. benefit from it as well. > I frankly don't see the point in making it correct for the fast path though. > I'd rather like to have a fast block emulation that works with all OSs than > an accurate one that emulates something nobody cares about. Sorry, but "Windows and Linux" != "all OSes". I'm sure there is a way that gives us correctness _and_ performance. >> Virtio for example doesn't use dma helpers either - they just claim it's not >> defined in the spec. So if virtio-blk gets away with it, it means that all >> OSs should never make use of the additional complexity. > > From what I can tell DMA helpers is common AIO request code plus: > > 1. It handles map failure using cpu_register_map_client(). > 2. It handles short maps that are unable to map a full sglist element. > > These two requirements are due to QEMU's guest memory mapping API. IIUC > the limitations on that API are due to a limited amount of bounce buffer > space being used for some targets allowing mapping of non-RAM memory. > > Perhaps this means that AHCI does not work on those targets if you > decide to send non-RAM pages to disk? > > I'd be interested in understanding how this all works and how the QEMU > RAM API that Anthony and Alex Williamson have been playing with comes > into play. Yeah, I don't know the details either. I hope that Anthony will comment on it. Kevin
[Qemu-devel] Re: [PATCH 6/8] ahci: make number of ports runtime determined
Am 18.01.2011 13:46, schrieb Alexander Graf: > > On 18.01.2011, at 13:40, Kevin Wolf wrote: > >> Am 20.12.2010 22:13, schrieb Alexander Graf: >>> Different AHCI controllers have a different number of ports, so the core >>> shouldn't care about the amount of ports available. >>> >>> This patch makes the number of ports available to the AHCI core runtime >>> configurable, allowing us to have multiple different AHCI implementations >>> with different amounts of ports. >>> >>> Signed-off-by: Alexander Graf >>> --- >>> hw/ide/ahci.c | 29 +++-- >>> hw/ide/ahci.h | 10 +- >>> hw/ide/ich.c |3 ++- >>> 3 files changed, 26 insertions(+), 16 deletions(-) >>> >>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c >>> index bd4f8a4..c0bc5ff 100644 >>> --- a/hw/ide/ahci.c >>> +++ b/hw/ide/ahci.c >>> @@ -146,7 +146,7 @@ static void ahci_check_irq(AHCIState *s) >>> >>> DPRINTF(-1, "check irq %#x\n", s->control_regs.irqstatus); >>> >>> -for (i = 0; i < SATA_PORTS; i++) { >>> +for (i = 0; i < s->ports; i++) { >>> AHCIPortRegs *pr = &s->dev[i].port_regs; >>> if (pr->irq_stat & pr->irq_mask) { >>> s->control_regs.irqstatus |= (1 << i); >>> @@ -300,7 +300,8 @@ static uint32_t ahci_mem_readl(void *ptr, >>> target_phys_addr_t addr) >>> >>> DPRINTF(-1, "(addr 0x%08X), val 0x%08X\n", (unsigned) addr, val); >>> } else if ((addr >= AHCI_PORT_REGS_START_ADDR) && >>> - (addr < AHCI_PORT_REGS_END_ADDR)) { >>> + (addr < (AHCI_PORT_REGS_START_ADDR + >>> +(s->ports * AHCI_PORT_ADDR_OFFSET_LEN { >>> val = ahci_port_read(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7, >>> addr & AHCI_PORT_ADDR_OFFSET_MASK); >>> } >>> @@ -352,7 +353,8 @@ static void ahci_mem_writel(void *ptr, >>> target_phys_addr_t addr, uint32_t val) >>> DPRINTF(-1, "write to unknown register 0x%x\n", >>> (unsigned)addr); >>> } >>> } else if ((addr >= AHCI_PORT_REGS_START_ADDR) && >>> - (addr < AHCI_PORT_REGS_END_ADDR)) { >>> + (addr < (AHCI_PORT_REGS_START_ADDR + >>> +(s->ports * AHCI_PORT_ADDR_OFFSET_LEN { >>> ahci_port_write(s, (addr - AHCI_PORT_REGS_START_ADDR) >> 7, >>> addr & AHCI_PORT_ADDR_OFFSET_MASK, val); >>> } >>> @@ -375,16 +377,16 @@ static void ahci_reg_init(AHCIState *s) >>> { >>> int i; >>> >>> -s->control_regs.cap = (SATA_PORTS - 1) | >>> +s->control_regs.cap = (s->ports - 1) | >>> (AHCI_NUM_COMMAND_SLOTS << 8) | >>> (AHCI_SUPPORTED_SPEED_GEN1 << >>> AHCI_SUPPORTED_SPEED) | >>> HOST_CAP_NCQ | HOST_CAP_AHCI; >>> >>> -s->control_regs.impl = (1 << SATA_PORTS) - 1; >>> +s->control_regs.impl = (1 << s->ports) - 1; >>> >>> s->control_regs.version = AHCI_VERSION_1_0; >>> >>> -for (i = 0; i < SATA_PORTS; i++) { >>> +for (i = 0; i < s->ports; i++) { >>> s->dev[i].port_state = STATE_RUN; >>> } >>> } >>> @@ -1177,17 +1179,19 @@ static const IDEDMAOps ahci_dma_ops = { >>> .reset = ahci_dma_reset, >>> }; >>> >>> -void ahci_init(AHCIState *s, DeviceState *qdev) >>> +void ahci_init(AHCIState *s, DeviceState *qdev, int ports) >>> { >>> qemu_irq *irqs; >>> int i; >>> >>> +s->ports = ports; >>> +s->dev = qemu_mallocz(sizeof(AHCIDevice) * ports); >>> ahci_reg_init(s); >>> s->mem = cpu_register_io_memory(ahci_readfn, ahci_writefn, s, >>> DEVICE_LITTLE_ENDIAN); >>> -irqs = qemu_allocate_irqs(ahci_irq_set, s, SATA_PORTS); >>> +irqs = qemu_allocate_irqs(ahci_irq_set, s, s->ports); >>> >>> -for (i = 0; i < SATA_PORTS; i++) { >>> +for (i = 0; i < s->ports; i++) { >>> AHCIDevice *ad = &s->dev[i]; >>> >>> ide_bus_new(&ad->port, qdev, i); >>> @@ -1201,6 +1205,11 @@ void ahci_init(AHCIState *s, DeviceState *qdev) >>> } >>> } >>> >>> +void ahci_uninit(AHCIState *s) >>> +{ >>> +qemu_free(s->dev); >>> +} >>> + >>> void ahci_pci_map(PCIDevice *pci_dev, int region_num, >>> pcibus_t addr, pcibus_t size, int type) >>> { >>> @@ -1218,7 +1227,7 @@ void ahci_reset(void *opaque) >>> d->ahci.control_regs.irqstatus = 0; >>> d->ahci.control_regs.ghc = 0; >>> >>> -for (i = 0; i < SATA_PORTS; i++) { >>> +for (i = 0; i < d->ahci.ports; i++) { >>> ahci_reset_port(&d->ahci, i); >>> } >>> } >>> diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h >>> index 0824064..eb87dcd 100644 >>> --- a/hw/ide/ahci.h >>> +++ b/hw/ide/ahci.h >>> @@ -165,11 +165,9 @@ >>> #define AHCI_GENERIC_HOST_CONTROL_REGS_MAX_ADDR 0x20 >>> /* Shouldn't this be 0x2c? */ >>> >>> -#define SATA_PORTS 4 >>> - >>> #define AHCI_PORT_REGS_START_ADDR 0x100 >>> -#define AHCI_PORT_REGS_END_ADDR (AHCI_PORT_REGS_START_ADDR + SATA_PORTS * >>>
Re: [Qemu-devel] paravirtual tablet v3
Hi, Sure, someone needs to map multitouch to non-multitouch. I'd leave that job to the guest driver tough. Why do you think doing it in the host is better? My assumptions are 1) the host is capable of doing the mapping just as easily as the guest Probably. 2) the host can do something useful with the information that the guest is not multitouch capable. What do you think of here? cheers, Gerd
Re: [Qemu-devel] [RFC][PATCH v6 00/23] virtagent: host/guest RPC communication agent
On 01/17/11 15:53, Michael Roth wrote: On 01/17/2011 07:53 AM, Gerd Hoffmann wrote: What is your plan to handle system-level queries+actions (such as reboot) vs. per-user stuff (such as cut+paste)? This is an area that hasn't been well-defined yet and is definitely open for suggestions. One option would be to have two virtio-serial channels, one for the system and one for the user stuff. gdm could grant the desktop user access to the user channel like it does with sound devices and simliar stuff, so the user agent has access to it. Another option is to have some socket where the user agent can talk to the system agent and have it relay the requests. Maybe it is also possible to use dbus for communication between the system agent and user agent (and maybe other components). Maybe it even makes sense to run the dbus protocol over the virtio-serial line? Disclaimer: I know next to nothing about dbus details. For host->guest RPCs the current plan is to always have the RPC executed at the system level, but for situations involving a specific user we fork and drop privileges with the RPC, and report back the status of the fork/exec. The fork'd process would then do what it needs to do, then if needed, communicate status back to the system-level daemon via some IPC mechanism (most likely a socket we listen to in addition to the serial channel) that can be used to send an event. The system-level daemon then communicates these events back to the host with a guest->host RPC. Hmm. A bit heavy to fork+exec on every rpc. might be ok for rare events though. Processes created independently of the system-level daemon could report events in the same manner, via this socket. I think this might suit the vdagent client model for Spice as well, Yes, vdagent works this way, except that *all* communication goes through that socket, i.e. events/requests coming from the host for the user-level agent are routed through that socket too. It is the only sane way to handle clipboard support IMHO as there is quite some message ping-pong involved to get a clipboard transaction done. How does xmlrpm transmit binary blobs btw? cheers, Gerd
Re: [Qemu-devel] [RFC][PATCH v6 00/23] virtagent: host/guest RPC communication agent
On 01/18/2011 08:02 AM, Gerd Hoffmann wrote: On 01/17/11 15:53, Michael Roth wrote: On 01/17/2011 07:53 AM, Gerd Hoffmann wrote: What is your plan to handle system-level queries+actions (such as reboot) vs. per-user stuff (such as cut+paste)? This is an area that hasn't been well-defined yet and is definitely open for suggestions. One option would be to have two virtio-serial channels, one for the system and one for the user stuff. gdm could grant the desktop user access to the user channel like it does with sound devices and simliar stuff, so the user agent has access to it. Another option is to have some socket where the user agent can talk to the system agent and have it relay the requests. I think this is the best approach. One requirement we've been working with is that all actions from guest agents are logged. This is to give an administrator confidence that the hypervisor isn't doing anything stupid. If you route all of the user traffic through a privileged daemon, you can log everything to syslog or an appropriate log file. Maybe it is also possible to use dbus for communication between the system agent and user agent (and maybe other components). Maybe it even makes sense to run the dbus protocol over the virtio-serial line? Disclaimer: I know next to nothing about dbus details. The way I'd prefer to think about it is that the transport and protocol used are separate layers that may have multiple implementations over time. For instance, we currently support virtio-serial and isa-serial. Supporting another protocol wouldn't be a big deal. The part that's needs to remain consistent is the API supported by the transport/protocol combinations. For host->guest RPCs the current plan is to always have the RPC executed at the system level, but for situations involving a specific user we fork and drop privileges with the RPC, and report back the status of the fork/exec. The fork'd process would then do what it needs to do, then if needed, communicate status back to the system-level daemon via some IPC mechanism (most likely a socket we listen to in addition to the serial channel) that can be used to send an event. The system-level daemon then communicates these events back to the host with a guest->host RPC. Hmm. A bit heavy to fork+exec on every rpc. might be ok for rare events though. Processes created independently of the system-level daemon could report events in the same manner, via this socket. I think this might suit the vdagent client model for Spice as well, Yes, vdagent works this way, except that *all* communication goes through that socket, i.e. events/requests coming from the host for the user-level agent are routed through that socket too. It is the only sane way to handle clipboard support IMHO as there is quite some message ping-pong involved to get a clipboard transaction done. How does xmlrpm transmit binary blobs btw? XML-RPC has a base64 encoding that's part of the standard for encoding binary data. It also supports UTF-8 encoded strings. Regards, Anthony Liguori cheers, Gerd
[Qemu-devel] Re: [PATCH v2 0/5] iohandlers: Add support for enabling/disabling individual handlers
On 01/18/2011 05:56 AM, Amit Shah wrote: On (Mon) Jan 17 2011 [08:57:16], Anthony Liguori wrote: Also -- this patchset was prompted by a bug in qemu chardevs that freezes guests if they write faster than the chardevs can consume. What should the strategy on fixing that bug be? Fix the loop API such that we're not just fixing one bug but that we can address a bunch of other bugs that are out there. But what's the right fix? * Pull in glib or some other library * Mimic API from some other library 1) get rid of poll callbacks These kill us right now because it makes it impossible to use poll/epoll or to use some other API. 2) switch to a single callback and an event mask We can't do this without (1) because the can_read callback takes a different signature. But read/write callbacks can be a single callback (but it needs to be changed to accept an event mask). 3) allow the event mask to be changed via a new API This gives the functionality you're looking for while making a huge improvement to the internal API. A quick python/perl script can probably do 90% of the work here. Regards, Anthony Liguori Amit
Re: [Qemu-devel] [PATCH] target-arm: Log instruction start in TCG code
On Tue, Jan 18, 2011 at 01:08:40PM +, Peter Maydell wrote: > Add support for logging the start of instructions in TCG > code debug dumps for ARM targets. > > Signed-off-by: Peter Maydell Applied, thanks. > --- > target-arm/translate.c |4 > 1 files changed, 4 insertions(+), 0 deletions(-) > > diff --git a/target-arm/translate.c b/target-arm/translate.c > index 907d73a..c60cd18 100644 > --- a/target-arm/translate.c > +++ b/target-arm/translate.c > @@ -9199,6 +9199,10 @@ static inline void > gen_intermediate_code_internal(CPUState *env, > if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) > gen_io_start(); > > +if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) { > +tcg_gen_debug_insn_start(dc->pc); > +} > + > if (dc->thumb) { > disas_thumb_insn(env, dc); > if (dc->condexec_mask) { > -- > 1.7.1 > >
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 2011-01-12 11:31, Jan Kiszka wrote: > Am 12.01.2011 11:22, Avi Kivity wrote: >> On 01/11/2011 03:54 PM, Anthony Liguori wrote: >>> >>> Right, we should introduce a KVMBus that KVM devices are created on. >>> The devices can get at KVMState through the BusState. >> >> There is no kvm bus in a PC (I looked). We're bending the device model >> here because a device is implemented in the kernel and not in >> userspace. An implementation detail is magnified beyond all proportions. >> >> An ioapic that is implemented by kvm lives in exactly the same place >> that the qemu ioapic lives in. An assigned pci device lives on the PCI >> bus, not a KVMBus. If we need a pointer to KVMState, then we must find >> it elsewhere, not through creating imaginary buses that don't exist. >> > > Exactly. > > So we can either "infect" the whole device tree with kvm (or maybe a > more generic accelerator structure that also deals with Xen) or we need > to pull the reference inside the device's init function from some global > service (kvm_get_state). Note that this topic is still waiting for good suggestions, specifically from those who believe in kvm_state references :). This is not only blocking kvmstate merge but will affect KVM irqchips as well. It boils down to how we reasonably pass a kvm_state reference from machine init code to a sysbus device. I'm probably biased, but I don't see any way that does not work against the idea of confining access to kvm_state or breaks device instantiation from the command line or a config file. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] Re: KVM call agenda for Jan 18
* Chris Wright (chr...@redhat.com) wrote: > Please send in any agenda items you are interested in covering. No agenda, this week's call is cancelled. thanks, -chris
[Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation.
Fix garbage collection of temporaries in Neon emulation. Signed-off-by: Christophe Lyon --- target-arm/translate.c | 22 +- 1 files changed, 17 insertions(+), 5 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 57664bc..363351e 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4176,6 +4176,18 @@ static inline void gen_neon_mull(TCGv_i64 dest, TCGv a, TCGv b, int size, int u) break; default: abort(); } + +/* gen_helper_neon_mull_[su]{8|16} do not free their parameters. + Don't forget to clean them now. */ +switch ((size << 1) | u) { +case 0: +case 1: +case 2: +case 3: + dead_tmp(a); + dead_tmp(b); + break; +} } /* Translate a NEON data processing instruction. Return nonzero if the @@ -4840,7 +4852,7 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) if (size == 3) { tcg_temp_free_i64(tmp64); } else { -dead_tmp(tmp2); +tcg_temp_free_i32(tmp2); } } else if (op == 10) { /* VSHLL */ @@ -5076,8 +5088,6 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) case 8: case 9: case 10: case 11: case 12: case 13: /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL */ gen_neon_mull(cpu_V0, tmp, tmp2, size, u); -dead_tmp(tmp2); -dead_tmp(tmp); break; case 14: /* Polynomial VMULL */ cpu_abort(env, "Polynomial VMULL not implemented"); @@ -5235,9 +5245,12 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) tmp = neon_load_reg(rn, 0); } else { tmp = tmp3; + /* tmp2 has been discarded in + gen_neon_mull during pass 0, we need to + recreate it. */ + tmp2 = neon_get_scalar(size, rm); } gen_neon_mull(cpu_V0, tmp, tmp2, size, u); -dead_tmp(tmp); if (op == 6 || op == 7) { gen_neon_negl(cpu_V0, size); } @@ -5264,7 +5277,6 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) neon_store_reg64(cpu_V0, rd + pass); } -dead_tmp(tmp2); break; default: /* 14 and 15 are RESERVED */ -- 1.7.2.3
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 01/18/2011 08:28 AM, Jan Kiszka wrote: On 2011-01-12 11:31, Jan Kiszka wrote: Am 12.01.2011 11:22, Avi Kivity wrote: On 01/11/2011 03:54 PM, Anthony Liguori wrote: Right, we should introduce a KVMBus that KVM devices are created on. The devices can get at KVMState through the BusState. There is no kvm bus in a PC (I looked). We're bending the device model here because a device is implemented in the kernel and not in userspace. An implementation detail is magnified beyond all proportions. An ioapic that is implemented by kvm lives in exactly the same place that the qemu ioapic lives in. An assigned pci device lives on the PCI bus, not a KVMBus. If we need a pointer to KVMState, then we must find it elsewhere, not through creating imaginary buses that don't exist. Exactly. So we can either "infect" the whole device tree with kvm (or maybe a more generic accelerator structure that also deals with Xen) or we need to pull the reference inside the device's init function from some global service (kvm_get_state). Note that this topic is still waiting for good suggestions, specifically from those who believe in kvm_state references :). This is not only blocking kvmstate merge but will affect KVM irqchips as well. It boils down to how we reasonably pass a kvm_state reference from machine init code to a sysbus device. I'm probably biased, but I don't see any way that does not work against the idea of confining access to kvm_state or breaks device instantiation from the command line or a config file. A KVM device should sit on a KVM specific bus that hangs off of sysbus. It can get to kvm_state through that bus. That bus doesn't get instantiated through qdev so requiring a pointer argument should not be an issue. Regards, Anthony Liguori Jan
Re: [Qemu-devel] paravirtual tablet v3
On 01/18/2011 07:46 AM, Gerd Hoffmann wrote: Hi, Sure, someone needs to map multitouch to non-multitouch. I'd leave that job to the guest driver tough. Why do you think doing it in the host is better? My assumptions are 1) the host is capable of doing the mapping just as easily as the guest Probably. 2) the host can do something useful with the information that the guest is not multitouch capable. What do you think of here? Imagine a nice looking GUI, you would likely have icons representing the mouse status. You've got at least PS/2 mouse, PS/2 mouse with capture, PV mouse in absolute mode, PV mouse in multitouch mode, etc. VirtualBox has an interface like this FWIW (although not multitouch aware). Regards, Anthony Liguori cheers, Gerd
[Qemu-devel] [sparc] Floating point exception issue
Hi, Recently, I have reported mysterious issues on NetBSD 5.1 emulated on SPARC. The whole first thread is here: http://lists.gnu.org/archive/html/qemu-devel/2011-01/msg01509.html I decided to investigate the problem deeper and with great help from NetBSD folks, I managed to find reproducible test case. Initially, it was AWK command: # echo NaN | awk '{print "test"}' awk: floating point exception 8 source line number 1 and next it boiled down to simple C program (see below). Details of the investigation are archived in the NetBSD Problem Report #44389 here: http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389 Here is final version of the test program which reproduces the problem: #include #include #include #include int is_number(const char *s) { double r; char *ep; errno = 0; r = strtod(s, &ep); if (r == HUGE_VAL) printf("X:%g\n", r); if (ep == s || r == HUGE_VAL || errno == ERANGE) return 0; while (*ep == ' ' || *ep == '\t' || *ep == '\n') ep++; if (*ep == '\0') return 1; else return 0; } int main(int argc, char **argv) { double v; if (is_number("NaN")) { printf("is a number\n"); v = atof("NaN"); } else { printf("not a number\n"); v = 0.0; } printf("%.4f\n", v); return 0; } On NetBSD/SPARC, the program receives SIGFPE: $ gcc ./nan_test_2.c $ ./a.out [1] Floating point exception (core dumped) ./a.out Specifically, it's caused by r == HUGE_VAL condition in if (ep == s || r == HUGE_VAL || errno == ERANGE) where r is NaN. All the signs indicate there is a bug in QEMU. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org Member of ACCU, http://accu.org
Re: [Qemu-devel] [PATCH 2/2] strtosz(): Use suffix macros in switch() statement
On Tue, 2011-01-18 at 10:20 +0100, Markus Armbruster wrote: > jes.soren...@redhat.com writes: > > > From: Jes Sorensen > > > > Signed-off-by: Jes Sorensen > > --- > > cutils.c | 10 +- > > 1 files changed, 5 insertions(+), 5 deletions(-) > > > > diff --git a/cutils.c b/cutils.c > > index 328738c..f2c8bbd 100644 > > --- a/cutils.c > > +++ b/cutils.c > > @@ -324,26 +324,26 @@ ssize_t strtosz_suffix(const char *nptr, char **end, > > const char default_suffix) > > } > > } > > switch (toupper(d)) { > > -case 'B': > > +case STRTOSZ_DEFSUFFIX_B: > > mul = 1; > > if (mul_required) { > > goto fail; > > } > > break; > > -case 'K': > > +case STRTOSZ_DEFSUFFIX_KB: > > mul = 1 << 10; > > break; > > case 0: > > if (mul_required) { > > goto fail; > > } > > -case 'M': > > +case STRTOSZ_DEFSUFFIX_MB: > > mul = 1ULL << 20; > > break; > > -case 'G': > > +case STRTOSZ_DEFSUFFIX_GB: > > mul = 1ULL << 30; > > break; > > -case 'T': > > +case STRTOSZ_DEFSUFFIX_TB: > > mul = 1ULL << 40; > > break; > > default: > > And this improves what? Certainly not clarity. > > In my opinion, the STRTOSZ_DEFSUFFIX_TB are useless chaff. Chacun à son > goût. I prefer it. Better than having 'B'/'K'/'M'/'G'/'T' sprinkled throughout the code. Easier to search for, easier to update, more consistent. Alex
Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation.
On 18 January 2011 14:34, Christophe Lyon wrote: > + > + /* gen_helper_neon_mull_[su]{8|16} do not free their parameters. > + Don't forget to clean them now. */ > + switch ((size << 1) | u) { > + case 0: > + case 1: > + case 2: > + case 3: > + dead_tmp(a); > + dead_tmp(b); > + break; > + } > } This seems a rather convoluted way to write "if (size < 2) { ... }" > @@ -5235,9 +5245,12 @@ static int disas_neon_data_insn(CPUState * env, > DisasContext *s, uint32_t insn) > tmp = neon_load_reg(rn, 0); > } else { > tmp = tmp3; > + /* tmp2 has been discarded in > + gen_neon_mull during pass 0, we need to > + recreate it. */ > + tmp2 = neon_get_scalar(size, rm); > } I think this will give the wrong results for instructions where the scalar operand is in the same Neon register as the destination for the first pass, because calling neon_get_scalar() again will do a reload from the Neon register and it might have changed. (Also loading it once at the start rather than in every pass is more efficient as well as being correct :-)) Also your patch has hard-coded tabs in it (please see CODING_STYLE on the subject of whitespace) and your mail client or server has line-wrapped long lines in the patch so it doesn't apply cleanly... -- PMM
Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation.
Incidentally there are some correctness fixes for the multiply-by-scalar neon insns from the qemu-meego tree which are on my list to push upstream. So you probably aren't getting the right results even if you've managed to shut up qemu's warnings :-) -- PMM
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 2011-01-18 16:04, Anthony Liguori wrote: > On 01/18/2011 08:28 AM, Jan Kiszka wrote: >> On 2011-01-12 11:31, Jan Kiszka wrote: >> >>> Am 12.01.2011 11:22, Avi Kivity wrote: >>> On 01/11/2011 03:54 PM, Anthony Liguori wrote: > Right, we should introduce a KVMBus that KVM devices are created on. > The devices can get at KVMState through the BusState. > There is no kvm bus in a PC (I looked). We're bending the device model here because a device is implemented in the kernel and not in userspace. An implementation detail is magnified beyond all proportions. An ioapic that is implemented by kvm lives in exactly the same place that the qemu ioapic lives in. An assigned pci device lives on the PCI bus, not a KVMBus. If we need a pointer to KVMState, then we must find it elsewhere, not through creating imaginary buses that don't exist. >>> Exactly. >>> >>> So we can either "infect" the whole device tree with kvm (or maybe a >>> more generic accelerator structure that also deals with Xen) or we need >>> to pull the reference inside the device's init function from some global >>> service (kvm_get_state). >>> >> Note that this topic is still waiting for good suggestions, specifically >> from those who believe in kvm_state references :). This is not only >> blocking kvmstate merge but will affect KVM irqchips as well. >> >> It boils down to how we reasonably pass a kvm_state reference from >> machine init code to a sysbus device. I'm probably biased, but I don't >> see any way that does not work against the idea of confining access to >> kvm_state or breaks device instantiation from the command line or a >> config file. >> > > A KVM device should sit on a KVM specific bus that hangs off of sysbus. > It can get to kvm_state through that bus. > > That bus doesn't get instantiated through qdev so requiring a pointer > argument should not be an issue. > This design is in conflict with the requirement to attach KVM-assisted devices also to their home bus, e.g. an assigned PCI device to the PCI bus. We don't support multi-homed qdev devices. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 01/18/2011 09:43 AM, Jan Kiszka wrote: On 2011-01-18 16:04, Anthony Liguori wrote: On 01/18/2011 08:28 AM, Jan Kiszka wrote: On 2011-01-12 11:31, Jan Kiszka wrote: Am 12.01.2011 11:22, Avi Kivity wrote: On 01/11/2011 03:54 PM, Anthony Liguori wrote: Right, we should introduce a KVMBus that KVM devices are created on. The devices can get at KVMState through the BusState. There is no kvm bus in a PC (I looked). We're bending the device model here because a device is implemented in the kernel and not in userspace. An implementation detail is magnified beyond all proportions. An ioapic that is implemented by kvm lives in exactly the same place that the qemu ioapic lives in. An assigned pci device lives on the PCI bus, not a KVMBus. If we need a pointer to KVMState, then we must find it elsewhere, not through creating imaginary buses that don't exist. Exactly. So we can either "infect" the whole device tree with kvm (or maybe a more generic accelerator structure that also deals with Xen) or we need to pull the reference inside the device's init function from some global service (kvm_get_state). Note that this topic is still waiting for good suggestions, specifically from those who believe in kvm_state references :). This is not only blocking kvmstate merge but will affect KVM irqchips as well. It boils down to how we reasonably pass a kvm_state reference from machine init code to a sysbus device. I'm probably biased, but I don't see any way that does not work against the idea of confining access to kvm_state or breaks device instantiation from the command line or a config file. A KVM device should sit on a KVM specific bus that hangs off of sysbus. It can get to kvm_state through that bus. That bus doesn't get instantiated through qdev so requiring a pointer argument should not be an issue. This design is in conflict with the requirement to attach KVM-assisted devices also to their home bus, e.g. an assigned PCI device to the PCI bus. We don't support multi-homed qdev devices. With vfio, would an assigned PCI device even need kvm_state? Regards, Anthony Liguori Jan
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 01/18/2011 09:43 AM, Jan Kiszka wrote: On 2011-01-18 16:04, Anthony Liguori wrote: On 01/18/2011 08:28 AM, Jan Kiszka wrote: On 2011-01-12 11:31, Jan Kiszka wrote: Am 12.01.2011 11:22, Avi Kivity wrote: On 01/11/2011 03:54 PM, Anthony Liguori wrote: Right, we should introduce a KVMBus that KVM devices are created on. The devices can get at KVMState through the BusState. There is no kvm bus in a PC (I looked). We're bending the device model here because a device is implemented in the kernel and not in userspace. An implementation detail is magnified beyond all proportions. An ioapic that is implemented by kvm lives in exactly the same place that the qemu ioapic lives in. An assigned pci device lives on the PCI bus, not a KVMBus. If we need a pointer to KVMState, then we must find it elsewhere, not through creating imaginary buses that don't exist. Exactly. So we can either "infect" the whole device tree with kvm (or maybe a more generic accelerator structure that also deals with Xen) or we need to pull the reference inside the device's init function from some global service (kvm_get_state). Note that this topic is still waiting for good suggestions, specifically from those who believe in kvm_state references :). This is not only blocking kvmstate merge but will affect KVM irqchips as well. It boils down to how we reasonably pass a kvm_state reference from machine init code to a sysbus device. I'm probably biased, but I don't see any way that does not work against the idea of confining access to kvm_state or breaks device instantiation from the command line or a config file. A KVM device should sit on a KVM specific bus that hangs off of sysbus. It can get to kvm_state through that bus. That bus doesn't get instantiated through qdev so requiring a pointer argument should not be an issue. This design is in conflict with the requirement to attach KVM-assisted devices also to their home bus, e.g. an assigned PCI device to the PCI bus. We don't support multi-homed qdev devices. The bus topology reflects how I/O flows in and out of a device. We do not model a perfect PC bus architecture and I don't think we ever intend to. Instead, we model a functional architecture. I/O from an assigned device does not flow through the emulated PCI bus. Therefore, it does not belong on the emulated PCI bus. Assigned devices need to interact with the emulated PCI bus, but they shouldn't be children of it. Regards, Anthony Liguori Jan
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 2011-01-18 16:48, Anthony Liguori wrote: > On 01/18/2011 09:43 AM, Jan Kiszka wrote: >> On 2011-01-18 16:04, Anthony Liguori wrote: >> >>> On 01/18/2011 08:28 AM, Jan Kiszka wrote: >>> On 2011-01-12 11:31, Jan Kiszka wrote: > Am 12.01.2011 11:22, Avi Kivity wrote: > > >> On 01/11/2011 03:54 PM, Anthony Liguori wrote: >> >> >>> Right, we should introduce a KVMBus that KVM devices are created on. >>> The devices can get at KVMState through the BusState. >>> >>> >> There is no kvm bus in a PC (I looked). We're bending the device model >> here because a device is implemented in the kernel and not in >> userspace. An implementation detail is magnified beyond all proportions. >> >> An ioapic that is implemented by kvm lives in exactly the same place >> that the qemu ioapic lives in. An assigned pci device lives on the PCI >> bus, not a KVMBus. If we need a pointer to KVMState, then we must find >> it elsewhere, not through creating imaginary buses that don't exist. >> >> >> > Exactly. > > So we can either "infect" the whole device tree with kvm (or maybe a > more generic accelerator structure that also deals with Xen) or we need > to pull the reference inside the device's init function from some global > service (kvm_get_state). > > Note that this topic is still waiting for good suggestions, specifically from those who believe in kvm_state references :). This is not only blocking kvmstate merge but will affect KVM irqchips as well. It boils down to how we reasonably pass a kvm_state reference from machine init code to a sysbus device. I'm probably biased, but I don't see any way that does not work against the idea of confining access to kvm_state or breaks device instantiation from the command line or a config file. >>> A KVM device should sit on a KVM specific bus that hangs off of sysbus. >>> It can get to kvm_state through that bus. >>> >>> That bus doesn't get instantiated through qdev so requiring a pointer >>> argument should not be an issue. >>> >>> >> This design is in conflict with the requirement to attach KVM-assisted >> devices also to their home bus, e.g. an assigned PCI device to the PCI >> bus. We don't support multi-homed qdev devices. >> > > With vfio, would an assigned PCI device even need kvm_state? IIUC: Yes, for establishing the irqfd link. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 2011-01-18 16:50, Anthony Liguori wrote: > On 01/18/2011 09:43 AM, Jan Kiszka wrote: >> On 2011-01-18 16:04, Anthony Liguori wrote: >> >>> On 01/18/2011 08:28 AM, Jan Kiszka wrote: >>> On 2011-01-12 11:31, Jan Kiszka wrote: > Am 12.01.2011 11:22, Avi Kivity wrote: > > >> On 01/11/2011 03:54 PM, Anthony Liguori wrote: >> >> >>> Right, we should introduce a KVMBus that KVM devices are created on. >>> The devices can get at KVMState through the BusState. >>> >>> >> There is no kvm bus in a PC (I looked). We're bending the device model >> here because a device is implemented in the kernel and not in >> userspace. An implementation detail is magnified beyond all proportions. >> >> An ioapic that is implemented by kvm lives in exactly the same place >> that the qemu ioapic lives in. An assigned pci device lives on the PCI >> bus, not a KVMBus. If we need a pointer to KVMState, then we must find >> it elsewhere, not through creating imaginary buses that don't exist. >> >> >> > Exactly. > > So we can either "infect" the whole device tree with kvm (or maybe a > more generic accelerator structure that also deals with Xen) or we need > to pull the reference inside the device's init function from some global > service (kvm_get_state). > > Note that this topic is still waiting for good suggestions, specifically from those who believe in kvm_state references :). This is not only blocking kvmstate merge but will affect KVM irqchips as well. It boils down to how we reasonably pass a kvm_state reference from machine init code to a sysbus device. I'm probably biased, but I don't see any way that does not work against the idea of confining access to kvm_state or breaks device instantiation from the command line or a config file. >>> A KVM device should sit on a KVM specific bus that hangs off of sysbus. >>> It can get to kvm_state through that bus. >>> >>> That bus doesn't get instantiated through qdev so requiring a pointer >>> argument should not be an issue. >>> >>> >> This design is in conflict with the requirement to attach KVM-assisted >> devices also to their home bus, e.g. an assigned PCI device to the PCI >> bus. We don't support multi-homed qdev devices. >> > > The bus topology reflects how I/O flows in and out of a device. We do > not model a perfect PC bus architecture and I don't think we ever intend > to. Instead, we model a functional architecture. > > I/O from an assigned device does not flow through the emulated PCI bus. > Therefore, it does not belong on the emulated PCI bus. > > Assigned devices need to interact with the emulated PCI bus, but they > shouldn't be children of it. You should be able to find assigned devices on some PCI bus, so you either have to hack up the existing bus to host devices that are, on the other side, not part of it or branch off a pci-kvm sub-bus, just like you would have to create a sysbus-kvm. I guess, if at all, we want the latter. Is that acceptable for everyone? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 01/18/2011 10:01 AM, Jan Kiszka wrote: On 2011-01-18 16:50, Anthony Liguori wrote: On 01/18/2011 09:43 AM, Jan Kiszka wrote: On 2011-01-18 16:04, Anthony Liguori wrote: On 01/18/2011 08:28 AM, Jan Kiszka wrote: On 2011-01-12 11:31, Jan Kiszka wrote: Am 12.01.2011 11:22, Avi Kivity wrote: On 01/11/2011 03:54 PM, Anthony Liguori wrote: Right, we should introduce a KVMBus that KVM devices are created on. The devices can get at KVMState through the BusState. There is no kvm bus in a PC (I looked). We're bending the device model here because a device is implemented in the kernel and not in userspace. An implementation detail is magnified beyond all proportions. An ioapic that is implemented by kvm lives in exactly the same place that the qemu ioapic lives in. An assigned pci device lives on the PCI bus, not a KVMBus. If we need a pointer to KVMState, then we must find it elsewhere, not through creating imaginary buses that don't exist. Exactly. So we can either "infect" the whole device tree with kvm (or maybe a more generic accelerator structure that also deals with Xen) or we need to pull the reference inside the device's init function from some global service (kvm_get_state). Note that this topic is still waiting for good suggestions, specifically from those who believe in kvm_state references :). This is not only blocking kvmstate merge but will affect KVM irqchips as well. It boils down to how we reasonably pass a kvm_state reference from machine init code to a sysbus device. I'm probably biased, but I don't see any way that does not work against the idea of confining access to kvm_state or breaks device instantiation from the command line or a config file. A KVM device should sit on a KVM specific bus that hangs off of sysbus. It can get to kvm_state through that bus. That bus doesn't get instantiated through qdev so requiring a pointer argument should not be an issue. This design is in conflict with the requirement to attach KVM-assisted devices also to their home bus, e.g. an assigned PCI device to the PCI bus. We don't support multi-homed qdev devices. The bus topology reflects how I/O flows in and out of a device. We do not model a perfect PC bus architecture and I don't think we ever intend to. Instead, we model a functional architecture. I/O from an assigned device does not flow through the emulated PCI bus. Therefore, it does not belong on the emulated PCI bus. Assigned devices need to interact with the emulated PCI bus, but they shouldn't be children of it. You should be able to find assigned devices on some PCI bus, so you either have to hack up the existing bus to host devices that are, on the other side, not part of it or branch off a pci-kvm sub-bus, just like you would have to create a sysbus-kvm. Management tools should never transverse the device tree to find devices. This is a recipe for disaster in the long term because the device tree will not remain stable. So yes, a management tool should be able to enumerate assigned devices as they would enumerate any other PCI device but that has almost nothing to do with what the tree layout is. Regards, Anthony Liguori I guess, if at all, we want the latter. Is that acceptable for everyone? Jan
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 2011-01-18 17:04, Anthony Liguori wrote: > A KVM device should sit on a KVM specific bus that hangs off of > sysbus. > It can get to kvm_state through that bus. > > That bus doesn't get instantiated through qdev so requiring a pointer > argument should not be an issue. > > > This design is in conflict with the requirement to attach KVM-assisted devices also to their home bus, e.g. an assigned PCI device to the PCI bus. We don't support multi-homed qdev devices. >>> The bus topology reflects how I/O flows in and out of a device. We do >>> not model a perfect PC bus architecture and I don't think we ever intend >>> to. Instead, we model a functional architecture. >>> >>> I/O from an assigned device does not flow through the emulated PCI bus. >>> Therefore, it does not belong on the emulated PCI bus. >>> >>> Assigned devices need to interact with the emulated PCI bus, but they >>> shouldn't be children of it. >>> >> You should be able to find assigned devices on some PCI bus, so you >> either have to hack up the existing bus to host devices that are, on the >> other side, not part of it or branch off a pci-kvm sub-bus, just like >> you would have to create a sysbus-kvm. > > Management tools should never transverse the device tree to find > devices. This is a recipe for disaster in the long term because the > device tree will not remain stable. > > So yes, a management tool should be able to enumerate assigned devices > as they would enumerate any other PCI device but that has almost nothing > to do with what the tree layout is. I'm probably misunderstanding you, but if the bus topology as the guest sees it is not properly reflected in an object tree on the qemu side, we are creating hacks again. Management and analysis tools must be able to traverse the system buses and find guest devices this way. If they create a device on bus X, it must never end up on bus Y just because it happens to be KVM-assisted or has some other property. On the other hand, trying to hide this dependency will likely cause severe damage to the qdev design. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
[Qemu-devel] [PATCH] pxa2xx_lcd: restore updating of display
Recently PXA2xx lcd have stopped to be updated incrementally (picture frozen). This patch fixes that by passing non min/max x/y, but rather (correctly) x/y and w/h. Signed-off-by: Dmitry Eremin-Solenikov --- hw/pxa2xx_lcd.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/pxa2xx_lcd.c b/hw/pxa2xx_lcd.c index 1f2211a..5b2b07e 100644 --- a/hw/pxa2xx_lcd.c +++ b/hw/pxa2xx_lcd.c @@ -796,9 +796,9 @@ static void pxa2xx_update_display(void *opaque) if (miny >= 0) { if (s->orientation) -dpy_update(s->ds, miny, 0, maxy, s->xres); +dpy_update(s->ds, miny, 0, maxy - miny, s->xres); else -dpy_update(s->ds, 0, miny, s->xres, maxy); +dpy_update(s->ds, 0, miny, s->xres, maxy - miny); } pxa2xx_lcdc_int_update(s); -- 1.7.2.3
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 01/18/2011 10:17 AM, Jan Kiszka wrote: On 2011-01-18 17:04, Anthony Liguori wrote: A KVM device should sit on a KVM specific bus that hangs off of sysbus. It can get to kvm_state through that bus. That bus doesn't get instantiated through qdev so requiring a pointer argument should not be an issue. This design is in conflict with the requirement to attach KVM-assisted devices also to their home bus, e.g. an assigned PCI device to the PCI bus. We don't support multi-homed qdev devices. The bus topology reflects how I/O flows in and out of a device. We do not model a perfect PC bus architecture and I don't think we ever intend to. Instead, we model a functional architecture. I/O from an assigned device does not flow through the emulated PCI bus. Therefore, it does not belong on the emulated PCI bus. Assigned devices need to interact with the emulated PCI bus, but they shouldn't be children of it. You should be able to find assigned devices on some PCI bus, so you either have to hack up the existing bus to host devices that are, on the other side, not part of it or branch off a pci-kvm sub-bus, just like you would have to create a sysbus-kvm. Management tools should never transverse the device tree to find devices. This is a recipe for disaster in the long term because the device tree will not remain stable. So yes, a management tool should be able to enumerate assigned devices as they would enumerate any other PCI device but that has almost nothing to do with what the tree layout is. I'm probably misunderstanding you, but if the bus topology as the guest sees it is not properly reflected in an object tree on the qemu side, we are creating hacks again. There is no such thing as the "bus topology as the guest sees it". The guest just sees a bunch of devices. The guest can only infer things like ISA busses. The guest sees a bunch of devices: an i8254, i8259, RTC, etc. Whether those devices are on an ISA bus, and LPC bus, or all in a SuperI/O chip that's part of the southbridge is all invisible to the guest. The device model topology is 100% a hidden architectural detail. Management and analysis tools must be able to traverse the system buses and find guest devices this way. We need to provide a compatible interface to the guest. If you agree with my above statements, then you'll also agree that we can do this without keeping the device model topology stable. But we also need to provide a compatible interface to management tools. Exposing the device model topology as a compatible interface artificially limits us. It's far better to provide higher level supported interfaces to give us the flexibility to change the device model as we need to. If they create a device on bus X, it must never end up on bus Y just because it happens to be KVM-assisted or has some other property. Nope. This is exactly what should happen. 90% of the devices in the device model are not created by management tools. They're part of a chipset. The chipset has well defined extension points and we provide management interfaces to create devices on those extension points. That is, interfaces to create PCI devices. Regards, Anthony Liguori On the other hand, trying to hide this dependency will likely cause severe damage to the qdev design. Jan
Re: [Qemu-devel] [PATCH 2/2] strtosz(): Use suffix macros in switch() statement
On 01/18/2011 03:20 AM, Markus Armbruster wrote: jes.soren...@redhat.com writes: From: Jes Sorensen Signed-off-by: Jes Sorensen --- cutils.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/cutils.c b/cutils.c index 328738c..f2c8bbd 100644 --- a/cutils.c +++ b/cutils.c @@ -324,26 +324,26 @@ ssize_t strtosz_suffix(const char *nptr, char **end, const char default_suffix) } } switch (toupper(d)) { -case 'B': +case STRTOSZ_DEFSUFFIX_B: mul = 1; if (mul_required) { goto fail; } break; -case 'K': +case STRTOSZ_DEFSUFFIX_KB: mul = 1<< 10; break; case 0: if (mul_required) { goto fail; } -case 'M': +case STRTOSZ_DEFSUFFIX_MB: mul = 1ULL<< 20; break; -case 'G': +case STRTOSZ_DEFSUFFIX_GB: mul = 1ULL<< 30; break; -case 'T': +case STRTOSZ_DEFSUFFIX_TB: mul = 1ULL<< 40; break; default: And this improves what? Certainly not clarity. In my opinion, the STRTOSZ_DEFSUFFIX_TB are useless chaff. Chacun à son goût. Yeah, I have to agree. I'm not of the literals are evil camp. BTW, a useful change would be to accept both upper and lower case letters. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 2/2] strtosz(): Use suffix macros in switch() statement
On 01/18/11 17:50, Anthony Liguori wrote: > On 01/18/2011 03:20 AM, Markus Armbruster wrote: >> jes.soren...@redhat.com writes: >> >> >>> From: Jes Sorensen >>> >>> Signed-off-by: Jes Sorensen >>> --- >>> cutils.c | 10 +- >>> 1 files changed, 5 insertions(+), 5 deletions(-) >>> >>> diff --git a/cutils.c b/cutils.c >>> index 328738c..f2c8bbd 100644 >>> --- a/cutils.c >>> +++ b/cutils.c >>> @@ -324,26 +324,26 @@ ssize_t strtosz_suffix(const char *nptr, char >>> **end, const char default_suffix) >>> } >>> } >>> switch (toupper(d)) { ^^^ >>> >> And this improves what? Certainly not clarity. >> >> In my opinion, the STRTOSZ_DEFSUFFIX_TB are useless chaff. Chacun à son >> goût. >> > > Yeah, I have to agree. I'm not of the literals are evil camp. > > BTW, a useful change would be to accept both upper and lower case letters. It already supports both, it's handle in the toupper() call. Cheers, Jes
Re: [Qemu-devel] [PATCH 2/2] strtosz(): Use suffix macros in switch() statement
On 01/18/2011 09:50 AM, Anthony Liguori wrote: >>> @@ -324,26 +324,26 @@ ssize_t strtosz_suffix(const char *nptr, char >>> **end, const char default_suffix) >>> } >>> } >>> switch (toupper(d)) { > BTW, a useful change would be to accept both upper and lower case letters. And it does, via the toupper() added earlier in the series (and which has separately been pointed out that using qemu_toupper() might be nicer). -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 2011-01-18 17:37, Anthony Liguori wrote: > On 01/18/2011 10:17 AM, Jan Kiszka wrote: >> On 2011-01-18 17:04, Anthony Liguori wrote: >> >>> A KVM device should sit on a KVM specific bus that hangs off of >>> sysbus. >>> It can get to kvm_state through that bus. >>> >>> That bus doesn't get instantiated through qdev so requiring a pointer >>> argument should not be an issue. >>> >>> >>> >>> >> This design is in conflict with the requirement to attach KVM-assisted >> devices also to their home bus, e.g. an assigned PCI device to the PCI >> bus. We don't support multi-homed qdev devices. >> >> >> > The bus topology reflects how I/O flows in and out of a device. We do > not model a perfect PC bus architecture and I don't think we ever intend > to. Instead, we model a functional architecture. > > I/O from an assigned device does not flow through the emulated PCI bus. > Therefore, it does not belong on the emulated PCI bus. > > Assigned devices need to interact with the emulated PCI bus, but they > shouldn't be children of it. > > You should be able to find assigned devices on some PCI bus, so you either have to hack up the existing bus to host devices that are, on the other side, not part of it or branch off a pci-kvm sub-bus, just like you would have to create a sysbus-kvm. >>> Management tools should never transverse the device tree to find >>> devices. This is a recipe for disaster in the long term because the >>> device tree will not remain stable. >>> >>> So yes, a management tool should be able to enumerate assigned devices >>> as they would enumerate any other PCI device but that has almost nothing >>> to do with what the tree layout is. >>> >> I'm probably misunderstanding you, but if the bus topology as the guest >> sees it is not properly reflected in an object tree on the qemu side, we >> are creating hacks again. >> > > There is no such thing as the "bus topology as the guest sees it". > > The guest just sees a bunch of devices. The guest can only infer things > like ISA busses. The guest sees a bunch of devices: an i8254, i8259, > RTC, etc. Whether those devices are on an ISA bus, and LPC bus, or all > in a SuperI/O chip that's part of the southbridge is all invisible to > the guest. > > The device model topology is 100% a hidden architectural detail. This is true for the sysbus, it is obviously not the case for PCI and similarly discoverable buses. There we have a guest-explorable topology that is currently equivalent to the the qdev layout. > >> Management and analysis tools must be able to traverse the system buses >> and find guest devices this way. > > We need to provide a compatible interface to the guest. If you agree > with my above statements, then you'll also agree that we can do this > without keeping the device model topology stable. > > But we also need to provide a compatible interface to management tools. > Exposing the device model topology as a compatible interface > artificially limits us. It's far better to provide higher level > supported interfaces to give us the flexibility to change the device > model as we need to. How do you want to change qdev to keep the guest and management tool view stable while branching off kvm sub-buses? Please propose such extensions so that they can be discussed. IIUC, that would be second relation between qdev and qbus instances besides the physical topology. What further use cases (besides passing kvm_state around) do you have in mind? > >> If they create a device on bus X, it >> must never end up on bus Y just because it happens to be KVM-assisted or >> has some other property. > > Nope. This is exactly what should happen. > > 90% of the devices in the device model are not created by management > tools. They're part of a chipset. The chipset has well defined > extension points and we provide management interfaces to create devices > on those extension points. That is, interfaces to create PCI devices. > Creating kvm irqchips via the management tool would be one simple way (not the only one, though) to enable/disable kvm assistance for those devices. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On Tue, 2011-01-18 at 16:54 +0100, Jan Kiszka wrote: > On 2011-01-18 16:48, Anthony Liguori wrote: > > On 01/18/2011 09:43 AM, Jan Kiszka wrote: > >> On 2011-01-18 16:04, Anthony Liguori wrote: > >> > >>> On 01/18/2011 08:28 AM, Jan Kiszka wrote: > >>> > On 2011-01-12 11:31, Jan Kiszka wrote: > > > > Am 12.01.2011 11:22, Avi Kivity wrote: > > > > > >> On 01/11/2011 03:54 PM, Anthony Liguori wrote: > >> > >> > >>> Right, we should introduce a KVMBus that KVM devices are created on. > >>> The devices can get at KVMState through the BusState. > >>> > >>> > >> There is no kvm bus in a PC (I looked). We're bending the device model > >> here because a device is implemented in the kernel and not in > >> userspace. An implementation detail is magnified beyond all > >> proportions. > >> > >> An ioapic that is implemented by kvm lives in exactly the same place > >> that the qemu ioapic lives in. An assigned pci device lives on the PCI > >> bus, not a KVMBus. If we need a pointer to KVMState, then we must find > >> it elsewhere, not through creating imaginary buses that don't exist. > >> > >> > >> > > Exactly. > > > > So we can either "infect" the whole device tree with kvm (or maybe a > > more generic accelerator structure that also deals with Xen) or we need > > to pull the reference inside the device's init function from some global > > service (kvm_get_state). > > > > > Note that this topic is still waiting for good suggestions, specifically > from those who believe in kvm_state references :). This is not only > blocking kvmstate merge but will affect KVM irqchips as well. > > It boils down to how we reasonably pass a kvm_state reference from > machine init code to a sysbus device. I'm probably biased, but I don't > see any way that does not work against the idea of confining access to > kvm_state or breaks device instantiation from the command line or a > config file. > > > >>> A KVM device should sit on a KVM specific bus that hangs off of sysbus. > >>> It can get to kvm_state through that bus. > >>> > >>> That bus doesn't get instantiated through qdev so requiring a pointer > >>> argument should not be an issue. > >>> > >>> > >> This design is in conflict with the requirement to attach KVM-assisted > >> devices also to their home bus, e.g. an assigned PCI device to the PCI > >> bus. We don't support multi-homed qdev devices. > >> > > > > With vfio, would an assigned PCI device even need kvm_state? > > IIUC: Yes, for establishing the irqfd link. We abstract this through the msi/msix layer though, so the vfio driver doesn't directly know anything about kvm_state. Alex
Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation.
On 18.01.2011 16:26, Peter Maydell wrote: > On 18 January 2011 14:34, Christophe Lyon wrote: >> + >> +/* gen_helper_neon_mull_[su]{8|16} do not free their parameters. >> + Don't forget to clean them now. */ >> +switch ((size << 1) | u) { >> +case 0: >> +case 1: >> +case 2: >> +case 3: >> + dead_tmp(a); >> + dead_tmp(b); >> + break; >> +} >> } > > This seems a rather convoluted way to write "if (size < 2) { ... }" > It was for consistency/readability with the preceding paragraph. >> @@ -5235,9 +5245,12 @@ static int disas_neon_data_insn(CPUState * env, >> DisasContext *s, uint32_t insn) >> tmp = neon_load_reg(rn, 0); >> } else { >> tmp = tmp3; >> + /* tmp2 has been discarded in >> + gen_neon_mull during pass 0, we need to >> + recreate it. */ >> + tmp2 = neon_get_scalar(size, rm); >> } > > I think this will give the wrong results for instructions where the > scalar operand is in the same Neon register as the destination > for the first pass, because calling neon_get_scalar() again will > do a reload from the Neon register and it might have changed. > (Also loading it once at the start rather than in every pass is > more efficient as well as being correct :-)) I agree it's more efficient, but as the temporary is freed by gen_neon_mull, how can I make an efficient copy? If we decide not to free the temporary in gen_mul[us]_i64_i32, we'll have to make sure clean up is performed correctly in many places. > Also your patch has hard-coded tabs in it (please see > CODING_STYLE on the subject of whitespace) and your > mail client or server has line-wrapped long lines in the patch > so it doesn't apply cleanly... Sorry, I know we have some trouble with the mail client or server. Is it possible to send patches as attachments on this list?
Re: [Qemu-devel] [V3 PATCH 2/8] virtio-9p: Provide chroot environment server side interfaces
On Tue, Jan 18, 2011 at 6:25 AM, M. Mohan Kumar wrote: > Implement chroot server side interfaces like sending the file > descriptor to qemu process, reading the object request from socket etc. > Also add chroot main function and other helper routines. > > Signed-off-by: M. Mohan Kumar > --- > Makefile.objs | 1 + > hw/9pfs/virtio-9p-chroot.c | 183 > > hw/9pfs/virtio-9p-chroot.h | 39 + > hw/9pfs/virtio-9p.c | 23 ++ > hw/file-op-9p.h | 2 + > 5 files changed, 248 insertions(+), 0 deletions(-) > create mode 100644 hw/9pfs/virtio-9p-chroot.c > create mode 100644 hw/9pfs/virtio-9p-chroot.h > > diff --git a/Makefile.objs b/Makefile.objs > index bc0344c..3007b6d 100644 > --- a/Makefile.objs > +++ b/Makefile.objs > @@ -273,6 +273,7 @@ hw-obj-$(CONFIG_SOUND) += $(sound-obj-y) > 9pfs-nested-$(CONFIG_VIRTFS) = virtio-9p-debug.o > 9pfs-nested-$(CONFIG_VIRTFS) += virtio-9p-local.o virtio-9p-xattr.o > 9pfs-nested-$(CONFIG_VIRTFS) += virtio-9p-xattr-user.o > virtio-9p-posix-acl.o > +9pfs-nested-$(CONFIG_VIRTFS) += virtio-9p-chroot.o > > hw-obj-$(CONFIG_REALLY_VIRTFS) += $(addprefix 9pfs/, $(9pfs-nested-y)) > $(addprefix 9pfs/, $(9pfs-nested-y)): CFLAGS += -I$(SRC_PATH)/hw/ > diff --git a/hw/9pfs/virtio-9p-chroot.c b/hw/9pfs/virtio-9p-chroot.c > new file mode 100644 > index 000..dcde2cc > --- /dev/null > +++ b/hw/9pfs/virtio-9p-chroot.c > @@ -0,0 +1,183 @@ > +/* > + * Virtio 9p chroot environment for contained access to exported path > + * > + * Copyright IBM, Corp. 2011 > + * > + * Authors: > + * M. Mohan Kumar > + * > + * This work is licensed under the terms of the GNU GPL, version 2. See > + * the copying file in the top-level directory > + * > + */ > + > +#include > +#include > +#include > +#include "virtio.h" > +#include "qemu_socket.h" > +#include "qemu-thread.h" > +#include "qerror.h" > +#include "virtio-9p.h" > +#include "virtio-9p-chroot.h" > + > +/* > + * Structure used by chroot functions to transmit file descriptor and > + * error info > + */ > +typedef struct { > + int fi_fd; > + int fi_error; > +} FdInfo; > + > +union MsgControl { > + struct cmsghdr cmsg; > + char control[CMSG_SPACE(sizeof(int))]; > +}; > + > +/* Send file descriptor and error status to qemu process */ > +static int chroot_sendfd(int sockfd, FdInfo *fd_info) const FdInfo *? > +{ > + struct msghdr msg = { }; > + struct iovec iov; > + struct cmsghdr *cmsg; > + int retval; > + union MsgControl msg_control; > + > + iov.iov_base = fd_info; > + iov.iov_len = sizeof(*fd_info); > + > + memset(&msg, 0, sizeof(msg)); > + msg.msg_iov = &iov; > + msg.msg_iovlen = 1; > + /* No ancillary data on error */ > + if (!fd_info->fi_error) { > + msg.msg_control = &msg_control; > + msg.msg_controllen = sizeof(msg_control); > + > + cmsg = &msg_control.cmsg; > + cmsg->cmsg_len = CMSG_LEN(sizeof(fd_info->fi_fd)); > + cmsg->cmsg_level = SOL_SOCKET; > + cmsg->cmsg_type = SCM_RIGHTS; > + memcpy(CMSG_DATA(cmsg), &fd_info->fi_fd, sizeof(fd_info->fi_fd)); > + } > + retval = sendmsg(sockfd, &msg, 0); > + close(fd_info->fi_fd); > + return retval; > +} > + > +/* Read V9fsFileObjectRequest written by QEMU process */ > +static void chroot_read_request(int sockfd, V9fsFileObjectRequest *request) > +{ > + int retval; > + retval = qemu_read_full(sockfd, request, sizeof(request->data)); > + if (retval <= 0 || request->data.path_len <= 0) { > + _exit(-1); > + } > + request->path.path = qemu_mallocz(request->data.path_len + 1); > + retval = qemu_read_full(sockfd, (void *)request->path.path, > + request->data.path_len); > + if (retval <= 0) { > + _exit(-1); > + } > + if (request->data.oldpath_len > 0) { > + request->path.old_path = > + qemu_mallocz(request->data.oldpath_len + 1); > + retval = qemu_read_full(sockfd, (void *)request->path.old_path, > + request->data.oldpath_len); > + if (retval <= 0) { > + _exit(-1); > + } > + } > +} > + > +static int chroot_daemonize(int chroot_sock) > +{ > + sigset_t sigset; > + struct rlimit nr_fd; > + int fd; > + > + /* Block all signals for this process */ > + sigprocmask(SIG_SETMASK, &sigset, NULL); > + > + /* Daemonize */ > + if (setsid() < 0) { > + error_report("setsid %s", strerror(errno)); > + return -1; > + } > + > + /* Close other file descriptors */ > + getrlimit(RLIMIT_NOFILE, &nr_fd); > + for (fd = 0; fd < nr_fd.rlim_cur; fd++) { > + if (fd != chroot_sock) { > + close(fd); > + } > + } > + > + /* Create files with mode as requested by client */ > + umask(0); > + return 0; > +} > + > +static void chroot_dummy(void) > +{ > + (void)chroot_sendfd; > +} > + > +/* > + * Fork a pr
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 2011-01-18 18:02, Alex Williamson wrote: > On Tue, 2011-01-18 at 16:54 +0100, Jan Kiszka wrote: >> On 2011-01-18 16:48, Anthony Liguori wrote: >>> On 01/18/2011 09:43 AM, Jan Kiszka wrote: On 2011-01-18 16:04, Anthony Liguori wrote: > On 01/18/2011 08:28 AM, Jan Kiszka wrote: > >> On 2011-01-12 11:31, Jan Kiszka wrote: >> >> >>> Am 12.01.2011 11:22, Avi Kivity wrote: >>> >>> On 01/11/2011 03:54 PM, Anthony Liguori wrote: > Right, we should introduce a KVMBus that KVM devices are created on. > The devices can get at KVMState through the BusState. > > There is no kvm bus in a PC (I looked). We're bending the device model here because a device is implemented in the kernel and not in userspace. An implementation detail is magnified beyond all proportions. An ioapic that is implemented by kvm lives in exactly the same place that the qemu ioapic lives in. An assigned pci device lives on the PCI bus, not a KVMBus. If we need a pointer to KVMState, then we must find it elsewhere, not through creating imaginary buses that don't exist. >>> Exactly. >>> >>> So we can either "infect" the whole device tree with kvm (or maybe a >>> more generic accelerator structure that also deals with Xen) or we need >>> to pull the reference inside the device's init function from some global >>> service (kvm_get_state). >>> >>> >> Note that this topic is still waiting for good suggestions, specifically >> from those who believe in kvm_state references :). This is not only >> blocking kvmstate merge but will affect KVM irqchips as well. >> >> It boils down to how we reasonably pass a kvm_state reference from >> machine init code to a sysbus device. I'm probably biased, but I don't >> see any way that does not work against the idea of confining access to >> kvm_state or breaks device instantiation from the command line or a >> config file. >> >> > A KVM device should sit on a KVM specific bus that hangs off of sysbus. > It can get to kvm_state through that bus. > > That bus doesn't get instantiated through qdev so requiring a pointer > argument should not be an issue. > > This design is in conflict with the requirement to attach KVM-assisted devices also to their home bus, e.g. an assigned PCI device to the PCI bus. We don't support multi-homed qdev devices. >>> >>> With vfio, would an assigned PCI device even need kvm_state? >> >> IIUC: Yes, for establishing the irqfd link. > > We abstract this through the msi/msix layer though, so the vfio driver > doesn't directly know anything about kvm_state. Which version/tree are you referring to? It wasn't the case in the last version I found on the list. Does the msi layer use irqfd for every source in kvm mode then? Of course, the key question will be how that layer will once obtain kvm_state. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 01/18/2011 10:56 AM, Jan Kiszka wrote: The device model topology is 100% a hidden architectural detail. This is true for the sysbus, it is obviously not the case for PCI and similarly discoverable buses. There we have a guest-explorable topology that is currently equivalent to the the qdev layout. But we also don't do PCI passthrough so we really haven't even explored how that maps in qdev. I don't know if qemu-kvm has attempted to qdev-ify it. Management and analysis tools must be able to traverse the system buses and find guest devices this way. We need to provide a compatible interface to the guest. If you agree with my above statements, then you'll also agree that we can do this without keeping the device model topology stable. But we also need to provide a compatible interface to management tools. Exposing the device model topology as a compatible interface artificially limits us. It's far better to provide higher level supported interfaces to give us the flexibility to change the device model as we need to. How do you want to change qdev to keep the guest and management tool view stable while branching off kvm sub-buses? The qdev device model is not a stable interface. I think that's been clear from the very beginning. Please propose such extensions so that they can be discussed. IIUC, that would be second relation between qdev and qbus instances besides the physical topology. What further use cases (besides passing kvm_state around) do you have in mind? The -device interface is a stable interface. Right now, you don't specify any type of identifier of the pci bus when you create a PCI device. It's implied in the interface. If they create a device on bus X, it must never end up on bus Y just because it happens to be KVM-assisted or has some other property. Nope. This is exactly what should happen. 90% of the devices in the device model are not created by management tools. They're part of a chipset. The chipset has well defined extension points and we provide management interfaces to create devices on those extension points. That is, interfaces to create PCI devices. Creating kvm irqchips via the management tool would be one simple way (not the only one, though) to enable/disable kvm assistance for those devices. It's automatically created as part of the CPUs or as part of the chipset. How to enable/disable kvm assistance is a property of the CPU and/or chipset. Regards, Anthony Liguori Jan
Re: [Qemu-devel] [V3 PATCH 5/8] virtio-9p: Create support in chroot environment
On Tue, Jan 18, 2011 at 6:25 AM, M. Mohan Kumar wrote: > Add both server & client side interfaces to create regular files in > chroot environment > > Signed-off-by: M. Mohan Kumar > --- > hw/9pfs/virtio-9p-chroot.c | 42 ++ > hw/9pfs/virtio-9p-local.c | 22 -- > 2 files changed, 62 insertions(+), 2 deletions(-) > > diff --git a/hw/9pfs/virtio-9p-chroot.c b/hw/9pfs/virtio-9p-chroot.c > index b599e23..e7f85e2 100644 > --- a/hw/9pfs/virtio-9p-chroot.c > +++ b/hw/9pfs/virtio-9p-chroot.c > @@ -193,6 +193,42 @@ static void chroot_do_open(V9fsFileObjectRequest > *request, FdInfo *fd_info) > } > } > > +/* > + * Helper routine to create a file and return the file descriptor and > + * error status in FdInfo structure. > + */ > +static void chroot_do_create(V9fsFileObjectRequest *request, FdInfo *fd_info) > +{ > + int cur_uid, cur_gid; uid_t cur_uid; gid_t cur_gid; > + > + cur_uid = geteuid(); > + cur_gid = getegid(); > + > + fd_info->fi_fd = -1; > + > + if (setfsuid(request->data.uid) < 0) { > + fd_info->fi_error = errno; > + return; > + } > + if (setfsgid(request->data.gid) < 0) { > + fd_info->fi_error = errno; > + goto unset_uid; > + } > + > + fd_info->fi_fd = open(request->path.path, request->data.flags, > + request->data.mode); > + > + if (fd_info->fi_fd < 0) { > + fd_info->fi_error = errno; > + } else { > + fd_info->fi_error = 0; > + } > + > + setfsgid(cur_gid); > +unset_uid: > + setfsuid(cur_uid); > +} > + > static int chroot_daemonize(int chroot_sock) > { > sigset_t sigset; > @@ -276,6 +312,12 @@ int v9fs_chroot(FsContext *fs_ctx) > error = -2; > } > break; > + case T_CREATE: > + chroot_do_create(&request, &fd_info); > + if (chroot_sendfd(chroot_sock, &fd_info) <= 0) { > + error = -2; > + } > + break; > default: > break; > } > diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/virtio-9p-local.c > index 2376ec2..7f39b40 100644 > --- a/hw/9pfs/virtio-9p-local.c > +++ b/hw/9pfs/virtio-9p-local.c > @@ -52,6 +52,23 @@ static int __open(FsContext *fs_ctx, const char *path, int > flags) > return fd; > } > > +static int __create(FsContext *fs_ctx, const char *path, int flags, Please don't use identifiers starting with underscores.
Re: [Qemu-devel] [V3 PATCH 6/8] virtio-9p: Support for creating special files
On Tue, Jan 18, 2011 at 6:26 AM, M. Mohan Kumar wrote: > Add both server and client side interfaces to create special files > (directory, device nodes, links and symbolic links) > > Signed-off-by: M. Mohan Kumar > --- > hw/9pfs/virtio-9p-chroot.c | 84 ++- > hw/9pfs/virtio-9p-chroot.h | 2 + > hw/9pfs/virtio-9p-local.c | 141 > ++-- > 3 files changed, 195 insertions(+), 32 deletions(-) > > diff --git a/hw/9pfs/virtio-9p-chroot.c b/hw/9pfs/virtio-9p-chroot.c > index e7f85e2..92a4917 100644 > --- a/hw/9pfs/virtio-9p-chroot.c > +++ b/hw/9pfs/virtio-9p-chroot.c > @@ -193,6 +193,29 @@ static void chroot_do_open(V9fsFileObjectRequest > *request, FdInfo *fd_info) > } > } > > +int v9fs_create_special(FsContext *fs_ctx, > + V9fsFileObjectRequest *request, int *error) > +{ > + int retval; > + > + pthread_mutex_lock(&fs_ctx->chroot_mutex); > + > + *error = 0; > + v9fs_write_request(fs_ctx->chroot_socket, request); > + retval = qemu_read_full(fs_ctx->chroot_socket, error, sizeof(*error)); > + if (retval != sizeof(*error)) { > + error_report("reading from socket failed: %s", strerror(errno)); > + exit(1); > + } > + > + pthread_mutex_unlock(&fs_ctx->chroot_mutex); > + if (*error) { > + return -1; > + } else { > + return 0; > + } > +} > + > /* > * Helper routine to create a file and return the file descriptor and > * error status in FdInfo structure. > @@ -229,6 +252,56 @@ unset_uid: > setfsuid(cur_uid); > } > > +/* > + * Create directory, symbolic link, link, device node and regular files > + * Similar to create, but it does not return the fd of created object > + * Returns 0 on success, returns errno on failure > + */ > +static int chroot_do_create_special(V9fsFileObjectRequest *request) > +{ > + int cur_uid, cur_gid; uid_t cur_uid; gid_t cur_gid; > + int retval, error; > + > + cur_uid = geteuid(); > + cur_gid = getegid(); > + > + if (setfsuid(request->data.uid) < 0) { > + return errno; > + } > + if (setfsgid(request->data.gid) < 0) { > + error = errno; > + goto unset_uid; > + } > + > + switch (request->data.type) { > + case T_MKDIR: > + retval = mkdir(request->path.path, request->data.mode); > + break; > + case T_SYMLINK: > + retval = symlink(request->path.old_path, request->path.path); > + break; > + case T_LINK: > + retval = link(request->path.old_path, request->path.path); > + break; > + default: > + retval = mknod(request->path.path, request->data.mode, > + request->data.dev); > + break; > + } > + > + if (retval < 0) { > + error = errno; > + } else { > + error = 0; > + } > + > + setfsgid(cur_gid); > +unset_uid: > + setfsuid(cur_uid); > + > + return error; > +} > + > static int chroot_daemonize(int chroot_sock) > { > sigset_t sigset; > @@ -263,7 +336,7 @@ static int chroot_daemonize(int chroot_sock) > */ > int v9fs_chroot(FsContext *fs_ctx) > { > - int fd_pair[2], pid, chroot_sock, error; > + int fd_pair[2], pid, chroot_sock, error, retval; > V9fsFileObjectRequest request; > FdInfo fd_info; > > @@ -318,6 +391,15 @@ int v9fs_chroot(FsContext *fs_ctx) > error = -2; > } > break; > + case T_MKDIR: > + case T_SYMLINK: > + case T_LINK: > + case T_MKNOD: > + retval = chroot_do_create_special(&request); > + if (qemu_write_full(chroot_sock, &retval, sizeof(retval)) < 0) { > + error = -2; > + } > + break; > default: > break; > } > diff --git a/hw/9pfs/virtio-9p-chroot.h b/hw/9pfs/virtio-9p-chroot.h > index f5a2ca0..9a0ba88 100644 > --- a/hw/9pfs/virtio-9p-chroot.h > +++ b/hw/9pfs/virtio-9p-chroot.h > @@ -36,5 +36,7 @@ typedef struct V9fsFileObjectRequest > > int v9fs_chroot(FsContext *fs_ctx); > int v9fs_request(FsContext *fs_ctx, V9fsFileObjectRequest *or, int *error); > +int v9fs_create_special(FsContext *fs_ctx, > + V9fsFileObjectRequest *request, int *error); > > #endif /* _QEMU_VIRTIO_9P_CHROOT_H */ > diff --git a/hw/9pfs/virtio-9p-local.c b/hw/9pfs/virtio-9p-local.c > index 7f39b40..08fd67f 100644 > --- a/hw/9pfs/virtio-9p-local.c > +++ b/hw/9pfs/virtio-9p-local.c > @@ -69,6 +69,78 @@ static int __create(FsContext *fs_ctx, const char *path, > int flags, > return fd; > } > > +static int __mknod(FsContext *fs_ctx, const char *path, FsCred *credp) underscores
Re: [Qemu-devel] [PATCH] linux-user: Fix possible realloc memory leak
Am 18.01.2011 09:26, schrieb Markus Armbruster: Stefan Weil writes: Extract from "man realloc": "If realloc() fails the original block is left untouched; it is not freed or moved." Fix a possible memory leak (reported by cppcheck). Cc: Riku Voipio Signed-off-by: Stefan Weil Sidestep the problem via qemu_realloc() instead? The same change was applied to bsd-user/elfload.c. As symbol loading is not essential in most applications, returning after out-of-memory should be better than aborting (that's what qemu_realloc does).
Re: [Qemu-devel] Changing the content of target cpu registers
On Tue, Jan 18, 2011 at 9:29 AM, Stefano Bonifazi wrote: > Hi all! > I am working on qemu-user (qemu-ppc). > I'd like to edit the values of target registers during the execution. Can I > do that by simply changing the content of env->gpr[] or do these only > contain a copy of the values of the registers? > In this last case, where are the real values of the target registers stored > so that by modifying them I can alter the behavior of the target code > execution? env->gpr is the canonical location, but the translator assigns TCG variables to them (cpu_gpr[] in translate.c), so GPR contents may be cached to these. But when helpers are called or the TB finishes, env->gpr should be valid again.
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 2011-01-18 18:09, Anthony Liguori wrote: > On 01/18/2011 10:56 AM, Jan Kiszka wrote: >> >>> The device model topology is 100% a hidden architectural detail. >>> >> This is true for the sysbus, it is obviously not the case for PCI and >> similarly discoverable buses. There we have a guest-explorable topology >> that is currently equivalent to the the qdev layout. >> > > But we also don't do PCI passthrough so we really haven't even explored > how that maps in qdev. I don't know if qemu-kvm has attempted to > qdev-ify it. It is. And even if it weren't or the current version in qemu-kvm was not perfect, we need to consider those uses cases now as we are about to define a generic model for kvm device integration. That's the point of this discussion. > Management and analysis tools must be able to traverse the system buses and find guest devices this way. >>> We need to provide a compatible interface to the guest. If you agree >>> with my above statements, then you'll also agree that we can do this >>> without keeping the device model topology stable. >>> >>> But we also need to provide a compatible interface to management tools. >>> Exposing the device model topology as a compatible interface >>> artificially limits us. It's far better to provide higher level >>> supported interfaces to give us the flexibility to change the device >>> model as we need to. >>> >> How do you want to change qdev to keep the guest and management tool >> view stable while branching off kvm sub-buses? > > The qdev device model is not a stable interface. I think that's been > clear from the very beginning. Internals aren't stable, but they should only be changed for a good reason, specifically when the change may impact the whole set of device models. > >> Please propose such >> extensions so that they can be discussed. IIUC, that would be second >> relation between qdev and qbus instances besides the physical topology. >> What further use cases (besides passing kvm_state around) do you have in >> mind? >> > > The -device interface is a stable interface. Right now, you don't > specify any type of identifier of the pci bus when you create a PCI > device. It's implied in the interface. Which only works as along as we expose a single bus. You don't need to be an oracle to predict that this is not a stable interface. > >> >>> If they create a device on bus X, it must never end up on bus Y just because it happens to be KVM-assisted or has some other property. >>> Nope. This is exactly what should happen. >>> >>> 90% of the devices in the device model are not created by management >>> tools. They're part of a chipset. The chipset has well defined >>> extension points and we provide management interfaces to create devices >>> on those extension points. That is, interfaces to create PCI devices. >>> >>> >> Creating kvm irqchips via the management tool would be one simple way >> (not the only one, though) to enable/disable kvm assistance for those >> devices. >> > > It's automatically created as part of the CPUs or as part of the > chipset. How to enable/disable kvm assistance is a property of the CPU > and/or chipset. If we exclude creation via command line / config files, we could also pass the kvm_state directly from the machine or chipset setup code and save us at least the kvm system buses. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation.
On 18 January 2011 17:00, Christophe Lyon wrote: > On 18.01.2011 16:36, Peter Maydell wrote: >> Incidentally there are some correctness fixes for the multiply-by-scalar >> neon insns from the qemu-meego tree which are on my list to push >> upstream. So you probably aren't getting the right results even if >> you've managed to shut up qemu's warnings :-) > And yes I can confirm there are many other wrong results in the > Neon support, which I am currently fixing. Please coordinate this with me! I have a big pile of fixes which I am working through, testing and submitting upstream, so you are in significant danger of duplicating work, which would be unfortunate. -- PMM
Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation.
On 18.01.2011 16:36, Peter Maydell wrote: > Incidentally there are some correctness fixes for the multiply-by-scalar > neon insns from the qemu-meego tree which are on my list to push > upstream. So you probably aren't getting the right results even if > you've managed to shut up qemu's warnings :-) > Actually it did not only shut up qemu's warnings. It was asserting. After fixing the asserts, it did warn a lot about resource leakage indeed, which I tried to fix with this patch. And yes I can confirm there are many other wrong results in the Neon support, which I am currently fixing. Christophe.
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 01/18/2011 11:20 AM, Jan Kiszka wrote: Which only works as along as we expose a single bus. You don't need to be an oracle to predict that this is not a stable interface. Today we only have a very low level factory interface--device creation and deletion. I think we should move to higher level bus factory interfaces. An interface to create a PCI device and to delete PCI devices. This is the only sane way to do hot plug. This also makes supporting multiple busses a lot more reasonable since this factory interface could be a method of the controller. It's automatically created as part of the CPUs or as part of the chipset. How to enable/disable kvm assistance is a property of the CPU and/or chipset. If we exclude creation via command line / config files, we could also pass the kvm_state directly from the machine or chipset setup code and save us at least the kvm system buses. Which is fine in the short term. This is exactly why we don't want the device model to be an ABI. It gives us the ability to make changes as they make sense instead of trying to be perfect from the start (which we never will be). Regards, Anthony Liguori Jan
Re: [Qemu-devel] [sparc] Floating point exception issue
On Tue, Jan 18, 2011 at 3:27 PM, Mateusz Loskot wrote: > Hi, > > Recently, I have reported mysterious issues on NetBSD 5.1 > emulated on SPARC. The whole first thread is here: > > http://lists.gnu.org/archive/html/qemu-devel/2011-01/msg01509.html > > I decided to investigate the problem deeper and with great help > from NetBSD folks, I managed to find reproducible test case. > Initially, it was AWK command: > > # echo NaN | awk '{print "test"}' > awk: floating point exception 8 > source line number 1 > > and next it boiled down to simple C program (see below). > Details of the investigation are archived in the NetBSD Problem > Report #44389 here: > > http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389 > > > Here is final version of the test program which reproduces the problem: > > #include > #include > #include > #include > > int is_number(const char *s) > { > double r; > char *ep; > errno = 0; > r = strtod(s, &ep); > if (r == HUGE_VAL) > printf("X:%g\n", r); > > if (ep == s || r == HUGE_VAL || errno == ERANGE) > return 0; > while (*ep == ' ' || *ep == '\t' || *ep == '\n') > ep++; > if (*ep == '\0') > return 1; > else > return 0; > } > > int main(int argc, char **argv) > { > double v; > > if (is_number("NaN")) { > printf("is a number\n"); > v = atof("NaN"); > } else { > printf("not a number\n"); > v = 0.0; > } > printf("%.4f\n", v); > > return 0; > } > > > On NetBSD/SPARC, the program receives SIGFPE: > > $ gcc ./nan_test_2.c > $ ./a.out > [1] Floating point exception (core dumped) ./a.out > > Specifically, it's caused by r == HUGE_VAL condition in > if (ep == s || r == HUGE_VAL || errno == ERANGE) > where r is NaN. > > All the signs indicate there is a bug in QEMU. I'll install 5.1, but on 4.0 which I had installed, the program works fine: $ ./sigfpe is a number nan
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On Tue, 2011-01-18 at 18:08 +0100, Jan Kiszka wrote: > On 2011-01-18 18:02, Alex Williamson wrote: > > On Tue, 2011-01-18 at 16:54 +0100, Jan Kiszka wrote: > >> On 2011-01-18 16:48, Anthony Liguori wrote: > >>> On 01/18/2011 09:43 AM, Jan Kiszka wrote: > On 2011-01-18 16:04, Anthony Liguori wrote: > > > On 01/18/2011 08:28 AM, Jan Kiszka wrote: > > > >> On 2011-01-12 11:31, Jan Kiszka wrote: > >> > >> > >>> Am 12.01.2011 11:22, Avi Kivity wrote: > >>> > >>> > On 01/11/2011 03:54 PM, Anthony Liguori wrote: > > > > Right, we should introduce a KVMBus that KVM devices are created on. > > The devices can get at KVMState through the BusState. > > > > > There is no kvm bus in a PC (I looked). We're bending the device > model > here because a device is implemented in the kernel and not in > userspace. An implementation detail is magnified beyond all > proportions. > > An ioapic that is implemented by kvm lives in exactly the same place > that the qemu ioapic lives in. An assigned pci device lives on the > PCI > bus, not a KVMBus. If we need a pointer to KVMState, then we must > find > it elsewhere, not through creating imaginary buses that don't exist. > > > > >>> Exactly. > >>> > >>> So we can either "infect" the whole device tree with kvm (or maybe a > >>> more generic accelerator structure that also deals with Xen) or we > >>> need > >>> to pull the reference inside the device's init function from some > >>> global > >>> service (kvm_get_state). > >>> > >>> > >> Note that this topic is still waiting for good suggestions, > >> specifically > >> from those who believe in kvm_state references :). This is not only > >> blocking kvmstate merge but will affect KVM irqchips as well. > >> > >> It boils down to how we reasonably pass a kvm_state reference from > >> machine init code to a sysbus device. I'm probably biased, but I don't > >> see any way that does not work against the idea of confining access to > >> kvm_state or breaks device instantiation from the command line or a > >> config file. > >> > >> > > A KVM device should sit on a KVM specific bus that hangs off of sysbus. > > It can get to kvm_state through that bus. > > > > That bus doesn't get instantiated through qdev so requiring a pointer > > argument should not be an issue. > > > > > This design is in conflict with the requirement to attach KVM-assisted > devices also to their home bus, e.g. an assigned PCI device to the PCI > bus. We don't support multi-homed qdev devices. > > >>> > >>> With vfio, would an assigned PCI device even need kvm_state? > >> > >> IIUC: Yes, for establishing the irqfd link. > > > > We abstract this through the msi/msix layer though, so the vfio driver > > doesn't directly know anything about kvm_state. > > Which version/tree are you referring to? It wasn't the case in the last > version I found on the list. > > Does the msi layer use irqfd for every source in kvm mode then? Of > course, the key question will be how that layer will once obtain kvm_state. Looking at "[RFC PATCH v2] VFIO based device assignment" sent on Nov 5th, I guess we do call kvm_set_irqfd. Maybe I'm just wishing that the msi layer abstracted it better. I'd like to be able to pass in a userspace interrupt handler function pointer and an event notifier fd and let the interrupt layers worry about how it's hooked up. Alex
Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state
On 2011-01-18 18:31, Anthony Liguori wrote: >>> It's automatically created as part of the CPUs or as part of the >>> chipset. How to enable/disable kvm assistance is a property of the CPU >>> and/or chipset. >>> >> If we exclude creation via command line / config files, we could also >> pass the kvm_state directly from the machine or chipset setup code and >> save us at least the kvm system buses. >> > > Which is fine in the short term. Unless we want to abuse the pointer property for this, and there was some resistance, we would have to change the sysbus init function signature. I don't want to propose this for a short-term workaround, we need a clearer vision and roadmap to avoid multiple invasive changes to the device model. > This is exactly why we don't want the > device model to be an ABI. It gives us the ability to make changes as > they make sense instead of trying to be perfect from the start (which we > never will be). The device model will always consist of a stable part, the guest and management visible topology. That beast needs to be modeled as well, likely via some new bus objects. If that's the way to go, starting now is probably the right time as we have an urgent use case, right? Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH] linux-user: Fix possible realloc memory leak
Stefan Weil writes: > Am 18.01.2011 09:26, schrieb Markus Armbruster: >> Stefan Weil writes: >> >>> Extract from "man realloc": >>> "If realloc() fails the original block is left untouched; >>> it is not freed or moved." >>> >>> Fix a possible memory leak (reported by cppcheck). >>> >>> Cc: Riku Voipio >>> Signed-off-by: Stefan Weil >> >> Sidestep the problem via qemu_realloc() instead? > > The same change was applied to bsd-user/elfload.c. > > As symbol loading is not essential in most applications, > returning after out-of-memory should be better than > aborting (that's what qemu_realloc does). Unless the requested size is *really* large, I'd expect this to stave off the out-of-memory failure for a few microseconds at best.
Re: [Qemu-devel] Re: Fwd: Re: port-sparc/44389: awk failure during m4 installation with pkgsrc
On Tue, Jan 18, 2011 at 11:40 AM, Mateusz Loskot wrote: > Blue Swirl gmail.com> writes: >> Mateusz Loskot loskot.net> wrote: >> > Hi, >> > >> > I emulate SPARC with NetBSD 5.0 installed and I've experienced a problem >> > with pkgsrc installing one of packages. >> > I submitted bug report to NetBSD team and received short response >> > suggesting it is likely QEMU problem (see original message below). >> > >> > Shortly, the problem is that AWK throws "floating point exception" >> > Here is the bug report with details: >> > >> > http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389 >> > >> > Given the nature of the problem, it "feels" the issue is unrelated to >> > QEMU, but I'm not on position to judge the suggestion posted by NetBSD >> > team is pointless. >> > >> > I have no idea how to isolate the problem on NetBSD, so I replied to >> > NetBSD team asking: >> > === >> > Could you suggest how to isolate the problem, find exact awk command >> > causing the error, so I can debug it or forward to QEMU developers with >> > details necessary to reproduce it? >> > === >> > >> > In the meantime, I thought I will attack qemu-devel hoping it may ring >> > some bells here as well. >> > >> > Any ideas? >> >> It's entirely possible that the floating point support for Sparc can >> be buggy, there have been a few fixes to softfloat recently for other >> architectures. >> >> I'd check if NaN handling or floating point exception support works >> correctly, regular programs don't use those. But a reproducible test >> case is essential. > > > I think I have managed to isolate reproducible test for this problem > and it seems the issue is in NetBSD userland. > See my last update to the problem reprot: > > http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389 > > Long story short, the problem boils down to the following command: > > # echo NaN | awk '{print "test"}' > awk: floating point exception 8 > source line number 1 > > which interprets "NaN" as a number when it isn't asked to. FYI: also this succeeds on 4.0: $ echo NaN | awk '{print "test"}' test
Re: [Qemu-devel] [sparc] Floating point exception issue
On 18/01/11 17:36, Blue Swirl wrote: On Tue, Jan 18, 2011 at 3:27 PM, Mateusz Loskot wrote: Hi, Recently, I have reported mysterious issues on NetBSD 5.1 emulated on SPARC. The whole first thread is here: http://lists.gnu.org/archive/html/qemu-devel/2011-01/msg01509.html I decided to investigate the problem deeper and with great help from NetBSD folks, I managed to find reproducible test case. Initially, it was AWK command: # echo NaN | awk '{print "test"}' awk: floating point exception 8 source line number 1 and next it boiled down to simple C program (see below). Details of the investigation are archived in the NetBSD Problem Report #44389 here: http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44389 Here is final version of the test program which reproduces the problem: #include #include #include #include int is_number(const char *s) { double r; char *ep; errno = 0; r = strtod(s,&ep); if (r == HUGE_VAL) printf("X:%g\n", r); if (ep == s || r == HUGE_VAL || errno == ERANGE) return 0; while (*ep == ' ' || *ep == '\t' || *ep == '\n') ep++; if (*ep == '\0') return 1; else return 0; } int main(int argc, char **argv) { double v; if (is_number("NaN")) { printf("is a number\n"); v = atof("NaN"); } else { printf("not a number\n"); v = 0.0; } printf("%.4f\n", v); return 0; } On NetBSD/SPARC, the program receives SIGFPE: $ gcc ./nan_test_2.c $ ./a.out [1] Floating point exception (core dumped) ./a.out Specifically, it's caused by r == HUGE_VAL condition in if (ep == s || r == HUGE_VAL || errno == ERANGE) where r is NaN. All the signs indicate there is a bug in QEMU. I'll install 5.1, but on 4.0 which I had installed, the program works fine: $ ./sigfpe is a number nan I just tested on NetBSD 5.0/SPARC under QEMU 0.13 (same version I use with NetBSD 5.1/SPARC) and it works well indeed: mloskot@qemu-netbsd-50-sparc:~/tmp# ./a.out is a number nan mloskot@qemu-netbsd-50-sparc:~/tmp# Hmm, this is becoming interesting. I run QEMU 0.13 on Windows Vista (64-bit). Perhaps host system and QEMU binaries are relevant here. I will try on Linux host system later tonight. BTW, here are my images: http://mateusz.loskot.net/tmp/qemu/ Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org Member of ACCU, http://accu.org