Re: [Qemu-devel] [PATCH] cpu-exec: make TBs generated codes unlinked when -singlestep
On 07/24/2014 06:37 PM, Jincheng Miao wrote: > '-singlestep' option will make TB contains only one instruction, > so that the qemu_log could output trace log when CPU_LOG_EXEC sets, > and it could help developers to debug control flow. > > But currently, in cpu_exec(), it doesn't check singlestep when > tb_add_jump(), so the TB linked is executed siliently. > Therefore, this patch adds singlestep check before tb_add_jump(). > > Signed-off-by: Jincheng Miao Reasonable. I've been thinking that we simply shoudn't emit goto_tb under single-step. That does require fixes to all but 2 or 3 of the backends though, and this patch attacks the problem all in one place. Reviewed-by: Richard Henderson r~
Re: [Qemu-devel] [PATCH] cpu-exec: make TBs generated codes unlinked when -singlestep
On 25 July 2014 07:58, Richard Henderson wrote: > On 07/24/2014 06:37 PM, Jincheng Miao wrote: >> '-singlestep' option will make TB contains only one instruction, >> so that the qemu_log could output trace log when CPU_LOG_EXEC sets, >> and it could help developers to debug control flow. >> >> But currently, in cpu_exec(), it doesn't check singlestep when >> tb_add_jump(), so the TB linked is executed siliently. >> Therefore, this patch adds singlestep check before tb_add_jump(). >> >> Signed-off-by: Jincheng Miao > > Reasonable. I've been thinking that we simply shoudn't emit goto_tb under > single-step. That does require fixes to all but 2 or 3 of the backends > though, > and this patch attacks the problem all in one place. Huh? We already don't emit goto_tb if single-stepping, surely? (Well, I guess some of the backends might well be broken, but in that case they probably don't get the other bits of singlestep support right either...) -- PMM
Re: [Qemu-devel] [PATCH] cpu-exec: make TBs generated codes unlinked when -singlestep
On 07/24/2014 09:37 PM, Peter Maydell wrote: > Huh? We already don't emit goto_tb if single-stepping, surely? > (Well, I guess some of the backends might well be broken, but > in that case they probably don't get the other bits of singlestep > support right either...) Indeed. I noticed this a month or so ago. Almost all backends check the gdb env->single_step to prevent goto_tb, but forget about the tcg debugging singlestep. r~
Re: [Qemu-devel] [PATCH] cpu-exec: make TBs generated codes unlinked when -singlestep
On 25 July 2014 08:41, Richard Henderson wrote: > On 07/24/2014 09:37 PM, Peter Maydell wrote: >> Huh? We already don't emit goto_tb if single-stepping, surely? >> (Well, I guess some of the backends might well be broken, but >> in that case they probably don't get the other bits of singlestep >> support right either...) > > Indeed. I noticed this a month or so ago. > > Almost all backends check the gdb env->single_step to prevent goto_tb, but > forget about the tcg debugging singlestep. Oh, we have two flavours of singlestep? That's confusing... (I'm currently working on the ARMv8 architectural singlestep, which will make 3 for target-arm.) thanks -- PMM
Re: [Qemu-devel] [PULL for-2.1 0/2] vnc: fix two vnc update issues.
On Do, 2014-07-24 at 15:22 +0100, Peter Maydell wrote: > On 24 July 2014 15:10, Gerd Hoffmann wrote: > > Hi, > > > >> > So are these *really* release critical bugs, if they've been > >> > only found in code review? We're really close to release now > >> > and so my preference is not to include changes unless they're > >> > really necessary... > >> > >> These are fixing openQA breakage (os-autoinst), > > > > In more detail: > > [snip] > > Thanks for the clarification; these do seem worth putting in 2.1. > I'll need to get you to respin with the missing signed-off-by > that Andreas pointed out, though. Done: git://git.kraxel.org/qemu tags/pull-vnc-20140725-1 cheers, Gerd
Re: [Qemu-devel] [PATCH] cpu-exec: make TBs generated codes unlinked when -singlestep
On 07/25/2014 03:45 PM, Peter Maydell wrote: On 25 July 2014 08:41, Richard Henderson wrote: On 07/24/2014 09:37 PM, Peter Maydell wrote: Huh? We already don't emit goto_tb if single-stepping, surely? (Well, I guess some of the backends might well be broken, but in that case they probably don't get the other bits of singlestep support right either...) Indeed. I noticed this a month or so ago. Almost all backends check the gdb env->single_step to prevent goto_tb, but forget about the tcg debugging singlestep. Oh, we have two flavours of singlestep? That's confusing... IMHO, CPUState->singlestep_enabled is a cpu execute mode, for emulating it, an exception should be raised. But '-singlestep' from command line rules qemu how to generate TBs and their generated codes. In this situation, a TB only contains one instruction, and should be unlinked. Am I right? (I'm currently working on the ARMv8 architectural singlestep, which will make 3 for target-arm.) thanks -- PMM
Re: [Qemu-devel] [PATCH 2/3] qemu-char: add -chardev exit-on-eof option
Stefan Hajnoczi writes: > When QEMU is executed as part of a test case or from a script, it is > usually desirable to exit if the parent process terminates. This > ensures that "leaked" QEMU processes do not continue consuming resources > after their parent has died. > > This patch adds the -chardev exit-on-eof option causing socket and pipe > chardevs to exit QEMU upon close. This happens when a parent process > deliberately closes its file descriptor but also when the kernel cleans > up a crashed process. > > Signed-off-by: Stefan Hajnoczi > --- > include/sysemu/char.h | 1 + > qapi-schema.json | 23 --- > qemu-char.c | 34 -- > qemu-options.hx | 19 +-- > 4 files changed, 58 insertions(+), 19 deletions(-) > > diff --git a/include/sysemu/char.h b/include/sysemu/char.h > index 0bbd631..382b320 100644 > --- a/include/sysemu/char.h > +++ b/include/sysemu/char.h > @@ -86,6 +86,7 @@ struct CharDriverState { > guint fd_in_tag; > QemuOpts *opts; > QTAILQ_ENTRY(CharDriverState) next; > +bool exit_on_eof; > }; > > /** > diff --git a/qapi-schema.json b/qapi-schema.json > index b11aad2..9b13da1 100644 > --- a/qapi-schema.json > +++ b/qapi-schema.json > @@ -2630,10 +2630,13 @@ > # @device: The name of the special file for the device, > # i.e. /dev/ttyS0 on Unix or COM1: on Windows > # @type: What kind of device this is. > +# @exit-on-eof: #optional terminate when other side closes the pipe > +# (default: false, since: 2.2) > # > # Since: 1.4 > ## > -{ 'type': 'ChardevHostdev', 'data': { 'device' : 'str' } } > +{ 'type': 'ChardevHostdev', 'data': { 'device' : 'str', > + '*exit-on-eof' : 'bool' } } > > ## > # @ChardevSocket: Any use cases beyond libqtest? If no, should this be x-exit-on-eof? Hmm, looks like there's no precedence for x- in QAPI.
Re: [Qemu-devel] [Bug 1344320] Re: qemu-aarch64 cannot execute glibc
On Tue, Jul 22, 2014 at 10:22:15PM -, Peter Maydell wrote: > On 22 July 2014 20:46, Richard Henderson wrote: > > On 07/21/2014 10:37 AM, Peter Maydell wrote: > >>> It's trying to measure clock cycles required to perform the startup > >>> relocations. > >> > >> That's a neat trick, given that the generic timers are not cycle > >> counters! They're a fixed frequency counter which is generally > >> unrelated and rather slower than the CPU frequency (and > >> which doesn't scale up and down with CPU frequency either). > > > > Even better. > > In any case the kernel guys say you can't guarantee they > exist unless you get them to define an ELF hwcap for > "timers exist and have a sane value in the 'what frequency > are they' register". So this is a glibc bug and I'm > not fixing QEMU... It's used by openssl for rdtsc emulation as well: http://git.openssl.org/gitweb/?p=openssl.git;a=blob;f=crypto/arm64cpuid.S#l17 Riku > -- PMM > > Title: > qemu-aarch64 cannot execute glibc > > Status in QEMU: > New > > Bug description: > $ aarch64-linux-user/qemu-aarch64 -version > qemu-aarch64 version 2.0.92, Copyright (c) 2003-2008 Fabrice Bellard > $ aarch64-linux-user/qemu-aarch64 -d in_asm > /daten/build/build-root/home/abuild/rpmbuild/BUILD/glibc-2.19.90/cc-base/elf/ld-linux-aarch64.so.1 > > host mmap_min_addr=0x1 > guest_base 0x0 > startend size prot > 0040-0041e000 0001e000 r-x > 0041e000-0042e000 0001 --- > 0042e000-00431000 3000 rw- > 00431000-00432000 1000 --- > 00432000-004000832000 0080 rw- > start_brk 0x > end_code0x0041dbe0 > start_code 0x0040 > start_data 0x0042eba8 > end_data0x00430008 > start_stack 0x004000830a10 > brk 0x00430170 > entry 0x004012c0 > > IN: > 0x004012c0: 910003e0 mov x0, sp > 0x004012c4: 94000d4f bl #+0x353c (addr 0x7fffb5bdad68) > > > IN: _dl_start > 0x00404800: d11243ff sub sp, sp, #0x490 (1168) > 0x00404804: a9ba7bfd stp x29, x30, [sp, #-96]! > 0x00404808: 910003fd mov x29, sp > 0x0040480c: a9046bf9 stp x25, x26, [sp, #64] > 0x00404810: a90153f3 stp x19, x20, [sp, #16] > 0x00404814: a9025bf5 stp x21, x22, [sp, #32] > 0x00404818: a90363f7 stp x23, x24, [sp, #48] > 0x0040481c: a90573fb stp x27, x28, [sp, #80] > 0x00404820: aa0003fa mov x26, x0 > 0x00404824: d5033fdf isb > 0x00404828: d53be040 mrs x0, (unknown) > > qemu: uncaught target signal 4 (Illegal instruction) - core dumped > Illegal instruction > $ objdump -d > /daten/build/build-root/home/abuild/rpmbuild/BUILD/glibc-2.19.90/cc-base/elf/ld-linux-aarch64.so.1 > | grep ' 4828:' > 4828: d53be040mrs x0, cntvct_el0 > > To manage notifications about this bug go to: > https://bugs.launchpad.net/qemu/+bug/1344320/+subscriptions
Re: [Qemu-devel] [Bug 1344320] Re: qemu-aarch64 cannot execute glibc
On 25 July 2014 10:54, Riku Voipio wrote: > On Tue, Jul 22, 2014 at 10:22:15PM -, Peter Maydell wrote: >> On 22 July 2014 20:46, Richard Henderson wrote: >> > On 07/21/2014 10:37 AM, Peter Maydell wrote: >> >>> It's trying to measure clock cycles required to perform the startup >> >>> relocations. >> >> >> >> That's a neat trick, given that the generic timers are not cycle >> >> counters! They're a fixed frequency counter which is generally >> >> unrelated and rather slower than the CPU frequency (and >> >> which doesn't scale up and down with CPU frequency either). >> > >> > Even better. >> >> In any case the kernel guys say you can't guarantee they >> exist unless you get them to define an ELF hwcap for >> "timers exist and have a sane value in the 'what frequency >> are they' register". So this is a glibc bug and I'm >> not fixing QEMU... > > It's used by openssl for rdtsc emulation as well: > > http://git.openssl.org/gitweb/?p=openssl.git;a=blob;f=crypto/arm64cpuid.S#l17 > It is well known to me and to the upstream author that the timer count is meaningless without also reading the frequency. However, the piece of code you refer to is only used for internal development by the upstream author, so I don't think we should care about it. (The snippet in question is executed with a SIGILL trap set as a kind of poor man's elf_hwcap) -- Ard. >> -- PMM >> >> Title: >> qemu-aarch64 cannot execute glibc >> >> Status in QEMU: >> New >> >> Bug description: >> $ aarch64-linux-user/qemu-aarch64 -version >> qemu-aarch64 version 2.0.92, Copyright (c) 2003-2008 Fabrice Bellard >> $ aarch64-linux-user/qemu-aarch64 -d in_asm >> /daten/build/build-root/home/abuild/rpmbuild/BUILD/glibc-2.19.90/cc-base/elf/ld-linux-aarch64.so.1 >> host mmap_min_addr=0x1 >> guest_base 0x0 >> startend size prot >> 0040-0041e000 0001e000 r-x >> 0041e000-0042e000 0001 --- >> 0042e000-00431000 3000 rw- >> 00431000-00432000 1000 --- >> 00432000-004000832000 0080 rw- >> start_brk 0x >> end_code0x0041dbe0 >> start_code 0x0040 >> start_data 0x0042eba8 >> end_data0x00430008 >> start_stack 0x004000830a10 >> brk 0x00430170 >> entry 0x004012c0 >> >> IN: >> 0x004012c0: 910003e0 mov x0, sp >> 0x004012c4: 94000d4f bl #+0x353c (addr 0x7fffb5bdad68) >> >> >> IN: _dl_start >> 0x00404800: d11243ff sub sp, sp, #0x490 (1168) >> 0x00404804: a9ba7bfd stp x29, x30, [sp, #-96]! >> 0x00404808: 910003fd mov x29, sp >> 0x0040480c: a9046bf9 stp x25, x26, [sp, #64] >> 0x00404810: a90153f3 stp x19, x20, [sp, #16] >> 0x00404814: a9025bf5 stp x21, x22, [sp, #32] >> 0x00404818: a90363f7 stp x23, x24, [sp, #48] >> 0x0040481c: a90573fb stp x27, x28, [sp, #80] >> 0x00404820: aa0003fa mov x26, x0 >> 0x00404824: d5033fdf isb >> 0x00404828: d53be040 mrs x0, (unknown) >> >> qemu: uncaught target signal 4 (Illegal instruction) - core dumped >> Illegal instruction >> $ objdump -d >> /daten/build/build-root/home/abuild/rpmbuild/BUILD/glibc-2.19.90/cc-base/elf/ld-linux-aarch64.so.1 >> | grep ' 4828:' >> 4828: d53be040mrs x0, cntvct_el0 >> >> To manage notifications about this bug go to: >> https://bugs.launchpad.net/qemu/+bug/1344320/+subscriptions
Re: [Qemu-devel] [PATCH 2/3] qemu-char: add -chardev exit-on-eof option
On Fri, Jul 25, 2014 at 9:12 AM, Markus Armbruster wrote: > Stefan Hajnoczi writes: > >> When QEMU is executed as part of a test case or from a script, it is >> usually desirable to exit if the parent process terminates. This >> ensures that "leaked" QEMU processes do not continue consuming resources >> after their parent has died. >> >> This patch adds the -chardev exit-on-eof option causing socket and pipe >> chardevs to exit QEMU upon close. This happens when a parent process >> deliberately closes its file descriptor but also when the kernel cleans >> up a crashed process. >> >> Signed-off-by: Stefan Hajnoczi >> --- >> include/sysemu/char.h | 1 + >> qapi-schema.json | 23 --- >> qemu-char.c | 34 -- >> qemu-options.hx | 19 +-- >> 4 files changed, 58 insertions(+), 19 deletions(-) >> >> diff --git a/include/sysemu/char.h b/include/sysemu/char.h >> index 0bbd631..382b320 100644 >> --- a/include/sysemu/char.h >> +++ b/include/sysemu/char.h >> @@ -86,6 +86,7 @@ struct CharDriverState { >> guint fd_in_tag; >> QemuOpts *opts; >> QTAILQ_ENTRY(CharDriverState) next; >> +bool exit_on_eof; >> }; >> >> /** >> diff --git a/qapi-schema.json b/qapi-schema.json >> index b11aad2..9b13da1 100644 >> --- a/qapi-schema.json >> +++ b/qapi-schema.json >> @@ -2630,10 +2630,13 @@ >> # @device: The name of the special file for the device, >> # i.e. /dev/ttyS0 on Unix or COM1: on Windows >> # @type: What kind of device this is. >> +# @exit-on-eof: #optional terminate when other side closes the pipe >> +# (default: false, since: 2.2) >> # >> # Since: 1.4 >> ## >> -{ 'type': 'ChardevHostdev', 'data': { 'device' : 'str' } } >> +{ 'type': 'ChardevHostdev', 'data': { 'device' : 'str', >> + '*exit-on-eof' : 'bool' } } >> >> ## >> # @ChardevSocket: > > Any use cases beyond libqtest? qemu-iotests should use it too. Basically any script that drives QEMU directly will need to clean up the QEMU process. Stefan
Re: [Qemu-devel] [PATCH 1/3] libqemustub: add qemu_system_shutdown_request() and no_shutdown
On Thu, Jul 24, 2014 at 6:07 PM, Peter Maydell wrote: > On 24 July 2014 17:39, Stefan Hajnoczi wrote: >> +++ b/stubs/shutdown.c >> @@ -0,0 +1,7 @@ >> +#include "sysemu/sysemu.h" >> + >> +int no_shutdown; >> + >> +void qemu_system_shutdown_request(void) >> +{ >> +} > > Tangential, but every use of "no_shutdown" outside vl.c > is in the sequence: > no_shutdown = 0; > qemu_system_shutdown_request(); > > so maybe we should have a >qemu_system_shutdown_demand(); > and make the no_shutdown variable private to vl.c ? > > (feel free to pick a better name :-)) Sure, I will find a solution for v2. Stefan
Re: [Qemu-devel] [PATCH v2 1/7] bootindex: add modify_boot_device_path function
Hi, > +void modify_boot_device_path(int32_t bootindex, DeviceState *dev, > + const char *suffix) > +{ > +FWBootEntry *node, *i; > + > +assert(dev != NULL || suffix != NULL); > + > +QTAILQ_FOREACH(i, &fw_boot_order, link) { > +if (i->bootindex == bootindex) { > +qerror_report(ERROR_CLASS_GENERIC_ERROR, > + "The bootindex %d has already been used", bootindex); > +return; > +} > +/* delete the same original dev */ > +if (i->dev->id && !strcmp(i->dev->id, dev->id)) { > +QTAILQ_REMOVE(&fw_boot_order, i, link); Ok ... > +g_free(i->suffix); > +g_free(i); ... but you should free the old entry later ... > + > +break; > +} > +} > + > +if (bootindex >= 0) { > +node = g_malloc0(sizeof(FWBootEntry)); > +node->bootindex = bootindex; > +node->suffix = g_strdup(suffix); ... because you can just copy the suffix from the old entry here, instead of expecting the caller pass it in. cheers, Gerd
Re: [Qemu-devel] [PATCH v2 4/7] bootindex: delete bootindex when device is removed
Hi, > +del_boot_device_path(dev); You can call this from device_finalize() instead of placing it into each individual device. cheers, Gerd
[Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks
The icount option already implemented in QEMU allows the guest to run at a theoretical frequency of 1/(2^N) GHz (N is the icount parameter). The goal of this patch is to have a real guest frequency close to the one imposed by using the icount option. The main idea behind the algorithm is that we compare the virtual monotonic clock and the host monotonic clock. For big icounts (on our test machine, an i5 CPU @ 3.10GHz, icounts starting at 6) the guest clock will be ahead of the host clock. In this case, we try to sleep QEMU for the difference between the 2 clocks. Therefore, the guest would have executed for a period almost equally to the one imposed by icount. We should point out that the algorithm works only for those icounts that allow the guest clock to be in front of the host clock. The first patch adds support for QemuOpts for the 'icount' parameter. It also adds a suboption called 'shift' that will hold the value for 'icount'. Therefore we now have -icount shift=N|auto or -icount N|auto. The second patch adds the 'align' suboption for icount. The third patch exports 'icount_time_shift' so that it can be used in places other than cpus.c; we need it in cpu-exec.c for calculating for how long we want QEMU to sleep. The forth patch implements the algorithm used for calculating the delay we want to sleep. It uses the number of instructions executed by the virtual cpu and also the icount_time_shift. The fifth patch prints to the console whenever the guest clock runs behind the host clock. The fastest printing speed is every 2 seconds, and we only print if the align option is enabled. We also have a limit to 100 printed messages. The sixth patch adds information about the difference between the host and guest clocks (taking into account the offset) in the 'info jit' command. We also print the maximum delay and advance of the guest clock compared to the host clock. v4 -> v5 * Use existing offset value (timers_state.cpu_clock_offset) for computation -> patch 4 * Use cpu_get_clock and cpu_get_icount for showing drift information -> patch 6 Important note: This patch series relies on the bug fix discussed here http://lists.gnu.org/archive/html/qemu-devel/2014-07/msg03208.html Sebastian Tanase (6): icount: Add QemuOpts for icount icount: Add align option to icount icount: Make icount_time_shift available everywhere cpu_exec: Add sleeping algorithm cpu_exec: Print to console if the guest is late monitor: Add drift info to 'info jit' cpu-exec.c| 127 ++ cpus.c| 59 +-- include/qemu-common.h | 9 +++- include/qemu/timer.h | 1 + monitor.c | 1 + qemu-options.hx | 17 +-- qtest.c | 13 +- vl.c | 39 +--- 8 files changed, 248 insertions(+), 18 deletions(-) -- 2.0.0.rc2
[Qemu-devel] [PATCH V5 1/6] icount: Add QemuOpts for icount
Make icount parameter use QemuOpts style options in order to easily add other suboptions. Signed-off-by: Sebastian Tanase Tested-by: Camille Bégué Signed-off-by: Paolo Bonzini --- cpus.c| 10 +- include/qemu-common.h | 3 ++- qemu-options.hx | 4 ++-- qtest.c | 13 +++-- vl.c | 35 --- 5 files changed, 52 insertions(+), 13 deletions(-) diff --git a/cpus.c b/cpus.c index 5e7f2cf..dcca96a 100644 --- a/cpus.c +++ b/cpus.c @@ -440,13 +440,21 @@ static const VMStateDescription vmstate_timers = { } }; -void configure_icount(const char *option) +void configure_icount(QemuOpts *opts, Error **errp) { +const char *option; + seqlock_init(&timers_state.vm_clock_seqlock, NULL); vmstate_register(NULL, 0, &vmstate_timers, &timers_state); +option = qemu_opt_get(opts, "shift"); if (!option) { return; } +/* When using -icount shift, the shift option will be + misinterpreted as a boolean */ +if (strcmp(option, "on") == 0 || strcmp(option, "off") == 0) { +error_setg(errp, "The shift option must be a number or auto"); +} icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME, icount_warp_rt, NULL); diff --git a/include/qemu-common.h b/include/qemu-common.h index ae76197..cc346ec 100644 --- a/include/qemu-common.h +++ b/include/qemu-common.h @@ -41,6 +41,7 @@ #include #include #include "glib-compat.h" +#include "qemu/option.h" #ifdef _WIN32 #include "sysemu/os-win32.h" @@ -105,7 +106,7 @@ static inline char *realpath(const char *path, char *resolved_path) #endif /* icount */ -void configure_icount(const char *option); +void configure_icount(QemuOpts *opts, Error **errp); extern int use_icount; #include "qemu/osdep.h" diff --git a/qemu-options.hx b/qemu-options.hx index 9e54686..143def4 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -3011,11 +3011,11 @@ re-inject them. ETEXI DEF("icount", HAS_ARG, QEMU_OPTION_icount, \ -"-icount [N|auto]\n" \ +"-icount [shift=N|auto]\n" \ "enable virtual instruction counter with 2^N clock ticks per\n" \ "instruction\n", QEMU_ARCH_ALL) STEXI -@item -icount [@var{N}|auto] +@item -icount [shift=@var{N}|auto] @findex -icount Enable virtual instruction counter. The virtual cpu will execute one instruction every 2^@var{N} ns of virtual time. If @code{auto} is specified diff --git a/qtest.c b/qtest.c index 04a6dc1..ef0d991 100644 --- a/qtest.c +++ b/qtest.c @@ -19,6 +19,9 @@ #include "hw/irq.h" #include "sysemu/sysemu.h" #include "sysemu/cpus.h" +#include "qemu/config-file.h" +#include "qemu/option.h" +#include "qemu/error-report.h" #define MAX_IRQ 256 @@ -509,10 +512,16 @@ static void qtest_event(void *opaque, int event) } } -int qtest_init_accel(MachineClass *mc) +static void configure_qtest_icount(const char *options) { -configure_icount("0"); +QemuOpts *opts = qemu_opts_parse(qemu_find_opts("icount"), options, 1); +configure_icount(opts, &error_abort); +qemu_opts_del(opts); +} +int qtest_init_accel(MachineClass *mc) +{ +configure_qtest_icount("0"); return 0; } diff --git a/vl.c b/vl.c index 41ddcd2..103027f 100644 --- a/vl.c +++ b/vl.c @@ -537,6 +537,20 @@ static QemuOptsList qemu_mem_opts = { }, }; +static QemuOptsList qemu_icount_opts = { +.name = "icount", +.implied_opt_name = "shift", +.merge_lists = true, +.head = QTAILQ_HEAD_INITIALIZER(qemu_icount_opts.head), +.desc = { +{ +.name = "shift", +.type = QEMU_OPT_STRING, +}, +{ /* end of list */ } +}, +}; + /** * Get machine options * @@ -2896,13 +2910,12 @@ int main(int argc, char **argv, char **envp) { int i; int snapshot, linux_boot; -const char *icount_option = NULL; const char *initrd_filename; const char *kernel_filename, *kernel_cmdline; const char *boot_order; DisplayState *ds; int cyls, heads, secs, translation; -QemuOpts *hda_opts = NULL, *opts, *machine_opts; +QemuOpts *hda_opts = NULL, *opts, *machine_opts, *icount_opts = NULL; QemuOptsList *olist; int optind; const char *optarg; @@ -2967,6 +2980,7 @@ int main(int argc, char **argv, char **envp) qemu_add_opts(&qemu_msg_opts); qemu_add_opts(&qemu_name_opts); qemu_add_opts(&qemu_numa_opts); +qemu_add_opts(&qemu_icount_opts); runstate_init(); @@ -3817,7 +3831,11 @@ int main(int argc, char **argv, char **envp) } break; case QEMU_OPTION_icount: -icount_option = optarg; +icount_opts = qemu_opts_parse(qemu_find_opts("icount"), + optarg, 1); +if (!icount_opts) { +exit(1); +}
[Qemu-devel] [PATCH V5 2/6] icount: Add align option to icount
The align option is used for activating the align algorithm in order to synchronise the host clock and the guest clock. Signed-off-by: Sebastian Tanase Tested-by: Camille Bégué --- cpus.c| 19 --- include/qemu-common.h | 1 + qemu-options.hx | 15 +-- vl.c | 4 4 files changed, 30 insertions(+), 9 deletions(-) diff --git a/cpus.c b/cpus.c index dcca96a..34cc4c8 100644 --- a/cpus.c +++ b/cpus.c @@ -443,25 +443,30 @@ static const VMStateDescription vmstate_timers = { void configure_icount(QemuOpts *opts, Error **errp) { const char *option; +char *rem_str = NULL; seqlock_init(&timers_state.vm_clock_seqlock, NULL); vmstate_register(NULL, 0, &vmstate_timers, &timers_state); option = qemu_opt_get(opts, "shift"); if (!option) { +if (qemu_opt_get(opts, "align") != NULL) { +error_setg(errp, "Please specify shift option when using align"); +} return; } -/* When using -icount shift, the shift option will be - misinterpreted as a boolean */ -if (strcmp(option, "on") == 0 || strcmp(option, "off") == 0) { -error_setg(errp, "The shift option must be a number or auto"); -} - +icount_align_option = qemu_opt_get_bool(opts, "align", false); icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME, icount_warp_rt, NULL); if (strcmp(option, "auto") != 0) { -icount_time_shift = strtol(option, NULL, 0); +errno = 0; +icount_time_shift = strtol(option, &rem_str, 0); +if (errno != 0 || *rem_str != '\0' || !strlen(option)) { +error_setg(errp, "icount: Invalid shift value"); +} use_icount = 1; return; +} else if (icount_align_option) { +error_setg(errp, "shift=auto and align=on are incompatible"); } use_icount = 2; diff --git a/include/qemu-common.h b/include/qemu-common.h index cc346ec..860bb15 100644 --- a/include/qemu-common.h +++ b/include/qemu-common.h @@ -108,6 +108,7 @@ static inline char *realpath(const char *path, char *resolved_path) /* icount */ void configure_icount(QemuOpts *opts, Error **errp); extern int use_icount; +extern int icount_align_option; #include "qemu/osdep.h" #include "qemu/bswap.h" diff --git a/qemu-options.hx b/qemu-options.hx index 143def4..379d932 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -3011,9 +3011,9 @@ re-inject them. ETEXI DEF("icount", HAS_ARG, QEMU_OPTION_icount, \ -"-icount [shift=N|auto]\n" \ +"-icount [shift=N|auto][,align=on|off]\n" \ "enable virtual instruction counter with 2^N clock ticks per\n" \ -"instruction\n", QEMU_ARCH_ALL) +"instruction and enable aligning the host and virtual clocks\n", QEMU_ARCH_ALL) STEXI @item -icount [shift=@var{N}|auto] @findex -icount @@ -3026,6 +3026,17 @@ Note that while this option can give deterministic behavior, it does not provide cycle accurate emulation. Modern CPUs contain superscalar out of order cores with complex cache hierarchies. The number of instructions executed often has little or no correlation with actual performance. + +@option{align=on} will activate the delay algorithm which will try to +to synchronise the host clock and the virtual clock. The goal is to +have a guest running at the real frequency imposed by the shift option. +Whenever the guest clock is behind the host clock and if +@option{align=on} is specified then we print a messsage to the user +to inform about the delay. +Currently this option does not work when @option{shift} is @code{auto}. +Note: The sync algorithm will work for those shift values for which +the guest clock runs ahead of the host clock. Typically this happens +when the shift value is high (how high depends on the host machine). ETEXI DEF("watchdog", HAS_ARG, QEMU_OPTION_watchdog, \ diff --git a/vl.c b/vl.c index 103027f..d270070 100644 --- a/vl.c +++ b/vl.c @@ -183,6 +183,7 @@ uint8_t *boot_splash_filedata; size_t boot_splash_filedata_size; uint8_t qemu_extra_params_fw[2]; +int icount_align_option; typedef struct FWBootEntry FWBootEntry; struct FWBootEntry { @@ -546,6 +547,9 @@ static QemuOptsList qemu_icount_opts = { { .name = "shift", .type = QEMU_OPT_STRING, +}, { +.name = "align", +.type = QEMU_OPT_BOOL, }, { /* end of list */ } }, -- 2.0.0.rc2
[Qemu-devel] [PATCH V5 3/6] icount: Make icount_time_shift available everywhere
icount_time_shift is used for calculting the delay qemu has to sleep in order to synchronise the host and guest clocks. Therefore, we need it in cpu-exec.c. Signed-off-by: Sebastian Tanase Tested-by: Camille Bégué Signed-off-by: Paolo Bonzini --- cpus.c| 8 ++-- include/qemu-common.h | 1 + 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/cpus.c b/cpus.c index 34cc4c8..de0a8b2 100644 --- a/cpus.c +++ b/cpus.c @@ -105,8 +105,12 @@ static bool all_cpu_threads_idle(void) /* Compensate for varying guest execution speed. */ static int64_t qemu_icount_bias; static int64_t vm_clock_warp_start; -/* Conversion factor from emulated instructions to virtual clock ticks. */ -static int icount_time_shift; +/* Conversion factor from emulated instructions to virtual clock ticks. + * icount_time_shift is defined as extern in include/qemu-common.h because + * it is used (in cpu-exec.c) for calculating the delay for sleeping + * qemu in order to align the host and virtual clocks. + */ +int icount_time_shift; /* Arbitrarily pick 1MIPS as the minimum allowable speed. */ #define MAX_ICOUNT_SHIFT 10 diff --git a/include/qemu-common.h b/include/qemu-common.h index 860bb15..d47aa02 100644 --- a/include/qemu-common.h +++ b/include/qemu-common.h @@ -109,6 +109,7 @@ static inline char *realpath(const char *path, char *resolved_path) void configure_icount(QemuOpts *opts, Error **errp); extern int use_icount; extern int icount_align_option; +extern int icount_time_shift; #include "qemu/osdep.h" #include "qemu/bswap.h" -- 2.0.0.rc2
[Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm
The goal is to sleep qemu whenever the guest clock is in advance compared to the host clock (we use the monotonic clocks). The amount of time to sleep is calculated in the execution loop in cpu_exec. At first, we tried to approximate at each for loop the real time elapsed while searching for a TB (generating or retrieving from cache) and executing it. We would then approximate the virtual time corresponding to the number of virtual instructions executed. The difference between these 2 values would allow us to know if the guest is in advance or delayed. However, the function used for measuring the real time (qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive. We had an added overhead of 13% of the total run time. Therefore, we modified the algorithm and only take into account the difference between the 2 clocks at the begining of the cpu_exec function. During the for loop we try to reduce the advance of the guest only by computing the virtual time elapsed and sleeping if necessary. The overhead is thus reduced to 3%. Even though this method still has a noticeable overhead, it no longer is a bottleneck in trying to achieve a better guest frequency for which the guest clock is faster than the host one. As for the the alignement of the 2 clocks, with the first algorithm the guest clock was oscillating between -1 and 1ms compared to the host clock. Using the second algorithm we notice that the guest is 5ms behind the host, which is still acceptable for our use case. The tests where conducted using fio and stress. The host machine in an i5 CPU at 3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm versatile-pb built with buildroot. Currently, on our test machine, the lowest icount we can achieve that is suitable for aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) are slower than the cpu tests (using stress). Signed-off-by: Sebastian Tanase Tested-by: Camille Bégué Signed-off-by: Paolo Bonzini --- cpu-exec.c | 91 cpus.c | 17 ++ include/qemu/timer.h | 1 + 3 files changed, 109 insertions(+) diff --git a/cpu-exec.c b/cpu-exec.c index 38e5f02..1a725b6 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -22,6 +22,84 @@ #include "tcg.h" #include "qemu/atomic.h" #include "sysemu/qtest.h" +#include "qemu/timer.h" + +/* -icount align implementation. */ + +typedef struct SyncClocks { +int64_t diff_clk; +int64_t original_instr_counter; +} SyncClocks; + +#if !defined(CONFIG_USER_ONLY) +/* Allow the guest to have a max 3ms advance. + * The difference between the 2 clocks could therefore + * oscillate around 0. + */ +#define VM_CLOCK_ADVANCE 300 + +static int64_t delay_host(int64_t diff_clk) +{ +if (diff_clk > VM_CLOCK_ADVANCE) { +#ifndef _WIN32 +struct timespec sleep_delay, rem_delay; +sleep_delay.tv_sec = diff_clk / 10LL; +sleep_delay.tv_nsec = diff_clk % 10LL; +if (nanosleep(&sleep_delay, &rem_delay) < 0) { +diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * 10LL; +diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec; +} else { +diff_clk = 0; +} +#else +Sleep(diff_clk / SCALE_MS); +diff_clk = 0; +#endif +} +return diff_clk; +} + +static int64_t instr_to_vtime(int64_t instr_counter, const CPUState *cpu) +{ +int64_t instr_exec_time; +instr_exec_time = instr_counter - + (cpu->icount_extra + + cpu->icount_decr.u16.low); +instr_exec_time = instr_exec_time << icount_time_shift; + +return instr_exec_time; +} + +static void align_clocks(SyncClocks *sc, const CPUState *cpu) +{ +if (!icount_align_option) { +return; +} +sc->diff_clk += instr_to_vtime(sc->original_instr_counter, cpu); +sc->original_instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low; +sc->diff_clk = delay_host(sc->diff_clk); +} + +static void init_delay_params(SyncClocks *sc, + const CPUState *cpu) +{ +if (!icount_align_option) { +return; +} +sc->diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - + qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + + cpu_get_clock_offset(); +sc->original_instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low; +} +#else +static void align_clocks(SyncClocks *sc, const CPUState *cpu) +{ +} + +static void init_delay_params(SyncClocks *sc, const CPUState *cpu) +{ +} +#endif /* CONFIG USER ONLY */ void cpu_loop_exit(CPUState *cpu) { @@ -227,6 +305,8 @@ int cpu_exec(CPUArchState *env) TranslationBlock *tb; uint8_t *tc_ptr; uintptr_t next_tb; +SyncClocks sc; + /* This must be volatile so it is not trashed by longjmp() */ volatile bool have_tb_lock = false; @@ -283,6 +363,13 @@ int cpu_exec(CPUArchState *env) #endif
[Qemu-devel] [PATCH V5 6/6] monitor: Add drift info to 'info jit'
Show in 'info jit' the current delay between the host clock and the guest clock. In addition, print the maximum advance and delay of the guest compared to the host. Signed-off-by: Sebastian Tanase Tested-by: Camille Bégué --- cpu-exec.c| 6 ++ cpus.c| 15 +++ include/qemu-common.h | 4 monitor.c | 1 + 4 files changed, 26 insertions(+) diff --git a/cpu-exec.c b/cpu-exec.c index 7259c94..533e0bf 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -117,6 +117,12 @@ static void init_delay_params(SyncClocks *sc, sc->realtime_clock + cpu_get_clock_offset(); sc->original_instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low; +if (sc->diff_clk < max_delay) { +max_delay = sc->diff_clk; +} +if (sc->diff_clk > max_advance) { +max_advance = sc->diff_clk; +} /* Print every 2s max if the guest is late. We limit the number of printed messages to NB_PRINT_MAX(currently 100) */ print_delay(sc); diff --git a/cpus.c b/cpus.c index fcc5308..c98036e 100644 --- a/cpus.c +++ b/cpus.c @@ -64,6 +64,8 @@ #endif /* CONFIG_LINUX */ static CPUState *next_cpu; +int64_t max_delay; +int64_t max_advance; bool cpu_is_stopped(CPUState *cpu) { @@ -1521,3 +1523,16 @@ void qmp_inject_nmi(Error **errp) error_set(errp, QERR_UNSUPPORTED); #endif } + +void dump_drift_info(FILE *f, fprintf_function cpu_fprintf) +{ +cpu_fprintf(f, "Host - Guest clock %ld(ms)\n", +(cpu_get_clock() - cpu_get_icount())/SCALE_MS); +if (icount_align_option) { +cpu_fprintf(f, "Max guest delay %ld(ms)\n", -max_delay/SCALE_MS); +cpu_fprintf(f, "Max guest advance %ld(ms)\n", max_advance/SCALE_MS); +} else { +cpu_fprintf(f, "Max guest delay NA\n"); +cpu_fprintf(f, "Max guest advance NA\n"); +} +} diff --git a/include/qemu-common.h b/include/qemu-common.h index d47aa02..55971df 100644 --- a/include/qemu-common.h +++ b/include/qemu-common.h @@ -110,6 +110,10 @@ void configure_icount(QemuOpts *opts, Error **errp); extern int use_icount; extern int icount_align_option; extern int icount_time_shift; +/* drift information for info jit command */ +extern int64_t max_delay; +extern int64_t max_advance; +void dump_drift_info(FILE *f, fprintf_function cpu_fprintf); #include "qemu/osdep.h" #include "qemu/bswap.h" diff --git a/monitor.c b/monitor.c index 5bc70a6..cdbaa60 100644 --- a/monitor.c +++ b/monitor.c @@ -1047,6 +1047,7 @@ static void do_info_registers(Monitor *mon, const QDict *qdict) static void do_info_jit(Monitor *mon, const QDict *qdict) { dump_exec_info((FILE *)mon, monitor_fprintf); +dump_drift_info((FILE *)mon, monitor_fprintf); } static void do_info_history(Monitor *mon, const QDict *qdict) -- 2.0.0.rc2
[Qemu-devel] [PATCH V5 5/6] cpu_exec: Print to console if the guest is late
If the align option is enabled, we print to the user whenever the guest clock is behind the host clock in order for he/she to have a hint about the actual performance. The maximum print interval is 2s and we limit the number of messages to 100. If desired, this can be changed in cpu-exec.c Signed-off-by: Sebastian Tanase Tested-by: Camille Bégué Conflicts: cpu-exec.c --- cpu-exec.c | 32 +++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/cpu-exec.c b/cpu-exec.c index 1a725b6..7259c94 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -29,6 +29,7 @@ typedef struct SyncClocks { int64_t diff_clk; int64_t original_instr_counter; +int64_t realtime_clock; } SyncClocks; #if !defined(CONFIG_USER_ONLY) @@ -37,6 +38,9 @@ typedef struct SyncClocks { * oscillate around 0. */ #define VM_CLOCK_ADVANCE 300 +#define THRESHOLD_REDUCE 1.5 +#define MAX_DELAY_PRINT_RATE 2 +#define MAX_NB_PRINTS 100 static int64_t delay_host(int64_t diff_clk) { @@ -80,16 +84,42 @@ static void align_clocks(SyncClocks *sc, const CPUState *cpu) sc->diff_clk = delay_host(sc->diff_clk); } +static void print_delay(const SyncClocks *sc) +{ +static float threshold_delay; +static int64_t last_realtime_clock; +static int nb_prints; + +if (icount_align_option && +(sc->realtime_clock - last_realtime_clock) / 10LL +>= MAX_DELAY_PRINT_RATE && nb_prints < MAX_NB_PRINTS) { +if ((-sc->diff_clk / (float)10LL > threshold_delay) || +(-sc->diff_clk / (float)10LL < + (threshold_delay - THRESHOLD_REDUCE))) { +threshold_delay = (-sc->diff_clk / 10LL) + 1; +printf("Warning: The guest is now late by %.1f to %.1f seconds\n", + threshold_delay - 1, + threshold_delay); +nb_prints++; +last_realtime_clock = sc->realtime_clock; +} +} +} + static void init_delay_params(SyncClocks *sc, const CPUState *cpu) { if (!icount_align_option) { return; } +sc->realtime_clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); sc->diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - - qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + + sc->realtime_clock + cpu_get_clock_offset(); sc->original_instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low; +/* Print every 2s max if the guest is late. We limit the number + of printed messages to NB_PRINT_MAX(currently 100) */ +print_delay(sc); } #else static void align_clocks(SyncClocks *sc, const CPUState *cpu) -- 2.0.0.rc2
Re: [Qemu-devel] [PATCH V5 0/6] icount: Implement delay algorithm between guest and host clocks
Am 25.07.2014 11:56, schrieb Sebastian Tanase: > Sebastian Tanase (6): > icount: Add QemuOpts for icount > icount: Add align option to icount > icount: Make icount_time_shift available everywhere > cpu_exec: Add sleeping algorithm > cpu_exec: Print to console if the guest is late FWIW the file name is cpu-exec, please fix if you need to resend. Thanks, Andreas > monitor: Add drift info to 'info jit' > > cpu-exec.c| 127 > ++ > cpus.c| 59 +-- > include/qemu-common.h | 9 +++- > include/qemu/timer.h | 1 + > monitor.c | 1 + > qemu-options.hx | 17 +-- > qtest.c | 13 +- > vl.c | 39 +--- > 8 files changed, 248 insertions(+), 18 deletions(-) -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm
Il 25/07/2014 11:56, Sebastian Tanase ha scritto: > The goal is to sleep qemu whenever the guest clock > is in advance compared to the host clock (we use > the monotonic clocks). The amount of time to sleep > is calculated in the execution loop in cpu_exec. > > At first, we tried to approximate at each for loop the real time elapsed > while searching for a TB (generating or retrieving from cache) and > executing it. We would then approximate the virtual time corresponding > to the number of virtual instructions executed. The difference between > these 2 values would allow us to know if the guest is in advance or delayed. > However, the function used for measuring the real time > (qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive. > We had an added overhead of 13% of the total run time. > > Therefore, we modified the algorithm and only take into account the > difference between the 2 clocks at the begining of the cpu_exec function. > During the for loop we try to reduce the advance of the guest only by > computing the virtual time elapsed and sleeping if necessary. The overhead > is thus reduced to 3%. Even though this method still has a noticeable > overhead, it no longer is a bottleneck in trying to achieve a better > guest frequency for which the guest clock is faster than the host one. > > As for the the alignement of the 2 clocks, with the first algorithm > the guest clock was oscillating between -1 and 1ms compared to the host clock. > Using the second algorithm we notice that the guest is 5ms behind the host, > which > is still acceptable for our use case. > > The tests where conducted using fio and stress. The host machine in an i5 CPU > at > 3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm > versatile-pb > built with buildroot. > > Currently, on our test machine, the lowest icount we can achieve that is > suitable for > aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) > are > slower than the cpu tests (using stress). > > Signed-off-by: Sebastian Tanase > Tested-by: Camille Bégué > Signed-off-by: Paolo Bonzini > --- > cpu-exec.c | 91 > > cpus.c | 17 ++ > include/qemu/timer.h | 1 + > 3 files changed, 109 insertions(+) > > diff --git a/cpu-exec.c b/cpu-exec.c > index 38e5f02..1a725b6 100644 > --- a/cpu-exec.c > +++ b/cpu-exec.c > @@ -22,6 +22,84 @@ > #include "tcg.h" > #include "qemu/atomic.h" > #include "sysemu/qtest.h" > +#include "qemu/timer.h" > + > +/* -icount align implementation. */ > + > +typedef struct SyncClocks { > +int64_t diff_clk; > +int64_t original_instr_counter; > +} SyncClocks; > + > +#if !defined(CONFIG_USER_ONLY) > +/* Allow the guest to have a max 3ms advance. > + * The difference between the 2 clocks could therefore > + * oscillate around 0. > + */ > +#define VM_CLOCK_ADVANCE 300 > + > +static int64_t delay_host(int64_t diff_clk) > +{ > +if (diff_clk > VM_CLOCK_ADVANCE) { > +#ifndef _WIN32 > +struct timespec sleep_delay, rem_delay; > +sleep_delay.tv_sec = diff_clk / 10LL; > +sleep_delay.tv_nsec = diff_clk % 10LL; > +if (nanosleep(&sleep_delay, &rem_delay) < 0) { > +diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * > 10LL; > +diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec; > +} else { > +diff_clk = 0; > +} > +#else > +Sleep(diff_clk / SCALE_MS); > +diff_clk = 0; > +#endif > +} > +return diff_clk; > +} > + > +static int64_t instr_to_vtime(int64_t instr_counter, const CPUState *cpu) > +{ > +int64_t instr_exec_time; > +instr_exec_time = instr_counter - > + (cpu->icount_extra + > + cpu->icount_decr.u16.low); > +instr_exec_time = instr_exec_time << icount_time_shift; > + > +return instr_exec_time; > +} > + > +static void align_clocks(SyncClocks *sc, const CPUState *cpu) > +{ > +if (!icount_align_option) { > +return; > +} > +sc->diff_clk += instr_to_vtime(sc->original_instr_counter, cpu); > +sc->original_instr_counter = cpu->icount_extra + > cpu->icount_decr.u16.low; > +sc->diff_clk = delay_host(sc->diff_clk); > +} Just two comments: 1) perhaps s/original/last/ in original_instr_counter? 2) I think I prefer this to be written like: instr_counter = cpu->icount_extra + cpu->icount_decr.u16.low; instr_exec_time = sc->original_instr_counter - instr_counter; sc->original_instr_counter = instr_counter sc->diff_clk += instr_exec_time << icount_time_shift; sc->diff_clk = delay_host(sc->diff_clk); If you agree, I can do it when applying the patches. Thanks for your persistence, I'm very happy with this version! As a follow up, do you think it's possible to modify the places where you run align_clocks, so that you sleep with the iothread mutex *not* ta
Re: [Qemu-devel] [PATCH 2/3] target-arm: A64: fix TLB flush instructions
Peter Maydell writes: > On 24 July 2014 16:52, Alex Bennée wrote: >> +/* See: D4.7.2 TLB maintenance requirements and the TLB maintenance >> instructions >> + * Page D4-1736 (DDI0487A.b) "For TLB maintenance instructions that >> + * take an address, the maintenance of VA[63:56] is interpreted as >> + * being the same as the maintenance of VA[55]" >> + */ > > I'd rather we didn't quote this bit of the ARM ARM, because it's > obviously mangled (I'm pretty sure it should say "the value of > VA[..]"). Is it OK to still reference the ARM ARM because otherwise the sign extension would look a little weird without context (although obviously we have a commit message to say we fixed something). > > Otherwise > Reviewed-by: Peter Maydell > > thanks > -- PMM -- Alex Bennée
Re: [Qemu-devel] [questions] about qemu log
Zhang Haoyu writes: > Hi, all > > If I use qemu command directly to run vm, bypass libvirt, how to configure > qemu to assure that each vm has its own log file, like vmname.log? > For example, VM: rhel7-net has its own log file, rhel7-net.log, > VM:rhel7-stor has its own log file, rhel7-stor.log. -D /path/to/unique/file/name.log Or am I misunderstanding what you want? -- Alex Bennée
[Qemu-devel] [PATCH] target-i386/fpu_helper.c: fbld instruction doesn't set minus sign
Obviously, there is a misprint in function implementation. From: Dmitry Poletaev Signed-off-by: Dmitry Poletaev --- target-i386/fpu_helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target-i386/fpu_helper.c b/target-i386/fpu_helper.c index 1b2900d..be1e545 100644 --- a/target-i386/fpu_helper.c +++ b/target-i386/fpu_helper.c @@ -622,7 +622,7 @@ void helper_fbld_ST0(CPUX86State *env, target_ulong ptr) } tmp = int64_to_floatx80(val, &env->fp_status); if (cpu_ldub_data(env, ptr + 9) & 0x80) { -floatx80_chs(tmp); +tmp = floatx80_chs(tmp); } fpush(env); ST0 = tmp; -- 1.8.4.msysgit.0
Re: [Qemu-devel] [questions] about qemu log
>> Hi, all >> >> If I use qemu command directly to run vm, bypass libvirt, how to configure >> qemu to assure that each vm has its own log file, like vmname.log? >> For example, VM: rhel7-net has its own log file, rhel7-net.log, >> VM:rhel7-stor has its own log file, rhel7-stor.log. > >-D /path/to/unique/file/name.log > -D option is to configure qemu_logfile for the output logs which are controlled by qmp command "logfile", which can be enabled/disabled on run time. I want to configure the log file for the output of fprintf(stderr, fmt, ...), .etc, i.e., how to redirect the output of fprintf(stderr, fmt, ...), or some other log-interface to a specified file? I saw the configuration in qemuStateInitialize(), libvirt code, but now I run the vm directly through qemu command, bypass libvirt. Thanks, Zhang Haoyu >Or am I misunderstanding what you want?
Re: [Qemu-devel] [PULL for-2.1 0/2] vnc: fix two vnc update issues.
On 25 July 2014 08:45, Gerd Hoffmann wrote: > On Do, 2014-07-24 at 15:22 +0100, Peter Maydell wrote: >> On 24 July 2014 15:10, Gerd Hoffmann wrote: >> > Hi, >> > >> >> > So are these *really* release critical bugs, if they've been >> >> > only found in code review? We're really close to release now >> >> > and so my preference is not to include changes unless they're >> >> > really necessary... >> >> >> >> These are fixing openQA breakage (os-autoinst), >> > >> > In more detail: >> > [snip] >> >> Thanks for the clarification; these do seem worth putting in 2.1. >> I'll need to get you to respin with the missing signed-off-by >> that Andreas pointed out, though. > > Done: git://git.kraxel.org/qemu tags/pull-vnc-20140725-1 Applied, thanks. -- PMM
Re: [Qemu-devel] [PATCH] cpu-exec: make TBs generated codes unlinked when -singlestep
Hello, On Fri, Jul 25, 2014 at 6:37 AM, Jincheng Miao wrote: > '-singlestep' option will make TB contains only one instruction, > so that the qemu_log could output trace log when CPU_LOG_EXEC sets, > and it could help developers to debug control flow. > > But currently, in cpu_exec(), it doesn't check singlestep when > tb_add_jump(), so the TB linked is executed siliently. > Therefore, this patch adds singlestep check before tb_add_jump(). > > Signed-off-by: Jincheng Miao I tested your patch in an environment generating run time traces and it works fine. Tested-by: Laurent Desnogues Thanks, Laurent > --- > cpu-exec.c |4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/cpu-exec.c b/cpu-exec.c > index 38e5f02..64b7289 100644 > --- a/cpu-exec.c > +++ b/cpu-exec.c > @@ -622,8 +622,8 @@ int cpu_exec(CPUArchState *env) > } > /* see if we can patch the calling TB. When the TB > spans two pages, we cannot safely do a direct > - jump. */ > -if (next_tb != 0 && tb->page_addr[1] == -1) { > + jump. So as when singlestep is enabled. */ > +if (next_tb != 0 && tb->page_addr[1] == -1 && !singlestep) { > tb_add_jump((TranslationBlock *)(next_tb & > ~TB_EXIT_MASK), > next_tb & TB_EXIT_MASK, tb); > } > -- > 1.7.1 > >
[Qemu-devel] [PATCH RFC 3/3] dataplane: stop trying on notifier error
If we fail to set up guest or host notifiers, there's no use trying again every time the guest kicks, so disable dataplane in that case. Acked-by: Christian Borntraeger Signed-off-by: Cornelia Huck --- hw/block/dataplane/virtio-blk.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 94e1a29..24a6b71 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -28,6 +28,7 @@ struct VirtIOBlockDataPlane { bool started; bool starting; bool stopping; +bool disabled; VirtIOBlkConf *blk; @@ -220,7 +221,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) VirtQueue *vq; int r; -if (s->started) { +if (s->started || s->disabled) { return; } @@ -274,6 +275,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) k->set_guest_notifiers(qbus->parent, 1, false); fail_guest_notifiers: vring_teardown(&s->vring, s->vdev, 0); +s->disabled = true; fail_vring: s->starting = false; } @@ -284,6 +286,13 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s) BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s->vdev))); VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); VirtIOBlock *vblk = VIRTIO_BLK(s->vdev); + + +/* Better luck next time. */ +if (s->disabled) { +s->disabled = false; +return; +} if (!s->started || s->stopping) { return; } -- 1.7.9.5
[Qemu-devel] [PATCH RFC 0/3] dataplane: dataplane: more graceful error handling
Currently, qemu will take a hard exit if it fails to set up guest or host notifiers, giving no real clue as to what went wrong (e.g., when out of file descriptors). This patchset tries to make this more manageable: Both by improving the error message and by gracefully falling back to non-dataplane in case of errors. Patches are also available on git://github.com/cohuck/qemu dataplane-graceful-fail Thoughts? Cornelia Huck (3): dataplane: print why starting failed dataplane: fail notifier setting gracefully dataplane: stop trying on notifier error hw/block/dataplane/virtio-blk.c | 39 +-- 1 file changed, 29 insertions(+), 10 deletions(-) -- 1.7.9.5
[Qemu-devel] [PATCH RFC 1/3] dataplane: print why starting failed
Setting up guest or host notifiers may fail, but the user will have no idea why: Let's print the error returned by the callback. Acked-by: Christian Borntraeger Signed-off-by: Cornelia Huck --- hw/block/dataplane/virtio-blk.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index d6ba65c..527a53c 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -218,6 +218,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus); VirtIOBlock *vblk = VIRTIO_BLK(s->vdev); VirtQueue *vq; +int r; if (s->started) { return; @@ -236,16 +237,18 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) } /* Set up guest notifier (irq) */ -if (k->set_guest_notifiers(qbus->parent, 1, true) != 0) { -fprintf(stderr, "virtio-blk failed to set guest notifier, " -"ensure -enable-kvm is set\n"); +r = k->set_guest_notifiers(qbus->parent, 1, true); +if (r != 0) { +fprintf(stderr, "virtio-blk failed to set guest notifier (%d), " +"ensure -enable-kvm is set\n", r); exit(1); } s->guest_notifier = virtio_queue_get_guest_notifier(vq); /* Set up virtqueue notify */ -if (k->set_host_notifier(qbus->parent, 0, true) != 0) { -fprintf(stderr, "virtio-blk failed to set host notifier\n"); +r = k->set_host_notifier(qbus->parent, 0, true); +if (r != 0) { +fprintf(stderr, "virtio-blk failed to set host notifier (%d)\n", r); exit(1); } s->host_notifier = *virtio_queue_get_host_notifier(vq); -- 1.7.9.5
[Qemu-devel] [PATCH RFC 2/3] dataplane: fail notifier setting gracefully
The dataplane code is currently doing a hard exit if it fails to set up either guest or host notifiers. In practice, this may mean that a guest suddenly dies after a dataplane device failed to come up (e.g., when a file descriptor limit is hit for tne nth device). Let's just try to unwind the setup instead and return. Acked-by: Christian Borntraeger Signed-off-by: Cornelia Huck --- hw/block/dataplane/virtio-blk.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 527a53c..94e1a29 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -232,8 +232,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) vq = virtio_get_queue(s->vdev, 0); if (!vring_setup(&s->vring, s->vdev, 0)) { -s->starting = false; -return; +goto fail_vring; } /* Set up guest notifier (irq) */ @@ -241,7 +240,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) if (r != 0) { fprintf(stderr, "virtio-blk failed to set guest notifier (%d), " "ensure -enable-kvm is set\n", r); -exit(1); +goto fail_guest_notifiers; } s->guest_notifier = virtio_queue_get_guest_notifier(vq); @@ -249,7 +248,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) r = k->set_host_notifier(qbus->parent, 0, true); if (r != 0) { fprintf(stderr, "virtio-blk failed to set host notifier (%d)\n", r); -exit(1); +goto fail_host_notifier; } s->host_notifier = *virtio_queue_get_host_notifier(vq); @@ -269,6 +268,14 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s) aio_context_acquire(s->ctx); aio_set_event_notifier(s->ctx, &s->host_notifier, handle_notify); aio_context_release(s->ctx); +return; + + fail_host_notifier: +k->set_guest_notifiers(qbus->parent, 1, false); + fail_guest_notifiers: +vring_teardown(&s->vring, s->vdev, 0); + fail_vring: +s->starting = false; } /* Context: QEMU global mutex held */ -- 1.7.9.5
[Qemu-devel] [PULL for-2.1 0/3] Last minute patches
The following changes since commit f368c33d5ab09dd5656924185cd975b11838cd25: Update version for v2.1.0-rc3 release (2014-07-22 18:17:03 +0100) are available in the git repository at: git://github.com/bonzini/qemu.git tags/for-upstream for you to fetch changes up to d975e28437377bc9a65fb5f4f9486a74c7a124cc: qemu-char: ignore flow control if a PTY's slave is not connected (2014-07-24 18:31:50 +0200) Here are my patches for the last 2.1 rc. Paolo Bonzini (3): acpi-dsdt: procedurally generate _PRT pc: hack for migration compatibility from QEMU 2.0 qemu-char: ignore flow control if a PTY's slave is not connected hw/i386/acpi-build.c| 75 +- hw/i386/acpi-dsdt.dsl | 90 +- hw/i386/acpi-dsdt.hex.generated | 1910 +++ hw/i386/pc_piix.c | 19 + hw/i386/pc_q35.c|5 + include/hw/i386/pc.h|1 + qemu-char.c |7 + 7 files changed, 250 insertions(+), 1857 deletions(-) -- 1.8.3.1
[Qemu-devel] [PULL 2/3] pc: hack for migration compatibility from QEMU 2.0
Changing the ACPI table size causes migration to break, and the memory hotplug work opened our eyes on how horribly we were breaking things in 2.0 already. The ACPI table size is rounded to the next 4k, which one would think gives some headroom. In practice this is not the case, because the user can control the ACPI table size (each CPU adds 97 bytes to the SSDT and 8 to the MADT) and so some "-smp" values will break the 4k boundary and fail to migrate. Similarly, PCI bridges add ~1870 bytes to the SSDT. To fix this, hard-code 64k as the maximum ACPI table size, which (despite being an order of magnitude smaller than 640k) should be enough for everyone. To fix migration from QEMU 2.0, compute the payload size of QEMU 2.0 and always use that one. The previous patch shrunk the ACPI tables enough that the QEMU 2.0 size should always be enough; non-AML tables can change depending on the configuration (especially MADT, SRAT, HPET) but they remain the same between QEMU 2.0 and 2.1, so we only compute our padding based on the sizes of the SSDT and DSDT. Migration from QEMU 1.7 should work for guests that have a number of CPUs other than 12, 13, 14, 54, 55, 56, 97, 98, 139, 140. It was already broken from QEMU 1.7 to QEMU 2.0 in the same way, though. Even with this patch, QEMU 1.7 and 2.0 have two different ideas of "-M pc-i440fx-2.0" when there are PCI bridges. Igor sent a patch to adopt the QEMU 1.7 definition. I think distributions should apply it if they move directly from QEMU 1.7 to 2.1+ without ever packaging version 2.0. Reviewed-by: Laszlo Ersek Tested-by: Igor Mammedov Signed-off-by: Paolo Bonzini --- hw/i386/acpi-build.c | 75 hw/i386/pc_piix.c| 19 + hw/i386/pc_q35.c | 5 include/hw/i386/pc.h | 1 + 4 files changed, 95 insertions(+), 5 deletions(-) diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index ebc5f03..7c6a7e3 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -25,7 +25,9 @@ #include #include "qemu-common.h" #include "qemu/bitmap.h" +#include "qemu/osdep.h" #include "qemu/range.h" +#include "qemu/error-report.h" #include "hw/pci/pci.h" #include "qom/cpu.h" #include "hw/i386/pc.h" @@ -52,6 +54,17 @@ #include "qapi/qmp/qint.h" #include "qom/qom-qobject.h" +/* These are used to size the ACPI tables for -M pc-i440fx-1.7 and + * -M pc-i440fx-2.0. Even if the actual amount of AML generated grows + * a little bit, there should be plenty of free space since the DSDT + * shrunk by ~1.5k between QEMU 2.0 and QEMU 2.1. + */ +#define ACPI_BUILD_CPU_AML_SIZE97 +#define ACPI_BUILD_BRIDGE_AML_SIZE 1875 +#define ACPI_BUILD_ALIGN_SIZE 0x1000 + +#define ACPI_BUILD_TABLE_SIZE 0x1 + typedef struct AcpiCpuInfo { DECLARE_BITMAP(found_cpus, ACPI_CPU_HOTPLUG_ID_LIMIT); } AcpiCpuInfo; @@ -87,6 +100,8 @@ typedef struct AcpiBuildPciBusHotplugState { struct AcpiBuildPciBusHotplugState *parent; } AcpiBuildPciBusHotplugState; +unsigned bsel_alloc; + static void acpi_get_dsdt(AcpiMiscInfo *info) { uint16_t *applesmc_sta; @@ -759,8 +774,8 @@ static void *acpi_set_bsel(PCIBus *bus, void *opaque) static void acpi_set_pci_info(void) { PCIBus *bus = find_i440fx(); /* TODO: Q35 support */ -unsigned bsel_alloc = 0; +assert(bsel_alloc == 0); if (bus) { /* Scan all PCI buses. Set property to enable acpi based hotplug. */ pci_for_each_bus_depth_first(bus, acpi_set_bsel, NULL, &bsel_alloc); @@ -1440,13 +1455,14 @@ static void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables) { GArray *table_offsets; -unsigned facs, dsdt, rsdt; +unsigned facs, ssdt, dsdt, rsdt; AcpiCpuInfo cpu; AcpiPmInfo pm; AcpiMiscInfo misc; AcpiMcfgInfo mcfg; PcPciInfo pci; uint8_t *u; +size_t aml_len = 0; acpi_get_cpu_info(&cpu); acpi_get_pm_info(&pm); @@ -1474,13 +1490,20 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables) dsdt = tables->table_data->len; build_dsdt(tables->table_data, tables->linker, &misc); +/* Count the size of the DSDT and SSDT, we will need it for legacy + * sizing of ACPI tables. + */ +aml_len += tables->table_data->len - dsdt; + /* ACPI tables pointed to by RSDT */ acpi_add_table(table_offsets, tables->table_data); build_fadt(tables->table_data, tables->linker, &pm, facs, dsdt); +ssdt = tables->table_data->len; acpi_add_table(table_offsets, tables->table_data); build_ssdt(tables->table_data, tables->linker, &cpu, &pm, &misc, &pci, guest_info); +aml_len += tables->table_data->len - ssdt; acpi_add_table(table_offsets, tables->table_data); build_madt(tables->table_data, tables->linker, &cpu, guest_info); @@ -1513,14 +1536,56 @@ void acpi_build(PcGuestInfo *guest_info, AcpiBuildTables *tables) /* RSDP is in FSEG memory, so allocate it separately */
[Qemu-devel] [PULL 1/3] acpi-dsdt: procedurally generate _PRT
This replaces the _PRT constant with a method that computes it. The problem is that the DSDT+SSDT have grown from 2.0 to 2.1, enough to cross the 8k barrier (we align the ACPI tables to 4k before putting them in fw_cfg). This causes problems with migration and the pc-i440fx-2.0 machine type. The solution to the problem is to hardcode 64k as the limit, but this doesn't solve the bug with pc-i440fx-2.0. The fix will be for QEMU 2.1 to use exactly the same size as QEMU 2.0 for the ACPI tables. First, however, we must make the actual AML equal or smaller; to do this, rewrite _PRT in a way that saves over 1k of bytecode. Reviewed-by: Laszlo Ersek Tested-by: Igor Mammedov Signed-off-by: Paolo Bonzini --- hw/i386/acpi-dsdt.dsl | 90 +- hw/i386/acpi-dsdt.hex.generated | 1910 +++ 2 files changed, 148 insertions(+), 1852 deletions(-) diff --git a/hw/i386/acpi-dsdt.dsl b/hw/i386/acpi-dsdt.dsl index 3cc0ea0..6ba0170 100644 --- a/hw/i386/acpi-dsdt.dsl +++ b/hw/i386/acpi-dsdt.dsl @@ -181,57 +181,45 @@ DefinitionBlock ( Scope(\_SB) { Scope(PCI0) { -Name(_PRT, Package() { -/* PCI IRQ routing table, example from ACPI 2.0a specification, - section 6.2.8.1 */ -/* Note: we provide the same info as the PCI routing - table of the Bochs BIOS */ - -#define prt_slot(nr, lnk0, lnk1, lnk2, lnk3) \ -Package() { nr##, 0, lnk0, 0 }, \ -Package() { nr##, 1, lnk1, 0 }, \ -Package() { nr##, 2, lnk2, 0 }, \ -Package() { nr##, 3, lnk3, 0 } - -#define prt_slot0(nr) prt_slot(nr, LNKD, LNKA, LNKB, LNKC) -#define prt_slot1(nr) prt_slot(nr, LNKA, LNKB, LNKC, LNKD) -#define prt_slot2(nr) prt_slot(nr, LNKB, LNKC, LNKD, LNKA) -#define prt_slot3(nr) prt_slot(nr, LNKC, LNKD, LNKA, LNKB) - -prt_slot0(0x), -/* Device 1 is power mgmt device, and can only use irq 9 */ -prt_slot(0x0001, LNKS, LNKB, LNKC, LNKD), -prt_slot2(0x0002), -prt_slot3(0x0003), -prt_slot0(0x0004), -prt_slot1(0x0005), -prt_slot2(0x0006), -prt_slot3(0x0007), -prt_slot0(0x0008), -prt_slot1(0x0009), -prt_slot2(0x000a), -prt_slot3(0x000b), -prt_slot0(0x000c), -prt_slot1(0x000d), -prt_slot2(0x000e), -prt_slot3(0x000f), -prt_slot0(0x0010), -prt_slot1(0x0011), -prt_slot2(0x0012), -prt_slot3(0x0013), -prt_slot0(0x0014), -prt_slot1(0x0015), -prt_slot2(0x0016), -prt_slot3(0x0017), -prt_slot0(0x0018), -prt_slot1(0x0019), -prt_slot2(0x001a), -prt_slot3(0x001b), -prt_slot0(0x001c), -prt_slot1(0x001d), -prt_slot2(0x001e), -prt_slot3(0x001f), -}) +Method (_PRT, 0) { +Store(Package(128) {}, Local0) +Store(Zero, Local1) +While(LLess(Local1, 128)) { +// slot = pin >> 2 +Store(ShiftRight(Local1, 2), Local2) + +// lnk = (slot + pin) & 3 +Store(And(Add(Local1, Local2), 3), Local3) +If (LEqual(Local3, 0)) { +Store(Package(4) { Zero, Zero, LNKD, Zero }, Local4) +} +If (LEqual(Local3, 1)) { +// device 1 is the power-management device, needs SCI +If (LEqual(Local1, 4)) { +Store(Package(4) { Zero, Zero, LNKS, Zero }, Local4) +} Else { +Store(Package(4) { Zero, Zero, LNKA, Zero }, Local4) +} +} +If (LEqual(Local3, 2)) { +Store(Package(4) { Zero, Zero, LNKB, Zero }, Local4) +} +If (LEqual(Local3, 3)) { +Store(Package(4) { Zero, Zero, LNKC, Zero }, Local4) +} + +// Complete the interrupt routing entry: +//Package(4) { 0x[slot], [pin], [link], 0) } + +Store(Or(ShiftLeft(Local2, 16), 0x), Index(Local4, 0)) +Store(And(Local1, 3),Index(Local4, 1)) +Store(Local4,Index(Local0, Local1)) + +Increment(Local1) +} + +Return(Local0) +} } Field(PCI0.ISA.P40C, ByteAcc, NoLock, Preserve) { diff --git a/hw/i386/acpi-dsdt.hex.generated b/hw/i386/acpi-dsdt
[Qemu-devel] [PULL 3/3] qemu-char: ignore flow control if a PTY's slave is not connected
After commit f702e62 (serial: change retry logic to avoid concurrency, 2014-07-11), guest boot hangs if the backend is an unconnected PTY. The reason is that PTYs do not support G_IO_HUP, and serial_xmit is never called. To fix this, simply invoke serial_xmit immediately (via g_idle_source_new) when this happens. Tested-by: Pavel Hrdina Signed-off-by: Paolo Bonzini --- qemu-char.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/qemu-char.c b/qemu-char.c index 7acc03f..956be49 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -1168,6 +1168,9 @@ static int pty_chr_write(CharDriverState *chr, const uint8_t *buf, int len) static GSource *pty_chr_add_watch(CharDriverState *chr, GIOCondition cond) { PtyCharDriver *s = chr->opaque; +if (!s->connected) { +return NULL; +} return g_io_create_watch(s->fd, cond); } @@ -3664,6 +3667,10 @@ int qemu_chr_fe_add_watch(CharDriverState *s, GIOCondition cond, } src = s->chr_add_watch(s, cond); +if (!src) { +return -EINVAL; +} + g_source_set_callback(src, (GSourceFunc)func, user_data, NULL); tag = g_source_attach(src, NULL); g_source_unref(src); -- 1.8.3.1
[Qemu-devel] [PULL v2 0/1] Fix for "-serial pty" regression
The following changes since commit f368c33d5ab09dd5656924185cd975b11838cd25: Update version for v2.1.0-rc3 release (2014-07-22 18:17:03 +0100) are available in the git repository at: git://github.com/bonzini/qemu.git tags/for-upstream for you to fetch changes up to 62c339c5272ce8fbe8ca52695cee8ff40da7872e: qemu-char: ignore flow control if a PTY's slave is not connected (2014-07-25 14:36:07 +0200) Here is the serial fix for 2.1. Paolo Bonzini (1): qemu-char: ignore flow control if a PTY's slave is not connected qemu-char.c | 7 +++ 1 file changed, 7 insertions(+) -- 1.8.3.1
[Qemu-devel] [PULL v2 1/1] qemu-char: ignore flow control if a PTY's slave is not connected
After commit f702e62 (serial: change retry logic to avoid concurrency, 2014-07-11), guest boot hangs if the backend is an unconnected PTY. The reason is that PTYs do not support G_IO_HUP, and serial_xmit is never called. To fix this, simply invoke serial_xmit immediately (via g_idle_source_new) when this happens. Tested-by: Pavel Hrdina Signed-off-by: Paolo Bonzini --- qemu-char.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/qemu-char.c b/qemu-char.c index 7acc03f..956be49 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -1168,6 +1168,9 @@ static int pty_chr_write(CharDriverState *chr, const uint8_t *buf, int len) static GSource *pty_chr_add_watch(CharDriverState *chr, GIOCondition cond) { PtyCharDriver *s = chr->opaque; +if (!s->connected) { +return NULL; +} return g_io_create_watch(s->fd, cond); } @@ -3664,6 +3667,10 @@ int qemu_chr_fe_add_watch(CharDriverState *s, GIOCondition cond, } src = s->chr_add_watch(s, cond); +if (!src) { +return -EINVAL; +} + g_source_set_callback(src, (GSourceFunc)func, user_data, NULL); tag = g_source_attach(src, NULL); g_source_unref(src); -- 1.8.3.1
Re: [Qemu-devel] [PATCH 1/4] block/parallels: extend parallels format header with actual data values
On 24/07/14 22:34, Jeff Cody wrote: On Tue, Jul 22, 2014 at 05:19:34PM +0400, Denis V. Lunev wrote: Parallels image format has several additional fields inside: - nb_sectors is actually 64 bit wide. Upper 32bits are not used for images with signature "WithoutFreeSpace" and must be explicitely s/explicitely/explicitly zeroed according to Parallels. They will be used for images with signature "WithouFreSpacExt" - inuse is magic which means that the image is currently opened for read/write or was not closed correctly, the magic is 0x746f6e59 - data_off is the location of the first data block. It can be zero and in this case I think you may have forgotten to finish this sentence :) ok This patch adds these values to struct parallels_header and adds proper handling of nb_sectors for currently supported WithoutFreeSpace images. WithouFreSpacExt will be covered in the next patch. Signed-off-by: Denis V. Lunev CC: Kevin Wolf CC: Stefan Hajnoczi --- block/parallels.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/block/parallels.c b/block/parallels.c index 1a5bd35..c44df87 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -41,8 +41,10 @@ struct parallels_header { uint32_t cylinders; uint32_t tracks; uint32_t catalog_entries; -uint32_t nb_sectors; -char padding[24]; +uint64_t nb_sectors; +uint32_t inuse; +uint32_t data_off; +char padding[12]; } QEMU_PACKED; typedef struct BDRVParallelsState { @@ -90,7 +92,7 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } -bs->total_sectors = le32_to_cpu(ph.nb_sectors); +bs->total_sectors = (uint32_t)le64_to_cpu(ph.nb_sectors); I think an explicit bit mask on the upper 32 bits would fit better here than a cast, especially since neither 'bs->total_sectors' nor 'ph.nb_sectors' is a uint32_t. E.g.: bs->total_sectors = 0x & le64_to_cpu(ph.nb_sectors); ok, will do
[Qemu-devel] [Bug 1307225] Re: Running a virtual machine on a Haswell system produces machine check events
I can confirm this. Using qemu-kvm for three virtual machines on Ubuntu 14.04 LTS using a Intel i7-4770 Haswell based server. dmesg: [63429.847437] mce: [Hardware Error]: Machine check events logged [65996.795630] mce: [Hardware Error]: Machine check events logged mcelog: Hardware event. This is not a software error. MCE 0 CPU 2 BANK 0 TIME 1406265172 Fri Jul 25 07:12:52 2014 MCG status: MCi status: Corrected error Error enabled MCA: Internal parity error STATUS 904f0005 MCGSTATUS 0 MCGCAP c09 APICID 4 SOCKETID 0 CPUID Vendor Intel Family 6 Model 60 It's the same error everytime, only APICID and CPU numbers are different. The mce errors did not happen until i migrated the virtual machines from another system, the haswell-server was up for three days without any incidents, now, while running qemu-kvm there is a mce error every 6-12 hours. After the first errors, i called the support of my server provider, they first exchanged RAM, upgraded BIOS... Then, they replaced the whole server, only swapping my harddisks to the new one. But even that didn't help, i still got MCE errors. The harddisks where replaced too, one at a time (to resync raid). Now, i have a completely swapped hardware, but the MCE errors are still popping up. system information attached -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1307225 Title: Running a virtual machine on a Haswell system produces machine check events Status in QEMU: New Bug description: I'm running a virtual Windows SBS 2003 installation on a Xeon E3 Haswell system running Gentoo Linux. First, I used Qemu 1.5.3 (the latest stable version on Gentoo). I got a lot of machine check events ("mce: [Hardware Error]: Machine check events logged") in dmesg that always looked like (using mcelog): Hardware event. This is not a software error. MCE 0 CPU 3 BANK 0 TIME 1397455091 Mon Apr 14 07:58:11 2014 MCG status: MCi status: Corrected error Error enabled MCA: Internal parity error STATUS 904f0005 MCGSTATUS 0 MCGCAP c09 APICID 6 SOCKETID 0 CPUID Vendor Intel Family 6 Model 60 I found this discussion on the vmware community: https://communities.vmware.com/thread/452344 It seems that this is (at least partly) caused by the Qemu machine. I switched to Qemu 1.7.0, the first version to use "pc-i440fx-1.7". With this version, the errors almost disappeared, but from time to time, I still get machine check events. Anyways, they so not seem to affect neither the vm, nor the host. The Haswell machine has been set up and running for several days without a single error message. They only appear when the VM is running. so I think this is actually some problem with the Haswell architecture (and not a real hardware error). To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1307225/+subscriptions
Re: [Qemu-devel] [PATCH 4/4] block/parallels: 2TB+ parallels images support
On 24/07/14 23:25, Jeff Cody wrote: On Tue, Jul 22, 2014 at 05:19:37PM +0400, Denis V. Lunev wrote: Parallels has released in the recent updates of Parallels Server 5/6 new addition to his image format. Images with signature WithouFreSpacExt have offsets in the catalog coded not as offsets in sectors (multiple of 512 bytes) but offsets coded in blocks (i.e. header->tracks * 512) In this case all 64 bits of header->nb_sectors are used for image size. This patch implements support of this for qemu-img. Signed-off-by: Denis V. Lunev CC: Kevin Wolf CC: Stefan Hajnoczi --- block/parallels.c | 20 +++- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/block/parallels.c b/block/parallels.c index 02739cf..d9cb04f 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -30,6 +30,7 @@ /**/ #define HEADER_MAGIC "WithoutFreeSpace" +#define HEADER_MAGIC2 "WithouFreSpacExt" #define HEADER_VERSION 2 #define HEADER_SIZE 64 @@ -54,6 +55,8 @@ typedef struct BDRVParallelsState { unsigned int catalog_size; unsigned int tracks; + +unsigned int off_multiplier; } BDRVParallelsState; static int parallels_probe(const uint8_t *buf, int buf_size, const char *filename) @@ -63,7 +66,8 @@ static int parallels_probe(const uint8_t *buf, int buf_size, const char *filenam if (buf_size < HEADER_SIZE) return 0; -if (!memcmp(ph->magic, HEADER_MAGIC, 16) && +if ((!memcmp(ph->magic, HEADER_MAGIC, 16) || +!memcmp(ph->magic, HEADER_MAGIC2, 16)) && (le32_to_cpu(ph->version) == HEADER_VERSION)) return 100; @@ -85,13 +89,18 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } +bs->total_sectors = le64_to_cpu(ph.nb_sectors); + bs->total_sectors is a int64_t, and ph.nb_sectors is a uint64_t. Should we do a sanity check here on the max number of sectors? in reality such image will not be usable as we will later have to multiply this count with 512 to calculate offset in the file if (le32_to_cpu(ph.version) != HEADER_VERSION) goto fail_format; -if (memcmp(ph.magic, HEADER_MAGIC, 16)) +if (!memcmp(ph.magic, HEADER_MAGIC, 16)) { +s->off_multiplier = 1; +bs->total_sectors = (uint32_t)bs->total_sectors; (same comment as in patch 1, w.r.t. cast vs. bitmask) ok +} else if (!memcmp(ph.magic, HEADER_MAGIC2, 16)) +s->off_multiplier = le32_to_cpu(ph.tracks); Is there a maximum size in the specification for ph.tracks? no, there is no limitation. Though in reality I have seen the following images only: 252 sectors, 63 sectors, 256 sectors, 2048 sectors and only 2048 was used with this new header. This is just an observation. +else goto fail_format; (same comment as the last patch, w.r.t. braces) -bs->total_sectors = (uint32_t)le64_to_cpu(ph.nb_sectors); - s->tracks = le32_to_cpu(ph.tracks); if (s->tracks == 0) { error_setg(errp, "Invalid image: Zero sectors per track"); @@ -137,7 +146,8 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num) /* not allocated */ if ((index > s->catalog_size) || (s->catalog_bitmap[index] == 0)) return -1; -return (uint64_t)(s->catalog_bitmap[index] + offset) * 512; +return +((uint64_t)s->catalog_bitmap[index] * s->off_multiplier + offset) * 512; Do we need to worry about overflow here, depending on the value of off_multiplier? Also, (and this existed prior to this patch), - we are casting to uint64_t, although the function returns int64_t. good catch. I'll change the declaration to uint64_t and will return 0 from previous if aka if ((index > s->catalog_size) || (s->catalog_bitmap[index] == 0)) It is not possible to obtain 0 as a real offset of any sector as this is an offset of the header. Regarding overflow check. Do you think that we should return error to the upper layer or we could silently fill result data with zeroes as was done originally for the following case 'index > s->catalog_size' ? } static int parallels_read(BlockDriverState *bs, int64_t sector_num, -- 1.9.1
Re: [Qemu-devel] [PATCH 3/4] block/parallels: split check for parallels format in parallels_open
On 24/07/14 22:50, Jeff Cody wrote: On Tue, Jul 22, 2014 at 05:19:36PM +0400, Denis V. Lunev wrote: and rework error path a bit. There is no difference at the moment, but the code will be definitely shorter when additional processing will be required for WithouFreSpacExt Signed-off-by: Denis V. Lunev CC: Kevin Wolf CC: Stefan Hajnoczi --- block/parallels.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/block/parallels.c b/block/parallels.c index 8f9ec8a..02739cf 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -85,12 +85,10 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } -if (memcmp(ph.magic, HEADER_MAGIC, 16) || -(le32_to_cpu(ph.version) != HEADER_VERSION)) { -error_setg(errp, "Image not in Parallels format"); -ret = -EINVAL; -goto fail; -} +if (le32_to_cpu(ph.version) != HEADER_VERSION) +goto fail_format; +if (memcmp(ph.magic, HEADER_MAGIC, 16)) +goto fail_format; QEMU coding style dictates these statements have curly braces, even though they are just one liners. (If you run your patches against scripts/checkpatch.pl, it should catch most style issues). ok, I just used a kernel convention. Will redo this here and in the next path. This is not a problem :) Thank you for pointing this out. I think this patch could also just be squashed into patch 4, if desired. I'd prefer not to do this to have behavior changing and behavior non-changing stuff separated. bs->total_sectors = (uint32_t)le64_to_cpu(ph.nb_sectors); @@ -120,6 +118,9 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, qemu_co_mutex_init(&s->lock); return 0; +fail_format: +error_setg(errp, "Image not in Parallels format"); +ret = -EINVAL; fail: g_free(s->catalog_bitmap); return ret; -- 1.9.1
[Qemu-devel] [Bug 1307225] Re: Running a virtual machine on a Haswell system produces machine check events
attachment logfiles, dmidecode, system information ** Attachment added: "logfiles, dmidecode, system information" https://bugs.launchpad.net/qemu/+bug/1307225/+attachment/4162599/+files/logfiles-mce.txt -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1307225 Title: Running a virtual machine on a Haswell system produces machine check events Status in QEMU: New Bug description: I'm running a virtual Windows SBS 2003 installation on a Xeon E3 Haswell system running Gentoo Linux. First, I used Qemu 1.5.3 (the latest stable version on Gentoo). I got a lot of machine check events ("mce: [Hardware Error]: Machine check events logged") in dmesg that always looked like (using mcelog): Hardware event. This is not a software error. MCE 0 CPU 3 BANK 0 TIME 1397455091 Mon Apr 14 07:58:11 2014 MCG status: MCi status: Corrected error Error enabled MCA: Internal parity error STATUS 904f0005 MCGSTATUS 0 MCGCAP c09 APICID 6 SOCKETID 0 CPUID Vendor Intel Family 6 Model 60 I found this discussion on the vmware community: https://communities.vmware.com/thread/452344 It seems that this is (at least partly) caused by the Qemu machine. I switched to Qemu 1.7.0, the first version to use "pc-i440fx-1.7". With this version, the errors almost disappeared, but from time to time, I still get machine check events. Anyways, they so not seem to affect neither the vm, nor the host. The Haswell machine has been set up and running for several days without a single error message. They only appear when the VM is running. so I think this is actually some problem with the Haswell architecture (and not a real hardware error). To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1307225/+subscriptions
Re: [Qemu-devel] [PATCH 4/4] block/parallels: 2TB+ parallels images support
On Fri, Jul 25, 2014 at 07:51:47AM +0400, Denis V. Lunev wrote: > On 24/07/14 23:25, Jeff Cody wrote: > >On Tue, Jul 22, 2014 at 05:19:37PM +0400, Denis V. Lunev wrote: > >>Parallels has released in the recent updates of Parallels Server 5/6 > >>new addition to his image format. Images with signature WithouFreSpacExt > >>have offsets in the catalog coded not as offsets in sectors (multiple > >>of 512 bytes) but offsets coded in blocks (i.e. header->tracks * 512) > >> > >>In this case all 64 bits of header->nb_sectors are used for image size. > >> > >>This patch implements support of this for qemu-img. > >> > >>Signed-off-by: Denis V. Lunev > >>CC: Kevin Wolf > >>CC: Stefan Hajnoczi > >>--- > >> block/parallels.c | 20 +++- > >> 1 file changed, 15 insertions(+), 5 deletions(-) > >> > >>diff --git a/block/parallels.c b/block/parallels.c > >>index 02739cf..d9cb04f 100644 > >>--- a/block/parallels.c > >>+++ b/block/parallels.c > >>@@ -30,6 +30,7 @@ > >> /**/ > >> #define HEADER_MAGIC "WithoutFreeSpace" > >>+#define HEADER_MAGIC2 "WithouFreSpacExt" > >> #define HEADER_VERSION 2 > >> #define HEADER_SIZE 64 > >>@@ -54,6 +55,8 @@ typedef struct BDRVParallelsState { > >> unsigned int catalog_size; > >> unsigned int tracks; > >>+ > >>+unsigned int off_multiplier; > >> } BDRVParallelsState; > >> static int parallels_probe(const uint8_t *buf, int buf_size, const char > >> *filename) > >>@@ -63,7 +66,8 @@ static int parallels_probe(const uint8_t *buf, int > >>buf_size, const char *filenam > >> if (buf_size < HEADER_SIZE) > >> return 0; > >>-if (!memcmp(ph->magic, HEADER_MAGIC, 16) && > >>+if ((!memcmp(ph->magic, HEADER_MAGIC, 16) || > >>+!memcmp(ph->magic, HEADER_MAGIC2, 16)) && > >> (le32_to_cpu(ph->version) == HEADER_VERSION)) > >> return 100; > >>@@ -85,13 +89,18 @@ static int parallels_open(BlockDriverState *bs, QDict > >>*options, int flags, > >> goto fail; > >> } > >>+bs->total_sectors = le64_to_cpu(ph.nb_sectors); > >>+ > >bs->total_sectors is a int64_t, and ph.nb_sectors is a uint64_t. > >Should we do a sanity check here on the max number of sectors? > in reality such image will not be usable as we will later have to multiply > this count with 512 to calculate offset in the file > >> if (le32_to_cpu(ph.version) != HEADER_VERSION) > >> goto fail_format; > >>-if (memcmp(ph.magic, HEADER_MAGIC, 16)) > >>+if (!memcmp(ph.magic, HEADER_MAGIC, 16)) { > >>+s->off_multiplier = 1; > >>+bs->total_sectors = (uint32_t)bs->total_sectors; > >(same comment as in patch 1, w.r.t. cast vs. bitmask) > ok > >>+} else if (!memcmp(ph.magic, HEADER_MAGIC2, 16)) > >>+s->off_multiplier = le32_to_cpu(ph.tracks); > >Is there a maximum size in the specification for ph.tracks? > no, there is no limitation. Though in reality I have seen the > following images only: > 252 sectors, 63 sectors, 256 sectors, 2048 sectors > and only 2048 was used with this new header. > > This is just an observation. > > >>+else > >> goto fail_format; > >(same comment as the last patch, w.r.t. braces) > > > >>-bs->total_sectors = (uint32_t)le64_to_cpu(ph.nb_sectors); > >>- > >> s->tracks = le32_to_cpu(ph.tracks); > >> if (s->tracks == 0) { > >> error_setg(errp, "Invalid image: Zero sectors per track"); > >>@@ -137,7 +146,8 @@ static int64_t seek_to_sector(BlockDriverState *bs, > >>int64_t sector_num) > >> /* not allocated */ > >> if ((index > s->catalog_size) || (s->catalog_bitmap[index] == 0)) > >> return -1; > >>-return (uint64_t)(s->catalog_bitmap[index] + offset) * 512; > >>+return > >>+((uint64_t)s->catalog_bitmap[index] * s->off_multiplier + offset) > >>* 512; > >Do we need to worry about overflow here, depending on the value of > >off_multiplier? > > > >Also, (and this existed prior to this patch), - we are casting to > >uint64_t, although the function returns int64_t. > good catch. I'll change the declaration to uint64_t and will return 0 > from previous if aka > Thanks. I'm not sure that is the best option though - the only place the return value from this function is used is in parallels_read(), which passes the output into bdrv_pread() as the offset. The offset parameter for bdrv_pread/pwrite is an int64_t, not a uint64_t. > if ((index > s->catalog_size) || (s->catalog_bitmap[index] == 0)) > > It is not possible to obtain 0 as a real offset of any sector as > this is an offset of the header. > > Regarding overflow check. Do you think that we should return > error to the upper layer or we could silently fill result data > with zeroes as was done originally for the following case > 'index > s->catalog_size' ? I think it would be best to return error (anything < 0 will also cause bdrv_pread to return -EIO, and it is also checked for in parallels_read). I a
Re: [Qemu-devel] [PATCH 4/4] block/parallels: 2TB+ parallels images support
On 25/07/14 17:08, Jeff Cody wrote: On Fri, Jul 25, 2014 at 07:51:47AM +0400, Denis V. Lunev wrote: On 24/07/14 23:25, Jeff Cody wrote: On Tue, Jul 22, 2014 at 05:19:37PM +0400, Denis V. Lunev wrote: Parallels has released in the recent updates of Parallels Server 5/6 new addition to his image format. Images with signature WithouFreSpacExt have offsets in the catalog coded not as offsets in sectors (multiple of 512 bytes) but offsets coded in blocks (i.e. header->tracks * 512) In this case all 64 bits of header->nb_sectors are used for image size. This patch implements support of this for qemu-img. Signed-off-by: Denis V. Lunev CC: Kevin Wolf CC: Stefan Hajnoczi --- block/parallels.c | 20 +++- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/block/parallels.c b/block/parallels.c index 02739cf..d9cb04f 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -30,6 +30,7 @@ /**/ #define HEADER_MAGIC "WithoutFreeSpace" +#define HEADER_MAGIC2 "WithouFreSpacExt" #define HEADER_VERSION 2 #define HEADER_SIZE 64 @@ -54,6 +55,8 @@ typedef struct BDRVParallelsState { unsigned int catalog_size; unsigned int tracks; + +unsigned int off_multiplier; } BDRVParallelsState; static int parallels_probe(const uint8_t *buf, int buf_size, const char *filename) @@ -63,7 +66,8 @@ static int parallels_probe(const uint8_t *buf, int buf_size, const char *filenam if (buf_size < HEADER_SIZE) return 0; -if (!memcmp(ph->magic, HEADER_MAGIC, 16) && +if ((!memcmp(ph->magic, HEADER_MAGIC, 16) || +!memcmp(ph->magic, HEADER_MAGIC2, 16)) && (le32_to_cpu(ph->version) == HEADER_VERSION)) return 100; @@ -85,13 +89,18 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } +bs->total_sectors = le64_to_cpu(ph.nb_sectors); + bs->total_sectors is a int64_t, and ph.nb_sectors is a uint64_t. Should we do a sanity check here on the max number of sectors? in reality such image will not be usable as we will later have to multiply this count with 512 to calculate offset in the file if (le32_to_cpu(ph.version) != HEADER_VERSION) goto fail_format; -if (memcmp(ph.magic, HEADER_MAGIC, 16)) +if (!memcmp(ph.magic, HEADER_MAGIC, 16)) { +s->off_multiplier = 1; +bs->total_sectors = (uint32_t)bs->total_sectors; (same comment as in patch 1, w.r.t. cast vs. bitmask) ok +} else if (!memcmp(ph.magic, HEADER_MAGIC2, 16)) +s->off_multiplier = le32_to_cpu(ph.tracks); Is there a maximum size in the specification for ph.tracks? no, there is no limitation. Though in reality I have seen the following images only: 252 sectors, 63 sectors, 256 sectors, 2048 sectors and only 2048 was used with this new header. This is just an observation. +else goto fail_format; (same comment as the last patch, w.r.t. braces) -bs->total_sectors = (uint32_t)le64_to_cpu(ph.nb_sectors); - s->tracks = le32_to_cpu(ph.tracks); if (s->tracks == 0) { error_setg(errp, "Invalid image: Zero sectors per track"); @@ -137,7 +146,8 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num) /* not allocated */ if ((index > s->catalog_size) || (s->catalog_bitmap[index] == 0)) return -1; -return (uint64_t)(s->catalog_bitmap[index] + offset) * 512; +return +((uint64_t)s->catalog_bitmap[index] * s->off_multiplier + offset) * 512; Do we need to worry about overflow here, depending on the value of off_multiplier? Also, (and this existed prior to this patch), - we are casting to uint64_t, although the function returns int64_t. good catch. I'll change the declaration to uint64_t and will return 0 from previous if aka Thanks. I'm not sure that is the best option though - the only place the return value from this function is used is in parallels_read(), which passes the output into bdrv_pread() as the offset. The offset parameter for bdrv_pread/pwrite is an int64_t, not a uint64_t. if ((index > s->catalog_size) || (s->catalog_bitmap[index] == 0)) It is not possible to obtain 0 as a real offset of any sector as this is an offset of the header. Regarding overflow check. Do you think that we should return error to the upper layer or we could silently fill result data with zeroes as was done originally for the following case 'index > s->catalog_size' ? I think it would be best to return error (anything < 0 will also cause bdrv_pread to return -EIO, and it is also checked for in parallels_read). I also don't mean to over-complicate things here for you, which is why I asked about the max value of ph.tracks. If that has a reasonable bounds check, then we don't need to worry about overflow at all, and can just change the cast to an int64_t instead of uint64_t. I think we are safe so long as ph.tracks <= (INT32_M
[Qemu-devel] [PATCH v2 2/4] libqemustub: add qemu_system_shutdown_force()
These sysemu functions will be needed by qemu-char.c, which is linked into tests/vhost-user-test without system emulation functionality. Signed-off-by: Stefan Hajnoczi --- stubs/Makefile.objs | 1 + stubs/shutdown.c| 5 + 2 files changed, 6 insertions(+) create mode 100644 stubs/shutdown.c diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs index 528e161..f17d517 100644 --- a/stubs/Makefile.objs +++ b/stubs/Makefile.objs @@ -39,3 +39,4 @@ stub-obj-$(CONFIG_WIN32) += fd-register.o stub-obj-y += cpus.o stub-obj-y += kvm.o stub-obj-y += qmp_pc_dimm_device_list.o +stub-obj-y += shutdown.o diff --git a/stubs/shutdown.c b/stubs/shutdown.c new file mode 100644 index 000..37c96c0 --- /dev/null +++ b/stubs/shutdown.c @@ -0,0 +1,5 @@ +#include "sysemu/sysemu.h" + +void qemu_system_shutdown_force(void) +{ +} -- 1.9.3
[Qemu-devel] [PATCH v2 0/4] libqtest: solve QEMU process cleanup problem
v2: * Added qemu_system_shutdown_force() API [Peter] Test cases are supposed to clean up even if they fail. Historically libqtest has leaked QEMU processes and files. This caused annoyances and buildbot failures so it was gradually fixed. The solution we have for terminating the QEMU process if the test case fails was not very satisfactory. A SIGABRT handler in the test case sends SIGTERM to QEMU. This only works if the test case receives SIGABRT, not other ways in which it could die. The approach is ugly because it installs a global signal handler although libqtest is supposed to support multiple simultaneous instances. This patch series adds the new -chardev exit-on-eof option. QEMU will terminate if the socket/pipe/stdio receives EOF. That happens when the test case process terminates for any reason. By adding this option to -chardev we can use it with both -qtest and -qmp. Stefan Hajnoczi (4): vl: add qemu_system_shutdown_force() libqemustub: add qemu_system_shutdown_force() qemu-char: add -chardev exit-on-eof option libqtest: use -chardev exit-on-eof to clean up QEMU include/sysemu/char.h | 1 + include/sysemu/sysemu.h | 2 +- qapi-schema.json| 23 --- qemu-char.c | 33 +++-- qemu-options.hx | 19 +-- qmp.c | 3 +-- stubs/Makefile.objs | 1 + stubs/shutdown.c| 5 + tests/libqtest.c| 48 +++- ui/sdl.c| 3 +-- ui/sdl2.c | 6 ++ vl.c| 11 --- 12 files changed, 79 insertions(+), 76 deletions(-) create mode 100644 stubs/shutdown.c -- 1.9.3
[Qemu-devel] [PATCH v2 4/4] libqtest: use -chardev exit-on-eof to clean up QEMU
When the test case aborts it is important to terminate the QEMU process so it does not leak. This was implemented using a SIGABRT handler function in libqtest that sent SIGTERM to QEMU. The SIGABRT approach is messy because it requires a global signal handler but libqtest should support multiple simultaneous instances. Simplify the code using the new -chardev exit-on-eof option. QEMU will automatically exit when our qtest socket closes. Signed-off-by: Stefan Hajnoczi --- tests/libqtest.c | 48 +++- 1 file changed, 3 insertions(+), 45 deletions(-) diff --git a/tests/libqtest.c b/tests/libqtest.c index 98e8f4b..6c3dd27 100644 --- a/tests/libqtest.c +++ b/tests/libqtest.c @@ -46,12 +46,8 @@ struct QTestState bool irq_level[MAX_IRQ]; GString *rx; pid_t qemu_pid; /* our child QEMU process */ -struct sigaction sigact_old; /* restored on exit */ }; -static GList *qtest_instances; -static struct sigaction sigact_old; - #define g_assert_no_errno(ret) do { \ g_assert_cmpint(ret, !=, -1); \ } while (0) @@ -110,32 +106,6 @@ static void kill_qemu(QTestState *s) } } -static void sigabrt_handler(int signo) -{ -GList *elem; -for (elem = qtest_instances; elem; elem = elem->next) { -kill_qemu(elem->data); -} -} - -static void setup_sigabrt_handler(void) -{ -struct sigaction sigact; - -/* Catch SIGABRT to clean up on g_assert() failure */ -sigact = (struct sigaction){ -.sa_handler = sigabrt_handler, -.sa_flags = SA_RESETHAND, -}; -sigemptyset(&sigact.sa_mask); -sigaction(SIGABRT, &sigact, &sigact_old); -} - -static void cleanup_sigabrt_handler(void) -{ -sigaction(SIGABRT, &sigact_old, NULL); -} - QTestState *qtest_init(const char *extra_args) { QTestState *s; @@ -156,17 +126,12 @@ QTestState *qtest_init(const char *extra_args) sock = init_socket(socket_path); qmpsock = init_socket(qmp_socket_path); -/* Only install SIGABRT handler once */ -if (!qtest_instances) { -setup_sigabrt_handler(); -} - -qtest_instances = g_list_prepend(qtest_instances, s); - s->qemu_pid = fork(); if (s->qemu_pid == 0) { command = g_strdup_printf("exec %s " - "-qtest unix:%s,nowait " + "-chardev socket,id=qtestdev,path=%s,nowait," + "exit-on-eof " + "-qtest chardev:qtestdev " "-qtest-log /dev/null " "-qmp unix:%s,nowait " "-machine accel=qtest " @@ -207,13 +172,6 @@ QTestState *qtest_init(const char *extra_args) void qtest_quit(QTestState *s) { -/* Uninstall SIGABRT handler on last instance */ -if (qtest_instances && !qtest_instances->next) { -cleanup_sigabrt_handler(); -} - -qtest_instances = g_list_remove(qtest_instances, s); - kill_qemu(s); close(s->fd); close(s->qmp_fd); -- 1.9.3
[Qemu-devel] [PATCH v2 1/4] vl: add qemu_system_shutdown_force()
The QEMU --no-shutdown option prevents guests from shutting down. There are several cases where shutdown should be forced, even if --no-shutdown was given. This patch adds an API for forcing shutdown. This cleans up the code and hides the extern int no_shutdown variable which is currently being touched in various files. Signed-off-by: Stefan Hajnoczi --- include/sysemu/sysemu.h | 2 +- qmp.c | 3 +-- ui/sdl.c| 3 +-- ui/sdl2.c | 6 ++ vl.c| 11 --- 5 files changed, 13 insertions(+), 12 deletions(-) diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index d8539fd..171d7d3 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -58,6 +58,7 @@ void qemu_system_wakeup_request(WakeupReason reason); void qemu_system_wakeup_enable(WakeupReason reason, bool enabled); void qemu_register_wakeup_notifier(Notifier *notifier); void qemu_system_shutdown_request(void); +void qemu_system_shutdown_force(void); void qemu_system_powerdown_request(void); void qemu_register_powerdown_notifier(Notifier *notifier); void qemu_system_debug_request(void); @@ -126,7 +127,6 @@ extern int max_cpus; extern int cursor_hide; extern int graphic_rotate; extern int no_quit; -extern int no_shutdown; extern int semihosting_enabled; extern int old_param; extern int boot_menu; diff --git a/qmp.c b/qmp.c index 0d2553a..cd4d6a7 100644 --- a/qmp.c +++ b/qmp.c @@ -86,8 +86,7 @@ UuidInfo *qmp_query_uuid(Error **errp) void qmp_quit(Error **errp) { -no_shutdown = 0; -qemu_system_shutdown_request(); +qemu_system_shutdown_force(); } void qmp_stop(Error **errp) diff --git a/ui/sdl.c b/ui/sdl.c index 4e7f920..7c3d91c 100644 --- a/ui/sdl.c +++ b/ui/sdl.c @@ -791,8 +791,7 @@ static void sdl_refresh(DisplayChangeListener *dcl) break; case SDL_QUIT: if (!no_quit) { -no_shutdown = 0; -qemu_system_shutdown_request(); +qemu_system_shutdown_force(); } break; case SDL_MOUSEMOTION: diff --git a/ui/sdl2.c b/ui/sdl2.c index fcac87b..f392528 100644 --- a/ui/sdl2.c +++ b/ui/sdl2.c @@ -711,8 +711,7 @@ static void handle_windowevent(DisplayChangeListener *dcl, SDL_Event *ev) break; case SDL_WINDOWEVENT_CLOSE: if (!no_quit) { -no_shutdown = 0; -qemu_system_shutdown_request(); +qemu_system_shutdown_force(); } break; } @@ -743,8 +742,7 @@ static void sdl_refresh(DisplayChangeListener *dcl) break; case SDL_QUIT: if (!no_quit) { -no_shutdown = 0; -qemu_system_shutdown_request(); +qemu_system_shutdown_force(); } break; case SDL_MOUSEMOTION: diff --git a/vl.c b/vl.c index fe451aa..c733e04 100644 --- a/vl.c +++ b/vl.c @@ -164,7 +164,7 @@ int acpi_enabled = 1; int no_hpet = 0; int fd_bootchk = 1; static int no_reboot; -int no_shutdown = 0; +static int no_shutdown = 0; int cursor_hide = 1; int graphic_rotate = 0; const char *watchdog; @@ -1915,8 +1915,7 @@ void qemu_system_killed(int signal, pid_t pid) { shutdown_signal = signal; shutdown_pid = pid; -no_shutdown = 0; -qemu_system_shutdown_request(); +qemu_system_shutdown_force(); } void qemu_system_shutdown_request(void) @@ -1926,6 +1925,12 @@ void qemu_system_shutdown_request(void) qemu_notify_event(); } +void qemu_system_shutdown_force(void) +{ +no_shutdown = 0; +qemu_system_shutdown_request(); +} + static void qemu_system_powerdown(void) { qapi_event_send_powerdown(&error_abort); -- 1.9.3
[Qemu-devel] [PATCH v2 3/4] qemu-char: add -chardev exit-on-eof option
When QEMU is executed as part of a test case or from a script, it is usually desirable to exit if the parent process terminates. This ensures that "leaked" QEMU processes do not continue consuming resources after their parent has died. This patch adds the -chardev exit-on-eof option causing socket and pipe chardevs to exit QEMU upon close. This happens when a parent process deliberately closes its file descriptor but also when the kernel cleans up a crashed process. Signed-off-by: Stefan Hajnoczi --- include/sysemu/char.h | 1 + qapi-schema.json | 23 --- qemu-char.c | 33 +++-- qemu-options.hx | 19 +-- 4 files changed, 57 insertions(+), 19 deletions(-) diff --git a/include/sysemu/char.h b/include/sysemu/char.h index 0bbd631..382b320 100644 --- a/include/sysemu/char.h +++ b/include/sysemu/char.h @@ -86,6 +86,7 @@ struct CharDriverState { guint fd_in_tag; QemuOpts *opts; QTAILQ_ENTRY(CharDriverState) next; +bool exit_on_eof; }; /** diff --git a/qapi-schema.json b/qapi-schema.json index b11aad2..9b13da1 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -2630,10 +2630,13 @@ # @device: The name of the special file for the device, # i.e. /dev/ttyS0 on Unix or COM1: on Windows # @type: What kind of device this is. +# @exit-on-eof: #optional terminate when other side closes the pipe +# (default: false, since: 2.2) # # Since: 1.4 ## -{ 'type': 'ChardevHostdev', 'data': { 'device' : 'str' } } +{ 'type': 'ChardevHostdev', 'data': { 'device' : 'str', + '*exit-on-eof' : 'bool' } } ## # @ChardevSocket: @@ -2648,14 +2651,17 @@ # @nodelay: #optional set TCP_NODELAY socket option (default: false) # @telnet: #optional enable telnet protocol on server # sockets (default: false) +# @exit-on-eof: #optional terminate when other side closes socket +# (default: false, since: 2.2) # # Since: 1.4 ## -{ 'type': 'ChardevSocket', 'data': { 'addr' : 'SocketAddress', - '*server' : 'bool', - '*wait': 'bool', - '*nodelay' : 'bool', - '*telnet' : 'bool' } } +{ 'type': 'ChardevSocket', 'data': { 'addr' : 'SocketAddress', + '*server' : 'bool', + '*wait': 'bool', + '*nodelay' : 'bool', + '*telnet' : 'bool', + '*exit-on-eof' : 'bool' } } ## # @ChardevUdp: @@ -2689,10 +2695,13 @@ # @signal: #optional Allow signals (such as SIGINT triggered by ^C) # be delivered to qemu. Default: true in -nographic mode, # false otherwise. +# @exit-on-eof: #optional terminate when other side sends EOF +# (default: false, since: 2.2) # # Since: 1.5 ## -{ 'type': 'ChardevStdio', 'data': { '*signal' : 'bool' } } +{ 'type': 'ChardevStdio', 'data': { '*signal' : 'bool', +'*exit-on-eof' : 'bool' } } ## # @ChardevSpiceChannel: diff --git a/qemu-char.c b/qemu-char.c index 7acc03f..1974b8f 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -110,9 +110,15 @@ void qemu_chr_be_event(CharDriverState *s, int event) break; } -if (!s->chr_event) -return; -s->chr_event(s->handler_opaque, event); +if (s->chr_event) { +s->chr_event(s->handler_opaque, event); +} + +if (s->exit_on_eof && event == CHR_EVENT_CLOSED) { +fprintf(stderr, "qemu: terminating due to eof on chardev '%s'\n", +s->label); +qemu_system_shutdown_force(); +} } void qemu_chr_be_generic_open(CharDriverState *s) @@ -991,6 +997,7 @@ static CharDriverState *qemu_chr_open_pipe(ChardevHostdev *opts) int fd_in, fd_out; char filename_in[256], filename_out[256]; const char *filename = opts->device; +CharDriverState *chr; if (filename == NULL) { fprintf(stderr, "chardev: pipe: no filename given\n"); @@ -1011,7 +1018,9 @@ static CharDriverState *qemu_chr_open_pipe(ChardevHostdev *opts) return NULL; } } -return qemu_chr_open_fd(fd_in, fd_out); +chr = qemu_chr_open_fd(fd_in, fd_out); +chr->exit_on_eof = opts->has_exit_on_eof && opts->exit_on_eof; +return chr; } /* init terminal so that we can grab keys */ @@ -2893,6 +2902,7 @@ static void tcp_chr_close(CharDriverState *chr) static CharDriverState *qemu_chr_open_socket_fd(int fd, bool do_nodelay, bool is_listen, bool is_telnet, bool is_waitconnect, +bool is_exit_on_eof,
Re: [Qemu-devel] [questions] about qemu log
Am 25.07.2014 13:07, schrieb Zhang Haoyu: >>> Hi, all >>> >>> If I use qemu command directly to run vm, bypass libvirt, how to configure >>> qemu to assure that each vm has its own log file, like vmname.log? >>> For example, VM: rhel7-net has its own log file, rhel7-net.log, >>> VM:rhel7-stor has its own log file, rhel7-stor.log. >> >> -D /path/to/unique/file/name.log >> > -D option is to configure qemu_logfile for the output logs which are > controlled by qmp command "logfile", which can be enabled/disabled on run > time. > > I want to configure the log file for the output of fprintf(stderr, fmt, ...), > .etc, > i.e., how to redirect the output of fprintf(stderr, fmt, ...), or some other > log-interface to a specified file? In a shell you would write something like: 2> stderr.log You may also want to toggle QEMU's -msg timestamp=on option. Cheers, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
[Qemu-devel] AArch64 ELF File Loading
Hi, I think the AArch64 port has a problem with a self-modifying code sequence that appears to run fine on other simulators, but I can't get QEMU to run the small bare metal test case I created to try to reproduce the issue. Any help would be appreciated. qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting -d exec,in_asm /tmp/test-nooverwrite 2>&1 | less qemu: fatal: Trying to execute code outside RAM or ROM at 0x qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting -d exec,in_asm -bios /tmp/test-nooverwrite 2>&1 | less qemu: fatal: Trying to execute code outside RAM or ROM at 0x qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting -d exec,in_asm -kernel /tmp/test-nooverwrite 2>&1 | less IN: 0x4000: e3a0 mov r0, #0 ; 0x0 0x4004: e59f1004 ldr r1, [pc, #4]; 0x4010 0x4008: e59f2004 ldr r2, [pc, #4]; 0x4014 0x400c: e59ff004 ldr pc, [pc, #4]; 0x4018 Trace 0x7f309f012000 [4000] Note that the above are A32 instructions, but my ELF is A64 and this is not the specified entry point. aarch64-linux-gnu-readelf -h /tmp/test-nooverwrite ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI:UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: AArch64 Version: 0x1 Entry point address: 0x80001140 Start of program headers: 64 (bytes into file) Start of section headers: 186600 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 3 Size of section headers: 64 (bytes) Number of section headers: 17 Section header string table index: 14 To generate a test bare metal executable, you can download the aarch64-none-elf toolchain from Linaro and: echo '#include int main() { printf("Hello, world!\n"); return 0; }' > hello.c aarch64-none-elf-gcc -specs=aem-ve.specs hello.c -o hello Thanks, Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Re: [Qemu-devel] AArch64 ELF File Loading
On 25 July 2014 15:01, Christopher Covington wrote: > Hi, > > I think the AArch64 port has a problem with a self-modifying code sequence > that appears to run fine on other simulators, but I can't get QEMU to run the > small bare metal test case I created to try to reproduce the issue. Any help > would be appreciated. > > qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting > -d exec,in_asm /tmp/test-nooverwrite 2>&1 | less > > qemu: fatal: Trying to execute code outside RAM or ROM at 0x > > qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting > -d exec,in_asm -bios /tmp/test-nooverwrite 2>&1 | less > > qemu: fatal: Trying to execute code outside RAM or ROM at 0x > > qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting > -d exec,in_asm -kernel /tmp/test-nooverwrite 2>&1 | less You haven't specified a CPU type, and virt defaults to cortex-a15. Try "-cpu cortex-a57". You haven't specified a memory size, and QEMU defaults to 128MB, which (given the start address of RAM) means there won't be any RAM at the load address you're trying to load your test program at. Try "-m 3G", and/or make your ELF file load at an address closer to the start of RAM. thanks -- PMM
Re: [Qemu-devel] [PULL for-2.1 0/3] Last minute patches
Il 25/07/2014 14:15, Paolo Bonzini ha scritto: > The following changes since commit f368c33d5ab09dd5656924185cd975b11838cd25: > > Update version for v2.1.0-rc3 release (2014-07-22 18:17:03 +0100) > > are available in the git repository at: > > git://github.com/bonzini/qemu.git tags/for-upstream > > for you to fetch changes up to d975e28437377bc9a65fb5f4f9486a74c7a124cc: > > qemu-char: ignore flow control if a PTY's slave is not connected > (2014-07-24 18:31:50 +0200) > > > Here are my patches for the last 2.1 rc. > > > Paolo Bonzini (3): > acpi-dsdt: procedurally generate _PRT > pc: hack for migration compatibility from QEMU 2.0 > qemu-char: ignore flow control if a PTY's slave is not connected Since Igor hasn't sent his patches, and I'm leaving the office, I pushed this to git://github.com/bonzini/qemu.git tags/for-upstream-full I don't know about tests/acpi-test-data/pc. It makes sense that this patch should modify something there, but: * "make check" passes * the test warns even before patch 1, for both the DSDT (modified by the patch) and SSDT (which this series doesn't touch at all) * I cannot get it to pass, except by blindly copying the "actual" output on the "expected" files * mst is on vacation and Marcel is off on Fridays Based on my understanding of the problem, it is not possible to fix the bug without hacks like this one, and even reverting all patches in this area would be more risky. Paolo
Re: [Qemu-devel] [PATCH V5 4/6] cpu_exec: Add sleeping algorithm
- Mail original - > De: "Paolo Bonzini" > À: "Sebastian Tanase" , qemu-devel@nongnu.org > Cc: aligu...@amazon.com, afaer...@suse.de, r...@twiddle.net, "peter maydell" > , > mich...@walle.cc, a...@alex.org.uk, stefa...@redhat.com, > lcapitul...@redhat.com, crobi...@redhat.com, > arm...@redhat.com, wenchaoq...@gmail.com, quint...@redhat.com, > kw...@redhat.com, m...@redhat.com, "pierre lemagourou" > , "jeremy rosen" , > "camille begue" > > Envoyé: Vendredi 25 Juillet 2014 12:13:44 > Objet: Re: [PATCH V5 4/6] cpu_exec: Add sleeping algorithm > > Il 25/07/2014 11:56, Sebastian Tanase ha scritto: > > The goal is to sleep qemu whenever the guest clock > > is in advance compared to the host clock (we use > > the monotonic clocks). The amount of time to sleep > > is calculated in the execution loop in cpu_exec. > > > > At first, we tried to approximate at each for loop the real time > > elapsed > > while searching for a TB (generating or retrieving from cache) and > > executing it. We would then approximate the virtual time > > corresponding > > to the number of virtual instructions executed. The difference > > between > > these 2 values would allow us to know if the guest is in advance or > > delayed. > > However, the function used for measuring the real time > > (qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very > > expensive. > > We had an added overhead of 13% of the total run time. > > > > Therefore, we modified the algorithm and only take into account the > > difference between the 2 clocks at the begining of the cpu_exec > > function. > > During the for loop we try to reduce the advance of the guest only > > by > > computing the virtual time elapsed and sleeping if necessary. The > > overhead > > is thus reduced to 3%. Even though this method still has a > > noticeable > > overhead, it no longer is a bottleneck in trying to achieve a > > better > > guest frequency for which the guest clock is faster than the host > > one. > > > > As for the the alignement of the 2 clocks, with the first algorithm > > the guest clock was oscillating between -1 and 1ms compared to the > > host clock. > > Using the second algorithm we notice that the guest is 5ms behind > > the host, which > > is still acceptable for our use case. > > > > The tests where conducted using fio and stress. The host machine in > > an i5 CPU at > > 3.10GHz running Debian Jessie (kernel 3.12). The guest machine is > > an arm versatile-pb > > built with buildroot. > > > > Currently, on our test machine, the lowest icount we can achieve > > that is suitable for > > aligning the 2 clocks is 6. However, we observe that the IO tests > > (using fio) are > > slower than the cpu tests (using stress). > > > > Signed-off-by: Sebastian Tanase > > Tested-by: Camille Bégué > > Signed-off-by: Paolo Bonzini > > --- > > cpu-exec.c | 91 > > > > cpus.c | 17 ++ > > include/qemu/timer.h | 1 + > > 3 files changed, 109 insertions(+) > > > > diff --git a/cpu-exec.c b/cpu-exec.c > > index 38e5f02..1a725b6 100644 > > --- a/cpu-exec.c > > +++ b/cpu-exec.c > > @@ -22,6 +22,84 @@ > > #include "tcg.h" > > #include "qemu/atomic.h" > > #include "sysemu/qtest.h" > > +#include "qemu/timer.h" > > + > > +/* -icount align implementation. */ > > + > > +typedef struct SyncClocks { > > +int64_t diff_clk; > > +int64_t original_instr_counter; > > +} SyncClocks; > > + > > +#if !defined(CONFIG_USER_ONLY) > > +/* Allow the guest to have a max 3ms advance. > > + * The difference between the 2 clocks could therefore > > + * oscillate around 0. > > + */ > > +#define VM_CLOCK_ADVANCE 300 > > + > > +static int64_t delay_host(int64_t diff_clk) > > +{ > > +if (diff_clk > VM_CLOCK_ADVANCE) { > > +#ifndef _WIN32 > > +struct timespec sleep_delay, rem_delay; > > +sleep_delay.tv_sec = diff_clk / 10LL; > > +sleep_delay.tv_nsec = diff_clk % 10LL; > > +if (nanosleep(&sleep_delay, &rem_delay) < 0) { > > +diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * > > 10LL; > > +diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec; > > +} else { > > +diff_clk = 0; > > +} > > +#else > > +Sleep(diff_clk / SCALE_MS); > > +diff_clk = 0; > > +#endif > > +} > > +return diff_clk; > > +} > > + > > +static int64_t instr_to_vtime(int64_t instr_counter, const > > CPUState *cpu) > > +{ > > +int64_t instr_exec_time; > > +instr_exec_time = instr_counter - > > + (cpu->icount_extra + > > + cpu->icount_decr.u16.low); > > +instr_exec_time = instr_exec_time << icount_time_shift; > > + > > +return instr_exec_time; > > +} > > + > > +static void align_clocks(SyncClocks *sc, const CPUState *cpu) > > +{ > > +if (!icount_align_option) { > > +return; > > +} > > +sc->diff_clk += instr_to_vtime(
Re: [Qemu-devel] AArch64 ELF File Loading
Hi Peter, On 07/25/2014 10:07 AM, Peter Maydell wrote: > On 25 July 2014 15:01, Christopher Covington wrote: >> Hi, >> >> I think the AArch64 port has a problem with a self-modifying code sequence >> that appears to run fine on other simulators, but I can't get QEMU to run the >> small bare metal test case I created to try to reproduce the issue. Any help >> would be appreciated. >> >> qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting >> -d exec,in_asm /tmp/test-nooverwrite 2>&1 | less >> >> qemu: fatal: Trying to execute code outside RAM or ROM at 0x >> >> qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting >> -d exec,in_asm -bios /tmp/test-nooverwrite 2>&1 | less >> >> qemu: fatal: Trying to execute code outside RAM or ROM at 0x >> >> qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt -semihosting >> -d exec,in_asm -kernel /tmp/test-nooverwrite 2>&1 | less > > You haven't specified a CPU type, and virt defaults > to cortex-a15. Try "-cpu cortex-a57". That explains the A32 instructions. > You haven't specified a memory size, and QEMU defaults > to 128MB, which (given the start address of RAM) means > there won't be any RAM at the load address you're trying to > load your test program at. Try "-m 3G", and/or make your > ELF file load at an address closer to the start of RAM. Thanks for the suggestions. aarch64-none-elf-gcc -specs=rdimon.specs -Ttext=0x4000 hello.c -o hello aarch64-none-elf-readelf -h hello ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI:UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: AArch64 Version: 0x1 Entry point address: 0x4158 Start of program headers: 64 (bytes into file) Start of section headers: 157432 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 4 Size of section headers: 64 (bytes) Number of section headers: 17 Section header string table index: 14 qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt \ -cpu cortex-a57 -m 3G -semihosting -kernel hello qemu: fatal: Trying to execute code outside RAM or ROM at 0x Thanks, Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Re: [Qemu-devel] AArch64 ELF File Loading
On 25 July 2014 15:35, Christopher Covington wrote: > qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt \ > -cpu cortex-a57 -m 3G -semihosting -kernel hello > > qemu: fatal: Trying to execute code outside RAM or ROM at 0x This means your code took an exception (and there's no RAM at the low address where the vector table is by default). Try "-d in_asm,exec,int" to get a better idea of what's being executed. Also, where do you expect the output from that printf to be going? Does your gcc/bare metal libc write to the UART? How does it know what address the UART is? If it's expecting to do semihosting for output, then you're running into the fact that we don't implement semihosting for AArch64 yet. thanks -- PMM
Re: [Qemu-devel] [PATCH 1/7] tests: Functions bus_foreach and device_find from libqos virtio API
On Thu, Jul 24, 2014 at 08:30:59PM +0200, Marc Marí wrote: > +static QPCIBus *test_start(void) > +{ > +char cmdline[100]; > +char tmp_path[] = "/tmp/qtest.XX"; > +int fd, ret; > + > +/* Create a temporary raw image */ > +fd = mkstemp(tmp_path); > +g_assert_cmpint(fd, >=, 0); > +ret = ftruncate(fd, TEST_IMAGE_SIZE); > +g_assert_cmpint(ret, ==, 0); > +close(fd); > + > +last_tmp_path = g_malloc0(strlen(tmp_path)); > +strcpy(last_tmp_path, tmp_path); > + > +snprintf(cmdline, 100, "-drive if=none,id=drive0,file=%s " > +"-device virtio-blk-pci,drive=drive0,addr=%x.%x", > +tmp_path, PCI_SLOT, PCI_FN); > +qtest_start(cmdline); Please unlink the temporary disk image file here. The QEMU process has a file descriptor open when we reach this point, so it's safe to delete it on disk (the file stays allocated until the last file descriptor is closed). This is important so that the temporary file is always deleted in failure cases. We do not reach test_end() when an assertion fails. pgpjYo2EWJU2Q.pgp Description: PGP signature
Re: [Qemu-devel] AArch64 ELF File Loading
Hi Peter, On 07/25/2014 10:41 AM, Peter Maydell wrote: > On 25 July 2014 15:35, Christopher Covington wrote: >> qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt \ >> -cpu cortex-a57 -m 3G -semihosting -kernel hello >> >> qemu: fatal: Trying to execute code outside RAM or ROM at 0x > > This means your code took an exception (and there's > no RAM at the low address where the vector table is > by default). Try "-d in_asm,exec,int" to get a better idea > of what's being executed. qemu-system-aarch64 -nodefaults -nographic -monitor none \ -M virt -cpu cortex-a57 -m 3G -semihosting -kernel hello \ -d in_asm,exec,int qemu: fatal: Trying to execute code outside RAM or ROM at 0x PC= SP= X00= X01= X02= ... The binary runs fine on at least one other simulator. > Also, where do you expect the output from that printf > to be going? Does your gcc/bare metal libc write to > the UART? How does it know what address the UART is? > If it's expecting to do semihosting for output, then you're > running into the fact that we don't implement semihosting > for AArch64 yet. I have local patches adding semihosting for AArch64. I hope eventually be able to share them and other changes, but the approvals will likely take a while longer. Here is an example that doesn't use semihosting. wget http://releases.linaro.org/14.06/components/toolchain/binaries/gcc-linaro-aarch64-none-elf-4.9-2014.06-02_linux.tar.xz tar xf gcc-linaro-aarch64-none-elf-4.9-2014.06-02_linux.tar.xz echo '.global _start _start: mrs x0, midr_el1 b _start' > hello.S gcc-linaro-aarch64-none-elf-4.9-2014.06-02_linux/bin/aarch64-none-elf-gcc \ -nostdlib -Ttext=0x4000 hello.S -o hello qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt \ -cpu cortex-a57 -m 3G -semihosting -kernel hello -d in_asm,exec,int qemu: fatal: Trying to execute code outside RAM or ROM at 0x Thanks, Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Re: [Qemu-devel] AArch64 ELF File Loading
On 25 July 2014 16:05, Christopher Covington wrote: > I have local patches adding semihosting for AArch64. I hope eventually be able > to share them and other changes, but the approvals will likely take a while > longer. That would be good; there is demand from other quarters for semihosting support. > wget > http://releases.linaro.org/14.06/components/toolchain/binaries/gcc-linaro-aarch64-none-elf-4.9-2014.06-02_linux.tar.xz > tar xf gcc-linaro-aarch64-none-elf-4.9-2014.06-02_linux.tar.xz > echo '.global _start > _start: > mrs x0, midr_el1 > b _start' > hello.S > gcc-linaro-aarch64-none-elf-4.9-2014.06-02_linux/bin/aarch64-none-elf-gcc \ > -nostdlib -Ttext=0x4000 hello.S -o hello > qemu-system-aarch64 -nodefaults -nographic -monitor none -M virt \ > -cpu cortex-a57 -m 3G -semihosting -kernel hello -d in_asm,exec,int > > qemu: fatal: Trying to execute code outside RAM or ROM at 0x Doh. We never implemented the AArch64 support for booting plain ELF files in hw/arm/boot.c:do_cpu_reset(). Patch coming up in a second; with that I can see via the gdbstub that we're going round the loop in the guest code. thanks -- PMM
Re: [Qemu-devel] [PATCH 2/7] tests: Add virtio device initialization
On Thu, Jul 24, 2014 at 08:31:00PM +0200, Marc Marí wrote: > +static uint32_t qvirtio_pci_get_features(QVirtioDevice *d) > +{ > +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; > +return qpci_io_readl(dev->pdev, dev->addr + QVIRTIO_DEVICE_FEATURES); > +} Unused? If it's unused, then it's untested. > + > +static uint8_t qvirtio_pci_get_status(QVirtioDevice *d) > +{ > +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; > +return qpci_io_readb(dev->pdev, dev->addr + QVIRTIO_DEVICE_STATUS); > +} Unused? > + > +static void qvirtio_pci_set_status(QVirtioDevice *d, uint8_t val) > +{ > +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; > +qpci_io_writeb(dev->pdev, dev->addr + QVIRTIO_DEVICE_STATUS, val); Unused? > @@ -73,3 +97,11 @@ QVirtioPCIDevice *qvirtio_pci_device_find(QPCIBus *bus, > uint16_t device_type) > > return dev; > } > + > +void qvirtio_pci_enable_device(QVirtioPCIDevice *d) > +{ > +qpci_device_enable(d->pdev); > +d->addr = qpci_iomap(d->pdev, 0); > +g_assert(d->addr != NULL); > +} Where is qpci_iounmap() called to clean up? > @@ -67,6 +69,18 @@ static void pci_basic(void) > g_assert_cmphex(dev->vdev.device_type, ==, QVIRTIO_BLK_DEVICE_ID); > g_assert_cmphex(dev->pdev->devfn, ==, ((PCI_SLOT << 3) | PCI_FN)); > > +qvirtio_pci_enable_device(dev); > +qvirtio_reset(&qvirtio_pci, &dev->vdev); > +qvirtio_set_acknowledge(&qvirtio_pci, &dev->vdev); > +qvirtio_set_driver(&qvirtio_pci, &dev->vdev); > + > +/* MSI-X is not enabled */ > +addr = dev->addr + QVIRTIO_DEVICE_SPECIFIC_NO_MSIX; > + > +capacity = qpci_io_readl(dev->pdev, addr) | > +qpci_io_readl(dev->pdev, addr+4); > +g_assert_cmpint(capacity, ==, TEST_IMAGE_SIZE/512); Please add a qvirtio_config_read() function instead of directly accessing the virtio configuration space via PCI. Stefan pgpJkubtbLrqG.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH 5/7] libqos: Change free function called in malloc
On Thu, Jul 24, 2014 at 08:31:03PM +0200, Marc Marí wrote: > Signed-off-by: Marc Marí > --- > tests/libqos/malloc.h |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tests/libqos/malloc.h b/tests/libqos/malloc.h > index 46f6000..5565381 100644 > --- a/tests/libqos/malloc.h > +++ b/tests/libqos/malloc.h > @@ -32,7 +32,7 @@ static inline uint64_t guest_alloc(QGuestAllocator > *allocator, size_t size) > > static inline void guest_free(QGuestAllocator *allocator, uint64_t addr) > { > -allocator->alloc(allocator, addr); > +allocator->free(allocator, addr); > } Hahahahahahaha! Reviewed-by: Stefan Hajnoczi pgp_Qq7L8ivx8.pgp Description: PGP signature
[Qemu-devel] [PATCH] hw/arm/boot: Set PC correctly when loading AArch64 ELF files
The code in do_cpu_reset() correctly handled AArch64 CPUs when running Linux kernels, but was missing code in the branch of the if() that deals with loading ELF files. Correctly jump to the ELF entry point on reset rather than leaving the reset PC at zero. Reported-by: Christopher Covington Signed-off-by: Peter Maydell Cc: qemu-sta...@nongnu.org --- hw/arm/boot.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/hw/arm/boot.c b/hw/arm/boot.c index 3d1f4a2..1241761 100644 --- a/hw/arm/boot.c +++ b/hw/arm/boot.c @@ -417,8 +417,12 @@ static void do_cpu_reset(void *opaque) if (info) { if (!info->is_linux) { /* Jump to the entry point. */ -env->regs[15] = info->entry & 0xfffe; -env->thumb = info->entry & 1; +if (env->aarch64) { +env->pc = info->entry; +} else { +env->regs[15] = info->entry & 0xfffe; +env->thumb = info->entry & 1; +} } else { if (CPU(cpu) == first_cpu) { if (env->aarch64) { -- 1.9.1
Re: [Qemu-devel] [PATCH 3/7] libqtest: add QTEST_LOG for debugging qtest testcases
On Thu, Jul 24, 2014 at 08:31:01PM +0200, Marc Marí wrote: > @@ -397,10 +398,18 @@ QDict *qtest_qmpv(QTestState *s, const char *fmt, > va_list ap) > > /* No need to send anything for an empty QObject. */ > if (qobj) { > +size_t len; > +int log = getenv("QTEST_LOG") != NULL; > QString *qstr = qobject_to_json(qobj); > const char *str = qstring_get_str(qstr); > size_t size = qstring_get_length(qstr); > > +if (log) { > +len = write(2, str, size); > +if (len != size) { > +fprintf(stderr, "Could not log\n"); It's a bit funny that we print an error message to stderr after failing to write to stderr. Why not just fprintf(stderr, "%s", str) instead of using write()? pgpL2e_NNatZz.pgp Description: PGP signature
[Qemu-devel] [PATCH RFC v2 00/12] VMState testing
Hi, The following patch introduce a mechanism to test the correctness of the vmstate's information. This is achieved by saving the device states' information to a memory buffer and then clearing the states, followed by loading the data from the saved memory buffer. v1 --> v2: * Added a list containing all the devices that have been qdevified and gets registered with the SaveStateEntry. * Have provided a way to use either qemu_system_reset functionality or my own version of qdev entries untill all the devices have been qdevified. This gives us the privilege to test any device we want. * Rename some of the variables. I am very bad at naming convention, thanks to community, specially Eric, I try to improve it with every version. * On Eric's advice, I have separated all of the qmp and hmp interface patches. * Changed the DPRINTF statements as required. Dr. David Alan Gilbert (1): QEMUSizedBuffer/QEMUFile Sanidhya Kashyap (11): reset handler for qdevified devices VMState test: query command to extract the qdevified device names VMState test: hmp interface for showing qdevified devices VMstate test: basic VMState testing mechanism VMState test: hmp interface for vmstate testing VMState test: qmp interface for querying the vmstate testing process VMState test: hmp interface for querying the vmstate testing process VMState test: update period of vmstate testing process VMState test: hmp interface for period update VMState test: cancel mechanism for an already running vmstate testing process VMState test: hmp interface for cancel mechanism hmp-commands.hx | 48 + hmp.c | 73 hmp.h | 5 + include/migration/qemu-file.h | 27 +++ monitor.c | 14 ++ qapi-schema.json | 103 +++ qemu-file.c | 411 ++ qmp-commands.hx | 129 + savevm.c | 357 9 files changed, 1167 insertions(+) -- 1.9.3
[Qemu-devel] [PATCH RFC v2 01/12] QEMUSizedBuffer/QEMUFile
From: "Dr. David Alan Gilbert" Stefan Berger's to create a QEMUFile that goes to a memory buffer; from: http://lists.gnu.org/archive/html/qemu-devel/2013-03/msg05036.html Using the QEMUFile interface, this patch adds support functions for operating on in-memory sized buffers that can be written to or read from. Signed-off-by: Stefan Berger Signed-off-by: Joel Schopp --- include/migration/qemu-file.h | 27 +++ qemu-file.c | 411 ++ 2 files changed, 438 insertions(+) diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h index c90f529..14e1f4d 100644 --- a/include/migration/qemu-file.h +++ b/include/migration/qemu-file.h @@ -25,6 +25,8 @@ #define QEMU_FILE_H 1 #include "exec/cpu-common.h" +#include + /* This function writes a chunk of data to a file at the given position. * The pos argument can be ignored if the file is only being used for * streaming. The handler should try to write all of the data it can. @@ -94,11 +96,31 @@ typedef struct QEMUFileOps { QEMURamSaveFunc *save_page; } QEMUFileOps; +struct QEMUSizedBuffer { +struct iovec *iov; +size_t n_iov; +size_t size; /* total allocated size in all iov's */ +size_t used; /* number of used bytes */ +}; + +typedef struct QEMUSizedBuffer QEMUSizedBuffer; +QEMUSizedBuffer *qsb_create(const uint8_t *buffer, size_t len); +QEMUSizedBuffer *qsb_clone(const QEMUSizedBuffer *); +void qsb_free(QEMUSizedBuffer *); +size_t qsb_set_length(QEMUSizedBuffer *qsb, size_t length); +size_t qsb_get_length(const QEMUSizedBuffer *qsb); +ssize_t qsb_get_buffer(const QEMUSizedBuffer *, off_t start, size_t count, + uint8_t **buf); +ssize_t qsb_write_at(QEMUSizedBuffer *qsb, const uint8_t *buf, + off_t pos, size_t count); + + QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops); QEMUFile *qemu_fopen(const char *filename, const char *mode); QEMUFile *qemu_fdopen(int fd, const char *mode); QEMUFile *qemu_fopen_socket(int fd, const char *mode); QEMUFile *qemu_popen_cmd(const char *command, const char *mode); +QEMUFile *qemu_bufopen(const char *mode, QEMUSizedBuffer *input); int qemu_get_fd(QEMUFile *f); int qemu_fclose(QEMUFile *f); int64_t qemu_ftell(QEMUFile *f); @@ -111,6 +133,11 @@ void qemu_put_byte(QEMUFile *f, int v); void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, int size); bool qemu_file_mode_is_not_valid(const char *mode); +/* + * For use on files opened with qemu_bufopen + */ +const QEMUSizedBuffer *qemu_buf_get(QEMUFile *f); + static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v) { qemu_put_byte(f, (int)v); diff --git a/qemu-file.c b/qemu-file.c index a8e3912..9fd1675 100644 --- a/qemu-file.c +++ b/qemu-file.c @@ -878,3 +878,414 @@ uint64_t qemu_get_be64(QEMUFile *f) v |= qemu_get_be32(f); return v; } + + +#define QSB_CHUNK_SIZE (1 << 10) +#define QSB_MAX_CHUNK_SIZE (10 * QSB_CHUNK_SIZE) + +/** + * Create a QEMUSizedBuffer + * This type of buffer uses scatter-gather lists internally and + * can grow to any size. Any data array in the scatter-gather list + * can hold different amount of bytes. + * + * @buffer: Optional buffer to copy into the QSB + * @len: size of initial buffer; if @buffer is given, buffer must + * hold at least len bytes + * + * Returns a pointer to a QEMUSizedBuffer + */ +QEMUSizedBuffer *qsb_create(const uint8_t *buffer, size_t len) +{ +QEMUSizedBuffer *qsb; +size_t alloc_len, num_chunks, i, to_copy; +size_t chunk_size = (len > QSB_MAX_CHUNK_SIZE) +? QSB_MAX_CHUNK_SIZE +: QSB_CHUNK_SIZE; + +if (len == 0) { +/* we want to allocate at least one chunk */ +len = QSB_CHUNK_SIZE; +} + +num_chunks = DIV_ROUND_UP(len, chunk_size); +alloc_len = num_chunks * chunk_size; + +qsb = g_new0(QEMUSizedBuffer, 1); +qsb->iov = g_new0(struct iovec, num_chunks); +qsb->n_iov = num_chunks; + +for (i = 0; i < num_chunks; i++) { +qsb->iov[i].iov_base = g_malloc0(chunk_size); +qsb->iov[i].iov_len = chunk_size; +if (buffer) { +to_copy = (len - qsb->used) > chunk_size + ? chunk_size : (len - qsb->used); +memcpy(qsb->iov[i].iov_base, &buffer[qsb->used], to_copy); +qsb->used += to_copy; +} +} + +qsb->size = alloc_len; + +return qsb; +} + +/** + * Free the QEMUSizedBuffer + * + * @qsb: The QEMUSizedBuffer to free + */ +void qsb_free(QEMUSizedBuffer *qsb) +{ +size_t i; + +if (!qsb) { +return; +} + +for (i = 0; i < qsb->n_iov; i++) { +g_free(qsb->iov[i].iov_base); +} +g_free(qsb->iov); +g_free(qsb); +} + +/** + * Get the number of of used bytes in the QEMUSizedBuffer + * + * @qsb: A QEMUSizedBuffer + * + * Returns the number of bytes currently used in this buffer + */ +size_t qsb_get_length(const QEMUSi
[Qemu-devel] [PATCH RFC v2 03/12] VMState test: query command to extract the qdevified device names
I have provided a qmp interface for getting the list of qdevified devices that have been registered with SaveVMHandlers. Signed-off-by: Sanidhya Kashyap --- qapi-schema.json | 22 ++ qmp-commands.hx | 25 + savevm.c | 34 ++ 3 files changed, 81 insertions(+) diff --git a/qapi-schema.json b/qapi-schema.json index b11aad2..996e6b5 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3480,3 +3480,25 @@ # Since: 2.1 ## { 'command': 'rtc-reset-reinjection' } + +## +# @VMstatesQdevDevices +# +# list of qdevified devices that are registered with SaveStateEntry +# +# @device: list of qdevified device names +# +# Since 2.2 +## +{ 'type': 'VMStatesQdevDevices', + 'data': { 'device': ['str'] } } + +## +# @query-qdev-devices +# +# returns the VMStatesQdevDevices that have the associated value +# +# Since 2.2 +## +{ 'command': 'query-qdev-devices', + 'returns': 'VMStatesQdevDevices' } diff --git a/qmp-commands.hx b/qmp-commands.hx index 4be4765..2e20032 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3755,3 +3755,28 @@ Example: <- { "return": {} } EQMP + +{ +.name = "query-qdev-devices", +.args_type = "", +.mhandler.cmd_new = qmp_marshal_input_query_qdev_devices, +}, + +SQMP +query-qdev-devices +-- + +Shows registered Qdevified devices + + +Example (1): + +-> { "execute": "query-qdev-devices" } +<- { "return": [ + { + "devices": [ "kvm-tpr-opt", "piix4_pm" ] + } + ] + } + +EQMP diff --git a/savevm.c b/savevm.c index 0255fa0..7c1600a 100644 --- a/savevm.c +++ b/savevm.c @@ -1167,6 +1167,40 @@ void do_savevm(Monitor *mon, const QDict *qdict) } } +static strList *create_qdev_list(const char *name, strList *list) +{ +strList *temp_list; +int len; + +if (!list) { +list = g_malloc0(sizeof(strList)); +len = strlen(name); +list->value = g_malloc0(sizeof(char)*(len+1)); +strcpy(list->value, name); +list->next = NULL; +return list; +} +temp_list = g_malloc0(sizeof(strList)); +len = strlen(name); +temp_list->value = g_malloc0(sizeof(char)*(len+1)); +strcpy(temp_list->value, name); +temp_list->next = list; +list = temp_list; +return list; +} + +VMStatesQdevDevices *qmp_query_qdev_devices(Error **errp) +{ +VMStatesQdevResetEntry *qre; +VMStatesQdevDevices *qdev_devices = g_malloc0(sizeof(VMStatesQdevDevices)); + +QTAILQ_FOREACH(qre, &vmstate_reset_handlers, entry) { +qdev_devices->device = create_qdev_list(qre->device_name, + qdev_devices->device); +} +return qdev_devices; +} + void qmp_xen_save_devices_state(const char *filename, Error **errp) { QEMUFile *f; -- 1.9.3
[Qemu-devel] [PATCH RFC v2 05/12] VMstate test: basic VMState testing mechanism
In this patch, I have made the following changes: * changed the DPRINT statement. * renamed the variables. * added noqdev variable which decides which option to use for resetting. * added devices option which can help in resetting one or many devices (only qdevified ones). * updated the documentation. Signed-off-by: Sanidhya Kashyap --- qapi-schema.json | 26 ++ qmp-commands.hx | 37 savevm.c | 251 +++ 3 files changed, 314 insertions(+) diff --git a/qapi-schema.json b/qapi-schema.json index 996e6b5..ec48977 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3502,3 +3502,29 @@ ## { 'command': 'query-qdev-devices', 'returns': 'VMStatesQdevDevices' } + +## +# @test-vmstates +# +# tests the vmstates' value by dumping and loading in memory +# +# @iterations: (optional) The total iterations for vmstate testing. +# The min and max defind value is 10 and 100 respectively. +# +# @period: (optional) sleep interval between iteration (in milliseconds). +# The default interval is 100 milliseconds with min and max being +# 1 and 1 respectively. +# +# @noqdev: boolean variable which decides whether to use qdevified devices +# or not. Will be removed when all the devices have been qdevified. +# +# @devices: (optional) helps in resetting particular qdevified decices +# that have been registered with SaveStateEntry +# +# Since 2.2 +## +{ 'command': 'test-vmstates', + 'data': {'*iterations': 'int', + '*period': 'int', + 'noqdev': 'bool', + '*qdevices': 'VMStatesQdevDevices' } } diff --git a/qmp-commands.hx b/qmp-commands.hx index 2e20032..6210f56 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3778,5 +3778,42 @@ Example (1): } ] } +EQMP + +{ +.name = "test-vmstates", +.args_type = "iterations:i?,period:i?,noqdev:b,qdevices:O?", +.mhandler.cmd_new = qmp_marshal_input_test_vmstates, +}, + +SQMP +test-vmstates +- + +Tests the vmstates' entry by dumping and loading in/from memory + +Arguments: +- "iterations": (optional) The total iterations for vmstate testing. + The min and max defined value is 10 and 100 respectively. + +- "period": (optional) sleep interval between iteration (in milliseconds). +The default interval is 100 milliseconds with min and max being +1 and 1 respectively. + +- "noqdev": boolean variable which decides whether to use qdev or not. +Will be removed when all the devices have been qdevified. + +- "devices": (optional) helps in resetting particular qdevified decices + that have been registered with SaveStateEntry + + +Example: + +-> { "execute": "test-vmstates", + "arguments": { +"iterations": 10, +"period": 100, +"noqdev": false } } +<- { "return": {} } EQMP diff --git a/savevm.c b/savevm.c index 7c1600a..304d8fc 100644 --- a/savevm.c +++ b/savevm.c @@ -1201,6 +1201,257 @@ VMStatesQdevDevices *qmp_query_qdev_devices(Error **errp) return qdev_devices; } +#define DEBUG_TEST_VMSTATES 1 + +#ifndef DEBUG_TEST_VMSTATES +#define DEBUG_TEST_VMSTATES 0 +#endif + +#define DPRINTF(fmt, ...) \ +do { \ +if (DEBUG_TEST_VMSTATES) { \ +printf("vmstate_test: " fmt, ## __VA_ARGS__); \ +} \ +} while (0) + +#define TEST_VMSTATE_MIN_TIMES 10 +#define TEST_VMSTATE_MAX_TIMES 1000 + +#define TEST_VMSTATE_MIN_INTERVAL_MS 1 +#define TEST_VMSTATE_DEFAULT_INTERVAL_MS 100 +#define TEST_VMSTATE_MAX_INTERVAL_MS 1 + +typedef struct VMStateLogState VMStateLogState; + +struct VMStateLogState { +int64_t current_iteration; +int64_t iterations; +int64_t period; +bool active_state; +bool noqdev; +VMStatesQdevDevices *qdevices; +QEMUTimer *timer; + +QTAILQ_HEAD(qdev_list, VMStatesQdevResetEntry) qdev_list; +}; + +static VMStateLogState *vmstate_current_state(void) +{ +static VMStateLogState current_state = { +.active_state = false, +}; + +return ¤t_state; +} + +static inline void test_vmstates_clear_qdev_entries(VMStateLogState *v) +{ +VMStatesQdevResetEntry *qre, *new_qre; +QTAILQ_FOREACH_SAFE(qre, &v->qdev_list, entry, new_qre) { +QTAILQ_REMOVE(&v->qdev_list, qre, entry); +} +} + +static inline bool check_device_name(VMStateLogState *v, + VMStatesQdevDevices *qdevices, + Error **errp) +{ +VMStatesQdevResetEntry *qre; +strList *devices_name = qdevices->device; +QTAILQ_INIT(&v->qdev_list); +bool device_present; + +/* now, checking against each one */ +for (; devices_name; devices_name = devices_name->next) { +device_present = false; +VMStatesQdevResetEntry *new_qre; +QTAILQ_FOREACH(qre, &vmstate_reset_handlers, entry) { +if (!strcmp(qre->device_name, devices_name->value)) { + +
[Qemu-devel] [PATCH RFC v2 04/12] VMState test: hmp interface for showing qdevified devices
This patch provides the hmp interface for qdevified devices list. Signed-off-by: Sanidhya Kashyap --- hmp-commands.hx | 2 ++ hmp.c | 21 + hmp.h | 1 + monitor.c | 7 +++ 4 files changed, 31 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index d0943b1..4603de5 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1780,6 +1780,8 @@ show qdev device model list show roms @item info tpm show the TPM device +@item info qdev_devices +show the qdevified devices registered with migration capability @end table ETEXI diff --git a/hmp.c b/hmp.c index 4d1838e..d1dd7d2 100644 --- a/hmp.c +++ b/hmp.c @@ -1714,3 +1714,24 @@ void hmp_info_memdev(Monitor *mon, const QDict *qdict) monitor_printf(mon, "\n"); } + +void hmp_info_qdev_devices(Monitor *mon, const QDict *qdict) +{ +Error *err = NULL; +VMStatesQdevDevices *qdev_devices = qmp_query_qdev_devices(&err); +strList *list = NULL; + +if (qdev_devices) { +list = qdev_devices->device; +while (list) { +monitor_printf(mon, "%s\n", list->value); +list = list->next; +} +} + +if (err) { +hmp_handle_error(mon, &err); +} + +qapi_free_strList(list); +} diff --git a/hmp.h b/hmp.h index 4fd3c4a..d179454 100644 --- a/hmp.h +++ b/hmp.h @@ -38,6 +38,7 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict); void hmp_info_pci(Monitor *mon, const QDict *qdict); void hmp_info_block_jobs(Monitor *mon, const QDict *qdict); void hmp_info_tpm(Monitor *mon, const QDict *qdict); +void hmp_info_qdev_devices(Monitor *mon, const QDict *qdict); void hmp_quit(Monitor *mon, const QDict *qdict); void hmp_stop(Monitor *mon, const QDict *qdict); void hmp_system_reset(Monitor *mon, const QDict *qdict); diff --git a/monitor.c b/monitor.c index 5bc70a6..bf828d6 100644 --- a/monitor.c +++ b/monitor.c @@ -2918,6 +2918,13 @@ static mon_cmd_t info_cmds[] = { .mhandler.cmd = hmp_info_memdev, }, { +.name = "qdev_devices", +.args_type = "", +.params = "", +.help = "show registered qdevified devices", +.mhandler.cmd = hmp_info_qdev_devices, +}, +{ .name = NULL, }, }; -- 1.9.3
[Qemu-devel] [PATCH RFC v2 02/12] reset handler for qdevified devices
I have added a structure containing the list of qdevified devices which have been added to the SaveVMHandlers. Since, I was unable to find any particular struct containing the information about all the qdevified devices. So, I have created my own version, which is very very specific. I shall be grateful if anyone can give some pointers on this. Signed-off-by: Sanidhya Kashyap --- savevm.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/savevm.c b/savevm.c index e19ae0a..0255fa0 100644 --- a/savevm.c +++ b/savevm.c @@ -503,12 +503,29 @@ void unregister_savevm(DeviceState *dev, const char *idstr, void *opaque) } } +/* + * Reset entry for qdevified devices. + * Contains all those devices which have been qdevified and are + * part of the SaveVMHandlers. This one allows resetting of + * single device at any time. + */ + +typedef struct VMStatesQdevResetEntry { +QTAILQ_ENTRY(VMStatesQdevResetEntry) entry; +DeviceState *dev; +char device_name[256]; +} VMStatesQdevResetEntry; + +static QTAILQ_HEAD(vmstate_reset_handlers, VMStatesQdevResetEntry) + vmstate_reset_handlers = QTAILQ_HEAD_INITIALIZER(vmstate_reset_handlers); + int vmstate_register_with_alias_id(DeviceState *dev, int instance_id, const VMStateDescription *vmsd, void *opaque, int alias_id, int required_for_version) { SaveStateEntry *se; +VMStatesQdevResetEntry *qre; /* If this triggers, alias support can be dropped for the vmsd. */ assert(alias_id == -1 || required_for_version >= vmsd->minimum_version_id); @@ -521,6 +538,12 @@ int vmstate_register_with_alias_id(DeviceState *dev, int instance_id, se->alias_id = alias_id; if (dev) { + +qre = g_malloc0(sizeof(VMStatesQdevResetEntry)); +qre->dev = dev; +strcpy(qre->device_name, vmsd->name); +QTAILQ_INSERT_TAIL(&vmstate_reset_handlers, qre, entry); + char *id = qdev_get_dev_path(dev); if (id) { pstrcpy(se->idstr, sizeof(se->idstr), id); @@ -551,6 +574,7 @@ void vmstate_unregister(DeviceState *dev, const VMStateDescription *vmsd, void *opaque) { SaveStateEntry *se, *new_se; +VMStatesQdevResetEntry *qre, *new_qre; QTAILQ_FOREACH_SAFE(se, &savevm_handlers, entry, new_se) { if (se->vmsd == vmsd && se->opaque == opaque) { @@ -561,6 +585,12 @@ void vmstate_unregister(DeviceState *dev, const VMStateDescription *vmsd, g_free(se); } } + +QTAILQ_FOREACH_SAFE(qre, &vmstate_reset_handlers, entry, new_qre) { +if (dev && qre->dev && !strcmp(vmsd->name, qre->device_name)) { +QTAILQ_REMOVE(&vmstate_reset_handlers, qre, entry); +} +} } static int vmstate_load(QEMUFile *f, SaveStateEntry *se, int version_id) -- 1.9.3
[Qemu-devel] [PATCH RFC v2 06/12] VMState test: hmp interface for vmstate testing
I have added the hmp interface for vmstate testing. Currently, the patch does not support the qdev list, since I could not figure out how to the pass the VMStatesQdevDevices struct which can be parsed and used. Signed-off-by: Sanidhya Kashyap --- hmp-commands.hx | 15 +++ hmp.c | 18 ++ hmp.h | 1 + 3 files changed, 34 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index 4603de5..6af72a6 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1790,6 +1790,21 @@ STEXI show available trace events and their state ETEXI + { +.name = "test-vmstates", +.args_type = "iterations:i?,period:i?", +.params = "total_iterations sleep_interval", +.help = "test the vmstates by dumping and loading form memory\n\t\t\t" + "iterations: (optional) number of times, the vmstates will be tested\n\t\t\t" + "period: (optional) sleep interval in milliseconds between each iteration", +.mhandler.cmd = hmp_test_vmstates, +}, +STEXI +@item test-vmstates +@findex test-vmstates +dumps and reads the device state's data from the memory for testing purpose +ETEXI + STEXI @end table ETEXI diff --git a/hmp.c b/hmp.c index d1dd7d2..6c998d2 100644 --- a/hmp.c +++ b/hmp.c @@ -1735,3 +1735,21 @@ void hmp_info_qdev_devices(Monitor *mon, const QDict *qdict) qapi_free_strList(list); } + +void hmp_test_vmstates(Monitor *mon, const QDict *qdict) +{ +int has_iterations = qdict_haskey(qdict, "iterations"); +int64_t iterations = qdict_get_try_int(qdict, "iterations", 10); +int has_period = qdict_haskey(qdict, "period"); +int64_t period = qdict_get_try_int(qdict, "period", 100); + +Error *err = NULL; + +qmp_test_vmstates(has_iterations, iterations, has_period, period, + true, false, NULL, &err); + +if (err) { +monitor_printf(mon, "test-vmstates: %s\n", error_get_pretty(err)); +error_free(err); +} +} diff --git a/hmp.h b/hmp.h index d179454..41bc781 100644 --- a/hmp.h +++ b/hmp.h @@ -95,6 +95,7 @@ void hmp_cpu_add(Monitor *mon, const QDict *qdict); void hmp_object_add(Monitor *mon, const QDict *qdict); void hmp_object_del(Monitor *mon, const QDict *qdict); void hmp_info_memdev(Monitor *mon, const QDict *qdict); +void hmp_test_vmstates(Monitor *mon, const QDict *qdict); void object_add_completion(ReadLineState *rs, int nb_args, const char *str); void object_del_completion(ReadLineState *rs, int nb_args, const char *str); void device_add_completion(ReadLineState *rs, int nb_args, const char *str); -- 1.9.3
[Qemu-devel] [PATCH RFC v2 10/12] VMState test: hmp interface for period update
Signed-off-by: Sanidhya Kashyap --- hmp-commands.hx | 15 +++ hmp.c | 14 ++ hmp.h | 1 + 3 files changed, 30 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index c1dc6a2..6d15184 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1807,6 +1807,21 @@ STEXI dumps and reads the device state's data from the memory for testing purpose ETEXI +{ +.name = "test_vmstates_set_period", +.args_type = "period:i", +.params = "period", +.help = "set the sleep interval for vmstates testing process\n\t\t\t" + "period: the new sleep interval value to replace the existing", +.mhandler.cmd = hmp_test_vmstates_set_period, +}, + +STEXI +@item test_vmstates_set_period @var{period} +@findex test_vmstates_set_period +Set the period to @var{period} (int) for vmstate testing process. +ETEXI + STEXI @end table ETEXI diff --git a/hmp.c b/hmp.c index 385fb99..f54b0b9 100644 --- a/hmp.c +++ b/hmp.c @@ -1767,3 +1767,17 @@ void hmp_info_test_vmstates(Monitor *mon, const QDict *qdict) qapi_free_VMStateLogStateInfo(log_info); } + +void hmp_test_vmstates_set_period(Monitor *mon, const QDict *qdict) +{ +int64_t period = qdict_get_int(qdict, "period"); +Error *err = NULL; + +qmp_test_vmstates_set_period(period, &err); + +if (err) { +monitor_printf(mon, "test-vmstates-set-period: %s\n", + error_get_pretty(err)); +error_free(err); +} +} diff --git a/hmp.h b/hmp.h index b77f14c..e1afde8 100644 --- a/hmp.h +++ b/hmp.h @@ -97,6 +97,7 @@ void hmp_object_add(Monitor *mon, const QDict *qdict); void hmp_object_del(Monitor *mon, const QDict *qdict); void hmp_info_memdev(Monitor *mon, const QDict *qdict); void hmp_test_vmstates(Monitor *mon, const QDict *qdict); +void hmp_test_vmstates_set_period(Monitor *mon, const QDict *qdict); void object_add_completion(ReadLineState *rs, int nb_args, const char *str); void object_del_completion(ReadLineState *rs, int nb_args, const char *str); void device_add_completion(ReadLineState *rs, int nb_args, const char *str); -- 1.9.3
[Qemu-devel] [PATCH RFC v2 08/12] VMState test: hmp interface for querying the vmstate testing process
Added a hmp interface for providing the information about the testing process. I have used the underscore as a separater on Eric's advice. But, I have found some of the commands having hyphen. Signed-off-by: Sanidhya Kashyap --- hmp-commands.hx | 2 ++ hmp.c | 14 ++ hmp.h | 1 + monitor.c | 7 +++ 4 files changed, 24 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index 6af72a6..c1dc6a2 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1770,6 +1770,8 @@ show migration status show current migration capabilities @item info migrate_cache_size show current migration XBZRLE cache size +@item info test_vmstates +show current vmstates testing process info @item info balloon show balloon information @item info qtree diff --git a/hmp.c b/hmp.c index 9e01127..385fb99 100644 --- a/hmp.c +++ b/hmp.c @@ -1753,3 +1753,17 @@ void hmp_test_vmstates(Monitor *mon, const QDict *qdict) error_free(err); } } + +void hmp_info_test_vmstates(Monitor *mon, const QDict *qdict) +{ +VMStateLogStateInfo *log_info = qmp_query_test_vmstates(NULL); + +if (log_info) { +monitor_printf(mon, "current-iteration: %"PRId64 "\n" +"iterations: %"PRId64 "\n" +"period: %"PRId64 "\n", log_info->current_iteration, +log_info->iterations, log_info->period); +} + +qapi_free_VMStateLogStateInfo(log_info); +} diff --git a/hmp.h b/hmp.h index 41bc781..b77f14c 100644 --- a/hmp.h +++ b/hmp.h @@ -39,6 +39,7 @@ void hmp_info_pci(Monitor *mon, const QDict *qdict); void hmp_info_block_jobs(Monitor *mon, const QDict *qdict); void hmp_info_tpm(Monitor *mon, const QDict *qdict); void hmp_info_qdev_devices(Monitor *mon, const QDict *qdict); +void hmp_info_test_vmstates(Monitor *mon, const QDict *qdict); void hmp_quit(Monitor *mon, const QDict *qdict); void hmp_stop(Monitor *mon, const QDict *qdict); void hmp_system_reset(Monitor *mon, const QDict *qdict); diff --git a/monitor.c b/monitor.c index bf828d6..427eef1 100644 --- a/monitor.c +++ b/monitor.c @@ -2862,6 +2862,13 @@ static mon_cmd_t info_cmds[] = { .mhandler.cmd = hmp_info_migrate_capabilities, }, { +.name = "test_vmstates", +.args_type = "", +.params = "", +.help = "show current vmstates testing process info", +.mhandler.cmd = hmp_info_test_vmstates, +}, +{ .name = "migrate_cache_size", .args_type = "", .params = "", -- 1.9.3
[Qemu-devel] [PATCH RFC v2 07/12] VMState test: qmp interface for querying the vmstate testing process
This patch provides the information about an already executing testing process. I have modified the qmp command to query-test-vmstates from test-vmstates-get-info. Signed-off-by: Sanidhya Kashyap --- qapi-schema.json | 34 ++ qmp-commands.hx | 25 + savevm.c | 18 ++ 3 files changed, 77 insertions(+) diff --git a/qapi-schema.json b/qapi-schema.json index ec48977..a12872f 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3528,3 +3528,37 @@ '*period': 'int', 'noqdev': 'bool', '*qdevices': 'VMStatesQdevDevices' } } + +## +# @VMStateLogStateInfo +# +# VMState testing information +# Tells about the current iteration, the total iterations +# that have been provided and the sleep interval +# +# @current-iteration: shows the current iteration at which +# the test is in. +# +# @iterations: the allocated total iterations for the vmstate +# testing process. +# +# @period: the allowed sleep interval between each iteration +# (in milliseconds). +# +# Since 2.2 +## +{ 'type': 'VMStateLogStateInfo', + 'data': { 'current-iteration': 'int', +'iterations':'int', +'period':'int' } } + +## +# @query-test-vmstates +# +# Get the current parameters value of the vmstate testing process. +# +# Returns VMStateLogStateInfo structure. +# +# Since 2.2 +## +{ 'command': 'query-test-vmstates', 'returns': 'VMStateLogStateInfo' } diff --git a/qmp-commands.hx b/qmp-commands.hx index 6210f56..0a40a88 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3817,3 +3817,28 @@ Example: "noqdev": false } } <- { "return": {} } EQMP + +{ +.name = "query-test-vmstates", +.args_type = "", +.mhandler.cmd_new = qmp_marshal_input_query_test_vmstates, +}, + +SQMP +query-test-vmstates-info + + +Get the parameters information + +- "current_iteration": the current iteration going on +- "iterations:" the total number of assigned iterations +- "period": the sleep interval between the iteration + +Example: + +-> { "execute": "query-test-vmstates" } +<- { "return": { +"current_iteration": 3, +"iterations": 100, +"period": 100 } } +EQMP diff --git a/savevm.c b/savevm.c index b5e53b8..793fee7 100644 --- a/savevm.c +++ b/savevm.c @@ -1451,6 +1451,24 @@ void qmp_test_vmstates(bool has_iterations, int64_t iterations, timer_mod(v->timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); } +VMStateLogStateInfo *qmp_query_test_vmstates(Error **errp) +{ +VMStateLogState *v = vmstate_current_state(); +VMStateLogStateInfo *log_info = NULL; + +if (!v->active_state) { +return log_info; +} + +log_info = g_malloc0(sizeof(VMStateLogStateInfo)); + +log_info->current_iteration = v->current_iteration; +log_info->iterations = v->iterations; +log_info->period = v->period; + +return log_info; +} + void qmp_xen_save_devices_state(const char *filename, Error **errp) { QEMUFile *f; -- 1.9.3
[Qemu-devel] [PATCH RFC v2 11/12] VMState test: cancel mechanism for an already running vmstate testing process
Signed-off-by: Sanidhya Kashyap --- qapi-schema.json | 9 + qmp-commands.hx | 19 +++ savevm.c | 16 ++-- 3 files changed, 42 insertions(+), 2 deletions(-) diff --git a/qapi-schema.json b/qapi-schema.json index 13e922e..91f1672 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3574,3 +3574,12 @@ ## { 'command' : 'test-vmstates-set-period', 'data': { 'period': 'int' } } + +## +# @log-dirty-bitmap-cancel +# +# cancel the testing vmstates process +# +# Since 2.2 +## +{ 'command': 'test-vmstates-cancel' } diff --git a/qmp-commands.hx b/qmp-commands.hx index 2f019b0..1035885 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3865,3 +3865,22 @@ Example: <- { "return": {} } EQMP + { +.name = "test-vmstates-cancel", +.args_type = "", +.mhandler.cmd_new = qmp_marshal_input_test_vmstates_cancel, +}, + +SQMP +test-vmstates-cancel +-- + +Cancel the current vmstate testing process. + +Arguments: None. + +Example: + +-> { "execute": "test-vmstates-cancel" } +<- { "return": {} } +EQMP diff --git a/savevm.c b/savevm.c index 8b75691..66597db 100644 --- a/savevm.c +++ b/savevm.c @@ -1365,8 +1365,12 @@ static void vmstate_test_cb(void *opaque) if (saved_vm_running) { vm_start(); } -timer_mod(v->timer, qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + - v->period); + if (v->active_state) { +timer_mod(v->timer, v->period + + qemu_clock_get_ms(QEMU_CLOCK_REALTIME)); +} else { +goto testing_end; +} return; } @@ -1482,6 +1486,14 @@ void qmp_test_vmstates_set_period(int64_t period, Error **errp) v->period = period; } +void qmp_test_vmstates_cancel(Error **errp) +{ +VMStateLogState *v = vmstate_current_state(); +if (v->active_state) { +v->active_state = false; +} +} + void qmp_xen_save_devices_state(const char *filename, Error **errp) { QEMUFile *f; -- 1.9.3
[Qemu-devel] [PATCH RFC v2 12/12] VMState test: hmp interface for cancel mechanism
Signed-off-by: Sanidhya Kashyap --- hmp-commands.hx | 14 ++ hmp.c | 6 ++ hmp.h | 1 + 3 files changed, 21 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index 6d15184..fe224fc 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1822,6 +1822,20 @@ STEXI Set the period to @var{period} (int) for vmstate testing process. ETEXI + { + .name = "test_vmstates_cancel", + .args_type = "", + .params = "", + .help = "cancel the current vmstates testing process", + .mhandler.cmd = hmp_test_vmstates_cancel, +}, + +STEXI +@item test_vmstates_cancel +@findex test_vmstates_cancel +Cancel the current vmstates testing process +ETEXI + STEXI @end table ETEXI diff --git a/hmp.c b/hmp.c index f54b0b9..bbff92a 100644 --- a/hmp.c +++ b/hmp.c @@ -1781,3 +1781,9 @@ void hmp_test_vmstates_set_period(Monitor *mon, const QDict *qdict) error_free(err); } } + +void hmp_test_vmstates_cancel(Monitor *mon, const QDict *qdict) +{ + qmp_test_vmstates_cancel(NULL); +} + diff --git a/hmp.h b/hmp.h index e1afde8..1277dbc 100644 --- a/hmp.h +++ b/hmp.h @@ -98,6 +98,7 @@ void hmp_object_del(Monitor *mon, const QDict *qdict); void hmp_info_memdev(Monitor *mon, const QDict *qdict); void hmp_test_vmstates(Monitor *mon, const QDict *qdict); void hmp_test_vmstates_set_period(Monitor *mon, const QDict *qdict); +void hmp_test_vmstates_cancel(Monitor *mon, const QDict *qdict); void object_add_completion(ReadLineState *rs, int nb_args, const char *str); void object_del_completion(ReadLineState *rs, int nb_args, const char *str); void device_add_completion(ReadLineState *rs, int nb_args, const char *str); -- 1.9.3
[Qemu-devel] [PATCH RFC v2 09/12] VMState test: update period of vmstate testing process
No particular change, except variable name. Since I am not modifying other variables, so I have not made the command generic. Signed-off-by: Sanidhya Kashyap --- qapi-schema.json | 12 qmp-commands.hx | 23 +++ savevm.c | 13 + 3 files changed, 48 insertions(+) diff --git a/qapi-schema.json b/qapi-schema.json index a12872f..13e922e 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3562,3 +3562,15 @@ # Since 2.2 ## { 'command': 'query-test-vmstates', 'returns': 'VMStateLogStateInfo' } + +## +# @test-vmstates-set-period +# +# sets the sleep interval between iterations of the vmstate testing process +# +# @period: the updated sleep interval value (in milliseconds) +# +# Since 2.2 +## +{ 'command' : 'test-vmstates-set-period', + 'data': { 'period': 'int' } } diff --git a/qmp-commands.hx b/qmp-commands.hx index 0a40a88..2f019b0 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -3842,3 +3842,26 @@ Example: "iterations": 100, "period": 100 } } EQMP + +{ +.name = "test-vmstates-set-period", +.args_type = "period:i", +.mhandler.cmd_new = qmp_marshal_input_test_vmstates_set_period, +}, + +SQMP +test-vmstates-set-period + + +Update the sleep interval for the remaining iterations + +Arguments: + +- "period": the updated sleep interval between iterations (json-int) + +Example: + +-> { "execute": "test-vmstates-set-period", "arguments": { "period": 1024 } } +<- { "return": {} } +EQMP + diff --git a/savevm.c b/savevm.c index 797be57..d5fb93b 100644 --- a/savevm.c +++ b/savevm.c @@ -1470,6 +1470,19 @@ VMStateLogStateInfo *qmp_query_test_vmstates(Error **errp) return log_info; } +void qmp_test_vmstates_set_period(int64_t period, Error **errp) +{ +VMStateLogState *v = vmstate_current_state(); +if (period < TEST_VMSTATE_MIN_INTERVAL_MS || +period > TEST_VMSTATE_MAX_INTERVAL_MS) { +error_setg(errp, "sleep interval (period) value must be " + "in the defined range [%d, %d](ms)\n", + TEST_VMSTATE_MIN_INTERVAL_MS, TEST_VMSTATE_MAX_INTERVAL_MS); +return; +} +v->period = period; +} + void qmp_xen_save_devices_state(const char *filename, Error **errp) { QEMUFile *f; -- 1.9.3
[Qemu-devel] [PATCH for-2.1 0/2] Fix migration failure due to ACPI tables size changes
Changing the ACPI table size causes migration to break, and the memory hotplug work opened our eyes on how horribly we were breaking things in 2.0 already. To trigger issue start QEMU-1.7 with -M pc-i440fx-1.7 -device pci-bridge,chassis_nr=1 and try to migrate to QEMU-2.1 or QEMU-2.0 as result target will fail with: qemu-system-x86_64: Length mismatch: /rom@etc/acpi/tables: 2000 in != 3000 This fix allows target QEMU to load smaller RAMBlock into a bigger one and fixes regression which was introduced since 2.0, allowing forward migration from 1.7/2.0 to 2.1 Fix is also suitable for stable-2.0. Igor Mammedov (2): migration: load smaller RAMBlock to a bigger one if permitted acpi: mark ACPI tables ROM blob as extend-able on migration arch_init.c | 19 ++- exec.c | 15 +-- hw/core/loader.c| 6 +- hw/i386/acpi-build.c| 2 +- include/exec/cpu-all.h | 15 ++- include/exec/memory.h | 10 ++ include/exec/ram_addr.h | 1 + include/hw/loader.h | 5 +++-- memory.c| 5 + 9 files changed, 62 insertions(+), 16 deletions(-) -- 1.8.3.1
[Qemu-devel] [PATCH for-2.1 2/2] acpi: mark ACPI tables ROM blob as extend-able on migration
It fixes migration failure for machine type pc-i440fx-1.7 from QEMU 1.7/2.0 to QEMU 2.1 Migration fails due to ACPI tables size grows across 1.7-2.1 versions. That causes ACPI tables ROM blob to change its size differently for the same configurations on different QEMU versions. As result migration code bails out when attempting to load smaller ROM blob into a bigger one on a newer QEMU. To trigger failure it's enough to add pci-bridge device to QEMU CLI Marking ACPI tables ROM blob as extend-able on migration allows QEMU to load smaller ACPI tables from QEMU 1.7/2.0, fixing forward migration failure introduced since 2.0 which affects only configurations that cause ACPI ROM blob size change. Signed-off-by: Igor Mammedov --- hw/core/loader.c | 6 +- hw/i386/acpi-build.c | 2 +- include/hw/loader.h | 5 +++-- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/hw/core/loader.c b/hw/core/loader.c index 2bf6b8f..d09b894 100644 --- a/hw/core/loader.c +++ b/hw/core/loader.c @@ -722,7 +722,8 @@ err: void *rom_add_blob(const char *name, const void *blob, size_t len, hwaddr addr, const char *fw_file_name, - FWCfgReadCallback fw_callback, void *callback_opaque) + FWCfgReadCallback fw_callback, void *callback_opaque, + bool extendable) { Rom *rom; void *data = NULL; @@ -742,6 +743,9 @@ void *rom_add_blob(const char *name, const void *blob, size_t len, if (rom_file_has_mr) { data = rom_set_mr(rom, OBJECT(fw_cfg), devpath); +if (extendable) { +memory_region_permit_extendable_migration(rom->mr); +} } else { data = rom->data; } diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index ebc5f03..651c06b 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1558,7 +1558,7 @@ static void *acpi_add_rom_blob(AcpiBuildState *build_state, GArray *blob, const char *name) { return rom_add_blob(name, blob->data, acpi_data_len(blob), -1, name, -acpi_build_update, build_state); +acpi_build_update, build_state, true); } static const VMStateDescription vmstate_acpi_build = { diff --git a/include/hw/loader.h b/include/hw/loader.h index 796cbf9..e5a24ac 100644 --- a/include/hw/loader.h +++ b/include/hw/loader.h @@ -57,7 +57,8 @@ int rom_add_file(const char *file, const char *fw_dir, bool option_rom); void *rom_add_blob(const char *name, const void *blob, size_t len, hwaddr addr, const char *fw_file_name, - FWCfgReadCallback fw_callback, void *callback_opaque); + FWCfgReadCallback fw_callback, void *callback_opaque, + bool extendable); int rom_add_elf_program(const char *name, void *data, size_t datasize, size_t romsize, hwaddr addr); int rom_load_all(void); @@ -70,7 +71,7 @@ void do_info_roms(Monitor *mon, const QDict *qdict); #define rom_add_file_fixed(_f, _a, _i) \ rom_add_file(_f, NULL, _a, _i, false) #define rom_add_blob_fixed(_f, _b, _l, _a) \ -rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL) +rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL, false) #define PC_ROM_MIN_VGA 0xc #define PC_ROM_MIN_OPTION 0xc8000 -- 1.8.3.1
Re: [Qemu-devel] [PATCH 4/7] libqos: Correct mask to align size to PAGE_SIZE in malloc-pc
On Thu, Jul 24, 2014 at 08:31:02PM +0200, Marc Marí wrote: > Signed-off-by: Paolo Bonzini > Signed-off-by: Marc Marí > --- > tests/libqos/malloc-pc.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Stefan Hajnoczi pgpfJeOahV0Cb.pgp Description: PGP signature
[Qemu-devel] [PATCH for-2.1 1/2] migration: load smaller RAMBlock to a bigger one if permitted
Add API to mark memory region as extend-able on migration, to allow migration code to load smaller RAMBlock into a bigger one on destination QEMU instance. This will allow to fix broken migration from QEMU 1.7/2.0 to QEMU 2.1 due to ACPI tables size changes across 1.7/2.0/2.1 versions by marking ACPI tables ROM blob as extend-able. So that smaller tables from previos version could be always migrated to a bigger rom blob on new version. Credits-for-idea: Michael S. Tsirkin Signed-off-by: Igor Mammedov --- arch_init.c | 19 ++- exec.c | 15 +-- include/exec/cpu-all.h | 15 ++- include/exec/memory.h | 10 ++ include/exec/ram_addr.h | 1 + memory.c| 5 + 6 files changed, 53 insertions(+), 12 deletions(-) diff --git a/arch_init.c b/arch_init.c index 8ddaf35..812f8b5 100644 --- a/arch_init.c +++ b/arch_init.c @@ -1071,11 +1071,20 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) QTAILQ_FOREACH(block, &ram_list.blocks, next) { if (!strncmp(id, block->idstr, sizeof(id))) { -if (block->length != length) { -error_report("Length mismatch: %s: " RAM_ADDR_FMT - " in != " RAM_ADDR_FMT, id, length, - block->length); -ret = -EINVAL; +if (block->flags & RAM_EXTENDABLE_ON_MIGRATE) { +if (block->length < length) { +error_report("Length too big: %s: "RAM_ADDR_FMT + " > " RAM_ADDR_FMT, id, length, + block->length); +ret = -EINVAL; +} +} else { +if (block->length != length) { +error_report("Length mismatch: %s: "RAM_ADDR_FMT + " in != " RAM_ADDR_FMT, id, length, + block->length); +ret = -EINVAL; +} } break; } diff --git a/exec.c b/exec.c index 765bd94..755b1d3 100644 --- a/exec.c +++ b/exec.c @@ -69,12 +69,6 @@ AddressSpace address_space_memory; MemoryRegion io_mem_rom, io_mem_notdirty; static MemoryRegion io_mem_unassigned; -/* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */ -#define RAM_PREALLOC (1 << 0) - -/* RAM is mmap-ed with MAP_SHARED */ -#define RAM_SHARED (1 << 1) - #endif struct CPUTailQ cpus = QTAILQ_HEAD_INITIALIZER(cpus); @@ -1214,6 +1208,15 @@ void qemu_ram_unset_idstr(ram_addr_t addr) } } +void qemu_ram_set_extendable_on_migration(ram_addr_t addr) +{ +RAMBlock *block = find_ram_block(addr); + +if (block) { +block->flags |= RAM_EXTENDABLE_ON_MIGRATE; +} +} + static int memory_try_enable_merging(void *addr, size_t len) { if (!qemu_opt_get_bool(qemu_get_machine_opts(), "mem-merge", true)) { diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h index f91581f..2b9aea3 100644 --- a/include/exec/cpu-all.h +++ b/include/exec/cpu-all.h @@ -296,13 +296,26 @@ CPUArchState *cpu_copy(CPUArchState *env); #if !defined(CONFIG_USER_ONLY) /* memory API */ +typedef enum { +/* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */ +RAM_PREALLOC = 1 << 0, +/* RAM is mmap-ed with MAP_SHARED */ +RAM_SHARED = 1 << 1, +/* + * Allow at migration time to load RAMBlock of smaller size than + * destination RAMBlock is + */ +RAM_EXTENDABLE_ON_MIGRATE = 1 << 2, +RAM_FLAGS_END = 1 << 31 +} RAMBlockFlags; + typedef struct RAMBlock { struct MemoryRegion *mr; uint8_t *host; ram_addr_t offset; ram_addr_t length; -uint32_t flags; +RAMBlockFlags flags; char idstr[256]; /* Reads can take either the iothread or the ramlist lock. * Writes must take both locks. diff --git a/include/exec/memory.h b/include/exec/memory.h index e2c8e3e..6c03f70 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -894,6 +894,16 @@ bool memory_region_present(MemoryRegion *container, hwaddr addr); bool memory_region_is_mapped(MemoryRegion *mr); /** + * memory_region_permit_extendable_migration: allows to mark + * #MemoryRegion as extendable on migrartion, which permits to + * load source memory block of smaller size than destination memory block + * at migration time + * + * @mr: a #MemoryRegion which should be marked as extendable on migration + */ +void memory_region_permit_extendable_migration(MemoryRegion *mr); + +/** * memory_region_find: translate an address/size relative to a * MemoryRegion into a #MemoryRegionSection. * diff
Re: [Qemu-devel] [PULL for-2.1 0/3] Last minute patches
On 25 July 2014 15:22, Paolo Bonzini wrote: > Since Igor hasn't sent his patches, and I'm leaving the office, I pushed > this to > >git://github.com/bonzini/qemu.git tags/for-upstream-full > > I don't know about tests/acpi-test-data/pc. It makes sense that this > patch should modify something there, but: > > * "make check" passes > > * the test warns even before patch 1, for both the DSDT (modified by the > patch) and SSDT (which this series doesn't touch at all) > > * I cannot get it to pass, except by blindly copying the "actual" output > on the "expected" files > > * mst is on vacation and Marcel is off on Fridays > > Based on my understanding of the problem, it is not possible to fix the > bug without hacks like this one, and even reverting all patches in this > area would be more risky. Hmm. I'm not really sure what the right thing is, so what I'm planning to do is: * just apply the qemu-char fix for now * not tag -rc4 today * see if things are clearer on Monday (I see Igor has now sent out a patchset) * tag -rc4 Mon or Tues * slip the release date a few days (not a big deal, I think) thanks -- PMM
Re: [Qemu-devel] [PATCH] [RFC] Add machine type pc-1.0-qemu-kvm for live migrate compatibility with qemu-kvm
ping: anyone got any views on this one? On 22 Jul 2014, at 19:43, Alex Bligh wrote: > Add a machine type pc-1.0-qemu-kvm for live migrate compatibility > with qemu-kvm version 1.0. > > Signed-off-by: Alex Bligh > --- > hw/acpi/piix4.c | 49 -- > hw/i386/pc_piix.c| 31 + > hw/timer/i8254_common.c | 41 ++ > include/hw/acpi/piix4.h |1 + > include/hw/timer/i8254.h |2 ++ > 5 files changed, 122 insertions(+), 2 deletions(-) > > This RFC patch adds inbound migrate capability from qemu-kvm version > 1.0. The main ideas are those set out in Cole Robinson's patch here: > http://pkgs.fedoraproject.org/cgit/qemu.git/tree/0001-Fix-migration-from-qemu-kvm.patch?h=f20 > however, rather than patching statically (and breaking inbound > migration on existing machine types), I have added a new machine > type (pc-1.0-qemu-kvm) without affecting any other machine types. > > This requires 'hot patching' the VMStateDescription in a couple of > places, which in turn is less than obvious as there may be (indeed > are for i8259) derived classes. Whilst pretty nausea-inducing, this > approach has the benefit of being entirely self-contained. > > I developed this on qemu 2.0 but have forward ported it (trivially) > to master. My testing has been on a VM live-migrated-to-file from > Ubuntu Precise qemu-kvm 1.0. Testing has been light to date (i.e. > can I migrate it inbound with -S without anything complaining). > > I have not (yet) brought forward the qxl rom size (and possibly > video ram) changes in Cole's patches as I'd prefer an assessment > of whether this is the right approach first. > > diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c > index b72b34e..708db79 100644 > --- a/hw/acpi/piix4.c > +++ b/hw/acpi/piix4.c > @@ -267,8 +267,9 @@ static const VMStateDescription vmstate_memhp_state = { > }; > > /* qemu-kvm 1.2 uses version 3 but advertised as 2 > - * To support incoming qemu-kvm 1.2 migration, change version_id > - * and minimum_version_id to 2 below (which breaks migration from > + * To support incoming qemu-kvm 1.2 migration, we support > + * via a command line option a change to minimum_version_id > + * of 2 in a _compat structure (which breaks migration from > * qemu 1.2). > * > */ > @@ -307,6 +308,34 @@ static const VMStateDescription vmstate_acpi = { > } > }; > > +static const VMStateDescription vmstate_acpi_compat = { > +.name = "piix4_pm", > +.version_id = 3, > +.minimum_version_id = 2, > +.minimum_version_id_old = 1, > +.load_state_old = acpi_load_old, > +.post_load = vmstate_acpi_post_load, > +.fields = (VMStateField[]) { > +VMSTATE_PCI_DEVICE(parent_obj, PIIX4PMState), > +VMSTATE_UINT16(ar.pm1.evt.sts, PIIX4PMState), > +VMSTATE_UINT16(ar.pm1.evt.en, PIIX4PMState), > +VMSTATE_UINT16(ar.pm1.cnt.cnt, PIIX4PMState), > +VMSTATE_STRUCT(apm, PIIX4PMState, 0, vmstate_apm, APMState), > +VMSTATE_TIMER(ar.tmr.timer, PIIX4PMState), > +VMSTATE_INT64(ar.tmr.overflow_time, PIIX4PMState), > +VMSTATE_STRUCT(ar.gpe, PIIX4PMState, 2, vmstate_gpe, ACPIGPE), > +VMSTATE_STRUCT_TEST( > +acpi_pci_hotplug.acpi_pcihp_pci_status[ACPI_PCIHP_BSEL_DEFAULT], > +PIIX4PMState, > +vmstate_test_no_use_acpi_pci_hotplug, > +2, vmstate_pci_status, > +struct AcpiPciHpPciStatus), > +VMSTATE_PCI_HOTPLUG(acpi_pci_hotplug, PIIX4PMState, > +vmstate_test_use_acpi_pci_hotplug), > +VMSTATE_END_OF_LIST() > +} > +}; > + > static void piix4_reset(void *opaque) > { > PIIX4PMState *s = opaque; > @@ -619,6 +648,22 @@ static void piix4_pm_class_init(ObjectClass *klass, void > *data) > adevc->ospm_status = piix4_ospm_status; > } > > +void piix4_pm_class_fix_compat(void) > +{ > +GSList *el, *devices = object_class_get_list(TYPE_DEVICE, false); > +/* > + * Change the vmstate field in this class and any derived classes > + * if not overriden. As no other classes should be using this > + * vmstate, we can simply iterate the class list > + */ > +for (el = devices; el; el = el->next) { > +DeviceClass *dc = el->data; > +if (dc->vmsd == &vmstate_acpi) { > +dc->vmsd = &vmstate_acpi_compat; > +} > +} > +} > + > static const TypeInfo piix4_pm_info = { > .name = TYPE_PIIX4_PM, > .parent= TYPE_PCI_DEVICE, > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c > index 7081c08..e400ea6 100644 > --- a/hw/i386/pc_piix.c > +++ b/hw/i386/pc_piix.c > @@ -49,6 +49,8 @@ > #include "hw/acpi/acpi.h" > #include "cpu.h" > #include "qemu/error-report.h" > +#include "hw/acpi/piix4.h" > +#include "hw/timer/i8254.h" > #ifdef CONFIG_XEN > # include > #endif > @@ -386,6 +388,15 @@ static void pc_init_pci_1_2(MachineState *machine) >
Re: [Qemu-devel] [PATCH] hw/arm/boot: Set PC correctly when loading AArch64 ELF files
On 07/25/2014 11:23 AM, Peter Maydell wrote: > The code in do_cpu_reset() correctly handled AArch64 CPUs > when running Linux kernels, but was missing code in the > branch of the if() that deals with loading ELF files. > Correctly jump to the ELF entry point on reset rather than > leaving the reset PC at zero. Thanks Peter! With this patch I can see the first few instructions being executed. Tested-by: Christopher Covington (The default Newlib/libgloss wants to touch EL3 registers that QEMU doesn't yet have, but I can probably make my test case work with -nostdlib.) Thanks, Christopher -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation.
Re: [Qemu-devel] [PATCH 7/7] libqos: Added basic virtqueue support to virtio implementation
On Thu, Jul 24, 2014 at 08:31:05PM +0200, Marc Marí wrote: > +static void qvirtio_pci_set_queue_address(QVirtioDevice *d, uint16_t addr) > +{ > +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; > +qpci_io_writel(dev->pdev, dev->addr + QVIRTIO_QUEUE_ADDRESS, addr); > +} Why is addr uint16_t? It should be a 32-bit Page Frame Number). It's probably clearer to name it "pfn" instead of "addr" since it's not an address. pgp1kUPsDwLVV.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH for-2.1 0/2] Fix migration failure due to ACPI tables size changes
On 07/25/14 17:48, Igor Mammedov wrote: > Changing the ACPI table size causes migration to break, and the memory > hotplug work opened our eyes on how horribly we were breaking things in > 2.0 already. > > To trigger issue start > QEMU-1.7 with -M pc-i440fx-1.7 -device pci-bridge,chassis_nr=1 > and try to migrate to QEMU-2.1 or QEMU-2.0 as result target will fail with: > qemu-system-x86_64: Length mismatch: /rom@etc/acpi/tables: 2000 in != 3000 > > This fix allows target QEMU to load smaller RAMBlock into a bigger one > and fixes regression which was introduced since 2.0, allowing > forward migration from 1.7/2.0 to 2.1 > Fix is also suitable for stable-2.0. > > Igor Mammedov (2): > migration: load smaller RAMBlock to a bigger one if permitted > acpi: mark ACPI tables ROM blob as extend-able on migration > > arch_init.c | 19 ++- > exec.c | 15 +-- > hw/core/loader.c| 6 +- > hw/i386/acpi-build.c| 2 +- > include/exec/cpu-all.h | 15 ++- > include/exec/memory.h | 10 ++ > include/exec/ram_addr.h | 1 + > include/hw/loader.h | 5 +++-- > memory.c| 5 + > 9 files changed, 62 insertions(+), 16 deletions(-) I can name things in favor of both approaches, Paolo's and yours. Paolo's approach keeps the hack where the problem was introduced, in the ACPI generator. The local nature of the migration-related pig sty is very attractive for me. As I said on IRC, if you fix the migration problem in the APCI generator by elegant compat flags, or such an emergency hack as Paolo's, that's an implementation detail; it's important that it stay local to the ACPI generator. Your patchset leaks the problem into RAMBlocks. In favor of your idea OTOH, as you explained, it would work for all possible ACPI configurations, and not break for a random few ones like Paolo's approach (talking about patch #2 of that set, the procedural _PRT is great in any case). You could argue that by customizing the RAMBlocks as extensible (for ACPI / fw_cfg blob backing only) the leakage is contained. I don't know how to decide between these two approaches. The optimal one would be to reimplement the 2.0 --> 2.1 feature additions from scratch, introducing compat flags afresh. Won't happen I guess. In any case I'll try to review this set for the technical things. Thanks Laszlo
Re: [Qemu-devel] [PULL v2 0/1] Fix for "-serial pty" regression
On 25 July 2014 13:38, Paolo Bonzini wrote: > The following changes since commit f368c33d5ab09dd5656924185cd975b11838cd25: > > Update version for v2.1.0-rc3 release (2014-07-22 18:17:03 +0100) > > are available in the git repository at: > > git://github.com/bonzini/qemu.git tags/for-upstream > > for you to fetch changes up to 62c339c5272ce8fbe8ca52695cee8ff40da7872e: > > qemu-char: ignore flow control if a PTY's slave is not connected > (2014-07-25 14:36:07 +0200) > > > Here is the serial fix for 2.1. > > > Paolo Bonzini (1): > qemu-char: ignore flow control if a PTY's slave is not connected Applied this pullreq rather than the other (see comments on that one). thanks -- PMM
[Qemu-devel] [Bug 1348719] [NEW] arm64: -smp 2 hangs qemu
Public bug reported: It appears that smp is broken on qemu for arm64. I'm looking into the root cause but am curious if others can reproduce in their environments. Tested with commit f368c33d5ab09dd5656924185cd975b11838cd25 (July 22) from https://github.com/qemu/qemu.git [root@joelaarch64 ~]# /usr/local/bin/qemu-system-aarch64 --version QEMU emulator version 2.0.93, Copyright (c) 2003-2008 Fabrice Bellard works fine: qemu --enable-kvm -nographic -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on -device virtio-net-device,netdev=t0,id=nic0 -kernel /extra/rootfs/boot/Image -drive file=/extra/Styx-Acadia-42-2014-07-07_10-09-27.img,id=fs -device virtio-blk-device,drive=fs -m 1024 -M virt -cpu host -smp 1 -append "console=ttyAMA0 console=ttyS0 root=/dev/vda2" hangs: qemu --enable-kvm -nographic -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on -device virtio-net-device,netdev=t0,id=nic0 -kernel /extra/rootfs/boot/Image -drive file=/extra/Styx-Acadia-42-2014-07-07_10-09-27.img,id=fs -device virtio-blk-device,drive=fs -m 1024 -M virt -cpu host -smp 2 -append "console=ttyAMA0 console=ttyS0 root=/dev/vda2" (gdb) t [Current thread is 1 (Thread 0x3ffb6787cc0 (LWP 7619))] (gdb) bt #0 0x03ffb6e50330 in ppoll () from /lib64/libc.so.6 #1 0x006631a0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, __fds=) at /usr/include/bits/poll2.h:77 #2 qemu_poll_ns (fds=, nfds=, timeout=) at qemu-timer.c:314 #3 0x00662878 in os_host_main_loop_wait (timeout=) at main-loop.c:229 #4 main_loop_wait (nonblocking=) at main-loop.c:484 #5 0x0040fdf4 in main_loop () at vl.c:2010 #6 main (argc=, argv=, envp=) at vl.c:4541 (gdb) t [Current thread is 1 (Thread 0x3ffb6787cc0 (LWP 7619))] (gdb) t 2 [Switching to thread 2 (Thread 0x3ffb64beef0 (LWP 7622))] #0 0x03ffb6f10a44 in sigwait () from /lib64/libpthread.so.0 (gdb) bt #0 0x03ffb6f10a44 in sigwait () from /lib64/libpthread.so.0 #1 0x0069d78c in sigwait_compat (opaque=0xd752c0) at util/compatfd.c:36 #2 0x03ffb6f07c20 in start_thread () from /lib64/libpthread.so.0 #3 0x03ffb6e5a80c in clone () from /lib64/libc.so.6 ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1348719 Title: arm64: -smp 2 hangs qemu Status in QEMU: New Bug description: It appears that smp is broken on qemu for arm64. I'm looking into the root cause but am curious if others can reproduce in their environments. Tested with commit f368c33d5ab09dd5656924185cd975b11838cd25 (July 22) from https://github.com/qemu/qemu.git [root@joelaarch64 ~]# /usr/local/bin/qemu-system-aarch64 --version QEMU emulator version 2.0.93, Copyright (c) 2003-2008 Fabrice Bellard works fine: qemu --enable-kvm -nographic -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on -device virtio-net-device,netdev=t0,id=nic0 -kernel /extra/rootfs/boot/Image -drive file=/extra/Styx-Acadia-42-2014-07-07_10-09-27.img,id=fs -device virtio-blk-device,drive=fs -m 1024 -M virt -cpu host -smp 1 -append "console=ttyAMA0 console=ttyS0 root=/dev/vda2" hangs: qemu --enable-kvm -nographic -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on -device virtio-net-device,netdev=t0,id=nic0 -kernel /extra/rootfs/boot/Image -drive file=/extra/Styx-Acadia-42-2014-07-07_10-09-27.img,id=fs -device virtio-blk-device,drive=fs -m 1024 -M virt -cpu host -smp 2 -append "console=ttyAMA0 console=ttyS0 root=/dev/vda2" (gdb) t [Current thread is 1 (Thread 0x3ffb6787cc0 (LWP 7619))] (gdb) bt #0 0x03ffb6e50330 in ppoll () from /lib64/libc.so.6 #1 0x006631a0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, __fds=) at /usr/include/bits/poll2.h:77 #2 qemu_poll_ns (fds=, nfds=, timeout=) at qemu-timer.c:314 #3 0x00662878 in os_host_main_loop_wait (timeout=) at main-loop.c:229 #4 main_loop_wait (nonblocking=) at main-loop.c:484 #5 0x0040fdf4 in main_loop () at vl.c:2010 #6 main (argc=, argv=, envp=) at vl.c:4541 (gdb) t [Current thread is 1 (Thread 0x3ffb6787cc0 (LWP 7619))] (gdb) t 2 [Switching to thread 2 (Thread 0x3ffb64beef0 (LWP 7622))] #0 0x03ffb6f10a44 in sigwait () from /lib64/libpthread.so.0 (gdb) bt #0 0x03ffb6f10a44 in sigwait () from /lib64/libpthread.so.0 #1 0x0069d78c in sigwait_compat (opaque=0xd752c0) at util/compatfd.c:36 #2 0x03ffb6f07c20 in start_thread () from /lib64/libpthread.so.0 #3 0x03ffb6e5a80c in clone () from /lib64/libc.so.6 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1348719/+subscriptions
[Qemu-devel] [Bug 1348719] Re: arm64: -smp 2 hangs qemu
** Changed in: qemu Assignee: (unassigned) => Joel Schopp (joel-schopp) -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1348719 Title: arm64: -smp 2 hangs qemu Status in QEMU: New Bug description: It appears that smp is broken on qemu for arm64. I'm looking into the root cause but am curious if others can reproduce in their environments. Tested with commit f368c33d5ab09dd5656924185cd975b11838cd25 (July 22) from https://github.com/qemu/qemu.git [root@joelaarch64 ~]# /usr/local/bin/qemu-system-aarch64 --version QEMU emulator version 2.0.93, Copyright (c) 2003-2008 Fabrice Bellard works fine: qemu --enable-kvm -nographic -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on -device virtio-net-device,netdev=t0,id=nic0 -kernel /extra/rootfs/boot/Image -drive file=/extra/Styx-Acadia-42-2014-07-07_10-09-27.img,id=fs -device virtio-blk-device,drive=fs -m 1024 -M virt -cpu host -smp 1 -append "console=ttyAMA0 console=ttyS0 root=/dev/vda2" hangs: qemu --enable-kvm -nographic -netdev tap,id=t0,ifname=tap0,script=no,downscript=no,vhost=on -device virtio-net-device,netdev=t0,id=nic0 -kernel /extra/rootfs/boot/Image -drive file=/extra/Styx-Acadia-42-2014-07-07_10-09-27.img,id=fs -device virtio-blk-device,drive=fs -m 1024 -M virt -cpu host -smp 2 -append "console=ttyAMA0 console=ttyS0 root=/dev/vda2" (gdb) t [Current thread is 1 (Thread 0x3ffb6787cc0 (LWP 7619))] (gdb) bt #0 0x03ffb6e50330 in ppoll () from /lib64/libc.so.6 #1 0x006631a0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, __fds=) at /usr/include/bits/poll2.h:77 #2 qemu_poll_ns (fds=, nfds=, timeout=) at qemu-timer.c:314 #3 0x00662878 in os_host_main_loop_wait (timeout=) at main-loop.c:229 #4 main_loop_wait (nonblocking=) at main-loop.c:484 #5 0x0040fdf4 in main_loop () at vl.c:2010 #6 main (argc=, argv=, envp=) at vl.c:4541 (gdb) t [Current thread is 1 (Thread 0x3ffb6787cc0 (LWP 7619))] (gdb) t 2 [Switching to thread 2 (Thread 0x3ffb64beef0 (LWP 7622))] #0 0x03ffb6f10a44 in sigwait () from /lib64/libpthread.so.0 (gdb) bt #0 0x03ffb6f10a44 in sigwait () from /lib64/libpthread.so.0 #1 0x0069d78c in sigwait_compat (opaque=0xd752c0) at util/compatfd.c:36 #2 0x03ffb6f07c20 in start_thread () from /lib64/libpthread.so.0 #3 0x03ffb6e5a80c in clone () from /lib64/libc.so.6 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1348719/+subscriptions
Re: [Qemu-devel] [Intel-gfx] ResettRe: [Xen-devel] [v5][PATCH 0/5] xen: add Intel IGD passthrough support
On Thu, Jul 24, 2014 at 09:44:41AM +0800, Chen, Tiejun wrote: > On 2014/7/24 4:54, Konrad Rzeszutek Wilk wrote: > >On Sat, Jul 19, 2014 at 12:27:21AM +, Kay, Allen M wrote: > >>>For the MCH PCI registers that do need to be read - can you tell us which > >>>ones those are? > >> > >>In qemu/hw/xen_pt_igd.c/igd_pci_read(), following MCH PCI config register > >>reads are passthrough to the host HW. Some of the registers are needed by > >>Ironlake GFX driver which we probably can remove. I did a trace recently > >>on Broadwell, the number of register accessed are even smaller (0, 2, 2c, > >>2e, 50, 52, a0, a4). Given that we now have integrated MCH and GPU in the > >>same socket, looks like driver can easily remove reads for offsets 0 - 0x2e. > >> > >>case 0x00:/* vendor id */ > >>case 0x02:/* device id */ > >>case 0x08:/* revision id */ > >>case 0x2c:/* sybsystem vendor id */ > >>case 0x2e:/* sybsystem id */ > > > >Right. We can fix the i915 to use the mechanism that Michael mentioned. > >(see attached RFC patches) > > > >>case 0x50:/* SNB: processor graphics control register */ > >>case 0x52:/* processor graphics control register */ > >>case 0xa0:/* top of memory */ > >>case 0xb0:/* ILK: BSM: should read from dev 2 offset > >> 0x5c */ > >>case 0x58:/* SNB: PAVPC Offset */ > >>case 0xa4:/* SNB: graphics base of stolen memory */ > >>case 0xa8:/* SNB: base of GTT stolen memory */ > > > >I dug in the intel-gtt.c (see ironlake_gtt_driver) to figure out this > >a bit more. As in, I speculated, what if we returned 0 (and not implement > >any support for reading from these registers). What would happen? > > > >0x52 for Ironlake (g5): > >-- > >It looks like intel_gmch_probe is called when i915_gem_gtt_init > >starts (there is a lot of code that looks to be used between > >intel-gtt.c and i915.c). > > > >Anyhow the interesting parts are that i9xx_setup ends up calling > >ioremap the i915 BAR to setup some of these registers for older generations. > > > >Then i965_gtt_total_entries gets which reads 0x52, but it is only > >needed for v5 generation. For other (v4 and G33) it reads it from the GPU's > >0x2020 register. > > > >If there is a mismatch, it writes to the GPU at 0x2020 to update the > >the size based on the bridge. And then it reads from 0x2020 and that > >is returned and stuck in intel_private.gtt_total_entries. > > > >So 0x52 in the emulated bridge could be populated with what the > >GPU has at 0x2020. And the writes go in the emulated area. > > > >0x52 for v6 -> v8: > >- > >We seem to go to gen6_gmch_probe which just figures out the > >the GTT based on the GPU's BAR sizes. The stolen values > >are read from 0x50 from the GPU. So no need to touch the bridge > >(see gen6_gmch_probe) > > > >OK, so no need to have 0x52 or 0x50 in the bridge. > > > >0xA0: > >- > >Could not find any reference in the Linux code. Why would > >Windows driver need this? If we returned the _real_ TOM would > >it matter? Is it used to figure out the device should use 32-bit > >DMA operations or 40-bit? > > > >0xb0 or 0x5c: > >- > >No mention of them in the Linux code. > > > >0x58, 0xa4, 0xa8: > >- > >No usage of them in the Linux code. We seem to be using the 0x52 > >from the bridge and 0x2020 from the GPU for this functionality. > > > > > >So in theory*, if using Ironlake we need to have a proper value > >in 0x52 in the bridge. But for the later chipsets we do not need > >these values (I am assuming that intel_setup_mchbar can safely > >return as it does that for Ironlake and could very well for later > >generations). > > > >Can this be reflected in the Windows driver as well? > > > >P.S. > >*theory: That is assuming we modify the Linux i915_drv.c:intel_detect_pch > >to pick up the id as suggested earlier. See the RFC patches attached. > >(Not compile tested at all!) > > I take a look these patches, looks we still scan all PCI Bridge to walk all > PCHs. So this means we still need to fake a PCI bridge, right? Or maybe you > don't cover this problem this time. I totally forgot. Feel free to fix that. > > I prefer we should check dev slot to get that PCH like my previous patch, Uh? Your patch was checking for 0:1f.0, not the slot of the device. (see https://lkml.org/lkml/2014/6/19/121) > "gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class > type". Because Windows always use this way, so I think this point should be > same between Linux and Windows. Didn't we discuss that we did not want to pass in PCH at all if we can? And from the observation I made above it seems that we can safely do it under Linux. With Windows I don't know - but I presume the answer is yes too. > > Or we need anthe
Re: [Qemu-devel] [PATCH 2/7] tests: Add virtio device initialization
El Fri, 25 Jul 2014 16:19:41 +0100 Stefan Hajnoczi escribió: > On Thu, Jul 24, 2014 at 08:31:00PM +0200, Marc Marí wrote: > > +static uint32_t qvirtio_pci_get_features(QVirtioDevice *d) > > +{ > > +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; > > +return qpci_io_readl(dev->pdev, dev->addr + > > QVIRTIO_DEVICE_FEATURES); +} > > Unused? If it's unused, then it's untested. Yes, moved to the other patch for v2 > > > + > > +static uint8_t qvirtio_pci_get_status(QVirtioDevice *d) > > +{ > > +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; > > +return qpci_io_readb(dev->pdev, dev->addr + > > QVIRTIO_DEVICE_STATUS); +} > > Unused? Used in virtio.c (qvirtio_reset / qvirtio_set_acknowledge / qvirtio_set_driver). > > + > > +static void qvirtio_pci_set_status(QVirtioDevice *d, uint8_t val) > > +{ > > +QVirtioPCIDevice *dev = (QVirtioPCIDevice *)d; > > +qpci_io_writeb(dev->pdev, dev->addr + QVIRTIO_DEVICE_STATUS, > > val); > > Unused? Also in virtio.c > > > @@ -73,3 +97,11 @@ QVirtioPCIDevice > > *qvirtio_pci_device_find(QPCIBus *bus, uint16_t device_type) > > return dev; > > } > > + > > +void qvirtio_pci_enable_device(QVirtioPCIDevice *d) > > +{ > > +qpci_device_enable(d->pdev); > > +d->addr = qpci_iomap(d->pdev, 0); > > +g_assert(d->addr != NULL); > > +} > > Where is qpci_iounmap() called to clean up? Missed. Also, it is unimplemented. Marc
Re: [Qemu-devel] [PATCH for-2.1 2/2] acpi: mark ACPI tables ROM blob as extend-able on migration
Il 25/07/2014 17:48, Igor Mammedov ha scritto: > It fixes migration failure for machine type pc-i440fx-1.7 from > QEMU 1.7/2.0 to QEMU 2.1 > > Migration fails due to ACPI tables size grows across 1.7-2.1 > versions. That causes ACPI tables ROM blob to change its size > differently for the same configurations on different QEMU versions. > As result migration code bails out when attempting to load > smaller ROM blob into a bigger one on a newer QEMU. > To trigger failure it's enough to add pci-bridge device to QEMU CLI > > Marking ACPI tables ROM blob as extend-able on migration allows > QEMU to load smaller ACPI tables from QEMU 1.7/2.0, fixing > forward migration failure introduced since 2.0 which affects > only configurations that cause ACPI ROM blob size change. This works in this case, and it is more friendly to downstreams indeed. It also is simpler, at least on the surface. I think the ramifications could be a bit larger than with my own patches, but still I guess it's more appropriate at this point of the release cycle. It doesn't handle the case of ACPI tables that shrink, which can happen as well. I guess if this ever happens we can just hard-code the table size of the "old" versions to something big enough (64K?) and keep using fine-grained sizing. I'd like a day or two to mull about it, but I have it even if the patches are applied. Peter, feel free to go ahead with Igor's patches. Paolo
Re: [Qemu-devel] [PATCH for-2.1 1/2] migration: load smaller RAMBlock to a bigger one if permitted
On 07/25/14 17:48, Igor Mammedov wrote: > Add API to mark memory region as extend-able on migration, > to allow migration code to load smaller RAMBlock into > a bigger one on destination QEMU instance. > > This will allow to fix broken migration from QEMU 1.7/2.0 to > QEMU 2.1 due to ACPI tables size changes across 1.7/2.0/2.1 > versions by marking ACPI tables ROM blob as extend-able. > So that smaller tables from previos version could be always ("previous") > migrated to a bigger rom blob on new version. > > Credits-for-idea: Michael S. Tsirkin > Signed-off-by: Igor Mammedov > --- > arch_init.c | 19 ++- > exec.c | 15 +-- > include/exec/cpu-all.h | 15 ++- > include/exec/memory.h | 10 ++ > include/exec/ram_addr.h | 1 + > memory.c| 5 + > 6 files changed, 53 insertions(+), 12 deletions(-) (Please pass the -O flag to git-format-patch, with an order file that moves header files (*.h) to the front. Header hunks should lead in a patch. Thanks.) > diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h > index 6593be1..248c075 100644 > --- a/include/exec/ram_addr.h > +++ b/include/exec/ram_addr.h > @@ -33,6 +33,7 @@ void *qemu_get_ram_block_host_ptr(ram_addr_t addr); > void *qemu_get_ram_ptr(ram_addr_t addr); > void qemu_ram_free(ram_addr_t addr); > void qemu_ram_free_from_ptr(ram_addr_t addr); > +void qemu_ram_set_extendable_on_migration(ram_addr_t addr); > > static inline bool cpu_physical_memory_get_dirty(ram_addr_t start, > ram_addr_t length, (Ugh. The declarations (prototypes) of qemu_ram_*() functions are scattered between "ram_addr.h" and "cpu-common.h" (both under include/exec). I can't see why that is a good thing (the function definitions all seem to be in exec.c).) > diff --git a/exec.c b/exec.c > index 765bd94..755b1d3 100644 > --- a/exec.c > +++ b/exec.c > @@ -69,12 +69,6 @@ AddressSpace address_space_memory; > MemoryRegion io_mem_rom, io_mem_notdirty; > static MemoryRegion io_mem_unassigned; > > -/* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */ > -#define RAM_PREALLOC (1 << 0) > - > -/* RAM is mmap-ed with MAP_SHARED */ > -#define RAM_SHARED (1 << 1) > - > #endif I'm not sure these macros should be replaced with an enum; the new flag could be introduced just following the existent pattern. > > struct CPUTailQ cpus = QTAILQ_HEAD_INITIALIZER(cpus); > @@ -1214,6 +1208,15 @@ void qemu_ram_unset_idstr(ram_addr_t addr) > } > } > > +void qemu_ram_set_extendable_on_migration(ram_addr_t addr) > +{ > +RAMBlock *block = find_ram_block(addr); > + > +if (block) { > +block->flags |= RAM_EXTENDABLE_ON_MIGRATE; > +} > +} > + Let's see how some oher qemu_ram_*() functions search for their blocks. We can form two classes; the first class takes a ram_addr_t into some RAMBlock, the second class only accepts/finds a ram_addr_t only at the beginning of some RAMBlock. (1) containment: qemu_get_ram_fd(): calls qemu_get_ram_block() qemu_get_ram_block_host_ptr: calls qemu_get_ram_block() qemu_get_ram_ptr(): calls qemu_get_ram_block() qemu_ram_remap():needs containment (2) exact block start match: qemu_ram_free(): needs (addr == block->offset) qemu_ram_free_from_ptr():needs (addr == block->offset) qemu_ram_set_idstr():calls find_ram_block() qemu_ram_unset_idstr(): calls find_ram_block() Your function will belong to the 2nd class. The definition is close to that of qemu_ram_unset_idstr(), another class member, OK. The declaration is close to qemu_ram_free_from_ptr(), which is another class member. OK. I'd throw in an assert(), rather than an "if", just like qemu_ram_set_idstr() does. > static int memory_try_enable_merging(void *addr, size_t len) > { > if (!qemu_opt_get_bool(qemu_get_machine_opts(), "mem-merge", true)) { Let's see the caller: > diff --git a/memory.c b/memory.c > index 64d7176..744c746 100644 > --- a/memory.c > +++ b/memory.c > @@ -1791,6 +1791,11 @@ bool memory_region_is_mapped(MemoryRegion *mr) > return mr->container ? true : false; > } > > +void memory_region_permit_extendable_migration(MemoryRegion *mr) > +{ > +qemu_ram_set_extendable_on_migration(mr->ram_addr); > +} > + > MemoryRegionSection memory_region_find(MemoryRegion *mr, > hwaddr addr, uint64_t size) > { > Looks matching, and consistent with other callers of the 2nd family functions. OK. > diff --git a/include/exec/memory.h b/include/exec/memory.h > index e2c8e3e..6c03f70 100644 > --- a/include/exec/memory.h > +++ b/include/exec/memory.h > @@ -894,6 +894,16 @@ bool memory_region_present(MemoryRegion *container, > hwaddr addr); > bool memory_region_is_mapped(MemoryRegion *mr); > > /** > + * memory_region_permit_extendable_migration: allows to mark Please refer to commit 9
Re: [Qemu-devel] [PATCH for-2.1 2/2] acpi: mark ACPI tables ROM blob as extend-able on migration
On 07/25/14 17:48, Igor Mammedov wrote: > It fixes migration failure for machine type pc-i440fx-1.7 from > QEMU 1.7/2.0 to QEMU 2.1 > > Migration fails due to ACPI tables size grows across 1.7-2.1 > versions. That causes ACPI tables ROM blob to change its size > differently for the same configurations on different QEMU versions. > As result migration code bails out when attempting to load > smaller ROM blob into a bigger one on a newer QEMU. > To trigger failure it's enough to add pci-bridge device to QEMU CLI > > Marking ACPI tables ROM blob as extend-able on migration allows > QEMU to load smaller ACPI tables from QEMU 1.7/2.0, fixing > forward migration failure introduced since 2.0 which affects > only configurations that cause ACPI ROM blob size change. > > Signed-off-by: Igor Mammedov > --- > hw/core/loader.c | 6 +- > hw/i386/acpi-build.c | 2 +- > include/hw/loader.h | 5 +++-- > 3 files changed, 9 insertions(+), 4 deletions(-) > > diff --git a/hw/core/loader.c b/hw/core/loader.c > index 2bf6b8f..d09b894 100644 > --- a/hw/core/loader.c > +++ b/hw/core/loader.c > @@ -722,7 +722,8 @@ err: > > void *rom_add_blob(const char *name, const void *blob, size_t len, > hwaddr addr, const char *fw_file_name, > - FWCfgReadCallback fw_callback, void *callback_opaque) > + FWCfgReadCallback fw_callback, void *callback_opaque, > + bool extendable) > { > Rom *rom; > void *data = NULL; > @@ -742,6 +743,9 @@ void *rom_add_blob(const char *name, const void *blob, > size_t len, > > if (rom_file_has_mr) { > data = rom_set_mr(rom, OBJECT(fw_cfg), devpath); > +if (extendable) { > +memory_region_permit_extendable_migration(rom->mr); > +} > } else { > data = rom->data; > } > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > index ebc5f03..651c06b 100644 > --- a/hw/i386/acpi-build.c > +++ b/hw/i386/acpi-build.c > @@ -1558,7 +1558,7 @@ static void *acpi_add_rom_blob(AcpiBuildState > *build_state, GArray *blob, > const char *name) > { > return rom_add_blob(name, blob->data, acpi_data_len(blob), -1, name, > -acpi_build_update, build_state); > +acpi_build_update, build_state, true); > } > > static const VMStateDescription vmstate_acpi_build = { > diff --git a/include/hw/loader.h b/include/hw/loader.h > index 796cbf9..e5a24ac 100644 > --- a/include/hw/loader.h > +++ b/include/hw/loader.h > @@ -57,7 +57,8 @@ int rom_add_file(const char *file, const char *fw_dir, > bool option_rom); > void *rom_add_blob(const char *name, const void *blob, size_t len, > hwaddr addr, const char *fw_file_name, > - FWCfgReadCallback fw_callback, void *callback_opaque); > + FWCfgReadCallback fw_callback, void *callback_opaque, > + bool extendable); > int rom_add_elf_program(const char *name, void *data, size_t datasize, > size_t romsize, hwaddr addr); > int rom_load_all(void); > @@ -70,7 +71,7 @@ void do_info_roms(Monitor *mon, const QDict *qdict); > #define rom_add_file_fixed(_f, _a, _i) \ > rom_add_file(_f, NULL, _a, _i, false) > #define rom_add_blob_fixed(_f, _b, _l, _a) \ > -rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL) > +rom_add_blob(_f, _b, _l, _a, NULL, NULL, NULL, false) > > #define PC_ROM_MIN_VGA 0xc > #define PC_ROM_MIN_OPTION 0xc8000 > Reviewed-by: Laszlo Ersek
[Qemu-devel] [PATCH 2/8] qemu-img: Add progress output for amend
Now that bdrv_amend_options() supports a status callback, use it to display a progress report. Signed-off-by: Max Reitz --- qemu-img.c | 26 +++--- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/qemu-img.c b/qemu-img.c index 90d6b79..a06f425 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -2732,6 +2732,13 @@ out: return 0; } +static void amend_status_cb(BlockDriverState *bs, +int64_t offset, int64_t total_work_size) +{ +(void)bs; +qemu_progress_print(100.f * offset / total_work_size, 0); +} + static int img_amend(int argc, char **argv) { int c, ret = 0; @@ -2740,12 +2747,12 @@ static int img_amend(int argc, char **argv) QemuOpts *opts = NULL; const char *fmt = NULL, *filename, *cache; int flags; -bool quiet = false; +bool quiet = false, progress = false; BlockDriverState *bs = NULL; cache = BDRV_DEFAULT_CACHE; for (;;) { -c = getopt(argc, argv, "ho:f:t:q"); +c = getopt(argc, argv, "ho:f:t:pq"); if (c == -1) { break; } @@ -2775,6 +2782,9 @@ static int img_amend(int argc, char **argv) case 't': cache = optarg; break; +case 'p': +progress = true; +break; case 'q': quiet = true; break; @@ -2785,6 +2795,11 @@ static int img_amend(int argc, char **argv) error_exit("Must specify options (-o)"); } +if (quiet) { +progress = false; +} +qemu_progress_init(progress, 1.0); + filename = (optind == argc - 1) ? argv[argc - 1] : NULL; if (fmt && has_help_option(options)) { /* If a format is explicitly specified (and possibly no filename is @@ -2827,13 +2842,18 @@ static int img_amend(int argc, char **argv) goto out; } -ret = bdrv_amend_options(bs, opts, NULL); +/* In case the driver does not call amend_status_cb() */ +qemu_progress_print(0.f, 0); +ret = bdrv_amend_options(bs, opts, &amend_status_cb); +qemu_progress_print(100.f, 0); if (ret < 0) { error_report("Error while amending options: %s", strerror(-ret)); goto out; } out: +qemu_progress_end(); + if (bs) { bdrv_unref(bs); } -- 2.0.1
[Qemu-devel] [PATCH 0/8] block/qcow2: Improve (?) zero cluster expansion
The main purpose of this series is to add a progress report to qemu-img amend. This is achieved by adding a callback function to bdrv_amend_options() - the reasons for this choice are explained in patch 1. While adapting qcow2's expand_zero_clusters_in_l1() accordingly, I noticed a way to simplify it and get rid of the rather ugly bitmap used there (patch 6) and even found a way to optimize it (patch 7). However, Kevin already expressed strong dislike about that patch 7, as it complicates things in a function which should rarely ever be used. I personally don't have a strong opinion on the topic. The optimization should significantly increase the expansion performance; on the other hand, it makes the code equally more complicated. Patch 8 does not really depend on patch 7; it contains a test case specifically for patch 7, but of course it works without that patch just as well. Therefore, we can simply drop patch 7 if we don't need it. In this version, the total size of the "zero cluster expansion job" is the number of zero cluster entries in all L2 tables multiplied by the cluster size. This of course requires that number to be calculated beforehand, which is complicated as well. An easier way (suggested by Kevin) would be to simply use the number of all L2 entries as a basis. This would not be as exact, but much easier to implement. Since I already implemented the more complicated version of everything, I'm just sending it as-is. I'll write an alternative to patch 5 which uses the simpler variant and send it for comparison later. This series depends on v2 of my "qemu-img: Allow source cache mode specification" seires. Max Reitz (8): block: Add status callback to bdrv_amend_options() qemu-img: Add progress output for amend qemu-img: Fix insignifcant memleak block/qcow2: Make get_refcount() global block/qcow2: Implement status CB for amend block/qcow2: Simplify shared L2 handling in amend block/qcow2: Speed up zero cluster expansion iotests: Expand test 061 block.c| 5 +- block/qcow2-cluster.c | 319 ++--- block/qcow2-refcount.c | 23 ++-- block/qcow2.c | 10 +- block/qcow2.h | 5 +- include/block/block.h | 5 +- include/block/block_int.h | 3 +- qemu-img.c | 30 - tests/qemu-iotests/061 | 41 ++ tests/qemu-iotests/061.out | 40 ++ tests/qemu-iotests/group | 2 +- 11 files changed, 379 insertions(+), 104 deletions(-) -- 2.0.1
[Qemu-devel] [PATCH 5/8] block/qcow2: Implement status CB for amend
The only really time-consuming operation potentially performed by qcow2_amend_options() is zero cluster expansion when downgrading qcow2 images from compat=1.1 to compat=0.10, so report status of that operation and that operation only through the status CB. For this, count the number of L2 zero entries, use this as the basis for the total "amend job length" and increase the current offset by the cluster size multiplied by the refcount of the L2 table when expanding zero clusters. Signed-off-by: Max Reitz --- block/qcow2-cluster.c | 145 -- block/qcow2.c | 9 ++-- block/qcow2.h | 3 +- 3 files changed, 147 insertions(+), 10 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 4208dc0..0f52ef6 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1548,10 +1548,20 @@ fail: * zero expansion (i.e., has been filled with zeroes and is referenced from an * L2 table). nb_clusters contains the total cluster count of the image file, * i.e., the number of bits in expanded_clusters. + * + * zero_clusters_count and *expanded_count are used to keep track of progress + * for status_cb(). zero_clusters_count contains the total number of L2 zero + * cluster entries. Accordingly, *expanded_count counts all visited L2 zero + * cluster entries; shared L2 tables are counted accordingly, that is, for each + * L2 zero cluster entry the count is increased by the refcount of the L2 table + * cluster. */ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, int l1_size, uint8_t **expanded_clusters, - uint64_t *nb_clusters) + uint64_t *nb_clusters, + int64_t *expanded_count, + int64_t zero_clusters_count, + BlockDriverAmendStatusCB *status_cb) { BDRVQcowState *s = bs->opaque; bool is_active_l1 = (l1_table == s->l1_table); @@ -1568,6 +1578,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, for (i = 0; i < l1_size; i++) { uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK; bool l2_dirty = false; +int l2_refcount; if (!l2_offset) { /* unallocated */ @@ -1587,6 +1598,12 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, goto fail; } +l2_refcount = qcow2_get_refcount(bs, l2_offset >> s->cluster_bits); +if (l2_refcount < 0) { +ret = l2_refcount; +goto fail; +} + for (j = 0; j < s->l2_size; j++) { uint64_t l2_entry = be64_to_cpu(l2_table[j]); int64_t offset = l2_entry & L2E_OFFSET_MASK, cluster_index; @@ -1617,6 +1634,8 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, continue; } +*expanded_count += l2_refcount; + if (!preallocated) { if (!bs->backing_hd) { /* not backed; therefore we can simply deallocate the @@ -1703,6 +1722,11 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, } } } + +if (status_cb) { +status_cb(bs, *expanded_count << s->cluster_bits, + zero_clusters_count << s->cluster_bits); +} } ret = 0; @@ -1724,26 +1748,137 @@ fail: } /* + * Counts all zero clusters in a specific L1 table. + */ +static int64_t count_zero_clusters_in_l1(BlockDriverState *bs, + uint64_t *l1_table, int l1_size) +{ +BDRVQcowState *s = bs->opaque; +bool is_active_l1 = (l1_table == s->l1_table); +uint64_t *l2_table = NULL; +int64_t count = 0; +int ret; +int i, j; + +if (!is_active_l1) { +l2_table = qemu_blockalign(bs, s->cluster_size); +} + +for (i = 0; i < l1_size; i++) { +uint64_t l2_offset = l1_table[i] & L1E_OFFSET_MASK; + +if (!l2_offset) { +continue; +} + +if (is_active_l1) { +ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, + (void **)&l2_table); +} else { +ret = bdrv_read(bs->file, l2_offset / BDRV_SECTOR_SIZE, +(void *)l2_table, s->cluster_sectors); +} +if (ret < 0) { +goto fail; +} + +for (j = 0; j < s->l2_size; j++) { +if (qcow2_get_cluster_type(be64_to_cpu(l2_table[j])) +== QCOW2_CLUSTER_ZERO) +{ +count++; +} +} + +if (is_active_l1) { +ret = qcow2_cache_put(bs, s->l2_table_cache, (void **)&l2_table); +l2_tabl
[Qemu-devel] [PATCH 4/8] block/qcow2: Make get_refcount() global
Reading the refcount of a cluster is an operation which can be useful in all of the qcow2 code, so make that function globally available. Signed-off-by: Max Reitz --- block/qcow2-refcount.c | 23 --- block/qcow2.h | 2 ++ 2 files changed, 14 insertions(+), 11 deletions(-) diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index cc6cf74..0c9887b 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -87,7 +87,7 @@ static int load_refcount_block(BlockDriverState *bs, * return value is the refcount of the cluster, negative values are -errno * and indicate an error. */ -static int get_refcount(BlockDriverState *bs, int64_t cluster_index) +int qcow2_get_refcount(BlockDriverState *bs, int64_t cluster_index) { BDRVQcowState *s = bs->opaque; uint64_t refcount_table_index, block_index; @@ -625,7 +625,7 @@ int qcow2_update_cluster_refcount(BlockDriverState *bs, return ret; } -return get_refcount(bs, cluster_index); +return qcow2_get_refcount(bs, cluster_index); } @@ -646,7 +646,7 @@ static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size) retry: for(i = 0; i < nb_clusters; i++) { uint64_t next_cluster_index = s->free_cluster_index++; -refcount = get_refcount(bs, next_cluster_index); +refcount = qcow2_get_refcount(bs, next_cluster_index); if (refcount < 0) { return refcount; @@ -710,7 +710,7 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset, /* Check how many clusters there are free */ cluster_index = offset >> s->cluster_bits; for(i = 0; i < nb_clusters; i++) { -refcount = get_refcount(bs, cluster_index++); +refcount = qcow2_get_refcount(bs, cluster_index++); if (refcount < 0) { return refcount; @@ -927,7 +927,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, cluster_index, addend, QCOW2_DISCARD_SNAPSHOT); } else { -refcount = get_refcount(bs, cluster_index); +refcount = qcow2_get_refcount(bs, cluster_index); } if (refcount < 0) { @@ -967,7 +967,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, refcount = qcow2_update_cluster_refcount(bs, l2_offset >> s->cluster_bits, addend, QCOW2_DISCARD_SNAPSHOT); } else { -refcount = get_refcount(bs, l2_offset >> s->cluster_bits); +refcount = qcow2_get_refcount(bs, l2_offset >> s->cluster_bits); } if (refcount < 0) { ret = refcount; @@ -1243,8 +1243,8 @@ fail: * Checks the OFLAG_COPIED flag for all L1 and L2 entries. * * This function does not print an error message nor does it increment - * check_errors if get_refcount fails (this is because such an error will have - * been already detected and sufficiently signaled by the calling function + * check_errors if qcow2_get_refcount fails (this is because such an error will + * have been already detected and sufficiently signaled by the calling function * (qcow2_check_refcounts) by the time this function is called). */ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res, @@ -1265,7 +1265,7 @@ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res, continue; } -refcount = get_refcount(bs, l2_offset >> s->cluster_bits); +refcount = qcow2_get_refcount(bs, l2_offset >> s->cluster_bits); if (refcount < 0) { /* don't print message nor increment check_errors */ continue; @@ -1307,7 +1307,8 @@ static int check_oflag_copied(BlockDriverState *bs, BdrvCheckResult *res, if ((cluster_type == QCOW2_CLUSTER_NORMAL) || ((cluster_type == QCOW2_CLUSTER_ZERO) && (data_offset != 0))) { -refcount = get_refcount(bs, data_offset >> s->cluster_bits); +refcount = qcow2_get_refcount(bs, + data_offset >> s->cluster_bits); if (refcount < 0) { /* don't print message nor increment check_errors */ continue; @@ -1597,7 +1598,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res, /* compare ref counts */ for (i = 0, highest_cluster = 0; i < nb_clusters; i++) { -refcount1 = get_refcount(bs, i); +refcount1 = qcow2_get_refcount(bs, i); if (refcount1 < 0) { fprintf(stderr, "Can't get refcount for cluster %" PRId64 ": %s\n", i, strerror(-refcount1)); diff --git a/block/qcow2.h b/block/qcow2.h index b49424b..b423b71 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -4
[Qemu-devel] [PATCH 3/8] qemu-img: Fix insignifcant memleak
As soon as options is set in img_amend(), it needs to be freed before the function returns. This leak is rather insignifcant, as qemu-img will exit subsequently anyway, but there's no point in not fixing it. Signed-off-by: Max Reitz --- qemu-img.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/qemu-img.c b/qemu-img.c index a06f425..d73a68a 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -2809,7 +2809,9 @@ static int img_amend(int argc, char **argv) } if (optind != argc - 1) { -error_exit("Expecting one image file name"); +error_report("Expecting one image file name"); +ret = -1; +goto out; } flags = BDRV_O_FLAGS | BDRV_O_RDWR; -- 2.0.1
[Qemu-devel] [PATCH 7/8] block/qcow2: Speed up zero cluster expansion
Actually, we do not need to allocate a new data cluster for every zero cluster to be expanded: It is completely sufficient to rely on qcow2's COW part and instead create a single zero cluster and reuse it as much as possible. Signed-off-by: Max Reitz --- block/qcow2-cluster.c | 119 ++ 1 file changed, 92 insertions(+), 27 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 905beb6..867db03 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1558,6 +1558,9 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, BDRVQcowState *s = bs->opaque; bool is_active_l1 = (l1_table == s->l1_table); uint64_t *l2_table = NULL; +int64_t zeroed_cluster_offset = 0; +int zeroed_cluster_refcount = 0; +int last_zeroed_cluster_l1i = 0, last_zeroed_cluster_l2i = 0; int ret; int i, j; @@ -1617,47 +1620,79 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table, continue; } -offset = qcow2_alloc_clusters(bs, s->cluster_size); -if (offset < 0) { -ret = offset; -goto fail; +if (zeroed_cluster_offset) { +zeroed_cluster_refcount += l2_refcount; +if (zeroed_cluster_refcount > 0x) { +zeroed_cluster_refcount = 0; +zeroed_cluster_offset = 0; +} } +if (!zeroed_cluster_offset) { +offset = qcow2_alloc_clusters(bs, s->cluster_size); +if (offset < 0) { +ret = offset; +goto fail; +} -if (l2_refcount > 1) { -/* For shared L2 tables, set the refcount accordingly (it is - * already 1 and needs to be l2_refcount) */ -ret = qcow2_update_cluster_refcount(bs, -offset >> s->cluster_bits, l2_refcount - 1, -QCOW2_DISCARD_OTHER); +ret = qcow2_pre_write_overlap_check(bs, 0, offset, +s->cluster_size); +if (ret < 0) { +qcow2_free_clusters(bs, offset, s->cluster_size, +QCOW2_DISCARD_OTHER); +goto fail; +} + +ret = bdrv_write_zeroes(bs->file, offset / BDRV_SECTOR_SIZE, +s->cluster_sectors, 0); if (ret < 0) { qcow2_free_clusters(bs, offset, s->cluster_size, QCOW2_DISCARD_OTHER); goto fail; } + +if (l2_refcount > 1) { +ret = qcow2_update_cluster_refcount(bs, +offset >> s->cluster_bits, l2_refcount - 1, +QCOW2_DISCARD_OTHER); +if (ret < 0) { +qcow2_free_clusters(bs, offset, s->cluster_size, +QCOW2_DISCARD_OTHER); +goto fail; +} +} + +zeroed_cluster_offset = offset; +zeroed_cluster_refcount = l2_refcount; +} else { +ret = qcow2_update_cluster_refcount(bs, +zeroed_cluster_offset >> s->cluster_bits, +l2_refcount, QCOW2_DISCARD_OTHER); +if (ret < 0) { +goto fail; +} } + +offset = zeroed_cluster_offset; +last_zeroed_cluster_l1i = i; +last_zeroed_cluster_l2i = j; } -ret = qcow2_pre_write_overlap_check(bs, 0, offset, s->cluster_size); -if (ret < 0) { -if (!preallocated) { -qcow2_free_clusters(bs, offset, s->cluster_size, -QCOW2_DISCARD_ALWAYS); +if (preallocated) { +ret = qcow2_pre_write_overlap_check(bs, 0, offset, +s->cluster_size); +if (ret < 0) { +goto fail; } -goto fail; -} -ret = bdrv_write_zeroes(bs->file, offset / BDRV_SECTOR_SIZE, -s->cluster_sectors, 0); -if (ret < 0) { -if (!preallocated) { -qcow2_free_clusters(bs, offset, s->cluster_size, -