[Xen-devel] [distros-debian-sid test] 66791: trouble: blocked/broken
flight 66791 distros-debian-sid real [real] http://osstest.xs.citrite.net/~osstest/testlogs/logs/66791/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf-pvops 4 capture-logsbroken REGR. vs. 66612 build-armhf 4 capture-logsbroken REGR. vs. 66612 Regressions which are regarded as allowable (not blocking): build-armhf-pvops 3 host-install(3) broken like 66612 build-armhf 3 host-install(3) broken like 66612 build-amd64 3 host-install(3) broken like 66612 build-amd64-pvops 3 host-install(3) broken like 66612 build-i3863 host-install(3) broken like 66612 build-i386-pvops 3 host-install(3) broken like 66612 Tests which did not succeed, but are not blocking: test-armhf-armhf-armhf-sid-netboot-pygrub 1 build-check(1)blocked n/a test-amd64-amd64-i386-sid-netboot-pygrub 1 build-check(1) blocked n/a test-amd64-amd64-amd64-sid-netboot-pvgrub 1 build-check(1)blocked n/a test-amd64-i386-i386-sid-netboot-pvgrub 1 build-check(1) blocked n/a test-amd64-i386-amd64-sid-netboot-pygrub 1 build-check(1) blocked n/a baseline version: flight 66612 jobs: build-amd64 broken build-armhf broken build-i386 broken build-amd64-pvopsbroken build-armhf-pvopsbroken build-i386-pvops broken test-amd64-amd64-amd64-sid-netboot-pvgrubblocked test-amd64-i386-i386-sid-netboot-pvgrub blocked test-amd64-i386-amd64-sid-netboot-pygrub blocked test-armhf-armhf-armhf-sid-netboot-pygrubblocked test-amd64-amd64-i386-sid-netboot-pygrub blocked sg-report-flight on osstest.xs.citrite.net logs: /home/osstest/logs images: /home/osstest/images Logs, config files, etc. are available at http://osstest.xs.citrite.net/~osstest/testlogs/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Push not applicable. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] page content
Hi, Is there an API in Xen to get the page content of a known gfn ? I want to calculate the checksum of the content of the page and wanna see if it is possible ? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] page content
On 07/25/2016 11:29 AM, sepanta s wrote: > Hi, > Is there an API in Xen to get the page content of a known gfn ? > I want to calculate the checksum of the content of the page and wanna > see if it > is possible ? You can map the page with xc_map_foreign_range(), then just inspect the memory as if it's local to your application. Cheers, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 16/18] arm/altp2m: Extended libxl to activate altp2m on ARM.
On Sun, Jul 24, 2016 at 06:06:00PM +0200, Sergej Proskurin wrote: > Hi Wei, > > On 07/07/2016 06:27 PM, Wei Liu wrote: > > On Mon, Jul 04, 2016 at 01:45:45PM +0200, Sergej Proskurin wrote: > >> The current implementation allows to set the parameter HVM_PARAM_ALTP2M. > >> This parameter allows further usage of altp2m on ARM. For this, we > >> define an additional altp2m field for PV domains as part of the > >> libxl_domain_type struct. This field can be set only on ARM systems > >> through the "altp2m" switch in the domain's configuration file (i.e. > >> set altp2m=1). > >> > >> Signed-off-by: Sergej Proskurin > >> --- > >> Cc: Ian Jackson > >> Cc: Wei Liu > >> --- > >> tools/libxl/libxl_create.c | 1 + > >> tools/libxl/libxl_dom.c | 14 ++ > >> tools/libxl/libxl_types.idl | 1 + > >> tools/libxl/xl_cmdimpl.c| 5 + > >> 4 files changed, 21 insertions(+) > >> > >> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c > >> index 1b99472..40b5f61 100644 > >> --- a/tools/libxl/libxl_create.c > >> +++ b/tools/libxl/libxl_create.c > >> @@ -400,6 +400,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, > >> b_info->cmdline = b_info->u.pv.cmdline; > >> b_info->u.pv.cmdline = NULL; > >> } > >> +libxl_defbool_setdefault(&b_info->u.pv.altp2m, false); > >> break; > >> default: > >> LOG(ERROR, "invalid domain type %s in create info", > >> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c > >> index ec29060..ab023a2 100644 > >> --- a/tools/libxl/libxl_dom.c > >> +++ b/tools/libxl/libxl_dom.c > >> @@ -277,6 +277,16 @@ err: > >> } > >> #endif > >> > >> +#if defined(__arm__) || defined(__aarch64__) > >> +static void pv_set_conf_params(xc_interface *handle, uint32_t domid, > >> + libxl_domain_build_info *const info) > >> +{ > >> +if ( libxl_defbool_val(info->u.pv.altp2m) ) > > > > Coding style. > > > > I am not sure which part does not fulfill Xen's coding style. Could you > be more specific please? Thank you. > There is a CODING_STYLE file in libxl as well. The coding style in tools/ is generally different from the coding style in hypervisor with the exception of libxc. Sorry for being too terse on this. Do ask questions! :-) > >> +xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M, > >> + libxl_defbool_val(info->u.pv.altp2m)); > >> +} > >> +#endif > >> + > >> static void hvm_set_conf_params(xc_interface *handle, uint32_t domid, > >> libxl_domain_build_info *const info) > >> { > >> @@ -433,6 +443,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, > >> return rc; > >> #endif > >> } > >> +#if defined(__arm__) || defined(__aarch64__) > >> +else /* info->type == LIBXL_DOMAIN_TYPE_PV */ > >> +pv_set_conf_params(ctx->xch, domid, info); > >> +#endif > >> > >> rc = libxl__arch_domain_create(gc, d_config, domid); > >> > >> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl > >> index ef614be..0a164f9 100644 > >> --- a/tools/libxl/libxl_types.idl > >> +++ b/tools/libxl/libxl_types.idl > >> @@ -554,6 +554,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ > >>("features", string, {'const': > >> True}), > >># Use host's E820 for PCI > >> passthrough. > >>("e820_host", libxl_defbool), > >> + ("altp2m", libxl_defbool), > > > > Why is this placed in PV instead of arch_arm? > > > > The current implementation mirrors the x86's altp2m, where the altp2m > field represents a property of HVM guests in b_info->u.hvm.altp2m. Since > guests on ARM are marked as PV, I have placed the field altp2m into > b_info->u.pv.altp2m. > > However, if you believe that it would make more sense to place altp2m > into b_info->arch_arm.altp2m, I will try to adapt the affected fields. > OK. I don't think that one should be placed in u.pv because that union is for x86. However it is also not suitable to just promote that to a common field because x86 pv doesn't have altp2m support. I think arch_arm would be a better location. > > You would also need a LIBXL_HAVE macro in libxl.h for the new field. > > > > There is already a LIBXL_HAVE_ALTP2M macro defined in libxl.h. Or did > you mean using an explicit one for ARM? > Yes, we need a new one for ARM. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 16/17] libxc/xc_dom_arm: Copy ACPI tables to guest space
On Thu, Jul 21, 2016 at 10:15 PM, Stefano Stabellini wrote: >> You are assuming that the guest will map the ACPI blob with the same >> attributes as the rest of the superpage. >> >> IHMO, a sane operating system will want to map the ACPI blob read-only. > > That's true. But there are other things which might be mapped > differently and could shatter a stage-1 superpage mapping (especially on > x86 that has a much more complex memory map than ARM). Obviously adding > one more is not doing it any good, but it might not make a difference in > practice. > > Anyway, I agree with Julien that his suggestion is the best for ARM. If > the libxl maintainers are willing to accept two different code paths for > this on ARM and x86, then I am fine with it too. Sorry to be a bit late to this thread -- there's a interface principle that I think we should at some point have a larger discussion about: whether "maxmem" means the amount of RAM which the guest sees as RAM, or whether "maxmem" means the amount of RAM that the administrator sees as used by the guest. At the moment tnhere's no consistent answer actually; but I am strongly of the opinion that for usability the best answer is for "memory" to be the *total* amount of *host* memory used by the guest. In an ideal world, the admin should be able to do "xl info", see that there is 3000MiB free, and then start a guest with 3000MiB and expect it to succeed. At the moment he has to guess. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 16/17] libxc/xc_dom_arm: Copy ACPI tables to guest space
On 2016/7/20 17:32, Wei Liu wrote: > On Wed, Jul 20, 2016 at 02:52:05PM +0800, Shannon Zhao wrote: >> > >> > >> > On 2016/7/19 18:38, Wei Liu wrote: >>> > > On Fri, Jul 15, 2016 at 05:39:32PM +0800, Shannon Zhao wrote: >>> > > [...] >>> > >>> > > >>> > >>> > > It would be trivial to have another option in xl.cfg to allow >>> > >>> > > MB >>> > >>> > > granularity. But I don't think that's a good idea. Asking for >>> > >>> > > more >>> > >>> > > memory when you don't really know how much is enough is not >>> > >>> > > very useful. >>> > >>> > > If an admin can know how much is needed, surely the library >>> > >>> > > can be >>> > >>> > > taught to obtain that knowledge, too. >>> > >>> > > >>> > >>> > > We need to decide which model we should go with. And, if we >>> > >>> > > decide to >>> > >>> > > diverge, document the difference between x86 and ARM model. >>> > >>> > > > > >> > Hi Wei, > > >> > > > >> > Do you decide how to add the size of ACPI blob to max_memkb? > > >> > >>> > > AFAICT ARM and x86 maintainers hold different opinions on how memory >>> > > should be accounted. >>> > > >>> > > I would like to have a unified memory accounting model. But if we can't >>> > > have that at the moment, I'm fine with divergence, but please document >>> > > it somewhere (comment near code snippet, in header, or a file under docs >>> > > etc). And the amount added to max_memkb needs to be properly calculated, >>> > > not some magic number, so that we have a chance in the future to >>> > > confidently change how we do thing. >> > If it's only allowed to add the size to max_memkb in >> > libxl__domain_build_info_setdefault(), it only can use a const number >> > since the tables are not generted and we don;t know the size. But the >> > const number could be chosen properly since we could know the maximum >> > ACPI tables size of current ARM approach. >> > >> > But maybe in the future, if we add some new ACPI tables, it should >> > increase the size as well. So I think this should be documented. >> > >> > As I asked before, is it ok to add the size to max_memkb after >> > generating the ACPI tables and before loading the tables to guest space? >> > > Yes. I don't think shoehorning everything into setdefault is a good > idea. > > I think libxl_arm.c:libxl__arch_domain_create would be the right place > to do it. > > I am thinking about calling xc_domain_setmaxmem there, but not adding a > number to d_config->b_info.max_memkb. Adding that to ->max_memkb would > be wrong because the bigger ->max_memkb will be recored and the same > algorithm will be applied every time you migrate your guest, so the > max_memkb will grow bigger and bigger. > > Given the different approach taken by ARM and x86, maybe we need to also > record the size of acpi blobs somewhere in xenstore (also needs to be > documented) so that subsequent libxl_domain_setmaxmem can extract that > number again. > > Please wait a bit for Stefano and Julien to comment before you do work. > Stefano, Julien, Any comments regarding how to add the ACPI size to max_memkb? Thanks, -- Shannon ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL
On Fri, Jul 22, 2016 at 03:27:30AM +, osstest service owner wrote: > flight 97737 xen-unstable real [real] > http://logs.test-lab.xenproject.org/osstest/logs/97737/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-armhf-armhf-xl 15 guest-start/debian.repeat fail REGR. vs. > 97664 From http://logs.test-lab.xenproject.org/osstest/logs/97737/test-armhf-armhf-xl/serial-cubietruck-picasso.log Jul 21 17:08:59.405183 [ 4479.814529] [ cut here ] Jul 21 17:09:16.961529 [ 4479.814600] kernel BUG at drivers/xen/grant-table.c:923! Jul 21 17:09:16.966838 [ 4479.814628] Internal error: Oops - BUG: 0 [#1] SMP ARM Jul 21 17:09:16.972090 [ 4479.814656] Modules linked in: xen_gntalloc bridge stp ipv6 llc brcmfmac brcmutil cfg80211 Jul 21 17:09:16.980340 [ 4479.814759] CPU: 1 PID: 24761 Comm: vif5.0-q0-guest Not tainted 3.16.7-ckt12+ #1 Jul 21 17:09:16.987841 [ 4479.814795] task: d8ef7600 ti: d85bc000 task.ti: d85bc000 Jul 21 17:09:16.993339 [ 4479.814833] PC is at gnttab_batch_copy+0xd0/0xe4 Jul 21 17:09:16.997963 [ 4479.814860] LR is at gnttab_batch_copy+0x1c/0xe4 Jul 21 17:09:17.002718 [ 4479.814888] pc : []lr : [] psr: a0070013 Jul 21 17:09:17.008962 [ 4479.814888] sp : d85bdea0 ip : deadbeef fp : c0c8e140 Jul 21 17:09:17.014341 [ 4479.814935] r10: r9 : e1bec000 r8 : Jul 21 17:09:17.019595 [ 4479.814960] r7 : 0002 r6 : 0002 r5 : d85bdf20 r4 : e1bf4d30 Jul 21 17:09:17.026095 [ 4479.814990] r3 : 0001 r2 : deadbeef r1 : deadbeef r0 : fff2 Jul 21 17:09:17.032717 [ 4479.815021] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Jul 21 17:09:17.040091 [ 4479.815055] Control: 10c5387d Table: 78d8406a DAC: 0015 Jul 21 17:09:17.045964 [ 4479.815084] Process vif5.0-q0-guest (pid: 24761, stack limit = 0xd85bc248) Jul 21 17:09:17.052840 [ 4479.815114] Stack: (0xd85bdea0 to 0xd85be000) Jul 21 17:09:17.057218 [ 4479.815145] dea0: 0001 d8b11388 d85bdf20 d85bdf04 0002 c05eb054 0388 Jul 21 17:09:17.065469 [ 4479.815183] dec0: d85bdf04 c0b7ea80 db0995c0 c05e86e4 e1bf4000 003c Jul 21 17:09:17.073753 [ 4479.815221] dee0: c0b8849c e1bf4cfc c0c8e140 e1bf4d30 e1bf4cc4 Jul 21 17:09:17.082001 [ 4479.815260] df00: db0c3e80 d85bdf08 d85bdf08 d8c5cb40 d8c5cb40 0001 Jul 21 17:09:17.090217 [ 4479.815298] df20: 0002 0001 e1bf4d30 e1c1f530 04c6 023c Jul 21 17:09:17.098466 [ 4479.815337] df40: d84aab80 e1bec000 c05ea990 Jul 21 17:09:17.106720 [ 4479.815375] df60: c0266238 00f8 e1bec000 Jul 21 17:09:17.114844 [ 4479.815414] df80: d85bdf80 d85bdf80 d85bdf90 d85bdf90 d85bdfac d84aab80 Jul 21 17:09:17.123093 [ 4479.815451] dfa0: c0266168 c020f138 Jul 21 17:09:17.131345 [ 4479.815489] dfc0: Jul 21 17:09:17.139596 [ 4479.815527] dfe0: 0013 Jul 21 17:09:17.147841 [ 4479.815583] [] (gnttab_batch_copy) from [] (xenvif_kthread_guest_rx+0x6c4/0xb58) Jul 21 17:09:17.156969 [ 4479.815636] [] (xenvif_kthread_guest_rx) from [] (kthread+0xd0/0xe8) Jul 21 17:09:17.165217 [ 4479.815681] [] (kthread) from [] (ret_from_fork+0x14/0x3c) Jul 21 17:09:17.172467 [ 4479.815721] Code: e1c432b4 eae0 e7f001f2 e8bd80f8 (e7f001f2) Jul 21 17:09:17.178595 [ 4479.815766] ---[ end trace 6ba7d172d52e24e2 ]--- ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 16/18] arm/altp2m: Extended libxl to activate altp2m on ARM.
Hi Wei, On 07/25/2016 10:32 AM, Wei Liu wrote: > On Sun, Jul 24, 2016 at 06:06:00PM +0200, Sergej Proskurin wrote: >> Hi Wei, >> >> On 07/07/2016 06:27 PM, Wei Liu wrote: >>> On Mon, Jul 04, 2016 at 01:45:45PM +0200, Sergej Proskurin wrote: The current implementation allows to set the parameter HVM_PARAM_ALTP2M. This parameter allows further usage of altp2m on ARM. For this, we define an additional altp2m field for PV domains as part of the libxl_domain_type struct. This field can be set only on ARM systems through the "altp2m" switch in the domain's configuration file (i.e. set altp2m=1). Signed-off-by: Sergej Proskurin --- Cc: Ian Jackson Cc: Wei Liu --- tools/libxl/libxl_create.c | 1 + tools/libxl/libxl_dom.c | 14 ++ tools/libxl/libxl_types.idl | 1 + tools/libxl/xl_cmdimpl.c| 5 + 4 files changed, 21 insertions(+) diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index 1b99472..40b5f61 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -400,6 +400,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, b_info->cmdline = b_info->u.pv.cmdline; b_info->u.pv.cmdline = NULL; } +libxl_defbool_setdefault(&b_info->u.pv.altp2m, false); break; default: LOG(ERROR, "invalid domain type %s in create info", diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index ec29060..ab023a2 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -277,6 +277,16 @@ err: } #endif +#if defined(__arm__) || defined(__aarch64__) +static void pv_set_conf_params(xc_interface *handle, uint32_t domid, + libxl_domain_build_info *const info) +{ +if ( libxl_defbool_val(info->u.pv.altp2m) ) >>> Coding style. >>> >> I am not sure which part does not fulfill Xen's coding style. Could you >> be more specific please? Thank you. >> > There is a CODING_STYLE file in libxl as well. The coding style in > tools/ is generally different from the coding style in hypervisor with > the exception of libxc. > > Sorry for being too terse on this. Do ask questions! :-) It's all good, no worries :) Thank you for the hint (it must be the indentation inside the if-clause brackets). I was still in the Xen coding style, when I wrote this part: Now, I know better, thank you :) +xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M, + libxl_defbool_val(info->u.pv.altp2m)); +} +#endif + static void hvm_set_conf_params(xc_interface *handle, uint32_t domid, libxl_domain_build_info *const info) { @@ -433,6 +443,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, return rc; #endif } +#if defined(__arm__) || defined(__aarch64__) +else /* info->type == LIBXL_DOMAIN_TYPE_PV */ +pv_set_conf_params(ctx->xch, domid, info); +#endif rc = libxl__arch_domain_create(gc, d_config, domid); diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index ef614be..0a164f9 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -554,6 +554,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("features", string, {'const': True}), # Use host's E820 for PCI passthrough. ("e820_host", libxl_defbool), + ("altp2m", libxl_defbool), >>> Why is this placed in PV instead of arch_arm? >>> >> The current implementation mirrors the x86's altp2m, where the altp2m >> field represents a property of HVM guests in b_info->u.hvm.altp2m. Since >> guests on ARM are marked as PV, I have placed the field altp2m into >> b_info->u.pv.altp2m. >> >> However, if you believe that it would make more sense to place altp2m >> into b_info->arch_arm.altp2m, I will try to adapt the affected fields. >> > OK. I don't think that one should be placed in u.pv because that union > is for x86. However it is also not suitable to just promote that to > a common field because x86 pv doesn't have altp2m support. > > I think arch_arm would be a better location. > Alright, I will move the field altp2m to arch_arm and adopt the initialization routines. >>> You would also need a LIBXL_HAVE macro in libxl.h for the new field. >>> >> There is already a LIBXL_HAVE_ALTP2M macro defined in libxl.h. Or did >> you mean using an explicit one for ARM? >> > Yes, we need a new one for ARM. > Ok. I will provide a LIBXL_HAVE_ARM_ALTP2M macro. Thank you. Best regards, ~Serg
Re: [Xen-devel] [PATCH linux v2 0/9] xen: pvhvm: support bootup on secondary vCPUs
Vitaly Kuznetsov writes: > It may happen that Xen's and Linux's ideas of vCPU id diverge. In > particular, when we crash on a secondary vCPU we may want to do kdump > and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting > on the vCPU which crashed. This doesn't work very well for PVHVM guests as > we have a number of hypercalls where we pass vCPU id as a parameter. These > hypercalls either fail or do something unexpected. To solve the issue we > need to have a mapping between Linux's and Xen's vCPU ids. > > This series solves the issue for x86 PVHVM guests. PV guests don't (and > probably won't) support kdump so I always assume Xen's vCPU id == Linux's > vCPU id. ARM guests will probably need to get proper mapping once we start > supporting kexec/kdump there. > > Changes since v1: > - "x86/acpi: store ACPI ids from MADT for future usage" patch added. > - Use ACPI ids instead of vLAPIC ids/2 [Andrew Cooper, Jan Beulich] > - Introduce xen_vcpu_nr() helper [David Vrabel]. > - Modify all callers of HYPERVISOR_vcpu_op() instead of modifying > HYPERVISOR_vcpu_op() [David Vrabel] David, Boris, I didn't get any feedback for the v2 of this patchset, is it ready to go in for 4.8 or is there anything else I need to do to make this happen? BTW, the patch describing the de facto "ACPI id == Xen's VCPU id" policy is in xen.git: commit ea210c52abb6458e39f5365f7f2c3abb9c191c47 Author: Vitaly Kuznetsov Date: Tue Jul 12 13:44:42 2016 +0200 x86, hvm: document the de facto policy for vCPU ids Thanks, -- Vitaly ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] systemd: use standard dependencies for xendriverdomain.service
On Sun, Jul 24, 2016 at 09:26:57PM +0200, Marek Marczykowski-Górecki wrote: > Having DefaultDependencies=no means it can be started before / is > remounted read-write, which will result in various failures (to start > with opening the log). > Since "libxl: trigger attach events for devices attached before xl devd > startup" it is no longer important to start it as early as possible, > because it will process devices created before its startup. > > Cc: Ian Jackson > Cc: Wei Liu > Signed-off-by: Marek Marczykowski-Górecki The reasoning makes sense. Acked-by: Wei Liu Also CC Rusty (author of this file) > --- > tools/hotplug/Linux/systemd/xendriverdomain.service.in | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/tools/hotplug/Linux/systemd/xendriverdomain.service.in > b/tools/hotplug/Linux/systemd/xendriverdomain.service.in > index c0cd454..0afb54d 100644 > --- a/tools/hotplug/Linux/systemd/xendriverdomain.service.in > +++ b/tools/hotplug/Linux/systemd/xendriverdomain.service.in > @@ -1,6 +1,5 @@ > [Unit] > Description=Xen driver domain device daemon > -DefaultDependencies=no > Requires=proc-xen.mount > After=proc-xen.mount > ConditionVirtualization=xen > -- > 2.5.5 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources
On Sat, Jul 23, 2016 at 06:18:23AM +0800, Bob Liu wrote: > > On 07/22/2016 07:45 PM, Roger Pau Monné wrote: > > On Fri, Jul 22, 2016 at 05:43:32PM +0800, Bob Liu wrote: > >> > >> On 07/22/2016 05:34 PM, Roger Pau Monné wrote: > >>> On Fri, Jul 22, 2016 at 04:17:48PM +0800, Bob Liu wrote: > > On 07/22/2016 03:45 PM, Roger Pau Monné wrote: > > On Thu, Jul 21, 2016 at 06:08:05PM +0800, Bob Liu wrote: > >> > >> On 07/21/2016 04:57 PM, Roger Pau Monné wrote: > ..[snip].. > + > +static ssize_t dynamic_reconfig_device(struct blkfront_info *info, > ssize_t count) > +{ > +unsigned int i; > +int err = -EBUSY; > + > +/* > + * Make sure no migration in parallel, device lock is actually a > + * mutex. > + */ > +if (!device_trylock(&info->xbdev->dev)) { > +pr_err("Fail to acquire dev:%s lock, may be in > migration.\n", > +dev_name(&info->xbdev->dev)); > +return err; > +} > + > +/* > + * Prevent new requests and guarantee no uncompleted reqs. > + */ > +blk_mq_freeze_queue(info->rq); > +if (part_in_flight(&info->gd->part0)) > +goto out; > + > +/* > + * FrontBackend > + * Switch to XenbusStateClosed > + * frontend_changed(): > + * case XenbusStateClosed: > + * > xen_blkif_disconnect() > + * Switch to > XenbusStateClosed > + * blkfront_resume(): > + * frontend_changed(): > + * reconnect > + * Wait until XenbusStateConnected > + */ > +info->reconfiguring = true; > +xenbus_switch_state(info->xbdev, XenbusStateClosed); > + > +/* Poll every 100ms, 1 minute timeout. */ > +for (i = 0; i < 600; i++) { > +/* > + * Wait backend enter XenbusStateClosed, > blkback_changed() > + * will clear reconfiguring. > + */ > +if (!info->reconfiguring) > +goto resume; > +schedule_timeout_interruptible(msecs_to_jiffies(100)); > +} > >>> > >>> Instead of having this wait, could you just set info->reconfiguring = > >>> 1, set > >>> the frontend state to XenbusStateClosed and mimic exactly what a > >>> resume from > >>> suspension does? blkback_changed would have to set the frontend state > >>> to > >>> InitWait when it detects that the backend has switched to Closed, and > >>> call > >>> blkfront_resume. > >> > >> > >> I think that won't work. > >> In the real "resume" case, the power management system will trigger > >> all ->resume() path. > >> But there is no place for dynamic configuration. > > > > Hello, > > > > I think it should be possible to set info->reconfiguring and wait for > > the > > backend to switch to state Closed, at that point we should call > > blkif_resume > > (from blkback_changed) and the backend will follow the reconection. > > > > Okay, I get your point. Yes, that's an option. > > But this will make 'dynamic configuration' to be async, I'm worry about > the end-user will get panic. > E.g > A end-user "echo > /sys/devices/vbd-xxx/max_indirect_segs", > but then the device will be Closed and disappeared, the user have to > wait for a random time so that the device can resume. > >>> > >>> That should not happen, AFAICT on migration the device never dissapears. > >> > >> Oh, yes. > >> > >>> alloc_disk and friends should not be called on resume from migration (see > >>> the switch in blkfront_connect, you should take the BLKIF_STATE_SUSPENDED > >>> path for the reconfiguration). > >>> > >> > >> What about if the end-user starts I/O immediately after writing new value > >> to /sys? > >> But the resume is still in progress. > > > > blkif_free already stops the queues by calling blk_mq_stop_hw_queues, and > > blkif_queue_request will refuse to put anything on the ring if the state > > is different than connected, which in turn makes blkif_queue_rq call > > blk_mq_stop_hw_queue to stop the queue, so it should be safe. > > > > But this will surprise the end-user, our user script is like this: > 1) echo > /sys/ > 2) Start I/O im
Re: [Xen-devel] [PATCH 2/2] systemd: drop dependency on proc-xen.mount in xendriverdomain.service
On Sun, Jul 24, 2016 at 09:27:13PM +0200, Marek Marczykowski-Górecki wrote: > It is no longer required since xl devd use /dev/xen interface. > How would this unit work when there is no /dev/xen interface? To be precise, we prefer /dev/xen interfaces whenever possible but there is a fallback to /proc/xen. Note that a lot of other unit files have this dependency on proc-xen.mount. I'm inclined to say we should keep this dependency but I'm not sure if I missed some obvious things. > Cc: Ian Jackson > Cc: Wei Liu > Signed-off-by: Marek Marczykowski-Górecki > --- > tools/hotplug/Linux/systemd/xendriverdomain.service.in | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/tools/hotplug/Linux/systemd/xendriverdomain.service.in > b/tools/hotplug/Linux/systemd/xendriverdomain.service.in > index 0afb54d..a100309 100644 > --- a/tools/hotplug/Linux/systemd/xendriverdomain.service.in > +++ b/tools/hotplug/Linux/systemd/xendriverdomain.service.in > @@ -1,7 +1,5 @@ > [Unit] > Description=Xen driver domain device daemon > -Requires=proc-xen.mount > -After=proc-xen.mount > ConditionVirtualization=xen > > [Service] > -- > 2.5.5 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4] x86/cpuid: AVX-512 Feature Detection
On 25/07/16 07:09, Kang, Luwei wrote: First of all - please don't top post. > What about remove the dependency between AVX2 and AVX512F >> ( AVX2: [AVX512F], ) ? Yes, that's what I think we want, but we need Andrew's agreement here. >>> Hi Andrew, what is your opinion ? >> You are in a better position to answer than me. >> >> For a specific instruction which may be VEX and EVEX encoded, is the >> circuitry >> for a specific instruction shared, or discrete? Is there a plausible future >> where you might support only the EVEX variant of an instruction, and not the >> VEX variant? >> >> These dependences are about what may be reasonably assumed about the >> way the environment is structured. It doesn't look reasonable to advertise >> an AVX512 environment to guests while at the same time stating that AVX2 is >> not present. If this is correct, then the dependency should stay. >> If Intel plausibly things it might release hardware with !AVX2 but AVX512, >> then the dependency should be dropped. > Regarding the dependency between AVX2 and AVX512F, I have ask some hardware > architecture engineer. > > AVX512 is a superset of AVX2, the most important item being the state. If the > state for AVX2 isn't enabled (in XCR0), then AVX512 also can't function. > > So if we want to use AVX512, AVX2 must be supported and enabled. The > dependence between AVX512F and AVX2 may need be reserved. Ok, so AVX512 strictly depends on AVX2. In which case, the python code was correct. The meaning of the key/value relationship is "if the feature in the key is not present, all features in the value must also be disabled". ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/2] systemd: drop dependency on proc-xen.mount in xendriverdomain.service
On Mon, Jul 25, 2016 at 10:27:47AM +0100, Wei Liu wrote: > On Sun, Jul 24, 2016 at 09:27:13PM +0200, Marek Marczykowski-Górecki wrote: > > It is no longer required since xl devd use /dev/xen interface. > > > > How would this unit work when there is no /dev/xen interface? Does it happen in reality? I thought /proc/xen is deprecated for a long time... > To be precise, we prefer /dev/xen interfaces whenever possible but there > is a fallback to /proc/xen. Note that a lot of other unit files have > this dependency on proc-xen.mount. > > I'm inclined to say we should keep this dependency but I'm not sure if I > missed some obvious things. > > > Cc: Ian Jackson > > Cc: Wei Liu > > Signed-off-by: Marek Marczykowski-Górecki > > --- > > tools/hotplug/Linux/systemd/xendriverdomain.service.in | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/tools/hotplug/Linux/systemd/xendriverdomain.service.in > > b/tools/hotplug/Linux/systemd/xendriverdomain.service.in > > index 0afb54d..a100309 100644 > > --- a/tools/hotplug/Linux/systemd/xendriverdomain.service.in > > +++ b/tools/hotplug/Linux/systemd/xendriverdomain.service.in > > @@ -1,7 +1,5 @@ > > [Unit] > > Description=Xen driver domain device daemon > > -Requires=proc-xen.mount > > -After=proc-xen.mount > > ConditionVirtualization=xen > > > > [Service] > > -- > > 2.5.5 > > -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? signature.asc Description: PGP signature ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/2] systemd: drop dependency on proc-xen.mount in xendriverdomain.service
On Mon, Jul 25, 2016 at 11:30:32AM +0200, Marek Marczykowski-Górecki wrote: > On Mon, Jul 25, 2016 at 10:27:47AM +0100, Wei Liu wrote: > > On Sun, Jul 24, 2016 at 09:27:13PM +0200, Marek Marczykowski-Górecki wrote: > > > It is no longer required since xl devd use /dev/xen interface. > > > > > > > How would this unit work when there is no /dev/xen interface? > > Does it happen in reality? I thought /proc/xen is deprecated for a long > time... > I can't tell whether it happens in reality or not. I also hope to get rid of deprecated interface but they are here to stay for as long as it takes. I guess the question is, do you have a compelling reason against keeping this dependency? Wei. > > To be precise, we prefer /dev/xen interfaces whenever possible but there > > is a fallback to /proc/xen. Note that a lot of other unit files have > > this dependency on proc-xen.mount. > > > > I'm inclined to say we should keep this dependency but I'm not sure if I > > missed some obvious things. > > > > > Cc: Ian Jackson > > > Cc: Wei Liu > > > Signed-off-by: Marek Marczykowski-Górecki > > > > > > --- > > > tools/hotplug/Linux/systemd/xendriverdomain.service.in | 2 -- > > > 1 file changed, 2 deletions(-) > > > > > > diff --git a/tools/hotplug/Linux/systemd/xendriverdomain.service.in > > > b/tools/hotplug/Linux/systemd/xendriverdomain.service.in > > > index 0afb54d..a100309 100644 > > > --- a/tools/hotplug/Linux/systemd/xendriverdomain.service.in > > > +++ b/tools/hotplug/Linux/systemd/xendriverdomain.service.in > > > @@ -1,7 +1,5 @@ > > > [Unit] > > > Description=Xen driver domain device daemon > > > -Requires=proc-xen.mount > > > -After=proc-xen.mount > > > ConditionVirtualization=xen > > > > > > [Service] > > > -- > > > 2.5.5 > > > > > > -- > Best Regards, > Marek Marczykowski-Górecki > Invisible Things Lab > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian
On 22.07.2016 12:21, Ingo Jürgensmann wrote: On 22.07.2016 11:03, Ingo Jürgensmann wrote: In the meanwhile, I activated IPv6 again on Tuesday evening and today the server crashed again some minutes ago. Here's the output from netconsole: ... and the second subsequent crash: ... and another crash below... But in the meanwhile, I wonder if anyone except Andreas and I has some interest in fixing this issue, as there were absolutely no comments or feedback on my previous mails. Quite disappointing. :-/ Jul 25 10:50:00 31.172.31.251 [257646.574819] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2+deb8u3 Jul 25 10:50:00 31.172.31.251 [257646.574874] Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 3.0a 01/03/2014 Jul 25 10:50:00 31.172.31.251 [257646.574934] task: 8181a460 ti: 8180 task.ti: 8180 Jul 25 10:50:00 31.172.31.251 [257646.575080] RIP: e030:[] Jul 25 10:50:00 31.172.31.251 [] memcpy+0x6/0x110 Jul 25 10:50:00 31.172.31.251 [257646.575140] RSP: e02b:880280e03b58 EFLAGS: 00010286 Jul 25 10:50:00 31.172.31.251 [257646.575171] RAX: 880269992870 RBX: 88026e0ceb00 RCX: 014e Jul 25 10:50:00 31.172.31.251 [257646.575221] RDX: 0300 RSI: 88026f42a000 RDI: 880269992a22 Jul 25 10:50:00 31.172.31.251 [257646.575271] RBP: 880280e03c38 R08: 06c0 R09: 880269992862 Jul 25 10:50:00 31.172.31.251 [257646.575321] R10: 0100 R11: 01c83452 R12: 8801d7cf7500 Jul 25 10:50:00 31.172.31.251 [257646.575370] R13: 88026f429e66 R14: 880259ca34c0 R15: 0308 Jul 25 10:50:00 31.172.31.251 [257646.575425] FS: () GS:880280e0() knlGS:880280ee Jul 25 10:50:00 31.172.31.251 [257646.575477] CS: e033 DS: ES: CR0: 80050033 Jul 25 10:50:00 31.172.31.251 [257646.575509] CR2: 88026f42a000 CR3: 000270edb000 CR4: 00042660 Jul 25 10:50:00 31.172.31.251 [257646.575559] Stack: Jul 25 10:50:00 31.172.31.251 [257646.575585] 814d319f Jul 25 10:50:00 31.172.31.251 88007c6c4a00 Jul 25 10:50:00 31.172.31.251 880280e03ba8 Jul 25 10:50:00 31.172.31.251 9401294600a7012a Jul 25 10:50:00 31.172.31.251 Jul 25 10:50:00 31.172.31.251 [257646.575652] 0100 Jul 25 10:50:00 31.172.31.251 814a000a Jul 25 10:50:00 31.172.31.251 6f1d3e90 Jul 25 10:50:00 31.172.31.251 80fe Jul 25 10:50:00 31.172.31.251 Jul 25 10:50:00 31.172.31.251 [257646.575719] 1ad902feff7ac40e Jul 25 10:50:00 31.172.31.251 88026826eb00 Jul 25 10:50:00 31.172.31.251 224afc3e1600 Jul 25 10:50:00 31.172.31.251 8801d7cf7500 Jul 25 10:50:00 31.172.31.251 Jul 25 10:50:00 31.172.31.251 [257646.575786] Call Trace: Jul 25 10:50:00 31.172.31.251 [257646.575813] Jul 25 10:50:00 31.172.31.251 Jul 25 10:50:00 31.172.31.251 [257646.575822] Jul 25 10:50:00 31.172.31.251 [] ? ndisc_send_redirect+0x3bf/0x410 Jul 25 10:50:00 31.172.31.251 [257646.575889] [] ? reg_vif_xmit+0xca/0xf0 Jul 25 10:50:00 31.172.31.251 [257646.575926] [] ? ip6_forward+0x71c/0x850 Jul 25 10:50:00 31.172.31.251 [257646.575961] [] ? ip6_route_input+0xa4/0xd0 Jul 25 10:50:00 31.172.31.251 [257646.575997] [] ? __netif_receive_skb_core+0x543/0x750 Jul 25 10:50:00 31.172.31.251 [257646.576034] [] ? netif_receive_skb_internal+0x1f/0x80 Jul 25 10:50:00 31.172.31.251 [257646.576078] [] ? br_handle_frame_finish+0x1c2/0x3c0 [bridge] Jul 25 10:50:00 31.172.31.251 [257646.576134] [] ? br_handle_frame+0x147/0x240 [bridge] Jul 25 10:50:00 31.172.31.251 [257646.576170] [] ? __netif_receive_skb_core+0x1c4/0x750 Jul 25 10:50:00 31.172.31.251 [257646.576209] [] ? xen_clocksource_get_cycles+0x1c/0x20 Jul 25 10:50:00 31.172.31.251 [257646.576244] [] ? netif_receive_skb_internal+0x1f/0x80 Jul 25 10:50:00 31.172.31.251 [257646.576282] [] ? xenvif_tx_action+0x49f/0x920 [xen_netback] Jul 25 10:50:00 31.172.31.251 [257646.576335] [] ? xenvif_poll+0x28/0x70 [xen_netback] Jul 25 10:50:00 31.172.31.251 [257646.576370] [] ? net_rx_action+0x140/0x240 Jul 25 10:50:00 31.172.31.251 [257646.576405] [] ? __do_softirq+0xf1/0x290 Jul 25 10:50:00 31.172.31.251 [257646.576439] [] ? irq_exit+0x95/0xa0 Jul 25 10:50:00 31.172.31.251 [257646.576475] [] ? xen_evtchn_do_upcall+0x35/0x50 Jul 25 10:50:00 31.172.31.251 [257646.576510] [] ? xen_do_hypervisor_callback+0x1e/0x30 Jul 25 10:50:00 31.172.31.251 [257646.576544] Jul 25 10:50:00 31.172.31.251 Jul 25 10:50:00 31.172.31.251 [257646.576553] Jul 25 10:50:00 31.172.31.251 [] ? xen_hypercall_sched_op+0xa/0x20 Jul 25 10:50:00 31.172.31.251 [257646.576615] [] ? xen_hypercall_sched_op+0xa/0x20 Jul 25 10:50:00 31.172.31.251 [257646.576651] [] ? xen_safe_halt+0xc/0x20 Jul 25 10:50:00 31.172.31.251 [257646.576686] [] ? default_idle+0x19/0xb0 Jul 25 10:50:00 31.172.31.251 [257646.576721] [] ? cpu_startup
Re: [Xen-devel] [PATCH 2/2] systemd: drop dependency on proc-xen.mount in xendriverdomain.service
On Mon, Jul 25, 2016 at 10:34:21AM +0100, Wei Liu wrote: > On Mon, Jul 25, 2016 at 11:30:32AM +0200, Marek Marczykowski-Górecki wrote: > > On Mon, Jul 25, 2016 at 10:27:47AM +0100, Wei Liu wrote: > > > On Sun, Jul 24, 2016 at 09:27:13PM +0200, Marek Marczykowski-Górecki > > > wrote: > > > > It is no longer required since xl devd use /dev/xen interface. > > > > > > > > > > How would this unit work when there is no /dev/xen interface? > > > > Does it happen in reality? I thought /proc/xen is deprecated for a long > > time... > > > > I can't tell whether it happens in reality or not. I also hope to get > rid of deprecated interface but they are here to stay for as long as it > takes. > > I guess the question is, do you have a compelling reason against keeping > this dependency? Just cleaning up old stuff, so - no. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? signature.asc Description: PGP signature ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records
On Thu, Jul 21, 2016 at 06:17:34PM +0100, Andrew Cooper wrote: > The sending side shouldn't send any variable sized records which end up having > zero content, but the receiving side will need to tolerate such records for > compatibility purposes. > > Signed-off-by: Andrew Cooper Reviewed-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/4] tools/libxc: Tolerate zero-length records in migration v2 streams
On Thu, Jul 21, 2016 at 06:17:35PM +0100, Andrew Cooper wrote: > Under some circumstances, the migration v2 save code would generate valid > records with zero content, when the intended behaviour was to omit the record > entirely. > > As the stream is otherwise fine, tolerate these records and avoid failing the > migration. > > Signed-off-by: Andrew Cooper Reviewed-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 16/17] libxc/xc_dom_arm: Copy ACPI tables to guest space
Hi George, On 25/07/16 09:38, George Dunlap wrote: On Thu, Jul 21, 2016 at 10:15 PM, Stefano Stabellini wrote: You are assuming that the guest will map the ACPI blob with the same attributes as the rest of the superpage. IHMO, a sane operating system will want to map the ACPI blob read-only. That's true. But there are other things which might be mapped differently and could shatter a stage-1 superpage mapping (especially on x86 that has a much more complex memory map than ARM). Obviously adding one more is not doing it any good, but it might not make a difference in practice. Anyway, I agree with Julien that his suggestion is the best for ARM. If the libxl maintainers are willing to accept two different code paths for this on ARM and x86, then I am fine with it too. Sorry to be a bit late to this thread -- there's a interface principle that I think we should at some point have a larger discussion about: whether "maxmem" means the amount of RAM which the guest sees as RAM, or whether "maxmem" means the amount of RAM that the administrator sees as used by the guest. At the moment tnhere's no consistent answer actually; but I am strongly of the opinion that for usability the best answer is for "memory" to be the *total* amount of *host* memory used by the guest. In an ideal world, the admin should be able to do "xl info", see that there is 3000MiB free, and then start a guest with 3000MiB and expect it to succeed. At the moment he has to guess. To confirm, do you include memory allocated by the hypervisor to keep track of the guest (i.e struc domain, struct vcpu...)? If not, the problem stays the same because the admin will have to know how much memory Xen will allocate to keep track of the guest. So if "xl info" tells you that 3000MiB is free, you will only be able to use 3000MiB - few kilobytes. If yes, this would be a problem for migration because a newer version of Xen may allocate less/more memory. Furthermore, with this suggestion, we will likely not be able to take advantage of 1GB superpage unless the admin knows how much memory will be used by Xen outside of the guest RAM (such as the console page, xenstore page, ). IHMO, it would be easier if we define "maxmem" as the usable for guest RAM and then let the toolstack take add the extra memory (ACPI blob, special pages...). Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/2] systemd: drop dependency on proc-xen.mount in xendriverdomain.service
On 25/07/16 10:34, Wei Liu wrote: > On Mon, Jul 25, 2016 at 11:30:32AM +0200, Marek Marczykowski-Górecki wrote: >> On Mon, Jul 25, 2016 at 10:27:47AM +0100, Wei Liu wrote: >>> On Sun, Jul 24, 2016 at 09:27:13PM +0200, Marek Marczykowski-Górecki wrote: It is no longer required since xl devd use /dev/xen interface. >>> How would this unit work when there is no /dev/xen interface? >> Does it happen in reality? I thought /proc/xen is deprecated for a long >> time... It really should be, but isn't. /proc/xen/capabilities and /proc/xen/allsyms (?) don't have any equivalent in /dev/ >> > I can't tell whether it happens in reality or not. I also hope to get > rid of deprecated interface but they are here to stay for as long as it > takes. > > I guess the question is, do you have a compelling reason against keeping > this dependency? We really should be making xenfs deprecated and able to be compiled out of Linux. It is a hangover of the early classic kernels which isn't necessary, and causes subtle/weird failures if it isn't mounted. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records
On Thu, Jul 21, 2016 at 06:17:36PM +0100, Andrew Cooper wrote: > It was never intended for records such as these to be sent with zero content. > Wouldn't it be better to modify write_split_record to ignore zero content instead of patching up different places? > Signed-off-by: Andrew Cooper > --- > CC: Ian Jackson > CC: Wei Liu > --- > tools/libxc/xc_sr_save_x86_hvm.c | 4 > tools/libxc/xc_sr_save_x86_pv.c | 12 > 2 files changed, 16 insertions(+) > > diff --git a/tools/libxc/xc_sr_save_x86_hvm.c > b/tools/libxc/xc_sr_save_x86_hvm.c > index ba50a43..5401bf9 100644 > --- a/tools/libxc/xc_sr_save_x86_hvm.c > +++ b/tools/libxc/xc_sr_save_x86_hvm.c > @@ -112,6 +112,10 @@ static int write_hvm_params(struct xc_sr_context *ctx) > } > } > > +/* No params? Skip this record. */ > +if ( hdr.count == 0 ) > +return 0; > + > rc = write_split_record(ctx, &rec, entries, hdr.count * > sizeof(*entries)); > if ( rc ) > PERROR("Failed to write HVM_PARAMS record"); > diff --git a/tools/libxc/xc_sr_save_x86_pv.c b/tools/libxc/xc_sr_save_x86_pv.c > index 4a29460..5fb9f2f 100644 > --- a/tools/libxc/xc_sr_save_x86_pv.c > +++ b/tools/libxc/xc_sr_save_x86_pv.c > @@ -607,6 +607,10 @@ static int write_one_vcpu_extended(struct xc_sr_context > *ctx, uint32_t id) > return -1; > } > > +/* No content? Skip the record. */ > +if ( domctl.u.ext_vcpucontext.size == 0 ) > +return 0; > + > return write_split_record(ctx, &rec, &domctl.u.ext_vcpucontext, >domctl.u.ext_vcpucontext.size); > } > @@ -662,6 +666,10 @@ static int write_one_vcpu_xsave(struct xc_sr_context > *ctx, uint32_t id) > goto err; > } > > +/* No xsave state? Skip this record. */ > +if ( domctl.u.vcpuextstate.size == 0 ) > +goto out; > + > rc = write_split_record(ctx, &rec, buffer, domctl.u.vcpuextstate.size); > if ( rc ) > goto err; > @@ -728,6 +736,10 @@ static int write_one_vcpu_msrs(struct xc_sr_context > *ctx, uint32_t id) > goto err; > } > > +/* No MSRs? Skip this record. */ > +if ( domctl.u.vcpu_msrs.msr_count == 0 ) > +goto out; > + > rc = write_split_record(ctx, &rec, buffer, > domctl.u.vcpu_msrs.msr_count * > sizeof(xen_domctl_vcpu_msr_t)); > -- > 2.1.4 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] tools/python: Adjust migration v2 library to warn about zero-length records
On Thu, Jul 21, 2016 at 06:17:37PM +0100, Andrew Cooper wrote: > These records shouldn't be in a stream, but accidentally are. Warn about > them, but don't abort the verification. > > While here, add a missing length check to the X86_PV_P2M_FRAMES record > checker. > > Signed-off-by: Andrew Cooper Reviewed-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/2] systemd: drop dependency on proc-xen.mount in xendriverdomain.service
On Mon, Jul 25, 2016 at 10:45:54AM +0100, Andrew Cooper wrote: > On 25/07/16 10:34, Wei Liu wrote: > > On Mon, Jul 25, 2016 at 11:30:32AM +0200, Marek Marczykowski-Górecki wrote: > >> On Mon, Jul 25, 2016 at 10:27:47AM +0100, Wei Liu wrote: > >>> On Sun, Jul 24, 2016 at 09:27:13PM +0200, Marek Marczykowski-Górecki > >>> wrote: > It is no longer required since xl devd use /dev/xen interface. > > >>> How would this unit work when there is no /dev/xen interface? > >> Does it happen in reality? I thought /proc/xen is deprecated for a long > >> time... > > It really should be, but isn't. > > /proc/xen/capabilities and /proc/xen/allsyms (?) don't have any > equivalent in /dev/ > > >> > > I can't tell whether it happens in reality or not. I also hope to get > > rid of deprecated interface but they are here to stay for as long as it > > takes. > > > > I guess the question is, do you have a compelling reason against keeping > > this dependency? > > We really should be making xenfs deprecated and able to be compiled out > of Linux. It is a hangover of the early classic kernels which isn't > necessary, and causes subtle/weird failures if it isn't mounted. > +1 from me of course! When the kernel is ready, I shall remove the fallback in libraries and then we can remove all references to proc-xen.mount and even proc-xen.mount itself. Wei. > ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 16/18] arm/altp2m: Extended libxl to activate altp2m on ARM.
Hello, On 25/07/16 10:04, Sergej Proskurin wrote: On 07/25/2016 10:32 AM, Wei Liu wrote: On Sun, Jul 24, 2016 at 06:06:00PM +0200, Sergej Proskurin wrote: +xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M, + libxl_defbool_val(info->u.pv.altp2m)); +} +#endif + static void hvm_set_conf_params(xc_interface *handle, uint32_t domid, libxl_domain_build_info *const info) { @@ -433,6 +443,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, return rc; #endif } +#if defined(__arm__) || defined(__aarch64__) +else /* info->type == LIBXL_DOMAIN_TYPE_PV */ +pv_set_conf_params(ctx->xch, domid, info); +#endif rc = libxl__arch_domain_create(gc, d_config, domid); diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index ef614be..0a164f9 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -554,6 +554,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("features", string, {'const': True}), # Use host's E820 for PCI passthrough. ("e820_host", libxl_defbool), + ("altp2m", libxl_defbool), Why is this placed in PV instead of arch_arm? The current implementation mirrors the x86's altp2m, where the altp2m field represents a property of HVM guests in b_info->u.hvm.altp2m. Since guests on ARM are marked as PV, I have placed the field altp2m into b_info->u.pv.altp2m. However, if you believe that it would make more sense to place altp2m into b_info->arch_arm.altp2m, I will try to adapt the affected fields. OK. I don't think that one should be placed in u.pv because that union is for x86. However it is also not suitable to just promote that to a common field because x86 pv doesn't have altp2m support. I think arch_arm would be a better location. Alright, I will move the field altp2m to arch_arm and adopt the initialization routines. Would not it make more sense to introduce a generic field 'altp2m' (i.e outside of hvm and arch_arm)? Otherwise, toolstack such as libvirt will need to have specific code to handle ARM and x86. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records
On 25/07/16 10:45, Wei Liu wrote: > On Thu, Jul 21, 2016 at 06:17:36PM +0100, Andrew Cooper wrote: >> It was never intended for records such as these to be sent with zero content. >> > Wouldn't it be better to modify write_split_record to ignore zero > content instead of patching up different places? That would catch the HVM_PARAMS, but not the other 3, which have an extra 8 byte header. However, having write_split_record() conditionally drop records is a violation of the principle of least surprise. Choosing to skip a record is an important decision and needs to be easy to spot in the code. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 16/18] arm/altp2m: Extended libxl to activate altp2m on ARM.
On Mon, Jul 25, 2016 at 10:49:41AM +0100, Julien Grall wrote: > Hello, > > On 25/07/16 10:04, Sergej Proskurin wrote: > >On 07/25/2016 10:32 AM, Wei Liu wrote: > >>On Sun, Jul 24, 2016 at 06:06:00PM +0200, Sergej Proskurin wrote: > >+xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M, > >+ libxl_defbool_val(info->u.pv.altp2m)); > >+} > >+#endif > >+ > > static void hvm_set_conf_params(xc_interface *handle, uint32_t domid, > > libxl_domain_build_info *const info) > > { > >@@ -433,6 +443,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, > > return rc; > > #endif > > } > >+#if defined(__arm__) || defined(__aarch64__) > >+else /* info->type == LIBXL_DOMAIN_TYPE_PV */ > >+pv_set_conf_params(ctx->xch, domid, info); > >+#endif > > > > rc = libxl__arch_domain_create(gc, d_config, domid); > > > >diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl > >index ef614be..0a164f9 100644 > >--- a/tools/libxl/libxl_types.idl > >+++ b/tools/libxl/libxl_types.idl > >@@ -554,6 +554,7 @@ libxl_domain_build_info = > >Struct("domain_build_info",[ > > ("features", string, {'const': > > True}), > > # Use host's E820 for PCI > > passthrough. > > ("e820_host", libxl_defbool), > >+ ("altp2m", libxl_defbool), > Why is this placed in PV instead of arch_arm? > > >>>The current implementation mirrors the x86's altp2m, where the altp2m > >>>field represents a property of HVM guests in b_info->u.hvm.altp2m. Since > >>>guests on ARM are marked as PV, I have placed the field altp2m into > >>>b_info->u.pv.altp2m. > >>> > >>>However, if you believe that it would make more sense to place altp2m > >>>into b_info->arch_arm.altp2m, I will try to adapt the affected fields. > >>> > >>OK. I don't think that one should be placed in u.pv because that union > >>is for x86. However it is also not suitable to just promote that to > >>a common field because x86 pv doesn't have altp2m support. > >> > >>I think arch_arm would be a better location. > >> > > > >Alright, I will move the field altp2m to arch_arm and adopt the > >initialization routines. > > Would not it make more sense to introduce a generic field 'altp2m' (i.e > outside of hvm and arch_arm)? > > Otherwise, toolstack such as libvirt will need to have specific code to > handle ARM and x86. > Hmm... this is a compelling reason. After mulling over this issue a bit more, I think I'm fine with promoting altp2m to a common field. Sergej, this is a patch to get you started. Please make sure setting the old field still works. If I'm not clear enough, please ask. Wei. ---8<--- diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index ef614be..81d3ae5 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -494,6 +494,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ # Note that the partial device tree should avoid to use the phandle # 65000 which is reserved by the toolstack. ("device_tree", string), +("altp2m", libxl_defbool), ("u", KeyedUnion(None, libxl_domain_type, "type", [("hvm", Struct(None, [("firmware", string), ("bios", libxl_bios_type), @@ -512,7 +513,7 @@ libxl_domain_build_info = Struct("domain_build_info",[ ("mmio_hole_memkb", MemKB), ("timer_mode", libxl_timer_mode), ("nested_hvm", libxl_defbool), - ("altp2m", libxl_defbool), + ("altp2m", libxl_defbool), # deprecated, please use common field ("smbios_firmware", string), ("acpi_firmware",string), ("hdtype", libxl_hdtype), ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records
On Mon, Jul 25, 2016 at 10:57:13AM +0100, Andrew Cooper wrote: > On 25/07/16 10:45, Wei Liu wrote: > > On Thu, Jul 21, 2016 at 06:17:36PM +0100, Andrew Cooper wrote: > >> It was never intended for records such as these to be sent with zero > >> content. > >> > > Wouldn't it be better to modify write_split_record to ignore zero > > content instead of patching up different places? > > That would catch the HVM_PARAMS, but not the other 3, which have an > extra 8 byte header. > > However, having write_split_record() conditionally drop records is a > violation of the principle of least surprise. Choosing to skip a record > is an important decision and needs to be easy to spot in the code. > Fair enough. > ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/vMSI-X: Fix host crash when shutting down guests with MSI capable devices
On 22/07/16 09:50, Sander Eikelenboom wrote: > Thursday, July 21, 2016, 12:18:37 PM, you wrote: > >> c/s 74c6dc2d "x86/vMSI-X: defer intercept handler registration" caused MSI-X >> table infrastructure not to always be initialised, but it missed one path >> which needed an is-initialised check. >> If a devices is passed through to a domain which is MSI capable but not MSI-X >> capable, the call to msixtbl_init() is omitted, but a >> XEN_DOMCTL_unbind_pt_irq >> hypercall still calls into msixtbl_pt_unregister(). This follows the linked >> list pointer which is still NULL. >> Introduce an is-initalised check to msixtbl_pt_unregister(). >> Furthermore, the purpose of the open-coded msixtbl_list.next check is rather >> subtle. Introduce an msixtbl_initialised() predicate instead, which makes >> its >> purpose far more obvious. >> Reported-by: Sander Eikelenboom >> Signed-off-by: Andrew Cooper >> --- >> CC: Jan Beulich >> CC: Sander Eikelenboom >> Sander - would you mind double checking this patch? >> --- > Hi Andrew, > > Just got the chance to test and it works for me ! > > Thanks, May I take that as a Test-by: then please? ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/vMSI-X: Fix host crash when shutting down guests with MSI capable devices
On 25/07/16 11:16, Andrew Cooper wrote: > On 22/07/16 09:50, Sander Eikelenboom wrote: >> Thursday, July 21, 2016, 12:18:37 PM, you wrote: >> >>> c/s 74c6dc2d "x86/vMSI-X: defer intercept handler registration" caused MSI-X >>> table infrastructure not to always be initialised, but it missed one path >>> which needed an is-initialised check. >>> If a devices is passed through to a domain which is MSI capable but not >>> MSI-X >>> capable, the call to msixtbl_init() is omitted, but a >>> XEN_DOMCTL_unbind_pt_irq >>> hypercall still calls into msixtbl_pt_unregister(). This follows the linked >>> list pointer which is still NULL. >>> Introduce an is-initalised check to msixtbl_pt_unregister(). >>> Furthermore, the purpose of the open-coded msixtbl_list.next check is rather >>> subtle. Introduce an msixtbl_initialised() predicate instead, which makes >>> its >>> purpose far more obvious. >>> Reported-by: Sander Eikelenboom >>> Signed-off-by: Andrew Cooper >>> --- >>> CC: Jan Beulich >>> CC: Sander Eikelenboom >>> Sander - would you mind double checking this patch? >>> --- >> Hi Andrew, >> >> Just got the chance to test and it works for me ! >> >> Thanks, > May I take that as a Test-by: then please? And of course, I meant Tested-by: ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records
On 21/07/16 18:17, Andrew Cooper wrote: > The sending side shouldn't send any variable sized records which end up having > zero content, but the receiving side will need to tolerate such records for > compatibility purposes. > > Signed-off-by: Andrew Cooper > --- > CC: Ian Jackson > CC: Wei Liu > --- > docs/specs/libxc-migration-stream.pandoc | 16 +++- > 1 file changed, 15 insertions(+), 1 deletion(-) > > diff --git a/docs/specs/libxc-migration-stream.pandoc > b/docs/specs/libxc-migration-stream.pandoc > index 31eba10..a90bc5d 100644 > --- a/docs/specs/libxc-migration-stream.pandoc > +++ b/docs/specs/libxc-migration-stream.pandoc > @@ -3,7 +3,7 @@ >Andrew Cooper <> >Wen Congyang <> >Yang Hongyang <> > -% Revision 1 > +% Revision 2 > > Introduction > > @@ -631,6 +631,10 @@ The set of valid records depends on the guest > architecture and type. No > assumptions should be made about the ordering or interleaving of > independent records. Record dependencies are noted below. > > +Some records have an exactly specified size. Some records have variable size > +depending on their content. A record with variable size which ends up being > +zero should be omitted entirely from the stream by the sending side. I disagree. I think the stream should include the records with the empty content. This gives better consistency and does not require changes to the stream. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian
On Mon, Jul 25, 2016 at 11:38:20AM +0200, Ingo Jürgensmann wrote: > On 22.07.2016 12:21, Ingo Jürgensmann wrote: > >On 22.07.2016 11:03, Ingo Jürgensmann wrote: > > > >>In the meanwhile, I activated IPv6 again on Tuesday evening and today > >>the server crashed again some minutes ago. Here's the output from > >>netconsole: > >... and the second subsequent crash: > > ... and another crash below... > > But in the meanwhile, I wonder if anyone except Andreas and I has some > interest in fixing this issue, as there were absolutely no comments or > feedback on my previous mails. Quite disappointing. :-/ > I did skim your emails. But the oops was happening in memcpy+0x6 which indicated it came back to the origin question why would it got an exception there. Just by staring at the code doesn't get me anywhere. Without a concrete reproduction of the issue, I'm afraid I can't provide more input here. There are several moving parts: 0. Hardware 1. Xen hypervisor 2. Dom0 kernel Your report and the debian report suggested that Dom0 kernel is less likely to be the culprit because you've tried different Dom0 kernels. As for Xen, not sure if you would be up for trying a debug build from source tree. That would help provide information on whether this is a bug in Xen or not. As for hardware, it would be worth trying whether this issue happens on other hardware platform. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/vMSI-X: Fix host crash when shutting down guests with MSI capable devices
Monday, July 25, 2016, 12:19:55 PM, you wrote: > On 25/07/16 11:16, Andrew Cooper wrote: >> On 22/07/16 09:50, Sander Eikelenboom wrote: >>> Thursday, July 21, 2016, 12:18:37 PM, you wrote: >>> c/s 74c6dc2d "x86/vMSI-X: defer intercept handler registration" caused MSI-X table infrastructure not to always be initialised, but it missed one path which needed an is-initialised check. If a devices is passed through to a domain which is MSI capable but not MSI-X capable, the call to msixtbl_init() is omitted, but a XEN_DOMCTL_unbind_pt_irq hypercall still calls into msixtbl_pt_unregister(). This follows the linked list pointer which is still NULL. Introduce an is-initalised check to msixtbl_pt_unregister(). Furthermore, the purpose of the open-coded msixtbl_list.next check is rather subtle. Introduce an msixtbl_initialised() predicate instead, which makes its purpose far more obvious. Reported-by: Sander Eikelenboom Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Sander Eikelenboom Sander - would you mind double checking this patch? --- >>> Hi Andrew, >>> >>> Just got the chance to test and it works for me ! >>> >>> Thanks, >> May I take that as a Test-by: then please? > And of course, I meant Tested-by: Yes, thanks for the quick fix ! -- Sander > ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records
On 25/07/16 11:21, David Vrabel wrote: > On 21/07/16 18:17, Andrew Cooper wrote: >> The sending side shouldn't send any variable sized records which end up >> having >> zero content, but the receiving side will need to tolerate such records for >> compatibility purposes. >> >> Signed-off-by: Andrew Cooper >> --- >> CC: Ian Jackson >> CC: Wei Liu >> --- >> docs/specs/libxc-migration-stream.pandoc | 16 +++- >> 1 file changed, 15 insertions(+), 1 deletion(-) >> >> diff --git a/docs/specs/libxc-migration-stream.pandoc >> b/docs/specs/libxc-migration-stream.pandoc >> index 31eba10..a90bc5d 100644 >> --- a/docs/specs/libxc-migration-stream.pandoc >> +++ b/docs/specs/libxc-migration-stream.pandoc >> @@ -3,7 +3,7 @@ >>Andrew Cooper <> >>Wen Congyang <> >>Yang Hongyang <> >> -% Revision 1 >> +% Revision 2 >> >> Introduction >> >> @@ -631,6 +631,10 @@ The set of valid records depends on the guest >> architecture and type. No >> assumptions should be made about the ordering or interleaving of >> independent records. Record dependencies are noted below. >> >> +Some records have an exactly specified size. Some records have variable >> size >> +depending on their content. A record with variable size which ends up being >> +zero should be omitted entirely from the stream by the sending side. > I disagree. I think the stream should include the records with the empty > content. This gives better consistency and does not require changes to > the stream. There are already some which are properly omitted, like the vcpu records for offline vcpus. There is no point having empty records; omitting them is an optimisation which we absolutely shouldn't preclude. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/vMSI-X: Fix host crash when shutting down guests with MSI capable devices
On Thu, Jul 21, 2016 at 11:18 AM, Andrew Cooper wrote: > c/s 74c6dc2d "x86/vMSI-X: defer intercept handler registration" caused MSI-X > table infrastructure not to always be initialised, but it missed one path > which needed an is-initialised check. > > If a devices is passed through to a domain which is MSI capable but not MSI-X > capable, the call to msixtbl_init() is omitted, but a XEN_DOMCTL_unbind_pt_irq > hypercall still calls into msixtbl_pt_unregister(). This follows the linked > list pointer which is still NULL. > > Introduce an is-initalised check to msixtbl_pt_unregister(). > > Furthermore, the purpose of the open-coded msixtbl_list.next check is rather > subtle. Introduce an msixtbl_initialised() predicate instead, which makes its > purpose far more obvious. Thanks for this bit. > Reported-by: Sander Eikelenboom > Signed-off-by: Andrew Cooper Reviewed-by: George Dunlap ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources
On 07/25/2016 05:20 PM, Roger Pau Monné wrote: > On Sat, Jul 23, 2016 at 06:18:23AM +0800, Bob Liu wrote: >> >> On 07/22/2016 07:45 PM, Roger Pau Monné wrote: >>> On Fri, Jul 22, 2016 at 05:43:32PM +0800, Bob Liu wrote: On 07/22/2016 05:34 PM, Roger Pau Monné wrote: > On Fri, Jul 22, 2016 at 04:17:48PM +0800, Bob Liu wrote: >> >> On 07/22/2016 03:45 PM, Roger Pau Monné wrote: >>> On Thu, Jul 21, 2016 at 06:08:05PM +0800, Bob Liu wrote: On 07/21/2016 04:57 PM, Roger Pau Monné wrote: >> ..[snip].. >> + >> +static ssize_t dynamic_reconfig_device(struct blkfront_info *info, >> ssize_t count) >> +{ >> +unsigned int i; >> +int err = -EBUSY; >> + >> +/* >> + * Make sure no migration in parallel, device lock is actually a >> + * mutex. >> + */ >> +if (!device_trylock(&info->xbdev->dev)) { >> +pr_err("Fail to acquire dev:%s lock, may be in >> migration.\n", >> +dev_name(&info->xbdev->dev)); >> +return err; >> +} >> + >> +/* >> + * Prevent new requests and guarantee no uncompleted reqs. >> + */ >> +blk_mq_freeze_queue(info->rq); >> +if (part_in_flight(&info->gd->part0)) >> +goto out; >> + >> +/* >> + * FrontBackend >> + * Switch to XenbusStateClosed >> + * frontend_changed(): >> + * case XenbusStateClosed: >> + * >> xen_blkif_disconnect() >> + * Switch to >> XenbusStateClosed >> + * blkfront_resume(): >> + * frontend_changed(): >> + * reconnect >> + * Wait until XenbusStateConnected >> + */ >> +info->reconfiguring = true; >> +xenbus_switch_state(info->xbdev, XenbusStateClosed); >> + >> +/* Poll every 100ms, 1 minute timeout. */ >> +for (i = 0; i < 600; i++) { >> +/* >> + * Wait backend enter XenbusStateClosed, >> blkback_changed() >> + * will clear reconfiguring. >> + */ >> +if (!info->reconfiguring) >> +goto resume; >> +schedule_timeout_interruptible(msecs_to_jiffies(100)); >> +} > > Instead of having this wait, could you just set info->reconfiguring = > 1, set > the frontend state to XenbusStateClosed and mimic exactly what a > resume from > suspension does? blkback_changed would have to set the frontend state > to > InitWait when it detects that the backend has switched to Closed, and > call > blkfront_resume. I think that won't work. In the real "resume" case, the power management system will trigger all ->resume() path. But there is no place for dynamic configuration. >>> >>> Hello, >>> >>> I think it should be possible to set info->reconfiguring and wait for >>> the >>> backend to switch to state Closed, at that point we should call >>> blkif_resume >>> (from blkback_changed) and the backend will follow the reconection. >>> >> >> Okay, I get your point. Yes, that's an option. >> >> But this will make 'dynamic configuration' to be async, I'm worry about >> the end-user will get panic. >> E.g >> A end-user "echo > /sys/devices/vbd-xxx/max_indirect_segs", >> but then the device will be Closed and disappeared, the user have to >> wait for a random time so that the device can resume. > > That should not happen, AFAICT on migration the device never dissapears. Oh, yes. > alloc_disk and friends should not be called on resume from migration (see > the switch in blkfront_connect, you should take the BLKIF_STATE_SUSPENDED > path for the reconfiguration). > What about if the end-user starts I/O immediately after writing new value to /sys? But the resume is still in progress. >>> >>> blkif_free already stops the queues by calling blk_mq_stop_hw_queues, and >>> blkif_queue_request will refuse to put anything on the ring if the state >>> is different than connected, which in turn makes blkif_queue_rq call >>> blk_mq_stop_hw_queue to stop the queue, so it should be safe. >>> >> >> But this will surprise the end-user, our user script is l
Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records
On 21/07/16 18:17, Andrew Cooper wrote: > It was never intended for records such as these to be sent with zero content. As the original author of the specification I'm perhaps best placed to say what the original intention is. For records such as HVM_PARAMS which consist of a set of N items, the intention was to most definitely send a record with 0 items. For records that fetch an opaque blob from the hypervisor, again the intention was to sent this blob as-is with no sort of processing or other checking. i.e., if the hypervisor gives us a zero-length blob we sent that as-is. This makes all the streams look the same with all the same records, regardless of what hardware platform it was run on. Including zero-length/count records also makes diagnosing problems easier -- the empty record is visible in the stream instead of having to remember that sometimes these records are deliberately omitted. As such, this series should be limited to making the restore side handle the zero count sets or zero length blobs if it does not do so already. The specification should be clarified to note that some records may have zero-length blobs or contain zero items. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records
On 25/07/16 11:25, Andrew Cooper wrote: > On 25/07/16 11:21, David Vrabel wrote: >> On 21/07/16 18:17, Andrew Cooper wrote: >>> The sending side shouldn't send any variable sized records which end up >>> having >>> zero content, but the receiving side will need to tolerate such records for >>> compatibility purposes. >>> >>> Signed-off-by: Andrew Cooper >>> --- >>> CC: Ian Jackson >>> CC: Wei Liu >>> --- >>> docs/specs/libxc-migration-stream.pandoc | 16 +++- >>> 1 file changed, 15 insertions(+), 1 deletion(-) >>> >>> diff --git a/docs/specs/libxc-migration-stream.pandoc >>> b/docs/specs/libxc-migration-stream.pandoc >>> index 31eba10..a90bc5d 100644 >>> --- a/docs/specs/libxc-migration-stream.pandoc >>> +++ b/docs/specs/libxc-migration-stream.pandoc >>> @@ -3,7 +3,7 @@ >>>Andrew Cooper <> >>>Wen Congyang <> >>>Yang Hongyang <> >>> -% Revision 1 >>> +% Revision 2 >>> >>> Introduction >>> >>> @@ -631,6 +631,10 @@ The set of valid records depends on the guest >>> architecture and type. No >>> assumptions should be made about the ordering or interleaving of >>> independent records. Record dependencies are noted below. >>> >>> +Some records have an exactly specified size. Some records have variable >>> size >>> +depending on their content. A record with variable size which ends up >>> being >>> +zero should be omitted entirely from the stream by the sending side. >> I disagree. I think the stream should include the records with the empty >> content. This gives better consistency and does not require changes to >> the stream. > > There are already some which are properly omitted, like the vcpu records > for offline vcpus. > > There is no point having empty records; omitting them is an optimisation > which we absolutely shouldn't preclude. The optimization doesn't matter since these records are so tiny. I've expanded on why these should be included in another reply. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [OSSTEST PATCH 1/3] cr-ensure-disk-space: Fix reference to iteration_proceed
8738d60622c1 "cr-ensure-disk-space: Break out check_space" renamed iteration_continue to iteration_proceed but did not change the call site. Signed-off-by: Ian Jackson --- cr-ensure-disk-space | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cr-ensure-disk-space b/cr-ensure-disk-space index 2d299dd..bb98835 100755 --- a/cr-ensure-disk-space +++ b/cr-ensure-disk-space @@ -152,7 +152,7 @@ db_retry($dbh_tests,[], sub { @flights = (); for (;;) { last if check_space(); - iteration_continue(); + iteration_proceed(); } }); -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [OSSTEST PATCH 3/3] cr-ensure-disk-space: Correct stdout output
d221996eea64 "cr-ensure-disk-space: Run check_space before taking lock" introduced an additional call to check_space but check_space prints the start of a message (with no newline) expecting iteration_proceed to print the rest. Move $|=1 up appropriately and add a couple of messages in the right place. This involves calling quit_ok rather than exit 0. Signed-off-by: Ian Jackson --- cr-ensure-disk-space | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/cr-ensure-disk-space b/cr-ensure-disk-space index 6a9781a..ff6b01b 100755 --- a/cr-ensure-disk-space +++ b/cr-ensure-disk-space @@ -68,13 +68,15 @@ sub quit_ok () { $|=1; -exit 0 if check_space; +check_space if check_space; + +print "taking lock..."; my $lock = "$c{GlobalLockDir}/publish-lock"; open LOCK, "> $lock" or die "$lock $!"; flock LOCK, LOCK_EX or die $!; -$|=1; +print "proceeding...\n"; my $chkq= db_prepare("SELECT * FROM flights WHERE flight=?"); -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [OSSTEST PATCH 2/3] cr-ensure-disk-space: Break out quit_ok
We are going to want another call site. No functional change. Signed-off-by: Ian Jackson --- cr-ensure-disk-space | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/cr-ensure-disk-space b/cr-ensure-disk-space index bb98835..6a9781a 100755 --- a/cr-ensure-disk-space +++ b/cr-ensure-disk-space @@ -61,6 +61,13 @@ sub check_space () { return $space >= logcfg('MinSpaceMby'); } +sub quit_ok () { +printf "ok.\n"; +exit 0; +} + +$|=1; + exit 0 if check_space; my $lock = "$c{GlobalLockDir}/publish-lock"; @@ -156,5 +163,4 @@ db_retry($dbh_tests,[], sub { } }); -printf "ok.\n"; -exit 0; +quit_ok(); -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records
On 25/07/16 11:35, David Vrabel wrote: > On 25/07/16 11:25, Andrew Cooper wrote: >> On 25/07/16 11:21, David Vrabel wrote: >>> On 21/07/16 18:17, Andrew Cooper wrote: The sending side shouldn't send any variable sized records which end up having zero content, but the receiving side will need to tolerate such records for compatibility purposes. Signed-off-by: Andrew Cooper --- CC: Ian Jackson CC: Wei Liu --- docs/specs/libxc-migration-stream.pandoc | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/docs/specs/libxc-migration-stream.pandoc b/docs/specs/libxc-migration-stream.pandoc index 31eba10..a90bc5d 100644 --- a/docs/specs/libxc-migration-stream.pandoc +++ b/docs/specs/libxc-migration-stream.pandoc @@ -3,7 +3,7 @@ Andrew Cooper <> Wen Congyang <> Yang Hongyang <> -% Revision 1 +% Revision 2 Introduction @@ -631,6 +631,10 @@ The set of valid records depends on the guest architecture and type. No assumptions should be made about the ordering or interleaving of independent records. Record dependencies are noted below. +Some records have an exactly specified size. Some records have variable size +depending on their content. A record with variable size which ends up being +zero should be omitted entirely from the stream by the sending side. >>> I disagree. I think the stream should include the records with the empty >>> content. This gives better consistency and does not require changes to >>> the stream. >> There are already some which are properly omitted, like the vcpu records >> for offline vcpus. >> >> There is no point having empty records; omitting them is an optimisation >> which we absolutely shouldn't preclude. > The optimization doesn't matter since these records are so tiny. > > I've expanded on why these should be included in another reply. We already omit most zero-length records. That ship has already sailed. This bug only presents itself because of two hypercalls in Xen returning different size characteristics. We already checked for zero first, and fail to check for zero the second time. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian
> On Mon, Jul 25, 2016 at 11:38:20AM +0200, Ingo Jürgensmann wrote: >> On 22.07.2016 12:21, Ingo Jürgensmann wrote: >>> On 22.07.2016 11:03, Ingo Jürgensmann wrote: >>> In the meanwhile, I activated IPv6 again on Tuesday evening and today the server crashed again some minutes ago. Here's the output from netconsole: >>> ... and the second subsequent crash: >> >> ... and another crash below... >> >> But in the meanwhile, I wonder if anyone except Andreas and I has some >> interest in fixing this issue, as there were absolutely no comments or >> feedback on my previous mails. Quite disappointing. :-/ >> > > I did skim your emails. But the oops was happening in memcpy+0x6 which > indicated it came back to the origin question why would it got an > exception there. > [...] > Your report and the debian report suggested that Dom0 kernel is less > likely to be the culprit because you've tried different Dom0 kernels. yes we did. but nothing newer than 3.16 so far, we could try that, too. > As for Xen, not sure if you would be up for trying a debug build from > source tree. That would help provide information on whether this is a > bug in Xen or not. ok, we'll try! > As for hardware, it would be worth trying whether this issue happens on > other hardware platform. as i wrote earlier on the list, we have two platforms which only have the Intel Xeon CPU in common, nothing else - and even that is from a very different generation. Andreas. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records
On 25/07/16 11:38, Andrew Cooper wrote: > On 25/07/16 11:35, David Vrabel wrote: >> On 25/07/16 11:25, Andrew Cooper wrote: >>> On 25/07/16 11:21, David Vrabel wrote: On 21/07/16 18:17, Andrew Cooper wrote: > The sending side shouldn't send any variable sized records which end up > having > zero content, but the receiving side will need to tolerate such records > for > compatibility purposes. > > Signed-off-by: Andrew Cooper > --- > CC: Ian Jackson > CC: Wei Liu > --- > docs/specs/libxc-migration-stream.pandoc | 16 +++- > 1 file changed, 15 insertions(+), 1 deletion(-) > > diff --git a/docs/specs/libxc-migration-stream.pandoc > b/docs/specs/libxc-migration-stream.pandoc > index 31eba10..a90bc5d 100644 > --- a/docs/specs/libxc-migration-stream.pandoc > +++ b/docs/specs/libxc-migration-stream.pandoc > @@ -3,7 +3,7 @@ >Andrew Cooper <> >Wen Congyang <> >Yang Hongyang <> > -% Revision 1 > +% Revision 2 > > Introduction > > @@ -631,6 +631,10 @@ The set of valid records depends on the guest > architecture and type. No > assumptions should be made about the ordering or interleaving of > independent records. Record dependencies are noted below. > > +Some records have an exactly specified size. Some records have variable > size > +depending on their content. A record with variable size which ends up > being > +zero should be omitted entirely from the stream by the sending side. I disagree. I think the stream should include the records with the empty content. This gives better consistency and does not require changes to the stream. >>> There are already some which are properly omitted, like the vcpu records >>> for offline vcpus. >>> >>> There is no point having empty records; omitting them is an optimisation >>> which we absolutely shouldn't preclude. >> The optimization doesn't matter since these records are so tiny. >> >> I've expanded on why these should be included in another reply. > > We already omit most zero-length records. That ship has already sailed. > > This bug only presents itself because of two hypercalls in Xen returning > different size characteristics. We already checked for zero first, and > fail to check for zero the second time. Then document that those specific records may be omitted, but don't spread the brokenness to other (future) records. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records
On 25/07/16 11:44, David Vrabel wrote: > On 25/07/16 11:38, Andrew Cooper wrote: >> On 25/07/16 11:35, David Vrabel wrote: >>> On 25/07/16 11:25, Andrew Cooper wrote: On 25/07/16 11:21, David Vrabel wrote: > On 21/07/16 18:17, Andrew Cooper wrote: >> The sending side shouldn't send any variable sized records which end up >> having >> zero content, but the receiving side will need to tolerate such records >> for >> compatibility purposes. >> >> Signed-off-by: Andrew Cooper >> --- >> CC: Ian Jackson >> CC: Wei Liu >> --- >> docs/specs/libxc-migration-stream.pandoc | 16 +++- >> 1 file changed, 15 insertions(+), 1 deletion(-) >> >> diff --git a/docs/specs/libxc-migration-stream.pandoc >> b/docs/specs/libxc-migration-stream.pandoc >> index 31eba10..a90bc5d 100644 >> --- a/docs/specs/libxc-migration-stream.pandoc >> +++ b/docs/specs/libxc-migration-stream.pandoc >> @@ -3,7 +3,7 @@ >>Andrew Cooper <> >>Wen Congyang <> >>Yang Hongyang <> >> -% Revision 1 >> +% Revision 2 >> >> Introduction >> >> @@ -631,6 +631,10 @@ The set of valid records depends on the guest >> architecture and type. No >> assumptions should be made about the ordering or interleaving of >> independent records. Record dependencies are noted below. >> >> +Some records have an exactly specified size. Some records have >> variable size >> +depending on their content. A record with variable size which ends up >> being >> +zero should be omitted entirely from the stream by the sending side. > I disagree. I think the stream should include the records with the empty > content. This gives better consistency and does not require changes to > the stream. There are already some which are properly omitted, like the vcpu records for offline vcpus. There is no point having empty records; omitting them is an optimisation which we absolutely shouldn't preclude. >>> The optimization doesn't matter since these records are so tiny. >>> >>> I've expanded on why these should be included in another reply. >> We already omit most zero-length records. That ship has already sailed. >> >> This bug only presents itself because of two hypercalls in Xen returning >> different size characteristics. We already checked for zero first, and >> fail to check for zero the second time. > Then document that those specific records may be omitted, but don't > spread the brokenness to other (future) records. I disagree, and do not consider this broken. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/6] libxl: add "pv device mode needed" support to device type framework
On Tue, Jul 12, 2016 at 05:30:40PM +0200, Juergen Gross wrote: > Add another callback to the device type framework in order to aid > decision whether a pv domain needs a device model. > > Signed-off-by: Juergen Gross Acked-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/6] libxl: add "merge" function to generic device type support
On Tue, Jul 12, 2016 at 05:30:39PM +0200, Juergen Gross wrote: > Instead of using a macro generating the code to merge xenstore and > json configuration data, use the generic device type support for > this purpose. > > This requires to add some accessor functions to the framework and > a structure for disks (as disks are added separately they didn't need > such a structure up to now). > > Signed-off-by: Juergen Gross I definitely think this is a good idea. The code looks good to me: Acked-by: Wei Liu I didn't do a line-by-line review though because most code looks very mechanical. I'm confident that we can sort out issues should they arise. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 6/6] libxl: move common nic stuff into one source
On Tue, Jul 12, 2016 at 05:30:44PM +0200, Juergen Gross wrote: > Put all nic related stuff of libxl form common files into a dedicated > source file. > > Signed-off-by: Juergen Gross Assuming this is pure code motion: Acked-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/6] libxl: split libxl vtpm code into one source
On Tue, Jul 12, 2016 at 05:30:42PM +0200, Juergen Gross wrote: > Put all vtpm related stuff of libxl into a dedicated source file. > > Signed-off-by: Juergen Gross Acked-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/6] libxl: move library pvusb specific code into libxl_pvusb.c
On Tue, Jul 12, 2016 at 05:30:41PM +0200, Juergen Gross wrote: > Outside libxl_pvusb.c only libxl_util.c still contains some pvusb code. > > Move it to libxl_pvusb.c. > > Signed-off-by: Juergen Gross Acked-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/6] libxl: add config update callback to device type framework
On Tue, Jul 12, 2016 at 05:30:43PM +0200, Juergen Gross wrote: > Some device types require a configuration update after resume of > domain. Add a callback for this purpose. > > Signed-off-by: Juergen Gross Acked-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian
On 25/07/16 10:38, Ingo Jürgensmann wrote: > On 22.07.2016 12:21, Ingo Jürgensmann wrote: >> On 22.07.2016 11:03, Ingo Jürgensmann wrote: >> >>> In the meanwhile, I activated IPv6 again on Tuesday evening and today >>> the server crashed again some minutes ago. Here's the output from >>> netconsole: >> ... and the second subsequent crash: > > ... and another crash below... > > But in the meanwhile, I wonder if anyone except Andreas and I has some > interest in fixing this issue, as there were absolutely no comments or > feedback on my previous mails. Quite disappointing. :-/ The level of technical support is somewhat proportional to what you pay. > Jul 25 10:50:00 31.172.31.251 [257646.574819] CPU: 0 PID: 0 Comm: > swapper/0 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2+deb8u3 > Jul 25 10:50:00 31.172.31.251 [257646.574874] Hardware name: Supermicro > X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 3.0a > 01/03/2014 > Jul 25 10:50:00 31.172.31.251 [257646.574934] task: 8181a460 ti: > 8180 task.ti: 8180 > Jul 25 10:50:00 31.172.31.251 [257646.575080] RIP: > e030:[] > Jul 25 10:50:00 31.172.31.251 [] memcpy+0x6/0x110 I think that there's a path in the network stack that is cloning frags without orphaning them. The original frags are freed (and unmapped) and the cloned frags now have unmapped pages. I think newer kernels address at least one of these paths. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian
>> Your report and the debian report suggested that Dom0 kernel is less >> likely to be the culprit because you've tried different Dom0 kernels. > > yes we did. but nothing newer than 3.16 so far, we could try that, too. i have to correct myself: we also tried 4.4 a while ago, maybe we should try 4.6 now ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [Embedded-pv-devel] Xen Project Developer Summit : program Published
On Fri, 2016-07-15 at 10:56 -0400, Meng Xu wrote: > On Fri, Jul 15, 2016 at 8:01 AM, Dario Faggioli > wrote: > > > > * "Consideration of Real Time GPU Scheduling of XenGT in Automotive > > Embedded System" > > Sangyun Lee, LG Electronics http://sched.co/7Lh3 > > > > * "Xenbedded: Xen-based client virtualization for phones and > > tablets" > > Chris Patterson & Kyle Temkin, AIS, Inc. http://sched.co/7LhI > > > > * "Review and Analysis of Performance Metrics of the Xen Hypervisor > > on > > Zynq UltraScale+ MPSoC" > > Jarvis Roach & Benjamin Sanda, DornerWorks http://sched.co/7Lh0 > > > > How cool is all that, eh? > > Very cool~ :-) > :-) > > We better continue to work on putting RTDS into as good as possible > > shape... World may need it soon! :-P :-P > Yes. I totally agree! One most important TODO for RTDS is to convert > it from non-work-conserving model to work-conserving model. I have > been keeping this in mind for a while. I can discuss this in our > internal RT-Xen meeting and try to get a plan for this. > Yes, that would be one good thing. > BTW, I may go to the summit if my Canada visa got approved on time. > Are you coming to the summit? > Not this year. So, if you go, responsibility of talking with (not only!) these people, listen to their needs, and lobby in favour of RTDS will be entirely yours. And you have all my trust, I'm sure you can do it! :-D Regards, Dario -- <> (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian
On Mon, Jul 25, 2016 at 12:42:39PM +0200, Andreas Ziegler wrote: > > On Mon, Jul 25, 2016 at 11:38:20AM +0200, Ingo Jürgensmann wrote: > >> On 22.07.2016 12:21, Ingo Jürgensmann wrote: > >>> On 22.07.2016 11:03, Ingo Jürgensmann wrote: > >>> > In the meanwhile, I activated IPv6 again on Tuesday evening and today > the server crashed again some minutes ago. Here's the output from > netconsole: > >>> ... and the second subsequent crash: > >> > >> ... and another crash below... > >> > >> But in the meanwhile, I wonder if anyone except Andreas and I has some > >> interest in fixing this issue, as there were absolutely no comments or > >> feedback on my previous mails. Quite disappointing. :-/ > >> > > > > I did skim your emails. But the oops was happening in memcpy+0x6 which > > indicated it came back to the origin question why would it got an > > exception there. > > > [...] > > Your report and the debian report suggested that Dom0 kernel is less > > likely to be the culprit because you've tried different Dom0 kernels. > > yes we did. but nothing newer than 3.16 so far, we could try that, too. > There is a 4.5 backport kernel for Jessie. Or you can compile a Dom0 kernel with your config file. If you go down that route, I suggest you either use the newest long term support Linux kernel (4.4 according to kernel.org) or the latest stable kernel. > > > As for Xen, not sure if you would be up for trying a debug build from > > source tree. That would help provide information on whether this is a > > bug in Xen or not. > > ok, we'll try! > > > > As for hardware, it would be worth trying whether this issue happens on > > other hardware platform. > > as i wrote earlier on the list, we have two platforms which only have > the Intel Xeon CPU in common, nothing else - and even that is from a > very different generation. > > OK. Wei. > Andreas. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources
On Mon, Jul 25, 2016 at 06:29:22PM +0800, Bob Liu wrote: > > On 07/25/2016 05:20 PM, Roger Pau Monné wrote: > > On Sat, Jul 23, 2016 at 06:18:23AM +0800, Bob Liu wrote: > >> > >> On 07/22/2016 07:45 PM, Roger Pau Monné wrote: > >>> On Fri, Jul 22, 2016 at 05:43:32PM +0800, Bob Liu wrote: > > On 07/22/2016 05:34 PM, Roger Pau Monné wrote: > > On Fri, Jul 22, 2016 at 04:17:48PM +0800, Bob Liu wrote: > >> > >> On 07/22/2016 03:45 PM, Roger Pau Monné wrote: > >>> On Thu, Jul 21, 2016 at 06:08:05PM +0800, Bob Liu wrote: > > On 07/21/2016 04:57 PM, Roger Pau Monné wrote: > >> + blk_mq_freeze_queue(info->rq); > >> + if (part_in_flight(&info->gd->part0)) > >> + goto out; > >> + > >> + /* > >> + * FrontBackend > >> + * Switch to XenbusStateClosed > >> + * frontend_changed(): > >> + * case XenbusStateClosed: > >> + * > >> xen_blkif_disconnect() > >> + * Switch to > >> XenbusStateClosed > >> + * blkfront_resume(): > >> + * frontend_changed(): > >> + * reconnect > >> + * Wait until XenbusStateConnected > >> + */ > >> + info->reconfiguring = true; > >> + xenbus_switch_state(info->xbdev, XenbusStateClosed); > >> + > >> + /* Poll every 100ms, 1 minute timeout. */ > >> + for (i = 0; i < 600; i++) { > >> + /* > >> + * Wait backend enter XenbusStateClosed, > >> blkback_changed() > >> + * will clear reconfiguring. > >> + */ > >> + if (!info->reconfiguring) > >> + goto resume; > >> + schedule_timeout_interruptible(msecs_to_jiffies(100)); > >> + } > > > > Instead of having this wait, could you just set info->reconfiguring > > = 1, set > > the frontend state to XenbusStateClosed and mimic exactly what a > > resume from > > suspension does? blkback_changed would have to set the frontend > > state to > > InitWait when it detects that the backend has switched to Closed, > > and call > > blkfront_resume. > > > I think that won't work. > In the real "resume" case, the power management system will trigger > all ->resume() path. > But there is no place for dynamic configuration. > >>> > >>> Hello, > >>> > >>> I think it should be possible to set info->reconfiguring and wait for > >>> the > >>> backend to switch to state Closed, at that point we should call > >>> blkif_resume > >>> (from blkback_changed) and the backend will follow the reconection. > >>> > >> > >> Okay, I get your point. Yes, that's an option. > >> > >> But this will make 'dynamic configuration' to be async, I'm worry > >> about the end-user will get panic. > >> E.g > >> A end-user "echo > /sys/devices/vbd-xxx/max_indirect_segs", > >> but then the device will be Closed and disappeared, the user have to > >> wait for a random time so that the device can resume. > > > > That should not happen, AFAICT on migration the device never > > dissapears. > > Oh, yes. > > > alloc_disk and friends should not be called on resume from migration > > (see > > the switch in blkfront_connect, you should take the > > BLKIF_STATE_SUSPENDED > > path for the reconfiguration). > > > > What about if the end-user starts I/O immediately after writing new > value to /sys? > But the resume is still in progress. > >>> > >>> blkif_free already stops the queues by calling blk_mq_stop_hw_queues, and > >>> blkif_queue_request will refuse to put anything on the ring if the state > >>> is different than connected, which in turn makes blkif_queue_rq call > >>> blk_mq_stop_hw_queue to stop the queue, so it should be safe. > >>> > >> > >> But this will surprise the end-user, our user script is like this: > >> 1) echo > /sys/ > >> 2) Start I/O immediately. > >>^^^ Fail because requests would be refused(even software queue was > >> still freezed). > > > > Which error do you get? AFAICT reads/writes should either block when the > > You are right, the reads/writes will just block which is fine. > > > queue is full, or if the fd is in non-blocking mode it should return EAGAIN > > (which a properly coded application should be prepared to deal with > > gracefully if it's using non-blocking fds anyway). > > > >> It's
Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL
Hi Wei, On 25/07/16 09:53, Wei Liu wrote: On Fri, Jul 22, 2016 at 03:27:30AM +, osstest service owner wrote: flight 97737 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/97737/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl 15 guest-start/debian.repeat fail REGR. vs. 97664 From \ Jul 21 17:08:59.405183 [ 4479.814529] [ cut here ] Jul 21 17:09:16.961529 [ 4479.814600] kernel BUG at drivers/xen/grant-table.c:923! Jul 21 17:09:16.966838 [ 4479.814628] Internal error: Oops - BUG: 0 [#1] SMP ARM Jul 21 17:09:16.972090 [ 4479.814656] Modules linked in: xen_gntalloc bridge stp ipv6 llc brcmfmac brcmutil cfg80211 Jul 21 17:09:16.980340 [ 4479.814759] CPU: 1 PID: 24761 Comm: vif5.0-q0-guest Not tainted 3.16.7-ckt12+ #1 Jul 21 17:09:16.987841 [ 4479.814795] task: d8ef7600 ti: d85bc000 task.ti: d85bc000 Jul 21 17:09:16.993339 [ 4479.814833] PC is at gnttab_batch_copy+0xd0/0xe4 Jul 21 17:09:16.997963 [ 4479.814860] LR is at gnttab_batch_copy+0x1c/0xe4 Jul 21 17:09:17.002718 [ 4479.814888] pc : []lr : [] psr: a0070013 Jul 21 17:09:17.008962 [ 4479.814888] sp : d85bdea0 ip : deadbeef fp : c0c8e140 Jul 21 17:09:17.014341 [ 4479.814935] r10: r9 : e1bec000 r8 : Jul 21 17:09:17.019595 [ 4479.814960] r7 : 0002 r6 : 0002 r5 : d85bdf20 r4 : e1bf4d30 Jul 21 17:09:17.026095 [ 4479.814990] r3 : 0001 r2 : deadbeef r1 : deadbeef r0 : fff2 Jul 21 17:09:17.032717 [ 4479.815021] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Jul 21 17:09:17.040091 [ 4479.815055] Control: 10c5387d Table: 78d8406a DAC: 0015 Jul 21 17:09:17.045964 [ 4479.815084] Process vif5.0-q0-guest (pid: 24761, stack limit = 0xd85bc248) Jul 21 17:09:17.052840 [ 4479.815114] Stack: (0xd85bdea0 to 0xd85be000) Jul 21 17:09:17.057218 [ 4479.815145] dea0: 0001 d8b11388 d85bdf20 d85bdf04 0002 c05eb054 0388 Jul 21 17:09:17.065469 [ 4479.815183] dec0: d85bdf04 c0b7ea80 db0995c0 c05e86e4 e1bf4000 003c Jul 21 17:09:17.073753 [ 4479.815221] dee0: c0b8849c e1bf4cfc c0c8e140 e1bf4d30 e1bf4cc4 Jul 21 17:09:17.082001 [ 4479.815260] df00: db0c3e80 d85bdf08 d85bdf08 d8c5cb40 d8c5cb40 0001 Jul 21 17:09:17.090217 [ 4479.815298] df20: 0002 0001 e1bf4d30 e1c1f530 04c6 023c Jul 21 17:09:17.098466 [ 4479.815337] df40: d84aab80 e1bec000 c05ea990 Jul 21 17:09:17.106720 [ 4479.815375] df60: c0266238 00f8 e1bec000 Jul 21 17:09:17.114844 [ 4479.815414] df80: d85bdf80 d85bdf80 d85bdf90 d85bdf90 d85bdfac d84aab80 Jul 21 17:09:17.123093 [ 4479.815451] dfa0: c0266168 c020f138 Jul 21 17:09:17.131345 [ 4479.815489] dfc0: Jul 21 17:09:17.139596 [ 4479.815527] dfe0: 0013 Jul 21 17:09:17.147841 [ 4479.815583] [] (gnttab_batch_copy) from [] (xenvif_kthread_guest_rx+0x6c4/0xb58) From my understanding the hypercall can only return a non-zero value if copy_*_guest helpers fails. Those helpers will only fail when it is not possible to retrieve the page associated to a virtual address. The value is in r0 (-EFAULT), seem to confirm that. So this looks very suspicious. Looking at the other parameters and the assembly code (see [1]): count = 2 (saved in r6) batch = 0xe1bf4d30 (saved in r4) They looks valid to me. Also, there was no major change around that code recently. I don't have much ideas what is going on. And unfortunately Xen ARM does not print much information when the translation fail. I have CCed few more people to see if they have a clue. Jul 21 17:09:17.156969 [ 4479.815636] [] (xenvif_kthread_guest_rx) from [] (kthread+0xd0/0xe8) Jul 21 17:09:17.165217 [ 4479.815681] [] (kthread) from [] (ret_from_fork+0x14/0x3c) Jul 21 17:09:17.172467 [ 4479.815721] Code: e1c432b4 eae0 e7f001f2 e8bd80f8 (e7f001f2) Jul 21 17:09:17.178595 [ 4479.815766] ---[ end trace 6ba7d172d52e24e2 ]--- Regards, [1] http://logs.test-lab.xenproject.org/osstest/logs/97737/build-armhf-pvops/info.html c04bb0c0 : c04bb0c0: e92d40f8push{r3, r4, r5, r6, r7, lr} c04bb0c4: e1a02001mov r2, r1 c04bb0c8: e1a04000mov r4, r0 c04bb0cc: e1a06001mov r6, r1 c04bb0d0: e1a01000mov r1, r0 c04bb0d4: e3a5mov r0, #5 c04bb0d8: ebf54e39bl c020e9c4 c04bb0dc: e350cmp r0, #0 c04bb0e0: 1a2abne c04bb190 -- Julien Grall _
Re: [Xen-devel] [PATCH v3 1/4] tools: remove systemd xenstore socket definitions
On Mon, Jul 25, 2016 at 06:33:35AM +0200, Juergen Gross wrote: > On 22/07/16 20:51, Wei Liu wrote: > > On Fri, Jul 22, 2016 at 08:49:17PM +0200, Juergen Gross wrote: > >> On 22/07/16 18:31, Wei Liu wrote: > >>> Only skim-read this patch, will do proper review later. > >>> > >>> On Fri, Jul 22, 2016 at 05:09:28PM +0200, Juergen Gross wrote: > >>> [...] > CAMLprim value ocaml_launched_by_systemd(value ignore) > { > -CAMLparam1(ignore); > -CAMLlocal1(ret); > + CAMLparam1(ignore); > + CAMLlocal1(ret); > > -ret = Val_false; > + ret = Val_false; > > -if (sd_listen_fds(0) > 0) > -ret = Val_true; > + if (sd_booted() > 0) > + ret = Val_true; > >>> > >>> I think this may be problematic. > >>> > >>> sd_booted returns true if system is booted with systemd, but it has no > >>> bearing whether this particular process is launched by systemd. > >>> > >>> IIRC using sd_booted would cause oxenstored thinks it is launched by > >>> systemd even if the user launches it by hand in a shell. That caused > >>> it's initialisation to fail. 81d758afca7c3c1e3ccbd78154b33d64fd7757fb > >>> was written to address that issue. > >>> > >>> So, what would happen if you start oxenstored by hand with your patch > >>> apply? Maybe we can just remove this launched_by_systemd check all > >>> together -- i.e. we always call sd_notify? > > Sure we could. I'll remove the checks in both xenstored variants if > nobody objects. > > >> > >> So you are concerned sd_notify() will be called too often, but you are > >> suggesting to call it always? I don't understand your concerns then. > >> > > > > No, my concern is that you won't be able to start oxenstored from > > command line manually if you boot with systemd. At least that was the > > bug that caused me to write that patch. > > I believe the main problem was xenstored not calling daemonize() in that > case, right? This problem is being remove with my patch as daemonize() > will be called always. The systemd service file is modified to reflect > this change in behavior. > I'm afraid I can't remember all the details. But as long as you can launch [o/c]xenstored by hand I think we're fine. Wei. > > Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources
On 07/25/2016 06:53 PM, Roger Pau Monné wrote: ..[snip..] * We get the device lock and blk_mq_freeze_queue() in dynamic_reconfig_device(), and then have to release in blkif_recover() asynchronously which makes the code more difficult to readable. >>> >>> I'm not sure I'm following here, do you mean that you will pick the lock in >>> dynamic_reconfig_device and release it in blkif_recover? Why wouldn't you >> >> Yes. >> >>> release the lock in dynamic_reconfig_device after doing whatever is needed? >>> >> >> Both 'dynamic configuration' and migration:xenbus_dev_resume() use { >> blkfront_resume(); blkif_recover() } and depends on the change of >> xbdev->state. >> If they happen simultaneously, the State machine of xbdev->state is going to >> be a mess and very difficult to track. >> >> The lock(actually a mutex) is like a big lock to make sure no race would >> happen at all. > > So the only thing that you should do is set the frontend state to closed and > wait for the backend to also switch to state closed, and then switch the > frontend state to init again in order to trigger a reconnection. > But if migration:xenbus_dev_resume() is triggered at the same time, any state be set might be changed. = E.g Dynamic_reconfig_device: Migration:xenbus_dev_resume() Set the frontend state to closed frontend state will be reset to XenbusStateInitialising by xenbus_dev_resume() Wait for the backend to also switch to state closed = Similar situation may happen at any other place regarding set the state. > You are right that all this process depends on the state being updated > correctly, but I don't see how's that different from a normal connection or > resume, and we don't seem to have races there. > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL
Thanks for investigating. There are only two arm related changes in the range being tested: * a43cc8f - (origin/smoke) arm/traps: fix bug in dump_guest_s1_walk handling of level 2 page tables (5 days ago) * 60e06f2 - arm/traps: fix bug in dump_guest_s1_walk L1 page table offset computation (5 days ago) They don't look very suspicious. If you need help navigating osstest test report, please let me know. Wei. On Mon, Jul 25, 2016 at 12:05:08PM +0100, Julien Grall wrote: > Hi Wei, > > On 25/07/16 09:53, Wei Liu wrote: > >On Fri, Jul 22, 2016 at 03:27:30AM +, osstest service owner wrote: > >>flight 97737 xen-unstable real [real] > >>http://logs.test-lab.xenproject.org/osstest/logs/97737/ > >> > >>Regressions :-( > >> > >>Tests which did not succeed and are blocking, > >>including tests which could not be run: > >> test-armhf-armhf-xl 15 guest-start/debian.repeat fail REGR. vs. > >> 97664 > > > >From > > > >\ > > > > > >Jul 21 17:08:59.405183 [ 4479.814529] [ cut here ] > > > >Jul 21 17:09:16.961529 [ 4479.814600] kernel BUG at > >drivers/xen/grant-table.c:923! > > > >Jul 21 17:09:16.966838 [ 4479.814628] Internal error: Oops - BUG: 0 [#1] SMP > >ARM > > > >Jul 21 17:09:16.972090 [ 4479.814656] Modules linked in: xen_gntalloc bridge > >stp ipv6 llc brcmfmac brcmutil cfg80211 > > > >Jul 21 17:09:16.980340 [ 4479.814759] CPU: 1 PID: 24761 Comm: > >vif5.0-q0-guest Not tainted 3.16.7-ckt12+ #1 > > > >Jul 21 17:09:16.987841 [ 4479.814795] task: d8ef7600 ti: d85bc000 task.ti: > >d85bc000 > > > >Jul 21 17:09:16.993339 [ 4479.814833] PC is at gnttab_batch_copy+0xd0/0xe4 > > > >Jul 21 17:09:16.997963 [ 4479.814860] LR is at gnttab_batch_copy+0x1c/0xe4 > > > >Jul 21 17:09:17.002718 [ 4479.814888] pc : []lr : [] > > psr: a0070013 > > > >Jul 21 17:09:17.008962 [ 4479.814888] sp : d85bdea0 ip : deadbeef fp : > >c0c8e140 > > > >Jul 21 17:09:17.014341 [ 4479.814935] r10: r9 : e1bec000 r8 : > > > > > >Jul 21 17:09:17.019595 [ 4479.814960] r7 : 0002 r6 : 0002 r5 : > >d85bdf20 r4 : e1bf4d30 > > > >Jul 21 17:09:17.026095 [ 4479.814990] r3 : 0001 r2 : deadbeef r1 : > >deadbeef r0 : fff2 > > > >Jul 21 17:09:17.032717 [ 4479.815021] Flags: NzCv IRQs on FIQs on Mode > >SVC_32 ISA ARM Segment kernel > > > >Jul 21 17:09:17.040091 [ 4479.815055] Control: 10c5387d Table: 78d8406a > >DAC: 0015 > > > >Jul 21 17:09:17.045964 [ 4479.815084] Process vif5.0-q0-guest (pid: 24761, > >stack limit = 0xd85bc248) > > > >Jul 21 17:09:17.052840 [ 4479.815114] Stack: (0xd85bdea0 to 0xd85be000) > > > >Jul 21 17:09:17.057218 [ 4479.815145] dea0: 0001 d8b11388 d85bdf20 > >d85bdf04 0002 c05eb054 0388 > > > >Jul 21 17:09:17.065469 [ 4479.815183] dec0: d85bdf04 > >c0b7ea80 db0995c0 c05e86e4 e1bf4000 003c > > > >Jul 21 17:09:17.073753 [ 4479.815221] dee0: > >c0b8849c e1bf4cfc c0c8e140 e1bf4d30 e1bf4cc4 > > > >Jul 21 17:09:17.082001 [ 4479.815260] df00: db0c3e80 d85bdf08 > >d85bdf08 d8c5cb40 d8c5cb40 0001 > > > >Jul 21 17:09:17.090217 [ 4479.815298] df20: 0002 0001 > > e1bf4d30 e1c1f530 04c6 023c > > > >Jul 21 17:09:17.098466 [ 4479.815337] df40: d84aab80 > >e1bec000 c05ea990 > > > >Jul 21 17:09:17.106720 [ 4479.815375] df60: c0266238 > > 00f8 e1bec000 > > > >Jul 21 17:09:17.114844 [ 4479.815414] df80: d85bdf80 d85bdf80 > > d85bdf90 d85bdf90 d85bdfac d84aab80 > > > >Jul 21 17:09:17.123093 [ 4479.815451] dfa0: c0266168 > >c020f138 > > > >Jul 21 17:09:17.131345 [ 4479.815489] dfc0: > > > > > >Jul 21 17:09:17.139596 [ 4479.815527] dfe0: > > 0013 > > > >Jul 21 17:09:17.147841 [ 4479.815583] [] (gnttab_batch_copy) from > >[] (xenvif_kthread_guest_rx+0x6c4/0xb58) > > From my understanding the hypercall can only return a non-zero value if > copy_*_guest helpers fails. > > Those helpers will only fail when it is not possible to retrieve the page > associated to a virtual address. The value is in r0 (-EFAULT), seem to > confirm that. So this looks very suspicious. > > Looking at the other parameters and the assembly code (see [1]): > count = 2 (saved in r6) > batch = 0xe1bf4d30 (saved in r4) > > They looks valid to me. Also, there was no major change around that code > recently. > > I don't have much ideas what is going on. And unfortunately Xen ARM does not > print much information when the translation fail. > > I have CCed few more people to see if they have a clue. > > > > >Jul 21 17:09:17.156969 [ 4479.815636] [] (xenvif_kthread_guest_rx) > >from [] (kthread+0xd0/0xe8) > > > >Jul 21 17:09:17.1
Re: [Xen-devel] [PATCH 2/3] xen: Have schedulers revise initial placement
On 18/07/16 11:28, Dario Faggioli wrote: > On Fri, 2016-07-15 at 19:02 +0100, George Dunlap wrote: >> The generic domain creation logic in >> xen/common/domctl.c:default_vcpu0_location() attempts to try to do >> initial placement load-balancing by placing vcpu 0 on the least-busy >> non-primary hyperthread available. Unfortunately, the logic can end >> up picking a pcpu that's not in the online mask. When this is passed >> to a scheduler such which assumes that the initial assignment is >> valid, it causes a null pointer dereference looking up the runqueue. >> >> Furthermore, this initial placement doesn't take into account hard or >> soft affinity, or any scheduler-specific knowledge (such as historic >> runqueue load, as in credit2). >> >> To solve this, when inserting a vcpu, always call the per-scheduler >> "pick" function to revise the initial placement. This will >> automatically take all knowledge the scheduler has into account. >> >> Signed-off-by: George Dunlap >> --- >> CC: Dario Faggioli >> CC: Anshul Makkar >> CC: Meng Xu >> CC: Jan Beulich >> > Reviewed-by: Dario Faggioli Just to clarify - does this cover the sched_rt.c changes as well? Thanks, -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records
Andrew Cooper writes ("[PATCH 1/4] docs: Clarify the expected behaviour of zero length records"): > The sending side shouldn't send any variable sized records which end > up having zero content, but the receiving side will need to tolerate > such records for compatibility purposes. ... > +Some records have an exactly specified size. Some records have > +variable size depending on their content. A record with variable > +size which ends up being zero should be omitted entirely from the > +stream by the sending side. I'm not sure this is best stated as a general rule. Consider a record type which specifies `explicit wombat layout' and consists of a list of `wombat locations'. Absence of the record would be semantically different from an empty record. In the absence of the record, the receiver would do automatic choice of number and layout of wombats. With an empty record, the receiver would create zero wombats (which might be different to the default). I think the underlying spec bug here is that the specification for `X86_PV_VCPU_BASIC, EXTENDED, XSAVE, MSRS' is too vague. It simply specifies context Binary data for this VCPU. and `The format of these records are identical. They are all binary blobs of data which are accessed using specific pairs of domctl hypercalls.'. I wasn't able to immediately find the corresponding hypercalls (which is another docs bug). But if the hypercalls generate and consume 0-length blocks, then I think it is fine for the migration stream to contain them. Have I misunderstood ? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH XTF 3/3] xtf-runner: regularise runner exit code
Andrew Cooper writes ("Re: [PATCH XTF 3/3] xtf-runner: regularise runner exit code"): > On 22/07/16 10:29, Wei Liu wrote: > > This would cause a FAILURE to overwrite previous ERROR result. Is that > > what you want? > > When running more than one test, the overall result should be the most > severe. So yes, a subsequent FAILURE should override an ERROR. "Error" is surely a more severe problem than "Failure". In particular, Error might mask a failure. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 16/18] arm/altp2m: Extended libxl to activate altp2m on ARM.
On 07/25/2016 12:08 PM, Wei Liu wrote: > On Mon, Jul 25, 2016 at 10:49:41AM +0100, Julien Grall wrote: >> Hello, >> >> On 25/07/16 10:04, Sergej Proskurin wrote: >>> On 07/25/2016 10:32 AM, Wei Liu wrote: On Sun, Jul 24, 2016 at 06:06:00PM +0200, Sergej Proskurin wrote: >>> +xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M, >>> + libxl_defbool_val(info->u.pv.altp2m)); >>> +} >>> +#endif >>> + >>> static void hvm_set_conf_params(xc_interface *handle, uint32_t domid, >>> libxl_domain_build_info *const info) >>> { >>> @@ -433,6 +443,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, >>> return rc; >>> #endif >>> } >>> +#if defined(__arm__) || defined(__aarch64__) >>> +else /* info->type == LIBXL_DOMAIN_TYPE_PV */ >>> +pv_set_conf_params(ctx->xch, domid, info); >>> +#endif >>> >>> rc = libxl__arch_domain_create(gc, d_config, domid); >>> >>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl >>> index ef614be..0a164f9 100644 >>> --- a/tools/libxl/libxl_types.idl >>> +++ b/tools/libxl/libxl_types.idl >>> @@ -554,6 +554,7 @@ libxl_domain_build_info = >>> Struct("domain_build_info",[ >>> ("features", string, {'const': >>> True}), >>> # Use host's E820 for PCI >>> passthrough. >>> ("e820_host", libxl_defbool), >>> + ("altp2m", libxl_defbool), >> Why is this placed in PV instead of arch_arm? >> > The current implementation mirrors the x86's altp2m, where the altp2m > field represents a property of HVM guests in b_info->u.hvm.altp2m. Since > guests on ARM are marked as PV, I have placed the field altp2m into > b_info->u.pv.altp2m. > > However, if you believe that it would make more sense to place altp2m > into b_info->arch_arm.altp2m, I will try to adapt the affected fields. > OK. I don't think that one should be placed in u.pv because that union is for x86. However it is also not suitable to just promote that to a common field because x86 pv doesn't have altp2m support. I think arch_arm would be a better location. >>> Alright, I will move the field altp2m to arch_arm and adopt the >>> initialization routines. >> Would not it make more sense to introduce a generic field 'altp2m' (i.e >> outside of hvm and arch_arm)? >> >> Otherwise, toolstack such as libvirt will need to have specific code to >> handle ARM and x86. >> > Hmm... this is a compelling reason. > > After mulling over this issue a bit more, I think I'm fine with > promoting altp2m to a common field. > > Sergej, this is a patch to get you started. Please make sure setting the > old field still works. If I'm not clear enough, please ask. > Alright, thank you. One question up front: Do we still need two different LIBXL_HAVE macros then? Also, concerning the dom configuration parameters: Do you think we should use two different dom configuration parameters (the legacy "altp2mhvm" and the new "altp2m") or shall we merge both into "altp2m" as we will address only one common altp2m field in the struct? Best regards ~Sergej ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian
On 25.07.2016 12:51, Wei Liu wrote: [...] > Your report and the debian report suggested that Dom0 kernel is less > likely to be the culprit because you've tried different Dom0 kernels. yes we did. but nothing newer than 3.16 so far, we could try that, too. There is a 4.5 backport kernel for Jessie. Or you can compile a Dom0 kernel with your config file. If you go down that route, I suggest you either use the newest long term support Linux kernel (4.4 according to kernel.org) or the latest stable kernel. I already tried kernel 4.5 from backports, but it was still crashing: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=804079#39 -- Ciao... //http://blog.windfluechter.net Ingo \X/ XMPP: i...@jabber.windfluechter.net gpg pubkey: http://www.juergensmann.de/ij_public_key.asc ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools/libxc: Properly increment ApicIdCoreSize field on AMD
Wei Liu writes ("Re: [PATCH] tools/libxc: Properly increment ApicIdCoreSize field on AMD"): > On Fri, Jul 22, 2016 at 01:45:05PM -0400, Boris Ostrovsky wrote: > > On 07/22/2016 01:38 PM, Wei Liu wrote: > > But this actually fixes a regression that was triggered by recent > > hvmloader change on one particular processor that we have in the test farm. > > Ian, this is a backport candidate and needs to go back as far as > possible. The bogus calculation was introduced in 2008 (!). Thanks, but: Not queued for backport because the patch is not in staging. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL
On 25/07/16 12:11, Wei Liu wrote: Thanks for investigating. There are only two arm related changes in the range being tested: * a43cc8f - (origin/smoke) arm/traps: fix bug in dump_guest_s1_walk handling of level 2 page tables (5 days ago) * 60e06f2 - arm/traps: fix bug in dump_guest_s1_walk L1 page table offset computation (5 days ago) They don't look very suspicious. The modified function is not called in the hypervisor at all. It's only here for manual debugging. Although, this may change the offset of some function (assuming we have an hidden bug). If you need help navigating osstest test report, please let me know. I have noticed that there is 2 kernel BUG in the logs (with one host reboot in the middle). Can you detail what the exact the test? It looks to me that you are trying to power cycle multiple time a guest. Cheers, Wei. On Mon, Jul 25, 2016 at 12:05:08PM +0100, Julien Grall wrote: Hi Wei, On 25/07/16 09:53, Wei Liu wrote: On Fri, Jul 22, 2016 at 03:27:30AM +, osstest service owner wrote: flight 97737 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/97737/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl 15 guest-start/debian.repeat fail REGR. vs. 97664 From \ Jul 21 17:08:59.405183 [ 4479.814529] [ cut here ] Jul 21 17:09:16.961529 [ 4479.814600] kernel BUG at drivers/xen/grant-table.c:923! Jul 21 17:09:16.966838 [ 4479.814628] Internal error: Oops - BUG: 0 [#1] SMP ARM Jul 21 17:09:16.972090 [ 4479.814656] Modules linked in: xen_gntalloc bridge stp ipv6 llc brcmfmac brcmutil cfg80211 Jul 21 17:09:16.980340 [ 4479.814759] CPU: 1 PID: 24761 Comm: vif5.0-q0-guest Not tainted 3.16.7-ckt12+ #1 Jul 21 17:09:16.987841 [ 4479.814795] task: d8ef7600 ti: d85bc000 task.ti: d85bc000 Jul 21 17:09:16.993339 [ 4479.814833] PC is at gnttab_batch_copy+0xd0/0xe4 Jul 21 17:09:16.997963 [ 4479.814860] LR is at gnttab_batch_copy+0x1c/0xe4 Jul 21 17:09:17.002718 [ 4479.814888] pc : []lr : [] psr: a0070013 Jul 21 17:09:17.008962 [ 4479.814888] sp : d85bdea0 ip : deadbeef fp : c0c8e140 Jul 21 17:09:17.014341 [ 4479.814935] r10: r9 : e1bec000 r8 : Jul 21 17:09:17.019595 [ 4479.814960] r7 : 0002 r6 : 0002 r5 : d85bdf20 r4 : e1bf4d30 Jul 21 17:09:17.026095 [ 4479.814990] r3 : 0001 r2 : deadbeef r1 : deadbeef r0 : fff2 Jul 21 17:09:17.032717 [ 4479.815021] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel Jul 21 17:09:17.040091 [ 4479.815055] Control: 10c5387d Table: 78d8406a DAC: 0015 Jul 21 17:09:17.045964 [ 4479.815084] Process vif5.0-q0-guest (pid: 24761, stack limit = 0xd85bc248) Jul 21 17:09:17.052840 [ 4479.815114] Stack: (0xd85bdea0 to 0xd85be000) Jul 21 17:09:17.057218 [ 4479.815145] dea0: 0001 d8b11388 d85bdf20 d85bdf04 0002 c05eb054 0388 Jul 21 17:09:17.065469 [ 4479.815183] dec0: d85bdf04 c0b7ea80 db0995c0 c05e86e4 e1bf4000 003c Jul 21 17:09:17.073753 [ 4479.815221] dee0: c0b8849c e1bf4cfc c0c8e140 e1bf4d30 e1bf4cc4 Jul 21 17:09:17.082001 [ 4479.815260] df00: db0c3e80 d85bdf08 d85bdf08 d8c5cb40 d8c5cb40 0001 Jul 21 17:09:17.090217 [ 4479.815298] df20: 0002 0001 e1bf4d30 e1c1f530 04c6 023c Jul 21 17:09:17.098466 [ 4479.815337] df40: d84aab80 e1bec000 c05ea990 Jul 21 17:09:17.106720 [ 4479.815375] df60: c0266238 00f8 e1bec000 Jul 21 17:09:17.114844 [ 4479.815414] df80: d85bdf80 d85bdf80 d85bdf90 d85bdf90 d85bdfac d84aab80 Jul 21 17:09:17.123093 [ 4479.815451] dfa0: c0266168 c020f138 Jul 21 17:09:17.131345 [ 4479.815489] dfc0: Jul 21 17:09:17.139596 [ 4479.815527] dfe0: 0013 Jul 21 17:09:17.147841 [ 4479.815583] [] (gnttab_batch_copy) from [] (xenvif_kthread_guest_rx+0x6c4/0xb58) From my understanding the hypercall can only return a non-zero value if copy_*_guest helpers fails. Those helpers will only fail when it is not possible to retrieve the page associated to a virtual address. The value is in r0 (-EFAULT), seem to confirm that. So this looks very suspicious. Looking at the other parameters and the assembly code (see [1]): count = 2 (saved in r6) batch = 0xe1bf4d30 (saved in r4) They looks valid to me. Also, there was no major change around that code recently. I don't have much ideas what is going on. And unfortunately Xen ARM does not print much information when the translation fail. I have CCed few more people to see if they have a clue. Jul 21 17:09:17.15696
Re: [Xen-devel] [PATCH 16/18] arm/altp2m: Extended libxl to activate altp2m on ARM.
On Mon, Jul 25, 2016 at 01:26:29PM +0200, Sergej Proskurin wrote: > > On 07/25/2016 12:08 PM, Wei Liu wrote: > > On Mon, Jul 25, 2016 at 10:49:41AM +0100, Julien Grall wrote: > >> Hello, > >> > >> On 25/07/16 10:04, Sergej Proskurin wrote: > >>> On 07/25/2016 10:32 AM, Wei Liu wrote: > On Sun, Jul 24, 2016 at 06:06:00PM +0200, Sergej Proskurin wrote: > >>> +xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M, > >>> + libxl_defbool_val(info->u.pv.altp2m)); > >>> +} > >>> +#endif > >>> + > >>> static void hvm_set_conf_params(xc_interface *handle, uint32_t domid, > >>> libxl_domain_build_info *const info) > >>> { > >>> @@ -433,6 +443,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t > >>> domid, > >>> return rc; > >>> #endif > >>> } > >>> +#if defined(__arm__) || defined(__aarch64__) > >>> +else /* info->type == LIBXL_DOMAIN_TYPE_PV */ > >>> +pv_set_conf_params(ctx->xch, domid, info); > >>> +#endif > >>> > >>> rc = libxl__arch_domain_create(gc, d_config, domid); > >>> > >>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl > >>> index ef614be..0a164f9 100644 > >>> --- a/tools/libxl/libxl_types.idl > >>> +++ b/tools/libxl/libxl_types.idl > >>> @@ -554,6 +554,7 @@ libxl_domain_build_info = > >>> Struct("domain_build_info",[ > >>> ("features", string, {'const': > >>> True}), > >>> # Use host's E820 for PCI > >>> passthrough. > >>> ("e820_host", libxl_defbool), > >>> + ("altp2m", libxl_defbool), > >> Why is this placed in PV instead of arch_arm? > >> > > The current implementation mirrors the x86's altp2m, where the altp2m > > field represents a property of HVM guests in b_info->u.hvm.altp2m. Since > > guests on ARM are marked as PV, I have placed the field altp2m into > > b_info->u.pv.altp2m. > > > > However, if you believe that it would make more sense to place altp2m > > into b_info->arch_arm.altp2m, I will try to adapt the affected fields. > > > OK. I don't think that one should be placed in u.pv because that union > is for x86. However it is also not suitable to just promote that to > a common field because x86 pv doesn't have altp2m support. > > I think arch_arm would be a better location. > > >>> Alright, I will move the field altp2m to arch_arm and adopt the > >>> initialization routines. > >> Would not it make more sense to introduce a generic field 'altp2m' (i.e > >> outside of hvm and arch_arm)? > >> > >> Otherwise, toolstack such as libvirt will need to have specific code to > >> handle ARM and x86. > >> > > Hmm... this is a compelling reason. > > > > After mulling over this issue a bit more, I think I'm fine with > > promoting altp2m to a common field. > > > > Sergej, this is a patch to get you started. Please make sure setting the > > old field still works. If I'm not clear enough, please ask. > > > > Alright, thank you. One question up front: Do we still need two > different LIBXL_HAVE macros then? > Yes, we still need one. Something like LIBXL_HAVE_COMMON_ALTP2M ? I'm bad at naming things though. > Also, concerning the dom configuration parameters: Do you think we > should use two different dom configuration parameters (the legacy > "altp2mhvm" and the new "altp2m") or shall we merge both into "altp2m" > as we will address only one common altp2m field in the struct? > I think using altp2m is fine. But altp2mhvm needs to continue to work because it is trivial to keep the old one. In any case, document the new parameter, deprecate the old one and document and the relationship between the two. Wei. > Best regards > ~Sergej > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian
On 25.07.2016 12:23, Wei Liu wrote: First, thank you for replying! Very much appreciated! :) I did skim your emails. But the oops was happening in memcpy+0x6 which indicated it came back to the origin question why would it got an exception there. Just by staring at the code doesn't get me anywhere. Without a concrete reproduction of the issue, I'm afraid I can't provide more input here. Well, from my point of view, it happens quite often when accessing the server via SSH. For example today it crashed when I wanted to add something and after I clicked into putty and typed the first char. In another putty, where I have my netconsole log open, I instantly saw the oops. But what exactly causing these kinds of reboots, I'm clueless as you too. Only that I do experience far more frequent crashes when accessing the server from workplace via putty on Windows than via SSH on OSX or Debian Linux. There are several moving parts: 0. Hardware 1. Xen hypervisor 2. Dom0 kernel Your report and the debian report suggested that Dom0 kernel is less likely to be the culprit because you've tried different Dom0 kernels. As just written in the other mail, I already tried kernel 4.5 from backports. Still crashing. As for Xen, not sure if you would be up for trying a debug build from source tree. That would help provide information on whether this is a bug in Xen or not. Will try to build from Debian source, but how to enable debug build? -- Ciao... //http://blog.windfluechter.net Ingo \X/ XMPP: i...@jabber.windfluechter.net gpg pubkey: http://www.juergensmann.de/ij_public_key.asc ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records
David Vrabel writes ("Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records"): > For records such as HVM_PARAMS which consist of a set of N items, the > intention was to most definitely send a record with 0 items. > > For records that fetch an opaque blob from the hypervisor, again the > intention was to sent this blob as-is with no sort of processing or > other checking. i.e., if the hypervisor gives us a zero-length blob we > sent that as-is. > > This makes all the streams look the same with all the same records, > regardless of what hardware platform it was run on. Including > zero-length/count records also makes diagnosing problems easier -- the > empty record is visible in the stream instead of having to remember that > sometimes these records are deliberately omitted. > > As such, this series should be limited to making the restore side handle > the zero count sets or zero length blobs if it does not do so already. > > The specification should be clarified to note that some records may have > zero-length blobs or contain zero items. I think I prefer David's view here, but I don't quite feel I understand what the underlying bug is. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL
On Mon, Jul 25, 2016 at 12:34:53PM +0100, Julien Grall wrote: > > > On 25/07/16 12:11, Wei Liu wrote: > >Thanks for investigating. > > > >There are only two arm related changes in the range being tested: > > > >* a43cc8f - (origin/smoke) arm/traps: fix bug in dump_guest_s1_walk handling > >of level 2 page tables (5 days ago) > >* 60e06f2 - arm/traps: fix bug in dump_guest_s1_walk L1 page table offset > >computation (5 days ago) > > > >They don't look very suspicious. > > The modified function is not called in the hypervisor at all. It's only here > for manual debugging. > > Although, this may change the offset of some function (assuming we have an > hidden bug). > > >If you need help navigating osstest test report, please let me know. > > I have noticed that there is 2 kernel BUG in the logs (with one host reboot > in the middle). Can you detail what the exact the test? What I normally do is to look at the summary page of the failed test to identify the failed step and the time. In this case: http://logs.test-lab.xenproject.org/osstest/logs/97737/test-armhf-armhf-xl/info.html The time stamp said the failed step started at 2016-07-21 19:30:10 Z, and then I look at the output of failed step log to look for time stamp that the test failed. Then I would look for output between these two time stamps in various log. I now realise the log I pasted in was not from the failed test. I wanted to paste in the second kernel oops, which should be the culprit that the test failed. The two oops was the same one, though. To identify which test step was running when the first oops happened, the same technique applies. It seems that the oops happened during ts-debian-install according to time stamps. Wei. > > It looks to me that you are trying to power cycle multiple time a guest. > > Cheers, > > >Wei. > > > > > >On Mon, Jul 25, 2016 at 12:05:08PM +0100, Julien Grall wrote: > >>Hi Wei, > >> > >>On 25/07/16 09:53, Wei Liu wrote: > >>>On Fri, Jul 22, 2016 at 03:27:30AM +, osstest service owner wrote: > flight 97737 xen-unstable real [real] > http://logs.test-lab.xenproject.org/osstest/logs/97737/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-armhf-armhf-xl 15 guest-start/debian.repeat fail REGR. vs. > 97664 > >>> > >>>From > >>> > >>>\ > >>> > >>> > >>>Jul 21 17:08:59.405183 [ 4479.814529] [ cut here ] > >>> > >>>Jul 21 17:09:16.961529 [ 4479.814600] kernel BUG at > >>>drivers/xen/grant-table.c:923! > >>> > >>>Jul 21 17:09:16.966838 [ 4479.814628] Internal error: Oops - BUG: 0 [#1] > >>>SMP ARM > >>> > >>>Jul 21 17:09:16.972090 [ 4479.814656] Modules linked in: xen_gntalloc > >>>bridge stp ipv6 llc brcmfmac brcmutil cfg80211 > >>> > >>>Jul 21 17:09:16.980340 [ 4479.814759] CPU: 1 PID: 24761 Comm: > >>>vif5.0-q0-guest Not tainted 3.16.7-ckt12+ #1 > >>> > >>>Jul 21 17:09:16.987841 [ 4479.814795] task: d8ef7600 ti: d85bc000 task.ti: > >>>d85bc000 > >>> > >>>Jul 21 17:09:16.993339 [ 4479.814833] PC is at gnttab_batch_copy+0xd0/0xe4 > >>> > >>>Jul 21 17:09:16.997963 [ 4479.814860] LR is at gnttab_batch_copy+0x1c/0xe4 > >>> > >>>Jul 21 17:09:17.002718 [ 4479.814888] pc : []lr : > >>>[]psr: a0070013 > >>> > >>>Jul 21 17:09:17.008962 [ 4479.814888] sp : d85bdea0 ip : deadbeef fp : > >>>c0c8e140 > >>> > >>>Jul 21 17:09:17.014341 [ 4479.814935] r10: r9 : e1bec000 r8 : > >>> > >>> > >>>Jul 21 17:09:17.019595 [ 4479.814960] r7 : 0002 r6 : 0002 r5 : > >>>d85bdf20 r4 : e1bf4d30 > >>> > >>>Jul 21 17:09:17.026095 [ 4479.814990] r3 : 0001 r2 : deadbeef r1 : > >>>deadbeef r0 : fff2 > >>> > >>>Jul 21 17:09:17.032717 [ 4479.815021] Flags: NzCv IRQs on FIQs on Mode > >>>SVC_32 ISA ARM Segment kernel > >>> > >>>Jul 21 17:09:17.040091 [ 4479.815055] Control: 10c5387d Table: 78d8406a > >>>DAC: 0015 > >>> > >>>Jul 21 17:09:17.045964 [ 4479.815084] Process vif5.0-q0-guest (pid: 24761, > >>>stack limit = 0xd85bc248) > >>> > >>>Jul 21 17:09:17.052840 [ 4479.815114] Stack: (0xd85bdea0 to 0xd85be000) > >>> > >>>Jul 21 17:09:17.057218 [ 4479.815145] dea0: 0001 d8b11388 d85bdf20 > >>>d85bdf04 0002 c05eb054 0388 > >>> > >>>Jul 21 17:09:17.065469 [ 4479.815183] dec0: d85bdf04 > >>>c0b7ea80 db0995c0 c05e86e4 e1bf4000 003c > >>> > >>>Jul 21 17:09:17.073753 [ 4479.815221] dee0: > >>>c0b8849c e1bf4cfc c0c8e140 e1bf4d30 e1bf4cc4 > >>> > >>>Jul 21 17:09:17.082001 [ 4479.815260] df00: db0c3e80 d85bdf08 > >>>d85bdf08 d8c5cb40 d8c5cb40 0001 > >>> > >>>Jul 21 17:09:17.090217 [ 4479.815298] df20: 0002 0001 > >>> e1bf4d30 e1c1f530 04c6 023c > >>> > >>>Jul 21 17:09:17.098466 [ 4479.815337] df40: d84aab80 > >>>e1bec000 c05ea990 > >>> > >>>Jul 21 17:09:17.106
Re: [Xen-devel] [PATCH XTF 3/3] xtf-runner: regularise runner exit code
On 25/07/16 12:25, Ian Jackson wrote: > Andrew Cooper writes ("Re: [PATCH XTF 3/3] xtf-runner: regularise runner exit > code"): >> On 22/07/16 10:29, Wei Liu wrote: >>> This would cause a FAILURE to overwrite previous ERROR result. Is that >>> what you want? >> When running more than one test, the overall result should be the most >> severe. So yes, a subsequent FAILURE should override an ERROR. > "Error" is surely a more severe problem than "Failure". No. Error means "something unexpected went wrong", while Failure is "the test worked, and identified that the area under test is defective". Error is typically a test case bug or infrastructure issue. > In particular, Error might mask a failure. Indeed it could. Either should cause a prompt investigation and effort to resolve the issues, but one must be more severe than the other and this is the way things are specified. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources
On Mon, Jul 25, 2016 at 07:08:36PM +0800, Bob Liu wrote: > > On 07/25/2016 06:53 PM, Roger Pau Monné wrote: > ..[snip..] > * We get the device lock and blk_mq_freeze_queue() in > dynamic_reconfig_device(), > and then have to release in blkif_recover() asynchronously which > makes the code more difficult to readable. > >>> > >>> I'm not sure I'm following here, do you mean that you will pick the lock > >>> in > >>> dynamic_reconfig_device and release it in blkif_recover? Why wouldn't you > >> > >> Yes. > >> > >>> release the lock in dynamic_reconfig_device after doing whatever is > >>> needed? > >>> > >> > >> Both 'dynamic configuration' and migration:xenbus_dev_resume() use { > >> blkfront_resume(); blkif_recover() } and depends on the change of > >> xbdev->state. > >> If they happen simultaneously, the State machine of xbdev->state is going > >> to be a mess and very difficult to track. > >> > >> The lock(actually a mutex) is like a big lock to make sure no race would > >> happen at all. > > > > So the only thing that you should do is set the frontend state to closed > > and > > wait for the backend to also switch to state closed, and then switch the > > frontend state to init again in order to trigger a reconnection. > > > > But if migration:xenbus_dev_resume() is triggered at the same time, any state > be set might be changed. > = > E.g > Dynamic_reconfig_device: > Migration:xenbus_dev_resume() > > Set the frontend state to closed > > > frontend state will be > reset to XenbusStateInitialising by xenbus_dev_resume() > > Wait for the backend to also switch to state closed This is not really a race, the backend will not switch to state closed, and instead will appear at state init again, which is what we wanted anyway in order to reconnect, so I don't see an issue with this flow. > = > Similar situation may happen at any other place regarding set the state. Right, but I don't see how your proposed patch also avoids this issues. I see that you pick the device lock in dynamic_reconfig_device, but I don't see xenbus_dev_resume picking the lock at all, and in any case I don't think you should prevent the VM from migrating. Depending on the toolstack it might decide to kill the VM because it has not been able to migrate it, in which case the result is not better. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/4] tools/libxc: Tolerate zero-length records in migration v2 streams
On 21/07/16 18:17, Andrew Cooper wrote: > Under some circumstances, the migration v2 save code would generate valid > records with zero content, when the intended behaviour was to omit the record As explained, this is not the intended behaviour. I would appreciate it if you did not misrepresent me here. > entirely. > > As the stream is otherwise fine, tolerate these records and avoid failing the > migration. [...] > --- a/tools/libxc/xc_sr_restore_x86_hvm.c > +++ b/tools/libxc/xc_sr_restore_x86_hvm.c [...] > +/* > + * Tolerate empty records. Older sending sides used to accidentally > + * generate them. > + */ > +if ( hdr->count == 0 ) > +{ > +DBGPRINTF("Skipping empty HVM_PARAMS record\n"); > +return 0; > +} > + > for ( i = 0; i < hdr->count; i++, entry++ ) This loop already handles hdr->count == 0. The additional check is not required. > --- a/tools/libxc/xc_sr_restore_x86_pv.c > +++ b/tools/libxc/xc_sr_restore_x86_pv.c > @@ -753,15 +753,26 @@ static int handle_x86_pv_vcpu_blob(struct xc_sr_context > *ctx, > } > > /* Confirm that there is a complete header. */ > -if ( rec->length <= sizeof(*vhdr) ) > +if ( rec->length < sizeof(*vhdr) ) > { > -ERROR("%s record truncated: length %u, min %zu", > - rec_name, rec->length, sizeof(*vhdr) + 1); > +ERROR("%s record truncated: length %u, header size %zu", > + rec_name, rec->length, sizeof(*vhdr)); > goto out; > } > > blobsz = rec->length - sizeof(*vhdr); > > +/* > + * Tolerate empty records. Older sending sides used to accidentally > + * generate them. > + */ > +if ( blobsz == 0 ) > +{ > +DBGPRINTF("Skipping empty %s record for vcpu %u\n", > + rec_type_to_str(rec->type), vhdr->vcpu_id); > +goto out; > +} This check for a zero-sized blob should be immediately prior to the blob = malloc(blobsz); So all the other length validation is not skipped. In particular, some record types may wish to make zero-length blobs invalid. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 1/4] tools: remove systemd xenstore socket definitions
On 25/07/16 13:05, Wei Liu wrote: > On Mon, Jul 25, 2016 at 06:33:35AM +0200, Juergen Gross wrote: >> On 22/07/16 20:51, Wei Liu wrote: >>> On Fri, Jul 22, 2016 at 08:49:17PM +0200, Juergen Gross wrote: On 22/07/16 18:31, Wei Liu wrote: > Only skim-read this patch, will do proper review later. > > On Fri, Jul 22, 2016 at 05:09:28PM +0200, Juergen Gross wrote: > [...] >> CAMLprim value ocaml_launched_by_systemd(value ignore) >> { >> -CAMLparam1(ignore); >> -CAMLlocal1(ret); >> + CAMLparam1(ignore); >> + CAMLlocal1(ret); >> >> -ret = Val_false; >> + ret = Val_false; >> >> -if (sd_listen_fds(0) > 0) >> -ret = Val_true; >> + if (sd_booted() > 0) >> + ret = Val_true; > > I think this may be problematic. > > sd_booted returns true if system is booted with systemd, but it has no > bearing whether this particular process is launched by systemd. > > IIRC using sd_booted would cause oxenstored thinks it is launched by > systemd even if the user launches it by hand in a shell. That caused > it's initialisation to fail. 81d758afca7c3c1e3ccbd78154b33d64fd7757fb > was written to address that issue. > > So, what would happen if you start oxenstored by hand with your patch > apply? Maybe we can just remove this launched_by_systemd check all > together -- i.e. we always call sd_notify? >> >> Sure we could. I'll remove the checks in both xenstored variants if >> nobody objects. >> So you are concerned sd_notify() will be called too often, but you are suggesting to call it always? I don't understand your concerns then. >>> >>> No, my concern is that you won't be able to start oxenstored from >>> command line manually if you boot with systemd. At least that was the >>> bug that caused me to write that patch. >> >> I believe the main problem was xenstored not calling daemonize() in that >> case, right? This problem is being remove with my patch as daemonize() >> will be called always. The systemd service file is modified to reflect >> this change in behavior. >> > > I'm afraid I can't remember all the details. But as long as you can > launch [o/c]xenstored by hand I think we're fine. Verified for both xenstored variants. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources
On 07/25/2016 08:11 PM, Roger Pau Monné wrote: > On Mon, Jul 25, 2016 at 07:08:36PM +0800, Bob Liu wrote: >> >> On 07/25/2016 06:53 PM, Roger Pau Monné wrote: >> ..[snip..] >> * We get the device lock and blk_mq_freeze_queue() in >> dynamic_reconfig_device(), >>and then have to release in blkif_recover() asynchronously which >> makes the code more difficult to readable. > > I'm not sure I'm following here, do you mean that you will pick the lock > in > dynamic_reconfig_device and release it in blkif_recover? Why wouldn't you Yes. > release the lock in dynamic_reconfig_device after doing whatever is > needed? > Both 'dynamic configuration' and migration:xenbus_dev_resume() use { blkfront_resume(); blkif_recover() } and depends on the change of xbdev->state. If they happen simultaneously, the State machine of xbdev->state is going to be a mess and very difficult to track. The lock(actually a mutex) is like a big lock to make sure no race would happen at all. >>> >>> So the only thing that you should do is set the frontend state to closed >>> and >>> wait for the backend to also switch to state closed, and then switch the >>> frontend state to init again in order to trigger a reconnection. >>> >> >> But if migration:xenbus_dev_resume() is triggered at the same time, any >> state be set might be changed. >> = >> E.g >> Dynamic_reconfig_device: >> Migration:xenbus_dev_resume() >> >> Set the frontend state to closed >> >> >> frontend state will be >> reset to XenbusStateInitialising by xenbus_dev_resume() >> >> Wait for the backend to also switch to state closed > > This is not really a race, the backend will not switch to state closed, and > instead will appear at state init again, which is what we wanted anyway in > order to reconnect, so I don't see an issue with this flow. > >> = >> Similar situation may happen at any other place regarding set the state. > > Right, but I don't see how your proposed patch also avoids this issues. I > see that you pick the device lock in dynamic_reconfig_device, but I don't > see xenbus_dev_resume picking the lock at all, and in any case I don't think The lock is picked from the power management framework. Anyway, I'm convinced and will follow your suggestion. Thank you! > you should prevent the VM from migrating. > > Depending on the toolstack it might decide to kill the VM because it has not > been able to migrate it, in which case the result is not better. > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian
Monday, July 25, 2016, 1:27:20 PM, you wrote: > On 25.07.2016 12:51, Wei Liu wrote: >>> [...] >>> > Your report and the debian report suggested that Dom0 kernel is less >>> > likely to be the culprit because you've tried different Dom0 kernels. >>> yes we did. but nothing newer than 3.16 so far, we could try that, >>> too. >> There is a 4.5 backport kernel for Jessie. Or you can compile a Dom0 >> kernel with your config file. If you go down that route, I suggest you >> either use the newest long term support Linux kernel (4.4 according to >> kernel.org) or the latest stable kernel. > I already tried kernel 4.5 from backports, but it was still crashing: > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=804079#39 Hi Ingo, Couldn't this just be a more generic Linux kernel issue with the combination of ipv6, bridging and ndisc ? Wouldn't be the first time that a config more common on systems running Xen, would more prominently expose a kernel bug in the networking stack. Have you brought this up on the LKML / netdev mailinglists already ? -- Sander ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH linux v2 0/9] xen: pvhvm: support bootup on secondary vCPUs
On 30/06/16 16:56, Vitaly Kuznetsov wrote: > It may happen that Xen's and Linux's ideas of vCPU id diverge. In > particular, when we crash on a secondary vCPU we may want to do kdump > and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting > on the vCPU which crashed. This doesn't work very well for PVHVM guests as > we have a number of hypercalls where we pass vCPU id as a parameter. These > hypercalls either fail or do something unexpected. To solve the issue we > need to have a mapping between Linux's and Xen's vCPU ids. > > This series solves the issue for x86 PVHVM guests. PV guests don't (and > probably won't) support kdump so I always assume Xen's vCPU id == Linux's > vCPU id. ARM guests will probably need to get proper mapping once we start > supporting kexec/kdump there. Applied to for-linus-4.8, thanks. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/4] tools/libxc: Tolerate zero-length records in migration v2 streams
On 25/07/16 13:21, David Vrabel wrote: > On 21/07/16 18:17, Andrew Cooper wrote: >> Under some circumstances, the migration v2 save code would generate valid >> records with zero content, when the intended behaviour was to omit the record > > As explained, this is not the intended behaviour. And as previously stated, I still disagree. > I would appreciate it > if you did not misrepresent me here. WTF? I am the author of this code, and I very definitely intended for the record to be omitted. I am not representing you in at all, let alone mis-representation. There is clearly a difference of opinion, and there is no guidance from the spec, but accusations like this are not warranted. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [ovmf test] 99608: regressions - FAIL
flight 99608 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/99608/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64 5 xen-build fail REGR. vs. 94748 build-i386-xsm5 xen-build fail REGR. vs. 94748 build-i3865 xen-build fail REGR. vs. 94748 build-amd64-xsm 5 xen-build fail REGR. vs. 94748 Tests which did not succeed, but are not blocking: build-amd64-libvirt 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a version targeted for testing: ovmf 227a1ac2d04504eee4c624f534ea3fcbb8140029 baseline version: ovmf dc99315b8732b6e3032d01319d3f534d440b43d0 Last test of basis94748 2016-05-24 22:43:25 Z 61 days Failing since 94750 2016-05-25 03:43:08 Z 61 days 122 attempts Testing same since99608 2016-07-25 10:57:02 Z0 days1 attempts People who touched revisions under test: Anandakrishnan Loganathan Ard Biesheuvel Bi, Dandan Bret Barkelew Bruce Cran Bruce Cran Chao Zhang Cinnamon Shia Cohen, Eugene Dandan Bi Darbin Reyes david wei Eric Dong Eugene Cohen Evan Lloyd Evgeny Yakovlev Feng Tian Fu Siyuan Fu, Siyuan Gary Li Gary Lin Giri P Mudusuru Graeme Gregory Hao Wu Hegde Nagaraj P Hegde, Nagaraj P hegdenag Heyi Guo Jan D?bro? Jan Dabros Jeff Fan Jeremy Linton Jiaxin Wu Jiewen Yao Joe Zhou Jordan Justen Katie Dellaquila Laszlo Ersek Liming Gao Lu, ShifeiX A lushifex Marcin Wojtas Mark Rutland Marvin H?user Marvin Haeuser Maurice Ma Michael Chang Michael Zimmermann Mudusuru, Giri P Ni, Ruiyu Qin Long Qiu Shumin Ruiyu Ni Ruiyu Niã Ryan Harkin Sami Mujawar Satya Yarlagadda Shannon Zhao Shi, Steven Sriram Subramanian Star Zeng Steven Shi Subramanian, Sriram (EG Servers Platform SW) Sunny Wang Tapan Shah Thomas Palmer Yarlagadda, Satya P Yonghong Zhu Zhang Lubo Zhang, Chao B Zhang, Lubo jobs: build-amd64-xsm fail build-i386-xsm fail build-amd64 fail build-i386 fail build-amd64-libvirt blocked build-i386-libvirt blocked build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked test-amd64-i386-xl-qemuu-ovmf-amd64 blocked sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. (No revision log; it would be 11446 lines long.) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/4] tools/libxc: Tolerate zero-length records in migration v2 streams
On 25/07/16 13:46, Andrew Cooper wrote: > On 25/07/16 13:21, David Vrabel wrote: >> On 21/07/16 18:17, Andrew Cooper wrote: >>> Under some circumstances, the migration v2 save code would generate valid >>> records with zero content, when the intended behaviour was to omit the >>> record >> >> As explained, this is not the intended behaviour. > > And as previously stated, I still disagree. > >> I would appreciate it >> if you did not misrepresent me here. > > WTF? > > I am the author of this code, and I very definitely intended for the > record to be omitted. I am not representing you in at all, let alone > mis-representation. As the original and principle author of the specification I definitely intended for empty records to be included. In the context above (given the lack of clarity in the text) it is not unreasonable for someone to conclude that its talking about the specification author's intentions. Thus without clarifying whose intentions you're talking about, you are mis-representing my profession opinion. So, yes. I am serious about you changing the commit message. I do not want your design decisions (good or bad) attributed to me. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian
On Mon, Jul 25, 2016 at 01:41:41PM +0200, Ingo Jürgensmann wrote: > On 25.07.2016 12:23, Wei Liu wrote: > > First, thank you for replying! Very much appreciated! :) > > >I did skim your emails. But the oops was happening in memcpy+0x6 which > >indicated it came back to the origin question why would it got an > >exception there. > > > >Just by staring at the code doesn't get me anywhere. Without a concrete > >reproduction of the issue, I'm afraid I can't provide more input here. > > Well, from my point of view, it happens quite often when accessing the > server via SSH. For example today it crashed when I wanted to add something > and after I clicked into putty and typed the first char. In another putty, > where I have my netconsole log open, I instantly saw the oops. > > But what exactly causing these kinds of reboots, I'm clueless as you too. > Only that I do experience far more frequent crashes when accessing the > server from workplace via putty on Windows than via SSH on OSX or Debian > Linux. > > >There are several moving parts: > >0. Hardware > >1. Xen hypervisor > >2. Dom0 kernel > >Your report and the debian report suggested that Dom0 kernel is less > >likely to be the culprit because you've tried different Dom0 kernels. > > As just written in the other mail, I already tried kernel 4.5 from > backports. Still crashing. > > >As for Xen, not sure if you would be up for trying a debug build from > >source tree. That would help provide information on whether this is a > >bug in Xen or not. > > Will try to build from Debian source, but how to enable debug build? > I was thinking about building directly from xen.git. http://wiki.xenproject.org/wiki/Compiling_Xen_From_Source Probably try the Xen 4.7 release. Wei. > -- > Ciao... //http://blog.windfluechter.net > Ingo \X/ XMPP: i...@jabber.windfluechter.net > > > gpg pubkey: http://www.juergensmann.de/ij_public_key.asc ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH linux v2 0/9] xen: pvhvm: support bootup on secondary vCPUs
Hi David, On 25/07/16 13:38, David Vrabel wrote: On 30/06/16 16:56, Vitaly Kuznetsov wrote: It may happen that Xen's and Linux's ideas of vCPU id diverge. In particular, when we crash on a secondary vCPU we may want to do kdump and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting on the vCPU which crashed. This doesn't work very well for PVHVM guests as we have a number of hypercalls where we pass vCPU id as a parameter. These hypercalls either fail or do something unexpected. To solve the issue we need to have a mapping between Linux's and Xen's vCPU ids. This series solves the issue for x86 PVHVM guests. PV guests don't (and probably won't) support kdump so I always assume Xen's vCPU id == Linux's vCPU id. ARM guests will probably need to get proper mapping once we start supporting kexec/kdump there. Applied to for-linus-4.8, thanks. It would have been nice to send a ping before applying. This patch series is containing Xen ARM code which has not been acked by Stefano, nor had feedback from ARM side. For instance given that all the hypercalls are representing a "vcpu id" using "uint32_t" it is a bit weird to use "int" to define xen_vcpu_id (see patch #3). Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] systemd: use standard dependencies for xendriverdomain.service
On Mon, Jul 25, 2016 at 10:17:44AM +0100, Wei Liu wrote: > On Sun, Jul 24, 2016 at 09:26:57PM +0200, Marek Marczykowski-Górecki wrote: > > Having DefaultDependencies=no means it can be started before / is > > remounted read-write, which will result in various failures (to start > > with opening the log). > > Since "libxl: trigger attach events for devices attached before xl devd > > startup" it is no longer important to start it as early as possible, > > because it will process devices created before its startup. > > > > Cc: Ian Jackson > > Cc: Wei Liu > > Signed-off-by: Marek Marczykowski-Górecki > > The reasoning makes sense. > > Acked-by: Wei Liu > > Also CC Rusty (author of this file) > Pushed. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools/libxc: Properly increment ApicIdCoreSize field on AMD
On Mon, Jul 25, 2016 at 12:33:37PM +0100, Ian Jackson wrote: > Wei Liu writes ("Re: [PATCH] tools/libxc: Properly increment ApicIdCoreSize > field on AMD"): > > On Fri, Jul 22, 2016 at 01:45:05PM -0400, Boris Ostrovsky wrote: > > > On 07/22/2016 01:38 PM, Wei Liu wrote: > > > But this actually fixes a regression that was triggered by recent > > > hvmloader change on one particular processor that we have in the test > > > farm. > > > > Ian, this is a backport candidate and needs to go back as far as > > possible. The bogus calculation was introduced in 2008 (!). > > Thanks, but: > > Not queued for backport because the patch is not in staging. > Pushed to staging. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v3 00/13] linux: generalize sections, ranges and linker tables
Hi Luis, On Fri, 22 Jul 2016 14:24:34 -0700 "Luis R. Rodriguez" wrote: > This RFC v3 builds off the last RFC v2 series [0] for adding linker tables. > The largest amount of work here was to take Russell King's feedback on > using linker table for kprobes text not being appropriate -- and providing > another lightweight API for simple section ranges: read-only stitched pieces > of executable code. This required also generalizing common building blocks > for both linker tables and section ranges, these building blocks are defined > now in include/linux/sections.h and asm-generic/section.h. The other last > thing > decided was to not support sub-sections. In the hunt for this I could think of > anything that really required this, and if it was needed it did not seem > impossible to port over to avoid its use. Please let me know if there are > valid > uses cases for sub-sections. > > Other significant effort here was to provide a set of common assembly helpers > which could be used across architectures, this starts off some of this work > for generic helpers which carve out and define custom Linux sections. > > Lastly, this now also goes with two ports which required module support when > using linker tables: jump labels, and dynamic debug support. A few > extensions have been made to the original series in order to provide > support for that. > > Since kprobes actually had both a linker table and a section range the > patch that dealt with kprobes is now split off in two patches, one > that deals with its linker table and another for its section ranges. > > More elaborate uses for linker tables are possible, I'll hold off on any > of this type of work until at least the basic building blocks are fleshed > out. To review how this work came about, and more elaborate uses being > evaluated check out the userspace linker-tables mockup solution [1]. > Hopefully most of the possible bikeshedding was already dealt with through > that tree. Thanks to hpa for tons of feedback. Great! so table and ranges completely replace the old-style(add-hoc) _kprobe and NOKPROBE_SYMBOL() implementation, good job! :) > > Should you need it, the code here is also available on my linux-next > 20160722-linker-table-v3-try2 branch on kernel.org [2]. Lastly, ranges and > table development go under copyleft-next, Rusty recently asked for code > to go in prior to the license tag being added denoting this license as > GPL-compatible [3] -- I had noted in the patch submission which annotated > copyleft-next's compatibility to GPLv2 that copyleft-next is the license > of choice for ongoing kernel development on my end [4]. If this is > objectionable I'm happy to change it to GPLv2 however I'd like a reason > provided as I've gone through all possible channels to ensure this is kosher, > including vetting by 3 attorneys now, 2 at SUSE. I'm not a lawyer, so I don't know really it is compatible with GPLv2, and if it is, I'm not sure the reason why we need another license. AFAICS the license terms, most of parts looks reasonable. I just concern clause 8, after fifteen years, is that still GPLv2 compatible? (I'd like see FAQ about this license...) Thank you! > > This all goes tested by 0-day... however since I found an issue with linker > tables and blackfin this series remains as RFC -- I'll try to debug the > issue with Steven Miao. The issue with blackfin is using a config that > enables CONFIG_FW_LOADER=y and CONFIG_DYNAMIC_DEBUG=y [5] we end up with > the following at final link time: > > LD init/built-in.o > lib/built-in.o: In function `dynamic_debug_init': > lib/dynamic_debug.c:(.init.text+0x156): undefined reference to > `__verbose__end' > lib/dynamic_debug.c:(.init.text+0x15a): undefined reference to > `__verbose__end' > lib/dynamic_debug.c:(.init.text+0x160): undefined reference to `__verbose' > lib/dynamic_debug.c:(.init.text+0x164): undefined reference to `__verbose' > lib/dynamic_debug.c:(.init.text+0x214): undefined reference to > `__verbose__end' > lib/dynamic_debug.c:(.init.text+0x218): undefined reference to > `__verbose__end' > lib/dynamic_debug.c:(.init.text+0x252): undefined reference to > `__verbose__end' > lib/dynamic_debug.c:(.init.text+0x258): undefined reference to `__verbose' > lib/dynamic_debug.c:(.init.text+0x25c): undefined reference to > `__verbose__end' > lib/dynamic_debug.c:(.init.text+0x260): undefined reference to `__verbose' > drivers/built-in.o: In function `release_firmware': > (.text+0x22dc2): undefined reference to `builtin_fw' > drivers/built-in.o: In function `release_firmware': > (.text+0x22dc6): undefined reference to `builtin_fw__end' > drivers/built-in.o: In function `release_firmware': > (.text+0x22dca): undefined reference to `builtin_fw' > drivers/built-in.o: In function `release_firmware': > (.text+0x22dce): undefined reference to `builtin_fw__end' > drivers/built-in.o: In function `fw_get_builtin_firmware': > drivers/base/firmware_class.c:(.text+0x23008): undefined ref
Re: [Xen-devel] [PATCH linux v2 0/9] xen: pvhvm: support bootup on secondary vCPUs
Julien Grall writes: > Hi David, > > On 25/07/16 13:38, David Vrabel wrote: >> On 30/06/16 16:56, Vitaly Kuznetsov wrote: >>> It may happen that Xen's and Linux's ideas of vCPU id diverge. In >>> particular, when we crash on a secondary vCPU we may want to do kdump >>> and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting >>> on the vCPU which crashed. This doesn't work very well for PVHVM guests as >>> we have a number of hypercalls where we pass vCPU id as a parameter. These >>> hypercalls either fail or do something unexpected. To solve the issue we >>> need to have a mapping between Linux's and Xen's vCPU ids. >>> >>> This series solves the issue for x86 PVHVM guests. PV guests don't (and >>> probably won't) support kdump so I always assume Xen's vCPU id == Linux's >>> vCPU id. ARM guests will probably need to get proper mapping once we start >>> supporting kexec/kdump there. >> >> Applied to for-linus-4.8, thanks. > > It would have been nice to send a ping before applying. This patch > series is containing Xen ARM code which has not been acked by Stefano, > nor had feedback from ARM side. > > For instance given that all the hypercalls are representing a "vcpu > id" using "uint32_t" it is a bit weird to use "int" to define > xen_vcpu_id (see patch #3). CPU id is usually 'int' in linux and now we pass it to all hypercalls as it is. It is a bit more convenient in the mapping I introduce as we can set it to a negative value to indicate there is no mapping available. I can definitely change that and use something like U32_MAX-1 to instead but I'm not sure it is worth it... -- Vitaly ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [Qemu-devel] [PATCH 01/19] xen: Create a new file xen_pvdev.c
On Sun, Jul 10, 2016 at 02:47:32PM +0300, Emil Condrea wrote: > The purpose of the new file is to store generic functions shared by frontend > and backends such as xenstore operations, xendevs. > > Signed-off-by: Quan Xu > Signed-off-by: Emil Condrea > --- > hw/xen/Makefile.objs | 2 +- > hw/xen/xen_backend.c | 125 +--- > hw/xen/xen_pvdev.c | 149 > +++ > include/hw/xen/xen_backend.h | 63 +- > include/hw/xen/xen_pvdev.h | 71 + > 5 files changed, 223 insertions(+), 187 deletions(-) > create mode 100644 hw/xen/xen_pvdev.c > create mode 100644 include/hw/xen/xen_pvdev.h > > diff --git a/include/hw/xen/xen_pvdev.h b/include/hw/xen/xen_pvdev.h > new file mode 100644 > index 000..f60bfae > --- /dev/null > +++ b/include/hw/xen/xen_pvdev.h > @@ -0,0 +1,71 @@ > +#ifndef QEMU_HW_XEN_PVDEV_H > +#define QEMU_HW_XEN_PVDEV_H 1 No need for the 1. Just "#define X" is enough. > + > +#include "hw/xen/xen_common.h" > +#include I don't think this include is needed. -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 4/4] tools: make xenstore domain easy configurable
On Fri, Jul 22, 2016 at 05:09:31PM +0200, Juergen Gross wrote: [...] > diff --git a/tools/hotplug/Linux/launch-xenstore.in > b/tools/hotplug/Linux/launch-xenstore.in > index 2bd9f64..fdfa33a 100644 > --- a/tools/hotplug/Linux/launch-xenstore.in > +++ b/tools/hotplug/Linux/launch-xenstore.in > @@ -48,18 +48,40 @@ test_xenstore && exit 0 > > test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . > @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons > > -test -z "$XENSTORED_ROOTDIR" && XENSTORED_ROOTDIR="@XEN_LIB_STORED@" > -rm -f "$XENSTORED_ROOTDIR"/tdb* 2>/dev/null > -test -z "$XENSTORED_TRACE" || XENSTORED_ARGS=" -T > @XEN_LOG_DIR@/xenstored-trace.log" > +[ "$XENSTORETYPE" = "" ] && XENSTORETYPE=daemon > + > +/bin/mkdir -p @XEN_RUN_DIR@ > + > +[ "$XENSTORETYPE" = "daemon" ] && { > + [ -z "$XENSTORED_ROOTDIR" ] && XENSTORED_ROOTDIR="@XEN_LIB_STORED@" > + /bin/rm -f $XENSTORED_ROOTDIR/tdb* 2>/dev/null > + [ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T > @XEN_LOG_DIR@/xenstored-trace.log" > + [ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@ > + [ -x "$XENSTORED" ] || { > + echo "No xenstored found" > + exit 1 > + } > > -if [ -n "$XENSTORED" ] ; then > echo -n Starting $XENSTORED... > $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS > -else > - echo "No xenstored found" > - exit 1 > -fi > > -timeout_xenstore $XENSTORED || exit 1 > + systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || > exit 1 > > -exit 0 > + exit 0 > +} > + > +[ "$XENSTORETYPE" = "domain" ] && { > + [ -z "$XENSTORE_DOMAIN_KERNEL" ] && > XENSTORE_DOMAIN_KERNEL=@LIBEXEC@/boot/xenstore-stubdom.gz > + XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --kernel > $XENSTORE_DOMAIN_KERNEL" > + [ -z "$XENSTORE_DOMAIN_SIZE" ] && XENSTORE_DOMAIN_SIZE=8 > + XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --memory > $XENSTORE_DOMAIN_SIZE" > + > + echo -n Starting $XENSTORE_DOMAIN_KERNEL... > + ${LIBEXEC_BIN}/init-xenstore-domain $XENSTORE_DOMAIN_ARGS || exit 1 > + systemd-notify --ready 2>/dev/null Please test if there is systemd-notify before trying to invoke it and then you can properly log the failure of the invocation. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 4/4] tools: make xenstore domain easy configurable
On 25/07/16 15:43, Wei Liu wrote: > On Fri, Jul 22, 2016 at 05:09:31PM +0200, Juergen Gross wrote: > [...] >> diff --git a/tools/hotplug/Linux/launch-xenstore.in >> b/tools/hotplug/Linux/launch-xenstore.in >> index 2bd9f64..fdfa33a 100644 >> --- a/tools/hotplug/Linux/launch-xenstore.in >> +++ b/tools/hotplug/Linux/launch-xenstore.in >> @@ -48,18 +48,40 @@ test_xenstore && exit 0 >> >> test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . >> @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons >> >> -test -z "$XENSTORED_ROOTDIR" && XENSTORED_ROOTDIR="@XEN_LIB_STORED@" >> -rm -f "$XENSTORED_ROOTDIR"/tdb* 2>/dev/null >> -test -z "$XENSTORED_TRACE" || XENSTORED_ARGS=" -T >> @XEN_LOG_DIR@/xenstored-trace.log" >> +[ "$XENSTORETYPE" = "" ] && XENSTORETYPE=daemon >> + >> +/bin/mkdir -p @XEN_RUN_DIR@ >> + >> +[ "$XENSTORETYPE" = "daemon" ] && { >> +[ -z "$XENSTORED_ROOTDIR" ] && XENSTORED_ROOTDIR="@XEN_LIB_STORED@" >> +/bin/rm -f $XENSTORED_ROOTDIR/tdb* 2>/dev/null >> +[ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T >> @XEN_LOG_DIR@/xenstored-trace.log" >> +[ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@ >> +[ -x "$XENSTORED" ] || { >> +echo "No xenstored found" >> +exit 1 >> +} >> >> -if [ -n "$XENSTORED" ] ; then >> echo -n Starting $XENSTORED... >> $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS >> -else >> -echo "No xenstored found" >> -exit 1 >> -fi >> >> -timeout_xenstore $XENSTORED || exit 1 >> +systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || >> exit 1 >> >> -exit 0 >> +exit 0 >> +} >> + >> +[ "$XENSTORETYPE" = "domain" ] && { >> +[ -z "$XENSTORE_DOMAIN_KERNEL" ] && >> XENSTORE_DOMAIN_KERNEL=@LIBEXEC@/boot/xenstore-stubdom.gz >> +XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --kernel >> $XENSTORE_DOMAIN_KERNEL" >> +[ -z "$XENSTORE_DOMAIN_SIZE" ] && XENSTORE_DOMAIN_SIZE=8 >> +XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --memory >> $XENSTORE_DOMAIN_SIZE" >> + >> +echo -n Starting $XENSTORE_DOMAIN_KERNEL... >> +${LIBEXEC_BIN}/init-xenstore-domain $XENSTORE_DOMAIN_ARGS || exit 1 >> +systemd-notify --ready 2>/dev/null > > Please test if there is systemd-notify before trying to invoke it and > then you can properly log the failure of the invocation. I thought about that. What would be the purpose doing so? Following cases are possible: - system has no systemd installed: systemd-notify will fail, but calling it was not necessary -> no harm done - system is with systemd, but not booted under control of it, and systemd-notify is not found: same as above -> no harm done - system is with systemd, but not booted under control of it, and systemd-notify is found: calling systemd-notify isn't really needed, but it won't harm - system is booted under control of systemd, systemd-notify is not found: I could log it, but I can't know that I'm under control of systemd (standard way to tell from a script is calling "systemd-notify --booted" which is kind of chicken and egg problem here) so I can't know whether not finding systemd-notify is an error or not - system is booted under control of systemd, systemd-notify is found: everything is nice, systemd receives the notification it is waiting for Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 07/19] xen: Rename xen_be_unbind_evtchn
On Sun, Jul 10, 2016 at 02:47:38PM +0300, Emil Condrea wrote: > Prepare xen_be_unbind_evtchn to be shared with frontends: > * xen_be_unbind_evtchn -> xen_pv_unbind_evtchn > > Signed-off-by: Emil Condrea > --- > hw/block/xen_disk.c| 2 +- > hw/char/xen_console.c | 2 +- > hw/display/xenfb.c | 2 +- > hw/net/xen_nic.c | 2 +- > hw/xen/xen_pvdev.c | 2 +- > include/hw/xen/xen_pvdev.h | 2 +- > 6 files changed, 6 insertions(+), 6 deletions(-) There is one change missing in hw/usb/xen-usb.c. -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/4] tools: split out xenstored starting form xencommons
On Fri, Jul 22, 2016 at 05:09:29PM +0200, Juergen Gross wrote: > In order to prepare starting a xenstore domain split out the starting > of the xenstore daemon from the xencommons script into a dedicated > launch-xenstore script. > > Correct one error: don't remove old tdb files in background, as this > could lead to very subtle races. > > A rerun of autogen.sh is required. > > Signed-off-by: Juergen Gross Acked-by: Wei Liu ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH linux v2 0/9] xen: pvhvm: support bootup on secondary vCPUs
On 25/07/16 14:17, Julien Grall wrote: > Hi David, > > On 25/07/16 13:38, David Vrabel wrote: >> On 30/06/16 16:56, Vitaly Kuznetsov wrote: >>> It may happen that Xen's and Linux's ideas of vCPU id diverge. In >>> particular, when we crash on a secondary vCPU we may want to do kdump >>> and unlike plain kexec where we do migrate_to_reboot_cpu() we try >>> booting >>> on the vCPU which crashed. This doesn't work very well for PVHVM >>> guests as >>> we have a number of hypercalls where we pass vCPU id as a parameter. >>> These >>> hypercalls either fail or do something unexpected. To solve the issue we >>> need to have a mapping between Linux's and Xen's vCPU ids. >>> >>> This series solves the issue for x86 PVHVM guests. PV guests don't (and >>> probably won't) support kdump so I always assume Xen's vCPU id == >>> Linux's >>> vCPU id. ARM guests will probably need to get proper mapping once we >>> start >>> supporting kexec/kdump there. >> >> Applied to for-linus-4.8, thanks. > > It would have been nice to send a ping before applying. This patch > series is containing Xen ARM code which has not been acked by Stefano, > nor had feedback from ARM side. The ARM parts are trivial and such a small part of this series, I saw no need to wait. David ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v3 00/13] linux: generalize sections, ranges and linker tables
On 07/25/2016 09:32 AM, Masami Hiramatsu wrote: > I'm not a lawyer, so I don't know really it is compatible with GPLv2, > and if it is, I'm not sure the reason why we need another license. > AFAICS the license terms, most of parts looks reasonable. I just concern > clause 8, after fifteen years, is that still GPLv2 compatible? > (I'd like see FAQ about this license...) Yes, after fifteen years the restrictions in the license that make it copyleft rather than permissive disappear, and the license is transformed into a simple, essentially unconditional, permissive open source license. Richard ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 02/19] xen: Create a new file xen_frontend.c
On Sun, Jul 10, 2016 at 02:47:33PM +0300, Emil Condrea wrote: > Its purpose is to store frontend related functions. > > Signed-off-by: Quan Xu > Signed-off-by: Emil Condrea > --- > hw/block/xen_disk.c | 1 + > hw/display/xenfb.c| 1 + > hw/net/xen_nic.c | 1 + > hw/xen/Makefile.objs | 2 +- > hw/xen/xen_backend.c | 47 + > hw/xen/xen_frontend.c | 70 > +++ > include/hw/xen/xen_backend.h | 3 -- > include/hw/xen/xen_frontend.h | 10 +++ > 8 files changed, 85 insertions(+), 50 deletions(-) > create mode 100644 hw/xen/xen_frontend.c > create mode 100644 include/hw/xen/xen_frontend.h > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 90aca73..28fbf24 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -25,6 +25,7 @@ > > #include "hw/hw.h" > #include "hw/xen/xen_backend.h" > +#include "hw/xen/xen_frontend.h" > #include "xen_blkif.h" > #include "sysemu/blockdev.h" > #include "sysemu/block-backend.h" > diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c > index 46b7d5e..5751113 100644 > --- a/hw/display/xenfb.c > +++ b/hw/display/xenfb.c > @@ -30,6 +30,7 @@ > #include "ui/console.h" > #include "sysemu/char.h" > #include "hw/xen/xen_backend.h" > +#include "hw/xen/xen_frontend.h" > > #include > #include > diff --git a/hw/net/xen_nic.c b/hw/net/xen_nic.c > index 0b4ddae..bdfa789 100644 > --- a/hw/net/xen_nic.c > +++ b/hw/net/xen_nic.c > @@ -29,6 +29,7 @@ > #include "net/checksum.h" > #include "net/util.h" > #include "hw/xen/xen_backend.h" > +#include "hw/xen/xen_frontend.h" > > #include > There is also "hw/usb/xen-usb.c" that needs to be updated to include "hw/xen/xen_frontend.h". > diff --git a/include/hw/xen/xen_frontend.h b/include/hw/xen/xen_frontend.h > new file mode 100644 > index 000..46485b9 > --- /dev/null > +++ b/include/hw/xen/xen_frontend.h > @@ -0,0 +1,10 @@ > +#ifndef QEMU_HW_XEN_FRONTEND_H > +#define QEMU_HW_XEN_FRONTEND_H 1 No need for the 1 here. -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 08/19] xen: Rename xen_be_send_notify
On Sun, Jul 10, 2016 at 02:47:39PM +0300, Emil Condrea wrote: > Prepare xen_be_send_notify to be shared with frontends: > * xen_be_send_notify -> xen_pv_send_notify > > Signed-off-by: Emil Condrea > --- > hw/block/xen_disk.c| 4 ++-- > hw/char/xen_console.c | 4 ++-- > hw/display/xenfb.c | 8 > hw/net/xen_nic.c | 4 ++-- > hw/xen/xen_pvdev.c | 2 +- > include/hw/xen/xen_pvdev.h | 2 +- > 6 files changed, 12 insertions(+), 12 deletions(-) hw/usb/xen-usb.c needs to be updated as well. -- Anthony PERARD ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 0/4] tools: make xenstore domain/daemon configurable
I went back to the long thread in v1. I don't think I can figure out if all the comments are addressed. Ian asked for something to be rectified, but I'm not sure if this series satisfies him. I will let him comment on that. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH linux v2 0/9] xen: pvhvm: support bootup on secondary vCPUs
Hello, On 25/07/16 14:39, Vitaly Kuznetsov wrote: Julien Grall writes: Hi David, On 25/07/16 13:38, David Vrabel wrote: On 30/06/16 16:56, Vitaly Kuznetsov wrote: It may happen that Xen's and Linux's ideas of vCPU id diverge. In particular, when we crash on a secondary vCPU we may want to do kdump and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting on the vCPU which crashed. This doesn't work very well for PVHVM guests as we have a number of hypercalls where we pass vCPU id as a parameter. These hypercalls either fail or do something unexpected. To solve the issue we need to have a mapping between Linux's and Xen's vCPU ids. This series solves the issue for x86 PVHVM guests. PV guests don't (and probably won't) support kdump so I always assume Xen's vCPU id == Linux's vCPU id. ARM guests will probably need to get proper mapping once we start supporting kexec/kdump there. Applied to for-linus-4.8, thanks. It would have been nice to send a ping before applying. This patch series is containing Xen ARM code which has not been acked by Stefano, nor had feedback from ARM side. For instance given that all the hypercalls are representing a "vcpu id" using "uint32_t" it is a bit weird to use "int" to define xen_vcpu_id (see patch #3). CPU id is usually 'int' in linux and now we pass it to all hypercalls as it is. Well, we need to differentiate between the internal representation of the CPU which is based on the boot order and the logical CPU ID. For instance on ARM, the logical CPU ID may not be contiguous nor 0 for the first CPU. From my understanding, the macros in patch #3 will be used at the last minute when prepare the hypercall data. IHMO this is very similar to a logical ID and defined as uint32_t by the hypercall ABI. Although, I agree that currently we use the internal CPU id on ARM which is very unfortunate because this value is based on the order of the nodes in the device tree. One way to abolish it on ARM would be to use the MPIDR (or at least a part) for the VCPU ID. It is a bit more convenient in the mapping I introduce as we can set it to a negative value to indicate there is no mapping available. I can definitely change that and use something like U32_MAX-1 to instead but I'm not sure it is worth it... I looked at the definition of cpu_acpi_id on x86 which return x86_cpu_to_acpiid that has been defined to an uint32_t. So you are assuming that it will never be possible to have an ID > 0x8000. Also, this may not be true on ARM depending how we define the VCPU mapping. We could decide to use the MPIDR which is in this case may be considered as "negative". Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 4/4] tools: make xenstore domain easy configurable
On Mon, Jul 25, 2016 at 03:56:17PM +0200, Juergen Gross wrote: > On 25/07/16 15:43, Wei Liu wrote: > > On Fri, Jul 22, 2016 at 05:09:31PM +0200, Juergen Gross wrote: > > [...] > >> diff --git a/tools/hotplug/Linux/launch-xenstore.in > >> b/tools/hotplug/Linux/launch-xenstore.in > >> index 2bd9f64..fdfa33a 100644 > >> --- a/tools/hotplug/Linux/launch-xenstore.in > >> +++ b/tools/hotplug/Linux/launch-xenstore.in > >> @@ -48,18 +48,40 @@ test_xenstore && exit 0 > >> > >> test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . > >> @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons > >> > >> -test -z "$XENSTORED_ROOTDIR" && XENSTORED_ROOTDIR="@XEN_LIB_STORED@" > >> -rm -f "$XENSTORED_ROOTDIR"/tdb* 2>/dev/null > >> -test -z "$XENSTORED_TRACE" || XENSTORED_ARGS=" -T > >> @XEN_LOG_DIR@/xenstored-trace.log" > >> +[ "$XENSTORETYPE" = "" ] && XENSTORETYPE=daemon > >> + > >> +/bin/mkdir -p @XEN_RUN_DIR@ > >> + > >> +[ "$XENSTORETYPE" = "daemon" ] && { > >> + [ -z "$XENSTORED_ROOTDIR" ] && XENSTORED_ROOTDIR="@XEN_LIB_STORED@" > >> + /bin/rm -f $XENSTORED_ROOTDIR/tdb* 2>/dev/null > >> + [ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T > >> @XEN_LOG_DIR@/xenstored-trace.log" > >> + [ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@ > >> + [ -x "$XENSTORED" ] || { > >> + echo "No xenstored found" > >> + exit 1 > >> + } > >> > >> -if [ -n "$XENSTORED" ] ; then > >>echo -n Starting $XENSTORED... > >>$XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS > >> -else > >> - echo "No xenstored found" > >> - exit 1 > >> -fi > >> > >> -timeout_xenstore $XENSTORED || exit 1 > >> + systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || > >> exit 1 > >> > >> -exit 0 > >> + exit 0 > >> +} > >> + > >> +[ "$XENSTORETYPE" = "domain" ] && { > >> + [ -z "$XENSTORE_DOMAIN_KERNEL" ] && > >> XENSTORE_DOMAIN_KERNEL=@LIBEXEC@/boot/xenstore-stubdom.gz > >> + XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --kernel > >> $XENSTORE_DOMAIN_KERNEL" > >> + [ -z "$XENSTORE_DOMAIN_SIZE" ] && XENSTORE_DOMAIN_SIZE=8 > >> + XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --memory > >> $XENSTORE_DOMAIN_SIZE" > >> + > >> + echo -n Starting $XENSTORE_DOMAIN_KERNEL... > >> + ${LIBEXEC_BIN}/init-xenstore-domain $XENSTORE_DOMAIN_ARGS || exit 1 > >> + systemd-notify --ready 2>/dev/null > > > > Please test if there is systemd-notify before trying to invoke it and > > then you can properly log the failure of the invocation. > > I thought about that. What would be the purpose doing so? Following > cases are possible: > This > - system is booted under control of systemd, systemd-notify is found: > everything is nice, systemd receives the notification it is waiting > for > Here you assume systemd-notify always succeed. It can fail due to some reasons. That's what its manpage suggests. We need to handle this. Wei. > > Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 4/4] tools: make xenstore domain easy configurable
On 25/07/16 16:01, Wei Liu wrote: > On Mon, Jul 25, 2016 at 03:56:17PM +0200, Juergen Gross wrote: >> On 25/07/16 15:43, Wei Liu wrote: >>> On Fri, Jul 22, 2016 at 05:09:31PM +0200, Juergen Gross wrote: >>> [...] diff --git a/tools/hotplug/Linux/launch-xenstore.in b/tools/hotplug/Linux/launch-xenstore.in index 2bd9f64..fdfa33a 100644 --- a/tools/hotplug/Linux/launch-xenstore.in +++ b/tools/hotplug/Linux/launch-xenstore.in @@ -48,18 +48,40 @@ test_xenstore && exit 0 test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons -test -z "$XENSTORED_ROOTDIR" && XENSTORED_ROOTDIR="@XEN_LIB_STORED@" -rm -f "$XENSTORED_ROOTDIR"/tdb* 2>/dev/null -test -z "$XENSTORED_TRACE" || XENSTORED_ARGS=" -T @XEN_LOG_DIR@/xenstored-trace.log" +[ "$XENSTORETYPE" = "" ] && XENSTORETYPE=daemon + +/bin/mkdir -p @XEN_RUN_DIR@ + +[ "$XENSTORETYPE" = "daemon" ] && { + [ -z "$XENSTORED_ROOTDIR" ] && XENSTORED_ROOTDIR="@XEN_LIB_STORED@" + /bin/rm -f $XENSTORED_ROOTDIR/tdb* 2>/dev/null + [ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T @XEN_LOG_DIR@/xenstored-trace.log" + [ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@ + [ -x "$XENSTORED" ] || { + echo "No xenstored found" + exit 1 + } -if [ -n "$XENSTORED" ] ; then echo -n Starting $XENSTORED... $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS -else - echo "No xenstored found" - exit 1 -fi -timeout_xenstore $XENSTORED || exit 1 + systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || exit 1 -exit 0 + exit 0 +} + +[ "$XENSTORETYPE" = "domain" ] && { + [ -z "$XENSTORE_DOMAIN_KERNEL" ] && XENSTORE_DOMAIN_KERNEL=@LIBEXEC@/boot/xenstore-stubdom.gz + XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --kernel $XENSTORE_DOMAIN_KERNEL" + [ -z "$XENSTORE_DOMAIN_SIZE" ] && XENSTORE_DOMAIN_SIZE=8 + XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --memory $XENSTORE_DOMAIN_SIZE" + + echo -n Starting $XENSTORE_DOMAIN_KERNEL... + ${LIBEXEC_BIN}/init-xenstore-domain $XENSTORE_DOMAIN_ARGS || exit 1 + systemd-notify --ready 2>/dev/null >>> >>> Please test if there is systemd-notify before trying to invoke it and >>> then you can properly log the failure of the invocation. >> >> I thought about that. What would be the purpose doing so? Following >> cases are possible: >> > > This > >> - system is booted under control of systemd, systemd-notify is found: >> everything is nice, systemd receives the notification it is waiting >> for >> > > Here you assume systemd-notify always succeed. It can fail due to some > reasons. That's what its manpage suggests. > > We need to handle this. May I repeat the sd_notify() man page here? "In order to support both, init systems that implement this scheme and those which do not, it is generally recommended to ignore the return value of this call." Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel