date:20160725

[Xen-devel] [distros-debian-sid test] 66791: trouble: blocked/broken

2016-07-25 Thread Platform Team regression test user

flight 66791 distros-debian-sid real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/66791/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-pvops 4 capture-logsbroken REGR. vs. 66612
 build-armhf   4 capture-logsbroken REGR. vs. 66612

Regressions which are regarded as allowable (not blocking):
 build-armhf-pvops 3 host-install(3)  broken like 66612
 build-armhf   3 host-install(3)  broken like 66612
 build-amd64   3 host-install(3)  broken like 66612
 build-amd64-pvops 3 host-install(3)  broken like 66612
 build-i3863 host-install(3)  broken like 66612
 build-i386-pvops  3 host-install(3)  broken like 66612

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-armhf-sid-netboot-pygrub  1 build-check(1)blocked n/a
 test-amd64-amd64-i386-sid-netboot-pygrub  1 build-check(1) blocked n/a
 test-amd64-amd64-amd64-sid-netboot-pvgrub  1 build-check(1)blocked n/a
 test-amd64-i386-i386-sid-netboot-pvgrub  1 build-check(1)  blocked n/a
 test-amd64-i386-amd64-sid-netboot-pygrub  1 build-check(1) blocked n/a

baseline version:
 flight   66612

jobs:
 build-amd64  broken  
 build-armhf  broken  
 build-i386   broken  
 build-amd64-pvopsbroken  
 build-armhf-pvopsbroken  
 build-i386-pvops broken  
 test-amd64-amd64-amd64-sid-netboot-pvgrubblocked 
 test-amd64-i386-i386-sid-netboot-pvgrub  blocked 
 test-amd64-i386-amd64-sid-netboot-pygrub blocked 
 test-armhf-armhf-armhf-sid-netboot-pygrubblocked 
 test-amd64-amd64-i386-sid-netboot-pygrub blocked 



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] page content

2016-07-25 Thread sepanta s

Hi,
Is there an API in Xen to get the page content of a known gfn ?
I want to calculate the checksum of the content of the page and wanna see
if it
is possible ?
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] page content

2016-07-25 Thread Razvan Cojocaru

On 07/25/2016 11:29 AM, sepanta s wrote:
> Hi,
> Is there an API in Xen to get the page content of a known gfn ?
> I want to calculate the checksum of the content of the page and wanna
> see if it
> is possible ?

You can map the page with xc_map_foreign_range(), then just inspect the
memory as if it's local to your application.


Cheers,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 16/18] arm/altp2m: Extended libxl to activate altp2m on ARM.

2016-07-25 Thread Wei Liu

On Sun, Jul 24, 2016 at 06:06:00PM +0200, Sergej Proskurin wrote:
> Hi Wei,
> 
> On 07/07/2016 06:27 PM, Wei Liu wrote:
> > On Mon, Jul 04, 2016 at 01:45:45PM +0200, Sergej Proskurin wrote:
> >> The current implementation allows to set the parameter HVM_PARAM_ALTP2M.
> >> This parameter allows further usage of altp2m on ARM. For this, we
> >> define an additional altp2m field for PV domains as part of the
> >> libxl_domain_type struct. This field can be set only on ARM systems
> >> through the "altp2m" switch in the domain's configuration file (i.e.
> >> set altp2m=1).
> >>
> >> Signed-off-by: Sergej Proskurin 
> >> ---
> >> Cc: Ian Jackson 
> >> Cc: Wei Liu 
> >> ---
> >>  tools/libxl/libxl_create.c  |  1 +
> >>  tools/libxl/libxl_dom.c | 14 ++
> >>  tools/libxl/libxl_types.idl |  1 +
> >>  tools/libxl/xl_cmdimpl.c|  5 +
> >>  4 files changed, 21 insertions(+)
> >>
> >> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> >> index 1b99472..40b5f61 100644
> >> --- a/tools/libxl/libxl_create.c
> >> +++ b/tools/libxl/libxl_create.c
> >> @@ -400,6 +400,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
> >>  b_info->cmdline = b_info->u.pv.cmdline;
> >>  b_info->u.pv.cmdline = NULL;
> >>  }
> >> +libxl_defbool_setdefault(&b_info->u.pv.altp2m, false);
> >>  break;
> >>  default:
> >>  LOG(ERROR, "invalid domain type %s in create info",
> >> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> >> index ec29060..ab023a2 100644
> >> --- a/tools/libxl/libxl_dom.c
> >> +++ b/tools/libxl/libxl_dom.c
> >> @@ -277,6 +277,16 @@ err:
> >>  }
> >>  #endif
> >>  
> >> +#if defined(__arm__) || defined(__aarch64__)
> >> +static void pv_set_conf_params(xc_interface *handle, uint32_t domid,
> >> +   libxl_domain_build_info *const info)
> >> +{
> >> +if ( libxl_defbool_val(info->u.pv.altp2m) )
> > 
> > Coding style.
> > 
> 
> I am not sure which part does not fulfill Xen's coding style. Could you
> be more specific please? Thank you.
> 

There is a CODING_STYLE file in libxl as well. The coding style in
tools/ is generally different from the coding style  in hypervisor with
the exception of libxc.

Sorry for being too terse on this. Do ask questions! :-)

> >> +xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
> >> + libxl_defbool_val(info->u.pv.altp2m));
> >> +}
> >> +#endif
> >> +
> >>  static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
> >>  libxl_domain_build_info *const info)
> >>  {
> >> @@ -433,6 +443,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
> >>  return rc;
> >>  #endif
> >>  }
> >> +#if defined(__arm__) || defined(__aarch64__)
> >> +else /* info->type == LIBXL_DOMAIN_TYPE_PV */
> >> +pv_set_conf_params(ctx->xch, domid, info);
> >> +#endif
> >>  
> >>  rc = libxl__arch_domain_create(gc, d_config, domid);
> >>  
> >> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> >> index ef614be..0a164f9 100644
> >> --- a/tools/libxl/libxl_types.idl
> >> +++ b/tools/libxl/libxl_types.idl
> >> @@ -554,6 +554,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
> >>("features", string, {'const': 
> >> True}),
> >># Use host's E820 for PCI 
> >> passthrough.
> >>("e820_host", libxl_defbool),
> >> +  ("altp2m", libxl_defbool),
> > 
> > Why is this placed in PV instead of arch_arm?
> > 
> 
> The current implementation mirrors the x86's altp2m, where the altp2m
> field represents a property of HVM guests in b_info->u.hvm.altp2m. Since
> guests on ARM are marked as PV, I have placed the field altp2m into
> b_info->u.pv.altp2m.
> 
> However, if you believe that it would make more sense to place altp2m
> into b_info->arch_arm.altp2m, I will try to adapt the affected fields.
> 

OK. I don't think that one should be placed in u.pv because that union
is for x86. However it is also not suitable to just promote that to
a common field because x86 pv doesn't have altp2m support.

I think arch_arm would be a better location.

> > You would also need a LIBXL_HAVE macro in libxl.h for the new field.
> > 
> 
> There is already a LIBXL_HAVE_ALTP2M macro defined in libxl.h. Or did
> you mean using an explicit one for ARM?
> 

Yes, we need a new one for ARM.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 16/17] libxc/xc_dom_arm: Copy ACPI tables to guest space

2016-07-25 Thread George Dunlap

On Thu, Jul 21, 2016 at 10:15 PM, Stefano Stabellini
 wrote:
>> You are assuming that the guest will map the ACPI blob with the same
>> attributes as the rest of the superpage.
>>
>> IHMO, a sane operating system will want to map the ACPI blob read-only.
>
> That's true. But there are other things which might be mapped
> differently and could shatter a stage-1 superpage mapping (especially on
> x86 that has a much more complex memory map than ARM). Obviously adding
> one more is not doing it any good, but it might not make a difference in
> practice.
>
> Anyway, I agree with Julien that his suggestion is the best for ARM. If
> the libxl maintainers are willing to accept two different code paths for
> this on ARM and x86, then I am fine with it too.

Sorry to be a bit late to this thread -- there's a interface principle
that I think we should at some point have a larger discussion about:
whether "maxmem" means the amount of RAM which the guest sees as RAM,
or whether "maxmem" means the amount of RAM that the administrator
sees as used by the guest.  At the moment tnhere's no consistent
answer actually; but I am strongly of the opinion that for usability
the best answer is for "memory" to be the *total* amount of *host*
memory used by the guest.  In an ideal world, the admin should be able
to do "xl info", see that there is 3000MiB free, and then start a
guest with 3000MiB and expect it to succeed.  At the moment he has to
guess.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 16/17] libxc/xc_dom_arm: Copy ACPI tables to guest space

2016-07-25 Thread Shannon Zhao



On 2016/7/20 17:32, Wei Liu wrote:
> On Wed, Jul 20, 2016 at 02:52:05PM +0800, Shannon Zhao wrote:
>> > 
>> > 
>> > On 2016/7/19 18:38, Wei Liu wrote:
>>> > > On Fri, Jul 15, 2016 at 05:39:32PM +0800, Shannon Zhao wrote:
>>> > > [...]
>>> > >>> > > 
>>> > >>> > > It would be trivial to have another option in xl.cfg to allow 
>>> > >>> > > MB
>>> > >>> > > granularity. But I don't think that's a good idea. Asking for 
>>> > >>> > > more
>>> > >>> > > memory when you don't really know how much is enough is not 
>>> > >>> > > very useful.
>>> > >>> > > If an admin can know how much is needed, surely the library 
>>> > >>> > > can be
>>> > >>> > > taught to obtain that knowledge, too.
>>> > >>> > > 
>>> > >>> > > We need to decide which model we should go with. And, if we 
>>> > >>> > > decide to
>>> > >>> > > diverge, document the difference between x86 and ARM model.
>>> > >>> > > 
> > >> > Hi Wei,
> > >> > 
> > >> > Do you decide how to add the size of ACPI blob to max_memkb?
> > >> > 
>>> > > AFAICT ARM and x86 maintainers hold different opinions on how memory
>>> > > should be accounted.
>>> > > 
>>> > > I would like to have a unified memory accounting model. But if we can't
>>> > > have that at the moment, I'm fine with divergence, but please document
>>> > > it somewhere (comment near code snippet, in header, or a file under docs
>>> > > etc). And the amount added to max_memkb needs to be properly calculated,
>>> > > not some magic number, so that we have a chance in the future to
>>> > > confidently change how we do thing.
>> > If it's only allowed to add the size to max_memkb in
>> > libxl__domain_build_info_setdefault(), it only can use a const number
>> > since the tables are not generted and we don;t know the size. But the
>> > const number could be chosen properly since we could know the maximum
>> > ACPI tables size of current ARM approach.
>> > 
>> > But maybe in the future, if we add some new ACPI tables, it should
>> > increase the size as well. So I think this should be documented.
>> > 
>> > As I asked before, is it ok to add the size to max_memkb after
>> > generating the ACPI tables and before loading the tables to guest space?
>> > 
> Yes. I don't think shoehorning everything into setdefault is a good
> idea.
> 
> I think libxl_arm.c:libxl__arch_domain_create would be the right place
> to do it.
> 
> I am thinking about calling xc_domain_setmaxmem there, but not adding a
> number to d_config->b_info.max_memkb. Adding that to ->max_memkb would
> be wrong because the bigger ->max_memkb will be recored and the same
> algorithm will be applied every time you migrate your guest, so the
> max_memkb will grow bigger and bigger.
> 
> Given the different approach taken by ARM and x86, maybe we need to also
> record the size of acpi blobs somewhere in xenstore (also needs to be
> documented) so that subsequent libxl_domain_setmaxmem can extract that
> number again.
> 
> Please wait a bit for Stefano and Julien to comment before you do work.
> 
Stefano, Julien,
Any comments regarding how to add the ACPI size to max_memkb?

Thanks,
-- 
Shannon


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL

2016-07-25 Thread Wei Liu

On Fri, Jul 22, 2016 at 03:27:30AM +, osstest service owner wrote:
> flight 97737 xen-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/97737/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-armhf-armhf-xl  15 guest-start/debian.repeat fail REGR. vs. 
> 97664

From

http://logs.test-lab.xenproject.org/osstest/logs/97737/test-armhf-armhf-xl/serial-cubietruck-picasso.log


Jul 21 17:08:59.405183 [ 4479.814529] [ cut here ]

Jul 21 17:09:16.961529 [ 4479.814600] kernel BUG at 
drivers/xen/grant-table.c:923!

Jul 21 17:09:16.966838 [ 4479.814628] Internal error: Oops - BUG: 0 [#1] SMP ARM

Jul 21 17:09:16.972090 [ 4479.814656] Modules linked in: xen_gntalloc bridge 
stp ipv6 llc brcmfmac brcmutil cfg80211

Jul 21 17:09:16.980340 [ 4479.814759] CPU: 1 PID: 24761 Comm: vif5.0-q0-guest 
Not tainted 3.16.7-ckt12+ #1

Jul 21 17:09:16.987841 [ 4479.814795] task: d8ef7600 ti: d85bc000 task.ti: 
d85bc000

Jul 21 17:09:16.993339 [ 4479.814833] PC is at gnttab_batch_copy+0xd0/0xe4

Jul 21 17:09:16.997963 [ 4479.814860] LR is at gnttab_batch_copy+0x1c/0xe4

Jul 21 17:09:17.002718 [ 4479.814888] pc : []lr : []
psr: a0070013

Jul 21 17:09:17.008962 [ 4479.814888] sp : d85bdea0  ip : deadbeef  fp : 
c0c8e140

Jul 21 17:09:17.014341 [ 4479.814935] r10:   r9 : e1bec000  r8 : 


Jul 21 17:09:17.019595 [ 4479.814960] r7 : 0002  r6 : 0002  r5 : 
d85bdf20  r4 : e1bf4d30

Jul 21 17:09:17.026095 [ 4479.814990] r3 : 0001  r2 : deadbeef  r1 : 
deadbeef  r0 : fff2

Jul 21 17:09:17.032717 [ 4479.815021] Flags: NzCv  IRQs on  FIQs on  Mode 
SVC_32  ISA ARM  Segment kernel

Jul 21 17:09:17.040091 [ 4479.815055] Control: 10c5387d  Table: 78d8406a  DAC: 
0015

Jul 21 17:09:17.045964 [ 4479.815084] Process vif5.0-q0-guest (pid: 24761, 
stack limit = 0xd85bc248)

Jul 21 17:09:17.052840 [ 4479.815114] Stack: (0xd85bdea0 to 0xd85be000)

Jul 21 17:09:17.057218 [ 4479.815145] dea0: 0001 d8b11388 d85bdf20 d85bdf04 
0002 c05eb054 0388 

Jul 21 17:09:17.065469 [ 4479.815183] dec0: d85bdf04   c0b7ea80 
db0995c0 c05e86e4 e1bf4000 003c

Jul 21 17:09:17.073753 [ 4479.815221] dee0:    c0b8849c 
e1bf4cfc c0c8e140 e1bf4d30 e1bf4cc4

Jul 21 17:09:17.082001 [ 4479.815260] df00: db0c3e80  d85bdf08 d85bdf08 
d8c5cb40 d8c5cb40 0001 

Jul 21 17:09:17.090217 [ 4479.815298] df20: 0002  0001  
e1bf4d30 e1c1f530 04c6 023c

Jul 21 17:09:17.098466 [ 4479.815337] df40:   d84aab80 e1bec000 
c05ea990   

Jul 21 17:09:17.106720 [ 4479.815375] df60:  c0266238   
00f8 e1bec000  

Jul 21 17:09:17.114844 [ 4479.815414] df80: d85bdf80 d85bdf80   
d85bdf90 d85bdf90 d85bdfac d84aab80

Jul 21 17:09:17.123093 [ 4479.815451] dfa0: c0266168   c020f138 
   

Jul 21 17:09:17.131345 [ 4479.815489] dfc0:     
   

Jul 21 17:09:17.139596 [ 4479.815527] dfe0:     
0013   

Jul 21 17:09:17.147841 [ 4479.815583] [] (gnttab_batch_copy) from 
[] (xenvif_kthread_guest_rx+0x6c4/0xb58)

Jul 21 17:09:17.156969 [ 4479.815636] [] (xenvif_kthread_guest_rx) 
from [] (kthread+0xd0/0xe8)

Jul 21 17:09:17.165217 [ 4479.815681] [] (kthread) from [] 
(ret_from_fork+0x14/0x3c)

Jul 21 17:09:17.172467 [ 4479.815721] Code: e1c432b4 eae0 e7f001f2 e8bd80f8 
(e7f001f2) 

Jul 21 17:09:17.178595 [ 4479.815766] ---[ end trace 6ba7d172d52e24e2 ]---

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 16/18] arm/altp2m: Extended libxl to activate altp2m on ARM.

2016-07-25 Thread Sergej Proskurin

Hi Wei,


On 07/25/2016 10:32 AM, Wei Liu wrote:
> On Sun, Jul 24, 2016 at 06:06:00PM +0200, Sergej Proskurin wrote:
>> Hi Wei,
>>
>> On 07/07/2016 06:27 PM, Wei Liu wrote:
>>> On Mon, Jul 04, 2016 at 01:45:45PM +0200, Sergej Proskurin wrote:
 The current implementation allows to set the parameter HVM_PARAM_ALTP2M.
 This parameter allows further usage of altp2m on ARM. For this, we
 define an additional altp2m field for PV domains as part of the
 libxl_domain_type struct. This field can be set only on ARM systems
 through the "altp2m" switch in the domain's configuration file (i.e.
 set altp2m=1).

 Signed-off-by: Sergej Proskurin 
 ---
 Cc: Ian Jackson 
 Cc: Wei Liu 
 ---
  tools/libxl/libxl_create.c  |  1 +
  tools/libxl/libxl_dom.c | 14 ++
  tools/libxl/libxl_types.idl |  1 +
  tools/libxl/xl_cmdimpl.c|  5 +
  4 files changed, 21 insertions(+)

 diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
 index 1b99472..40b5f61 100644
 --- a/tools/libxl/libxl_create.c
 +++ b/tools/libxl/libxl_create.c
 @@ -400,6 +400,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
  b_info->cmdline = b_info->u.pv.cmdline;
  b_info->u.pv.cmdline = NULL;
  }
 +libxl_defbool_setdefault(&b_info->u.pv.altp2m, false);
  break;
  default:
  LOG(ERROR, "invalid domain type %s in create info",
 diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
 index ec29060..ab023a2 100644
 --- a/tools/libxl/libxl_dom.c
 +++ b/tools/libxl/libxl_dom.c
 @@ -277,6 +277,16 @@ err:
  }
  #endif
  
 +#if defined(__arm__) || defined(__aarch64__)
 +static void pv_set_conf_params(xc_interface *handle, uint32_t domid,
 +   libxl_domain_build_info *const info)
 +{
 +if ( libxl_defbool_val(info->u.pv.altp2m) )
>>> Coding style.
>>>
>> I am not sure which part does not fulfill Xen's coding style. Could you
>> be more specific please? Thank you.
>>
> There is a CODING_STYLE file in libxl as well. The coding style in
> tools/ is generally different from the coding style  in hypervisor with
> the exception of libxc.
>
> Sorry for being too terse on this. Do ask questions! :-)

It's all good, no worries :) Thank you for the hint (it must be the
indentation inside the if-clause brackets). I was still in the Xen
coding style, when I wrote this part: Now, I know better, thank you :)

 +xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
 + libxl_defbool_val(info->u.pv.altp2m));
 +}
 +#endif
 +
  static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
  libxl_domain_build_info *const info)
  {
 @@ -433,6 +443,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
  return rc;
  #endif
  }
 +#if defined(__arm__) || defined(__aarch64__)
 +else /* info->type == LIBXL_DOMAIN_TYPE_PV */
 +pv_set_conf_params(ctx->xch, domid, info);
 +#endif
  
  rc = libxl__arch_domain_create(gc, d_config, domid);
  
 diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
 index ef614be..0a164f9 100644
 --- a/tools/libxl/libxl_types.idl
 +++ b/tools/libxl/libxl_types.idl
 @@ -554,6 +554,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
("features", string, {'const': 
 True}),
# Use host's E820 for PCI 
 passthrough.
("e820_host", libxl_defbool),
 +  ("altp2m", libxl_defbool),
>>> Why is this placed in PV instead of arch_arm?
>>>
>> The current implementation mirrors the x86's altp2m, where the altp2m
>> field represents a property of HVM guests in b_info->u.hvm.altp2m. Since
>> guests on ARM are marked as PV, I have placed the field altp2m into
>> b_info->u.pv.altp2m.
>>
>> However, if you believe that it would make more sense to place altp2m
>> into b_info->arch_arm.altp2m, I will try to adapt the affected fields.
>>
> OK. I don't think that one should be placed in u.pv because that union
> is for x86. However it is also not suitable to just promote that to
> a common field because x86 pv doesn't have altp2m support.
>
> I think arch_arm would be a better location.
>

Alright, I will move the field altp2m to arch_arm and adopt the
initialization routines.

>>> You would also need a LIBXL_HAVE macro in libxl.h for the new field.
>>>
>> There is already a LIBXL_HAVE_ALTP2M macro defined in libxl.h. Or did
>> you mean using an explicit one for ARM?
>>
> Yes, we need a new one for ARM.
>

Ok. I will provide a LIBXL_HAVE_ARM_ALTP2M macro.

Thank you.

Best regards,
~Serg

Re: [Xen-devel] [PATCH linux v2 0/9] xen: pvhvm: support bootup on secondary vCPUs

2016-07-25 Thread Vitaly Kuznetsov

Vitaly Kuznetsov  writes:

> It may happen that Xen's and Linux's ideas of vCPU id diverge. In
> particular, when we crash on a secondary vCPU we may want to do kdump
> and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting
> on the vCPU which crashed. This doesn't work very well for PVHVM guests as
> we have a number of hypercalls where we pass vCPU id as a parameter. These
> hypercalls either fail or do something unexpected. To solve the issue we
> need to have a mapping between Linux's and Xen's vCPU ids.
>
> This series solves the issue for x86 PVHVM guests. PV guests don't (and
> probably won't) support kdump so I always assume Xen's vCPU id == Linux's
> vCPU id. ARM guests will probably need to get proper mapping once we start
> supporting kexec/kdump there.
>
> Changes since v1:
> - "x86/acpi: store ACPI ids from MADT for future usage" patch added.
> - Use ACPI ids instead of vLAPIC ids/2 [Andrew Cooper, Jan Beulich]
> - Introduce xen_vcpu_nr() helper [David Vrabel].
> - Modify all callers of HYPERVISOR_vcpu_op() instead of modifying
>   HYPERVISOR_vcpu_op() [David Vrabel]

David, Boris,

I didn't get any feedback for the v2 of this patchset, is it ready to go
in for 4.8 or is there anything else I need to do to make this happen?

BTW, the patch describing the de facto "ACPI id == Xen's VCPU id" policy
is in xen.git:

commit ea210c52abb6458e39f5365f7f2c3abb9c191c47
Author: Vitaly Kuznetsov 
Date:   Tue Jul 12 13:44:42 2016 +0200

x86, hvm: document the de facto policy for vCPU ids

Thanks,

-- 
  Vitaly

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/2] systemd: use standard dependencies for xendriverdomain.service

2016-07-25 Thread Wei Liu

On Sun, Jul 24, 2016 at 09:26:57PM +0200, Marek Marczykowski-Górecki wrote:
> Having DefaultDependencies=no means it can be started before / is
> remounted read-write, which will result in various failures (to start
> with opening the log).
> Since "libxl: trigger attach events for devices attached before xl devd
> startup" it is no longer important to start it as early as possible,
> because it will process devices created before its startup.
> 
> Cc: Ian Jackson 
> Cc: Wei Liu 
> Signed-off-by: Marek Marczykowski-Górecki 

The reasoning makes sense.

Acked-by: Wei Liu 

Also CC Rusty (author of this file)

> ---
>  tools/hotplug/Linux/systemd/xendriverdomain.service.in | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/tools/hotplug/Linux/systemd/xendriverdomain.service.in 
> b/tools/hotplug/Linux/systemd/xendriverdomain.service.in
> index c0cd454..0afb54d 100644
> --- a/tools/hotplug/Linux/systemd/xendriverdomain.service.in
> +++ b/tools/hotplug/Linux/systemd/xendriverdomain.service.in
> @@ -1,6 +1,5 @@
>  [Unit]
>  Description=Xen driver domain device daemon
> -DefaultDependencies=no
>  Requires=proc-xen.mount
>  After=proc-xen.mount
>  ConditionVirtualization=xen
> -- 
> 2.5.5



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

2016-07-25 Thread Roger Pau Monné

On Sat, Jul 23, 2016 at 06:18:23AM +0800, Bob Liu wrote:
> 
> On 07/22/2016 07:45 PM, Roger Pau Monné wrote:
> > On Fri, Jul 22, 2016 at 05:43:32PM +0800, Bob Liu wrote:
> >>
> >> On 07/22/2016 05:34 PM, Roger Pau Monné wrote:
> >>> On Fri, Jul 22, 2016 at 04:17:48PM +0800, Bob Liu wrote:
> 
>  On 07/22/2016 03:45 PM, Roger Pau Monné wrote:
> > On Thu, Jul 21, 2016 at 06:08:05PM +0800, Bob Liu wrote:
> >>
> >> On 07/21/2016 04:57 PM, Roger Pau Monné wrote:
>  ..[snip]..
>  +
>  +static ssize_t dynamic_reconfig_device(struct blkfront_info *info, 
>  ssize_t count)
>  +{
>  +unsigned int i;
>  +int err = -EBUSY;
>  +
>  +/*
>  + * Make sure no migration in parallel, device lock is actually a
>  + * mutex.
>  + */
>  +if (!device_trylock(&info->xbdev->dev)) {
>  +pr_err("Fail to acquire dev:%s lock, may be in 
>  migration.\n",
>  +dev_name(&info->xbdev->dev));
>  +return err;
>  +}
>  +
>  +/*
>  + * Prevent new requests and guarantee no uncompleted reqs.
>  + */
>  +blk_mq_freeze_queue(info->rq);
>  +if (part_in_flight(&info->gd->part0))
>  +goto out;
>  +
>  +/*
>  + * FrontBackend
>  + * Switch to XenbusStateClosed
>  + *  frontend_changed():
>  + *   case XenbusStateClosed:
>  + *  
>  xen_blkif_disconnect()
>  + *  Switch to 
>  XenbusStateClosed
>  + * blkfront_resume():
>  + *  frontend_changed():
>  + *  reconnect
>  + * Wait until XenbusStateConnected
>  + */
>  +info->reconfiguring = true;
>  +xenbus_switch_state(info->xbdev, XenbusStateClosed);
>  +
>  +/* Poll every 100ms, 1 minute timeout. */
>  +for (i = 0; i < 600; i++) {
>  +/*
>  + * Wait backend enter XenbusStateClosed, 
>  blkback_changed()
>  + * will clear reconfiguring.
>  + */
>  +if (!info->reconfiguring)
>  +goto resume;
>  +schedule_timeout_interruptible(msecs_to_jiffies(100));
>  +}
> >>>
> >>> Instead of having this wait, could you just set info->reconfiguring = 
> >>> 1, set 
> >>> the frontend state to XenbusStateClosed and mimic exactly what a 
> >>> resume from 
> >>> suspension does? blkback_changed would have to set the frontend state 
> >>> to 
> >>> InitWait when it detects that the backend has switched to Closed, and 
> >>> call 
> >>> blkfront_resume.
> >>
> >>
> >> I think that won't work.
> >> In the real "resume" case, the power management system will trigger 
> >> all ->resume() path.
> >> But there is no place for dynamic configuration.
> >
> > Hello,
> >
> > I think it should be possible to set info->reconfiguring and wait for 
> > the 
> > backend to switch to state Closed, at that point we should call 
> > blkif_resume 
> > (from blkback_changed) and the backend will follow the reconection.
> >
> 
>  Okay, I get your point. Yes, that's an option.
> 
>  But this will make 'dynamic configuration' to be async, I'm worry about 
>  the end-user will get panic.
>  E.g
>  A end-user "echo  > /sys/devices/vbd-xxx/max_indirect_segs",
>  but then the device will be Closed and disappeared, the user have to 
>  wait for a random time so that the device can resume.
> >>>
> >>> That should not happen, AFAICT on migration the device never dissapears. 
> >>
> >> Oh, yes.
> >>
> >>> alloc_disk and friends should not be called on resume from migration (see 
> >>> the switch in blkfront_connect, you should take the BLKIF_STATE_SUSPENDED 
> >>> path for the reconfiguration).
> >>>
> >>
> >> What about if the end-user starts I/O immediately after writing new value 
> >> to /sys?
> >> But the resume is still in progress.
> > 
> > blkif_free already stops the queues by calling blk_mq_stop_hw_queues, and 
> > blkif_queue_request will refuse to put anything on the ring if the state 
> > is different than connected, which in turn makes blkif_queue_rq call 
> > blk_mq_stop_hw_queue to stop the queue, so it should be safe.
> > 
> 
> But this will surprise the end-user, our user script is like this:
> 1) echo  > /sys/
> 2) Start I/O im

Re: [Xen-devel] [PATCH 2/2] systemd: drop dependency on proc-xen.mount in xendriverdomain.service

2016-07-25 Thread Wei Liu

On Sun, Jul 24, 2016 at 09:27:13PM +0200, Marek Marczykowski-Górecki wrote:
> It is no longer required since xl devd use /dev/xen interface.
> 

How would this unit work when there is no /dev/xen interface?

To be precise, we prefer /dev/xen interfaces whenever possible but there
is a fallback to /proc/xen. Note that a lot of other unit files have
this dependency on proc-xen.mount.

I'm inclined to say we should keep this dependency but I'm not sure if I
missed some obvious things.

> Cc: Ian Jackson 
> Cc: Wei Liu 
> Signed-off-by: Marek Marczykowski-Górecki 
> ---
>  tools/hotplug/Linux/systemd/xendriverdomain.service.in | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/tools/hotplug/Linux/systemd/xendriverdomain.service.in 
> b/tools/hotplug/Linux/systemd/xendriverdomain.service.in
> index 0afb54d..a100309 100644
> --- a/tools/hotplug/Linux/systemd/xendriverdomain.service.in
> +++ b/tools/hotplug/Linux/systemd/xendriverdomain.service.in
> @@ -1,7 +1,5 @@
>  [Unit]
>  Description=Xen driver domain device daemon
> -Requires=proc-xen.mount
> -After=proc-xen.mount
>  ConditionVirtualization=xen
>  
>  [Service]
> -- 
> 2.5.5



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4] x86/cpuid: AVX-512 Feature Detection

2016-07-25 Thread Andrew Cooper

On 25/07/16 07:09, Kang, Luwei wrote:
 First of all - please don't top post.

> What about remove the dependency between AVX2 and AVX512F
>> ( AVX2:
 [AVX512F], ) ?

 Yes, that's what I think we want, but we need Andrew's agreement here.

>>> Hi Andrew,  what is your opinion ?
>> You are in a better position to answer than me.
>>
>> For a specific instruction which may be VEX and EVEX encoded, is the 
>> circuitry
>> for a specific instruction shared, or discrete?  Is there a plausible future
>> where you might support only the EVEX variant of an instruction, and not the
>> VEX variant?
>>
>> These dependences are about what may be reasonably assumed about the
>> way the environment is structured.  It doesn't look reasonable to advertise
>> an AVX512 environment to guests while at the same time stating that AVX2 is
>> not present.  If this is correct, then the dependency should stay.
>> If Intel plausibly things it might release hardware with !AVX2 but AVX512,
>> then the dependency should be dropped.
> Regarding the dependency between AVX2 and AVX512F, I have ask some hardware 
> architecture engineer.
>
> AVX512 is a superset of AVX2, the most important item being the state. If the 
> state for AVX2 isn't enabled (in XCR0), then AVX512 also can't function.
>
> So if we want to use AVX512, AVX2 must be supported and enabled. The 
> dependence between AVX512F and AVX2 may need be reserved.

Ok, so AVX512 strictly depends on AVX2.

In which case, the python code was correct.  The meaning of the
key/value relationship is "if the feature in the key is not present, all
features in the value must also be disabled".

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/2] systemd: drop dependency on proc-xen.mount in xendriverdomain.service

2016-07-25 Thread Marek Marczykowski-Górecki

On Mon, Jul 25, 2016 at 10:27:47AM +0100, Wei Liu wrote:
> On Sun, Jul 24, 2016 at 09:27:13PM +0200, Marek Marczykowski-Górecki wrote:
> > It is no longer required since xl devd use /dev/xen interface.
> > 
> 
> How would this unit work when there is no /dev/xen interface?

Does it happen in reality? I thought /proc/xen is deprecated for a long
time...

> To be precise, we prefer /dev/xen interfaces whenever possible but there
> is a fallback to /proc/xen. Note that a lot of other unit files have
> this dependency on proc-xen.mount.
> 
> I'm inclined to say we should keep this dependency but I'm not sure if I
> missed some obvious things.
> 
> > Cc: Ian Jackson 
> > Cc: Wei Liu 
> > Signed-off-by: Marek Marczykowski-Górecki 
> > ---
> >  tools/hotplug/Linux/systemd/xendriverdomain.service.in | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/tools/hotplug/Linux/systemd/xendriverdomain.service.in 
> > b/tools/hotplug/Linux/systemd/xendriverdomain.service.in
> > index 0afb54d..a100309 100644
> > --- a/tools/hotplug/Linux/systemd/xendriverdomain.service.in
> > +++ b/tools/hotplug/Linux/systemd/xendriverdomain.service.in
> > @@ -1,7 +1,5 @@
> >  [Unit]
> >  Description=Xen driver domain device daemon
> > -Requires=proc-xen.mount
> > -After=proc-xen.mount
> >  ConditionVirtualization=xen
> >  
> >  [Service]
> > -- 
> > 2.5.5
> 
> 

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/2] systemd: drop dependency on proc-xen.mount in xendriverdomain.service

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 11:30:32AM +0200, Marek Marczykowski-Górecki wrote:
> On Mon, Jul 25, 2016 at 10:27:47AM +0100, Wei Liu wrote:
> > On Sun, Jul 24, 2016 at 09:27:13PM +0200, Marek Marczykowski-Górecki wrote:
> > > It is no longer required since xl devd use /dev/xen interface.
> > > 
> > 
> > How would this unit work when there is no /dev/xen interface?
> 
> Does it happen in reality? I thought /proc/xen is deprecated for a long
> time...
> 

I can't tell whether it happens in reality or not. I also hope to get
rid of deprecated interface but they are here to stay for as long as it
takes.

I guess the question is, do you have a compelling reason against keeping
this dependency?

Wei.

> > To be precise, we prefer /dev/xen interfaces whenever possible but there
> > is a fallback to /proc/xen. Note that a lot of other unit files have
> > this dependency on proc-xen.mount.
> > 
> > I'm inclined to say we should keep this dependency but I'm not sure if I
> > missed some obvious things.
> > 
> > > Cc: Ian Jackson 
> > > Cc: Wei Liu 
> > > Signed-off-by: Marek Marczykowski-Górecki 
> > > 
> > > ---
> > >  tools/hotplug/Linux/systemd/xendriverdomain.service.in | 2 --
> > >  1 file changed, 2 deletions(-)
> > > 
> > > diff --git a/tools/hotplug/Linux/systemd/xendriverdomain.service.in 
> > > b/tools/hotplug/Linux/systemd/xendriverdomain.service.in
> > > index 0afb54d..a100309 100644
> > > --- a/tools/hotplug/Linux/systemd/xendriverdomain.service.in
> > > +++ b/tools/hotplug/Linux/systemd/xendriverdomain.service.in
> > > @@ -1,7 +1,5 @@
> > >  [Unit]
> > >  Description=Xen driver domain device daemon
> > > -Requires=proc-xen.mount
> > > -After=proc-xen.mount
> > >  ConditionVirtualization=xen
> > >  
> > >  [Service]
> > > -- 
> > > 2.5.5
> > 
> > 
> 
> -- 
> Best Regards,
> Marek Marczykowski-Górecki
> Invisible Things Lab
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian

2016-07-25 Thread Ingo Jürgensmann


On 22.07.2016 12:21, Ingo Jürgensmann wrote:

On 22.07.2016 11:03, Ingo Jürgensmann wrote:


In the meanwhile, I activated IPv6 again on Tuesday evening and today
the server crashed again some minutes ago. Here's the output from
netconsole:

... and the second subsequent crash:


... and another crash below...

But in the meanwhile, I wonder if anyone except Andreas and I has some 
interest in fixing this issue, as there were absolutely no comments or 
feedback on my previous mails. Quite disappointing. :-/



Jul 25 10:50:00 31.172.31.251 [257646.574819] CPU: 0 PID: 0 Comm: 
swapper/0 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2+deb8u3
Jul 25 10:50:00 31.172.31.251 [257646.574874] Hardware name: Supermicro 
X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 3.0a 
01/03/2014
Jul 25 10:50:00 31.172.31.251 [257646.574934] task: 8181a460 ti: 
8180 task.ti: 8180
Jul 25 10:50:00 31.172.31.251 [257646.575080] RIP: 
e030:[]

Jul 25 10:50:00 31.172.31.251  [] memcpy+0x6/0x110
Jul 25 10:50:00 31.172.31.251 [257646.575140] RSP: e02b:880280e03b58 
 EFLAGS: 00010286
Jul 25 10:50:00 31.172.31.251 [257646.575171] RAX: 880269992870 RBX: 
88026e0ceb00 RCX: 014e
Jul 25 10:50:00 31.172.31.251 [257646.575221] RDX: 0300 RSI: 
88026f42a000 RDI: 880269992a22
Jul 25 10:50:00 31.172.31.251 [257646.575271] RBP: 880280e03c38 R08: 
06c0 R09: 880269992862
Jul 25 10:50:00 31.172.31.251 [257646.575321] R10: 0100 R11: 
01c83452 R12: 8801d7cf7500
Jul 25 10:50:00 31.172.31.251 [257646.575370] R13: 88026f429e66 R14: 
880259ca34c0 R15: 0308
Jul 25 10:50:00 31.172.31.251 [257646.575425] FS:  
() GS:880280e0() knlGS:880280ee
Jul 25 10:50:00 31.172.31.251 [257646.575477] CS:  e033 DS:  ES: 
 CR0: 80050033
Jul 25 10:50:00 31.172.31.251 [257646.575509] CR2: 88026f42a000 CR3: 
000270edb000 CR4: 00042660

Jul 25 10:50:00 31.172.31.251 [257646.575559] Stack:
Jul 25 10:50:00 31.172.31.251 [257646.575585]  814d319f
Jul 25 10:50:00 31.172.31.251  88007c6c4a00
Jul 25 10:50:00 31.172.31.251  880280e03ba8
Jul 25 10:50:00 31.172.31.251  9401294600a7012a
Jul 25 10:50:00 31.172.31.251
Jul 25 10:50:00 31.172.31.251 [257646.575652]  0100
Jul 25 10:50:00 31.172.31.251  814a000a
Jul 25 10:50:00 31.172.31.251  6f1d3e90
Jul 25 10:50:00 31.172.31.251  80fe
Jul 25 10:50:00 31.172.31.251
Jul 25 10:50:00 31.172.31.251 [257646.575719]  1ad902feff7ac40e
Jul 25 10:50:00 31.172.31.251  88026826eb00
Jul 25 10:50:00 31.172.31.251  224afc3e1600
Jul 25 10:50:00 31.172.31.251  8801d7cf7500
Jul 25 10:50:00 31.172.31.251
Jul 25 10:50:00 31.172.31.251 [257646.575786] Call Trace:
Jul 25 10:50:00 31.172.31.251 [257646.575813]  
Jul 25 10:50:00 31.172.31.251
Jul 25 10:50:00 31.172.31.251 [257646.575822]
Jul 25 10:50:00 31.172.31.251  [] ? 
ndisc_send_redirect+0x3bf/0x410
Jul 25 10:50:00 31.172.31.251 [257646.575889]  [] ? 
reg_vif_xmit+0xca/0xf0
Jul 25 10:50:00 31.172.31.251 [257646.575926]  [] ? 
ip6_forward+0x71c/0x850
Jul 25 10:50:00 31.172.31.251 [257646.575961]  [] ? 
ip6_route_input+0xa4/0xd0
Jul 25 10:50:00 31.172.31.251 [257646.575997]  [] ? 
__netif_receive_skb_core+0x543/0x750
Jul 25 10:50:00 31.172.31.251 [257646.576034]  [] ? 
netif_receive_skb_internal+0x1f/0x80
Jul 25 10:50:00 31.172.31.251 [257646.576078]  [] ? 
br_handle_frame_finish+0x1c2/0x3c0 [bridge]
Jul 25 10:50:00 31.172.31.251 [257646.576134]  [] ? 
br_handle_frame+0x147/0x240 [bridge]
Jul 25 10:50:00 31.172.31.251 [257646.576170]  [] ? 
__netif_receive_skb_core+0x1c4/0x750
Jul 25 10:50:00 31.172.31.251 [257646.576209]  [] ? 
xen_clocksource_get_cycles+0x1c/0x20
Jul 25 10:50:00 31.172.31.251 [257646.576244]  [] ? 
netif_receive_skb_internal+0x1f/0x80
Jul 25 10:50:00 31.172.31.251 [257646.576282]  [] ? 
xenvif_tx_action+0x49f/0x920 [xen_netback]
Jul 25 10:50:00 31.172.31.251 [257646.576335]  [] ? 
xenvif_poll+0x28/0x70 [xen_netback]
Jul 25 10:50:00 31.172.31.251 [257646.576370]  [] ? 
net_rx_action+0x140/0x240
Jul 25 10:50:00 31.172.31.251 [257646.576405]  [] ? 
__do_softirq+0xf1/0x290
Jul 25 10:50:00 31.172.31.251 [257646.576439]  [] ? 
irq_exit+0x95/0xa0
Jul 25 10:50:00 31.172.31.251 [257646.576475]  [] ? 
xen_evtchn_do_upcall+0x35/0x50
Jul 25 10:50:00 31.172.31.251 [257646.576510]  [] ? 
xen_do_hypervisor_callback+0x1e/0x30

Jul 25 10:50:00 31.172.31.251 [257646.576544]  
Jul 25 10:50:00 31.172.31.251
Jul 25 10:50:00 31.172.31.251 [257646.576553]
Jul 25 10:50:00 31.172.31.251  [] ? 
xen_hypercall_sched_op+0xa/0x20
Jul 25 10:50:00 31.172.31.251 [257646.576615]  [] ? 
xen_hypercall_sched_op+0xa/0x20
Jul 25 10:50:00 31.172.31.251 [257646.576651]  [] ? 
xen_safe_halt+0xc/0x20
Jul 25 10:50:00 31.172.31.251 [257646.576686]  [] ? 
default_idle+0x19/0xb0
Jul 25 10:50:00 31.172.31.251 [257646.576721]  [] ? 
cpu_startup

Re: [Xen-devel] [PATCH 2/2] systemd: drop dependency on proc-xen.mount in xendriverdomain.service

2016-07-25 Thread Marek Marczykowski-Górecki

On Mon, Jul 25, 2016 at 10:34:21AM +0100, Wei Liu wrote:
> On Mon, Jul 25, 2016 at 11:30:32AM +0200, Marek Marczykowski-Górecki wrote:
> > On Mon, Jul 25, 2016 at 10:27:47AM +0100, Wei Liu wrote:
> > > On Sun, Jul 24, 2016 at 09:27:13PM +0200, Marek Marczykowski-Górecki 
> > > wrote:
> > > > It is no longer required since xl devd use /dev/xen interface.
> > > > 
> > > 
> > > How would this unit work when there is no /dev/xen interface?
> > 
> > Does it happen in reality? I thought /proc/xen is deprecated for a long
> > time...
> > 
> 
> I can't tell whether it happens in reality or not. I also hope to get
> rid of deprecated interface but they are here to stay for as long as it
> takes.
> 
> I guess the question is, do you have a compelling reason against keeping
> this dependency?

Just cleaning up old stuff, so - no.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


signature.asc
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records

2016-07-25 Thread Wei Liu

On Thu, Jul 21, 2016 at 06:17:34PM +0100, Andrew Cooper wrote:
> The sending side shouldn't send any variable sized records which end up having
> zero content, but the receiving side will need to tolerate such records for
> compatibility purposes.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/4] tools/libxc: Tolerate zero-length records in migration v2 streams

2016-07-25 Thread Wei Liu

On Thu, Jul 21, 2016 at 06:17:35PM +0100, Andrew Cooper wrote:
> Under some circumstances, the migration v2 save code would generate valid
> records with zero content, when the intended behaviour was to omit the record
> entirely.
> 
> As the stream is otherwise fine, tolerate these records and avoid failing the
> migration.
> 
> Signed-off-by: Andrew Cooper 


Reviewed-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 16/17] libxc/xc_dom_arm: Copy ACPI tables to guest space

2016-07-25 Thread Julien Grall


Hi George,

On 25/07/16 09:38, George Dunlap wrote:

On Thu, Jul 21, 2016 at 10:15 PM, Stefano Stabellini
 wrote:

You are assuming that the guest will map the ACPI blob with the same
attributes as the rest of the superpage.

IHMO, a sane operating system will want to map the ACPI blob read-only.


That's true. But there are other things which might be mapped
differently and could shatter a stage-1 superpage mapping (especially on
x86 that has a much more complex memory map than ARM). Obviously adding
one more is not doing it any good, but it might not make a difference in
practice.

Anyway, I agree with Julien that his suggestion is the best for ARM. If
the libxl maintainers are willing to accept two different code paths for
this on ARM and x86, then I am fine with it too.


Sorry to be a bit late to this thread -- there's a interface principle
that I think we should at some point have a larger discussion about:
whether "maxmem" means the amount of RAM which the guest sees as RAM,
or whether "maxmem" means the amount of RAM that the administrator
sees as used by the guest.  At the moment tnhere's no consistent
answer actually; but I am strongly of the opinion that for usability
the best answer is for "memory" to be the *total* amount of *host*
memory used by the guest.  In an ideal world, the admin should be able
to do "xl info", see that there is 3000MiB free, and then start a
guest with 3000MiB and expect it to succeed.  At the moment he has to
guess.


To confirm, do you include memory allocated by the hypervisor to keep 
track of the guest (i.e struc domain, struct vcpu...)?


If not, the problem stays the same because the admin will have to know 
how much memory Xen will allocate to keep track of the guest. So if "xl 
info" tells you that 3000MiB is free, you will only be able to use 
3000MiB - few kilobytes.


If yes, this would be a problem for migration because a newer version of 
Xen may allocate less/more memory.


Furthermore, with this suggestion, we will likely not be able to take 
advantage of 1GB superpage unless the admin knows how much memory will 
be used by Xen outside of the guest RAM (such as the console page, 
xenstore page, ).


IHMO, it would be easier if we define "maxmem" as the usable for guest 
RAM and then let the toolstack take add the extra memory (ACPI blob, 
special pages...).


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/2] systemd: drop dependency on proc-xen.mount in xendriverdomain.service

2016-07-25 Thread Andrew Cooper

On 25/07/16 10:34, Wei Liu wrote:
> On Mon, Jul 25, 2016 at 11:30:32AM +0200, Marek Marczykowski-Górecki wrote:
>> On Mon, Jul 25, 2016 at 10:27:47AM +0100, Wei Liu wrote:
>>> On Sun, Jul 24, 2016 at 09:27:13PM +0200, Marek Marczykowski-Górecki wrote:
 It is no longer required since xl devd use /dev/xen interface.

>>> How would this unit work when there is no /dev/xen interface?
>> Does it happen in reality? I thought /proc/xen is deprecated for a long
>> time...

It really should be, but isn't.

/proc/xen/capabilities and /proc/xen/allsyms (?) don't have any
equivalent in /dev/

>>
> I can't tell whether it happens in reality or not. I also hope to get
> rid of deprecated interface but they are here to stay for as long as it
> takes.
>
> I guess the question is, do you have a compelling reason against keeping
> this dependency?

We really should be making xenfs deprecated and able to be compiled out
of Linux.  It is a hangover of the early classic kernels which isn't
necessary, and causes subtle/weird failures if it isn't mounted.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records

2016-07-25 Thread Wei Liu

On Thu, Jul 21, 2016 at 06:17:36PM +0100, Andrew Cooper wrote:
> It was never intended for records such as these to be sent with zero content.
> 

Wouldn't it be better to modify write_split_record to ignore zero
content instead of patching up different places?

> Signed-off-by: Andrew Cooper 
> ---
> CC: Ian Jackson 
> CC: Wei Liu 
> ---
>  tools/libxc/xc_sr_save_x86_hvm.c |  4 
>  tools/libxc/xc_sr_save_x86_pv.c  | 12 
>  2 files changed, 16 insertions(+)
> 
> diff --git a/tools/libxc/xc_sr_save_x86_hvm.c 
> b/tools/libxc/xc_sr_save_x86_hvm.c
> index ba50a43..5401bf9 100644
> --- a/tools/libxc/xc_sr_save_x86_hvm.c
> +++ b/tools/libxc/xc_sr_save_x86_hvm.c
> @@ -112,6 +112,10 @@ static int write_hvm_params(struct xc_sr_context *ctx)
>  }
>  }
>  
> +/* No params? Skip this record. */
> +if ( hdr.count == 0 )
> +return 0;
> +
>  rc = write_split_record(ctx, &rec, entries, hdr.count * 
> sizeof(*entries));
>  if ( rc )
>  PERROR("Failed to write HVM_PARAMS record");
> diff --git a/tools/libxc/xc_sr_save_x86_pv.c b/tools/libxc/xc_sr_save_x86_pv.c
> index 4a29460..5fb9f2f 100644
> --- a/tools/libxc/xc_sr_save_x86_pv.c
> +++ b/tools/libxc/xc_sr_save_x86_pv.c
> @@ -607,6 +607,10 @@ static int write_one_vcpu_extended(struct xc_sr_context 
> *ctx, uint32_t id)
>  return -1;
>  }
>  
> +/* No content? Skip the record. */
> +if ( domctl.u.ext_vcpucontext.size == 0 )
> +return 0;
> +
>  return write_split_record(ctx, &rec, &domctl.u.ext_vcpucontext,
>domctl.u.ext_vcpucontext.size);
>  }
> @@ -662,6 +666,10 @@ static int write_one_vcpu_xsave(struct xc_sr_context 
> *ctx, uint32_t id)
>  goto err;
>  }
>  
> +/* No xsave state? Skip this record. */
> +if ( domctl.u.vcpuextstate.size == 0 )
> +goto out;
> +
>  rc = write_split_record(ctx, &rec, buffer, domctl.u.vcpuextstate.size);
>  if ( rc )
>  goto err;
> @@ -728,6 +736,10 @@ static int write_one_vcpu_msrs(struct xc_sr_context 
> *ctx, uint32_t id)
>  goto err;
>  }
>  
> +/* No MSRs? Skip this record. */
> +if ( domctl.u.vcpu_msrs.msr_count == 0 )
> +goto out;
> +
>  rc = write_split_record(ctx, &rec, buffer,
>  domctl.u.vcpu_msrs.msr_count *
>  sizeof(xen_domctl_vcpu_msr_t));
> -- 
> 2.1.4
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 4/4] tools/python: Adjust migration v2 library to warn about zero-length records

2016-07-25 Thread Wei Liu

On Thu, Jul 21, 2016 at 06:17:37PM +0100, Andrew Cooper wrote:
> These records shouldn't be in a stream, but accidentally are.  Warn about
> them, but don't abort the verification.
> 
> While here, add a missing length check to the X86_PV_P2M_FRAMES record
> checker.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/2] systemd: drop dependency on proc-xen.mount in xendriverdomain.service

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 10:45:54AM +0100, Andrew Cooper wrote:
> On 25/07/16 10:34, Wei Liu wrote:
> > On Mon, Jul 25, 2016 at 11:30:32AM +0200, Marek Marczykowski-Górecki wrote:
> >> On Mon, Jul 25, 2016 at 10:27:47AM +0100, Wei Liu wrote:
> >>> On Sun, Jul 24, 2016 at 09:27:13PM +0200, Marek Marczykowski-Górecki 
> >>> wrote:
>  It is no longer required since xl devd use /dev/xen interface.
> 
> >>> How would this unit work when there is no /dev/xen interface?
> >> Does it happen in reality? I thought /proc/xen is deprecated for a long
> >> time...
> 
> It really should be, but isn't.
> 
> /proc/xen/capabilities and /proc/xen/allsyms (?) don't have any
> equivalent in /dev/
> 
> >>
> > I can't tell whether it happens in reality or not. I also hope to get
> > rid of deprecated interface but they are here to stay for as long as it
> > takes.
> >
> > I guess the question is, do you have a compelling reason against keeping
> > this dependency?
> 
> We really should be making xenfs deprecated and able to be compiled out
> of Linux.  It is a hangover of the early classic kernels which isn't
> necessary, and causes subtle/weird failures if it isn't mounted.
> 

+1 from me of course!

When the kernel is ready, I shall remove the fallback in libraries and
then we can remove all references to proc-xen.mount and even
proc-xen.mount itself.

Wei.

> ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 16/18] arm/altp2m: Extended libxl to activate altp2m on ARM.

2016-07-25 Thread Julien Grall


Hello,

On 25/07/16 10:04, Sergej Proskurin wrote:

On 07/25/2016 10:32 AM, Wei Liu wrote:

On Sun, Jul 24, 2016 at 06:06:00PM +0200, Sergej Proskurin wrote:

+xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
+ libxl_defbool_val(info->u.pv.altp2m));
+}
+#endif
+
 static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
 libxl_domain_build_info *const info)
 {
@@ -433,6 +443,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
 return rc;
 #endif
 }
+#if defined(__arm__) || defined(__aarch64__)
+else /* info->type == LIBXL_DOMAIN_TYPE_PV */
+pv_set_conf_params(ctx->xch, domid, info);
+#endif

 rc = libxl__arch_domain_create(gc, d_config, domid);

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index ef614be..0a164f9 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -554,6 +554,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
   ("features", string, {'const': True}),
   # Use host's E820 for PCI passthrough.
   ("e820_host", libxl_defbool),
+  ("altp2m", libxl_defbool),

Why is this placed in PV instead of arch_arm?


The current implementation mirrors the x86's altp2m, where the altp2m
field represents a property of HVM guests in b_info->u.hvm.altp2m. Since
guests on ARM are marked as PV, I have placed the field altp2m into
b_info->u.pv.altp2m.

However, if you believe that it would make more sense to place altp2m
into b_info->arch_arm.altp2m, I will try to adapt the affected fields.


OK. I don't think that one should be placed in u.pv because that union
is for x86. However it is also not suitable to just promote that to
a common field because x86 pv doesn't have altp2m support.

I think arch_arm would be a better location.



Alright, I will move the field altp2m to arch_arm and adopt the
initialization routines.


Would not it make more sense to introduce a generic field 'altp2m' (i.e 
outside of hvm and arch_arm)?


Otherwise, toolstack such as libvirt will need to have specific code to 
handle ARM and x86.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records

2016-07-25 Thread Andrew Cooper

On 25/07/16 10:45, Wei Liu wrote:
> On Thu, Jul 21, 2016 at 06:17:36PM +0100, Andrew Cooper wrote:
>> It was never intended for records such as these to be sent with zero content.
>>
> Wouldn't it be better to modify write_split_record to ignore zero
> content instead of patching up different places?

That would catch the HVM_PARAMS, but not the other 3, which have an
extra 8 byte header.

However, having write_split_record() conditionally drop records is a
violation of the principle of least surprise.  Choosing to skip a record
is an important decision and needs to be easy to spot in the code.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 16/18] arm/altp2m: Extended libxl to activate altp2m on ARM.

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 10:49:41AM +0100, Julien Grall wrote:
> Hello,
> 
> On 25/07/16 10:04, Sergej Proskurin wrote:
> >On 07/25/2016 10:32 AM, Wei Liu wrote:
> >>On Sun, Jul 24, 2016 at 06:06:00PM +0200, Sergej Proskurin wrote:
> >+xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
> >+ libxl_defbool_val(info->u.pv.altp2m));
> >+}
> >+#endif
> >+
> > static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
> > libxl_domain_build_info *const info)
> > {
> >@@ -433,6 +443,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
> > return rc;
> > #endif
> > }
> >+#if defined(__arm__) || defined(__aarch64__)
> >+else /* info->type == LIBXL_DOMAIN_TYPE_PV */
> >+pv_set_conf_params(ctx->xch, domid, info);
> >+#endif
> >
> > rc = libxl__arch_domain_create(gc, d_config, domid);
> >
> >diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> >index ef614be..0a164f9 100644
> >--- a/tools/libxl/libxl_types.idl
> >+++ b/tools/libxl/libxl_types.idl
> >@@ -554,6 +554,7 @@ libxl_domain_build_info = 
> >Struct("domain_build_info",[
> >   ("features", string, {'const': 
> > True}),
> >   # Use host's E820 for PCI 
> > passthrough.
> >   ("e820_host", libxl_defbool),
> >+  ("altp2m", libxl_defbool),
> Why is this placed in PV instead of arch_arm?
> 
> >>>The current implementation mirrors the x86's altp2m, where the altp2m
> >>>field represents a property of HVM guests in b_info->u.hvm.altp2m. Since
> >>>guests on ARM are marked as PV, I have placed the field altp2m into
> >>>b_info->u.pv.altp2m.
> >>>
> >>>However, if you believe that it would make more sense to place altp2m
> >>>into b_info->arch_arm.altp2m, I will try to adapt the affected fields.
> >>>
> >>OK. I don't think that one should be placed in u.pv because that union
> >>is for x86. However it is also not suitable to just promote that to
> >>a common field because x86 pv doesn't have altp2m support.
> >>
> >>I think arch_arm would be a better location.
> >>
> >
> >Alright, I will move the field altp2m to arch_arm and adopt the
> >initialization routines.
> 
> Would not it make more sense to introduce a generic field 'altp2m' (i.e
> outside of hvm and arch_arm)?
> 
> Otherwise, toolstack such as libvirt will need to have specific code to
> handle ARM and x86.
> 

Hmm... this is a compelling reason.

After mulling over this issue a bit more, I think I'm fine with
promoting altp2m to a common field.

Sergej, this is a patch to get you started. Please make sure setting the
old field still works. If I'm not clear enough, please ask.


Wei.

---8<---
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index ef614be..81d3ae5 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -494,6 +494,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
 # Note that the partial device tree should avoid to use the phandle
 # 65000 which is reserved by the toolstack.
 ("device_tree",  string),
+("altp2m",   libxl_defbool),
 ("u", KeyedUnion(None, libxl_domain_type, "type",
 [("hvm", Struct(None, [("firmware", string),
("bios", libxl_bios_type),
@@ -512,7 +513,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
("mmio_hole_memkb",  MemKB),
("timer_mode",   libxl_timer_mode),
("nested_hvm",   libxl_defbool),
-   ("altp2m",   libxl_defbool),
+   ("altp2m",   libxl_defbool), # 
deprecated, please use common field
("smbios_firmware",  string),
("acpi_firmware",string),
("hdtype",   libxl_hdtype),

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 10:57:13AM +0100, Andrew Cooper wrote:
> On 25/07/16 10:45, Wei Liu wrote:
> > On Thu, Jul 21, 2016 at 06:17:36PM +0100, Andrew Cooper wrote:
> >> It was never intended for records such as these to be sent with zero 
> >> content.
> >>
> > Wouldn't it be better to modify write_split_record to ignore zero
> > content instead of patching up different places?
> 
> That would catch the HVM_PARAMS, but not the other 3, which have an
> extra 8 byte header.
> 
> However, having write_split_record() conditionally drop records is a
> violation of the principle of least surprise.  Choosing to skip a record
> is an important decision and needs to be easy to spot in the code.
> 

Fair enough.

> ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/vMSI-X: Fix host crash when shutting down guests with MSI capable devices

2016-07-25 Thread Andrew Cooper

On 22/07/16 09:50, Sander Eikelenboom wrote:
> Thursday, July 21, 2016, 12:18:37 PM, you wrote:
>
>> c/s 74c6dc2d "x86/vMSI-X: defer intercept handler registration" caused MSI-X
>> table infrastructure not to always be initialised, but it missed one path
>> which needed an is-initialised check.
>> If a devices is passed through to a domain which is MSI capable but not MSI-X
>> capable, the call to msixtbl_init() is omitted, but a 
>> XEN_DOMCTL_unbind_pt_irq
>> hypercall still calls into msixtbl_pt_unregister().  This follows the linked
>> list pointer which is still NULL.
>> Introduce an is-initalised check to msixtbl_pt_unregister().
>> Furthermore, the purpose of the open-coded msixtbl_list.next check is rather
>> subtle.  Introduce an msixtbl_initialised() predicate instead, which makes 
>> its
>> purpose far more obvious.
>> Reported-by: Sander Eikelenboom 
>> Signed-off-by: Andrew Cooper 
>> ---
>> CC: Jan Beulich 
>> CC: Sander Eikelenboom 
>> Sander - would you mind double checking this patch?
>> ---
> Hi Andrew,
>
> Just got the chance to test and it works for me !
>
> Thanks,

May I take that as a Test-by: then please?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/vMSI-X: Fix host crash when shutting down guests with MSI capable devices

2016-07-25 Thread Andrew Cooper

On 25/07/16 11:16, Andrew Cooper wrote:
> On 22/07/16 09:50, Sander Eikelenboom wrote:
>> Thursday, July 21, 2016, 12:18:37 PM, you wrote:
>>
>>> c/s 74c6dc2d "x86/vMSI-X: defer intercept handler registration" caused MSI-X
>>> table infrastructure not to always be initialised, but it missed one path
>>> which needed an is-initialised check.
>>> If a devices is passed through to a domain which is MSI capable but not 
>>> MSI-X
>>> capable, the call to msixtbl_init() is omitted, but a 
>>> XEN_DOMCTL_unbind_pt_irq
>>> hypercall still calls into msixtbl_pt_unregister().  This follows the linked
>>> list pointer which is still NULL.
>>> Introduce an is-initalised check to msixtbl_pt_unregister().
>>> Furthermore, the purpose of the open-coded msixtbl_list.next check is rather
>>> subtle.  Introduce an msixtbl_initialised() predicate instead, which makes 
>>> its
>>> purpose far more obvious.
>>> Reported-by: Sander Eikelenboom 
>>> Signed-off-by: Andrew Cooper 
>>> ---
>>> CC: Jan Beulich 
>>> CC: Sander Eikelenboom 
>>> Sander - would you mind double checking this patch?
>>> ---
>> Hi Andrew,
>>
>> Just got the chance to test and it works for me !
>>
>> Thanks,
> May I take that as a Test-by: then please?

And of course, I meant Tested-by:

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records

2016-07-25 Thread David Vrabel

On 21/07/16 18:17, Andrew Cooper wrote:
> The sending side shouldn't send any variable sized records which end up having
> zero content, but the receiving side will need to tolerate such records for
> compatibility purposes.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Ian Jackson 
> CC: Wei Liu 
> ---
>  docs/specs/libxc-migration-stream.pandoc | 16 +++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/specs/libxc-migration-stream.pandoc 
> b/docs/specs/libxc-migration-stream.pandoc
> index 31eba10..a90bc5d 100644
> --- a/docs/specs/libxc-migration-stream.pandoc
> +++ b/docs/specs/libxc-migration-stream.pandoc
> @@ -3,7 +3,7 @@
>Andrew Cooper <>
>Wen Congyang <>
>Yang Hongyang <>
> -% Revision 1
> +% Revision 2
>  
>  Introduction
>  
> @@ -631,6 +631,10 @@ The set of valid records depends on the guest 
> architecture and type.  No
>  assumptions should be made about the ordering or interleaving of
>  independent records.  Record dependencies are noted below.
>  
> +Some records have an exactly specified size.  Some records have variable size
> +depending on their content.  A record with variable size which ends up being
> +zero should be omitted entirely from the stream by the sending side.

I disagree. I think the stream should include the records with the empty
content.  This gives better consistency and does not require changes to
the stream.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 11:38:20AM +0200, Ingo Jürgensmann wrote:
> On 22.07.2016 12:21, Ingo Jürgensmann wrote:
> >On 22.07.2016 11:03, Ingo Jürgensmann wrote:
> >
> >>In the meanwhile, I activated IPv6 again on Tuesday evening and today
> >>the server crashed again some minutes ago. Here's the output from
> >>netconsole:
> >... and the second subsequent crash:
> 
> ... and another crash below...
> 
> But in the meanwhile, I wonder if anyone except Andreas and I has some
> interest in fixing this issue, as there were absolutely no comments or
> feedback on my previous mails. Quite disappointing. :-/
> 

I did skim your emails. But the oops was happening in memcpy+0x6 which
indicated it came back to the origin question why would it got an
exception there.

Just by staring at the code doesn't get me anywhere. Without a concrete
reproduction of the issue, I'm afraid I can't provide more input here.

There are several moving parts:

0. Hardware
1. Xen hypervisor
2. Dom0 kernel

Your report and the debian report suggested that Dom0 kernel is less
likely to be the culprit because you've tried different Dom0 kernels.

As for Xen, not sure if you would be up for trying a debug build from
source tree. That would help provide information on whether this is a
bug in Xen or not.

As for hardware, it would be worth trying whether this issue happens on
other hardware platform.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/vMSI-X: Fix host crash when shutting down guests with MSI capable devices

2016-07-25 Thread Sander Eikelenboom


Monday, July 25, 2016, 12:19:55 PM, you wrote:

> On 25/07/16 11:16, Andrew Cooper wrote:
>> On 22/07/16 09:50, Sander Eikelenboom wrote:
>>> Thursday, July 21, 2016, 12:18:37 PM, you wrote:
>>>
 c/s 74c6dc2d "x86/vMSI-X: defer intercept handler registration" caused 
 MSI-X
 table infrastructure not to always be initialised, but it missed one path
 which needed an is-initialised check.
 If a devices is passed through to a domain which is MSI capable but not 
 MSI-X
 capable, the call to msixtbl_init() is omitted, but a 
 XEN_DOMCTL_unbind_pt_irq
 hypercall still calls into msixtbl_pt_unregister().  This follows the 
 linked
 list pointer which is still NULL.
 Introduce an is-initalised check to msixtbl_pt_unregister().
 Furthermore, the purpose of the open-coded msixtbl_list.next check is 
 rather
 subtle.  Introduce an msixtbl_initialised() predicate instead, which makes 
 its
 purpose far more obvious.
 Reported-by: Sander Eikelenboom 
 Signed-off-by: Andrew Cooper 
 ---
 CC: Jan Beulich 
 CC: Sander Eikelenboom 
 Sander - would you mind double checking this patch?
 ---
>>> Hi Andrew,
>>>
>>> Just got the chance to test and it works for me !
>>>
>>> Thanks,
>> May I take that as a Test-by: then please?

> And of course, I meant Tested-by:

Yes, thanks for the quick fix !

--
Sander

> ~Andrew



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records

2016-07-25 Thread Andrew Cooper

On 25/07/16 11:21, David Vrabel wrote:
> On 21/07/16 18:17, Andrew Cooper wrote:
>> The sending side shouldn't send any variable sized records which end up 
>> having
>> zero content, but the receiving side will need to tolerate such records for
>> compatibility purposes.
>>
>> Signed-off-by: Andrew Cooper 
>> ---
>> CC: Ian Jackson 
>> CC: Wei Liu 
>> ---
>>  docs/specs/libxc-migration-stream.pandoc | 16 +++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/specs/libxc-migration-stream.pandoc 
>> b/docs/specs/libxc-migration-stream.pandoc
>> index 31eba10..a90bc5d 100644
>> --- a/docs/specs/libxc-migration-stream.pandoc
>> +++ b/docs/specs/libxc-migration-stream.pandoc
>> @@ -3,7 +3,7 @@
>>Andrew Cooper <>
>>Wen Congyang <>
>>Yang Hongyang <>
>> -% Revision 1
>> +% Revision 2
>>  
>>  Introduction
>>  
>> @@ -631,6 +631,10 @@ The set of valid records depends on the guest 
>> architecture and type.  No
>>  assumptions should be made about the ordering or interleaving of
>>  independent records.  Record dependencies are noted below.
>>  
>> +Some records have an exactly specified size.  Some records have variable 
>> size
>> +depending on their content.  A record with variable size which ends up being
>> +zero should be omitted entirely from the stream by the sending side.
> I disagree. I think the stream should include the records with the empty
> content.  This gives better consistency and does not require changes to
> the stream.

There are already some which are properly omitted, like the vcpu records
for offline vcpus.

There is no point having empty records; omitting them is an optimisation
which we absolutely shouldn't preclude.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/vMSI-X: Fix host crash when shutting down guests with MSI capable devices

2016-07-25 Thread George Dunlap

On Thu, Jul 21, 2016 at 11:18 AM, Andrew Cooper
 wrote:
> c/s 74c6dc2d "x86/vMSI-X: defer intercept handler registration" caused MSI-X
> table infrastructure not to always be initialised, but it missed one path
> which needed an is-initialised check.
>
> If a devices is passed through to a domain which is MSI capable but not MSI-X
> capable, the call to msixtbl_init() is omitted, but a XEN_DOMCTL_unbind_pt_irq
> hypercall still calls into msixtbl_pt_unregister().  This follows the linked
> list pointer which is still NULL.
>
> Introduce an is-initalised check to msixtbl_pt_unregister().
>
> Furthermore, the purpose of the open-coded msixtbl_list.next check is rather
> subtle.  Introduce an msixtbl_initialised() predicate instead, which makes its
> purpose far more obvious.

Thanks for this bit.

> Reported-by: Sander Eikelenboom 
> Signed-off-by: Andrew Cooper 

Reviewed-by: George Dunlap 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

2016-07-25 Thread Bob Liu


On 07/25/2016 05:20 PM, Roger Pau Monné wrote:
> On Sat, Jul 23, 2016 at 06:18:23AM +0800, Bob Liu wrote:
>>
>> On 07/22/2016 07:45 PM, Roger Pau Monné wrote:
>>> On Fri, Jul 22, 2016 at 05:43:32PM +0800, Bob Liu wrote:

 On 07/22/2016 05:34 PM, Roger Pau Monné wrote:
> On Fri, Jul 22, 2016 at 04:17:48PM +0800, Bob Liu wrote:
>>
>> On 07/22/2016 03:45 PM, Roger Pau Monné wrote:
>>> On Thu, Jul 21, 2016 at 06:08:05PM +0800, Bob Liu wrote:

 On 07/21/2016 04:57 PM, Roger Pau Monné wrote:
>> ..[snip]..
>> +
>> +static ssize_t dynamic_reconfig_device(struct blkfront_info *info, 
>> ssize_t count)
>> +{
>> +unsigned int i;
>> +int err = -EBUSY;
>> +
>> +/*
>> + * Make sure no migration in parallel, device lock is actually a
>> + * mutex.
>> + */
>> +if (!device_trylock(&info->xbdev->dev)) {
>> +pr_err("Fail to acquire dev:%s lock, may be in 
>> migration.\n",
>> +dev_name(&info->xbdev->dev));
>> +return err;
>> +}
>> +
>> +/*
>> + * Prevent new requests and guarantee no uncompleted reqs.
>> + */
>> +blk_mq_freeze_queue(info->rq);
>> +if (part_in_flight(&info->gd->part0))
>> +goto out;
>> +
>> +/*
>> + * FrontBackend
>> + * Switch to XenbusStateClosed
>> + *  frontend_changed():
>> + *   case XenbusStateClosed:
>> + *  
>> xen_blkif_disconnect()
>> + *  Switch to 
>> XenbusStateClosed
>> + * blkfront_resume():
>> + *  frontend_changed():
>> + *  reconnect
>> + * Wait until XenbusStateConnected
>> + */
>> +info->reconfiguring = true;
>> +xenbus_switch_state(info->xbdev, XenbusStateClosed);
>> +
>> +/* Poll every 100ms, 1 minute timeout. */
>> +for (i = 0; i < 600; i++) {
>> +/*
>> + * Wait backend enter XenbusStateClosed, 
>> blkback_changed()
>> + * will clear reconfiguring.
>> + */
>> +if (!info->reconfiguring)
>> +goto resume;
>> +schedule_timeout_interruptible(msecs_to_jiffies(100));
>> +}
>
> Instead of having this wait, could you just set info->reconfiguring = 
> 1, set 
> the frontend state to XenbusStateClosed and mimic exactly what a 
> resume from 
> suspension does? blkback_changed would have to set the frontend state 
> to 
> InitWait when it detects that the backend has switched to Closed, and 
> call 
> blkfront_resume.


 I think that won't work.
 In the real "resume" case, the power management system will trigger 
 all ->resume() path.
 But there is no place for dynamic configuration.
>>>
>>> Hello,
>>>
>>> I think it should be possible to set info->reconfiguring and wait for 
>>> the 
>>> backend to switch to state Closed, at that point we should call 
>>> blkif_resume 
>>> (from blkback_changed) and the backend will follow the reconection.
>>>
>>
>> Okay, I get your point. Yes, that's an option.
>>
>> But this will make 'dynamic configuration' to be async, I'm worry about 
>> the end-user will get panic.
>> E.g
>> A end-user "echo  > /sys/devices/vbd-xxx/max_indirect_segs",
>> but then the device will be Closed and disappeared, the user have to 
>> wait for a random time so that the device can resume.
>
> That should not happen, AFAICT on migration the device never dissapears. 

 Oh, yes.

> alloc_disk and friends should not be called on resume from migration (see 
> the switch in blkfront_connect, you should take the BLKIF_STATE_SUSPENDED 
> path for the reconfiguration).
>

 What about if the end-user starts I/O immediately after writing new value 
 to /sys?
 But the resume is still in progress.
>>>
>>> blkif_free already stops the queues by calling blk_mq_stop_hw_queues, and 
>>> blkif_queue_request will refuse to put anything on the ring if the state 
>>> is different than connected, which in turn makes blkif_queue_rq call 
>>> blk_mq_stop_hw_queue to stop the queue, so it should be safe.
>>>
>>
>> But this will surprise the end-user, our user script is l

Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records

2016-07-25 Thread David Vrabel

On 21/07/16 18:17, Andrew Cooper wrote:
> It was never intended for records such as these to be sent with zero content.

As the original author of the specification I'm perhaps best placed to
say what the original intention is.

For records such as HVM_PARAMS which consist of a set of N items, the
intention was to most definitely send a record with 0 items.

For records that fetch an opaque blob from the hypervisor, again the
intention was to sent this blob as-is with no sort of processing or
other checking. i.e., if the hypervisor gives us a zero-length blob we
sent that as-is.

This makes all the streams look the same with all the same records,
regardless of what hardware platform it was run on.  Including
zero-length/count records also makes diagnosing problems easier -- the
empty record is visible in the stream instead of having to remember that
sometimes these records are deliberately omitted.

As such, this series should be limited to making the restore side handle
the zero count sets or zero length blobs if it does not do so already.

The specification should be clarified to note that some records may have
zero-length blobs or contain zero items.

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records

2016-07-25 Thread David Vrabel

On 25/07/16 11:25, Andrew Cooper wrote:
> On 25/07/16 11:21, David Vrabel wrote:
>> On 21/07/16 18:17, Andrew Cooper wrote:
>>> The sending side shouldn't send any variable sized records which end up 
>>> having
>>> zero content, but the receiving side will need to tolerate such records for
>>> compatibility purposes.
>>>
>>> Signed-off-by: Andrew Cooper 
>>> ---
>>> CC: Ian Jackson 
>>> CC: Wei Liu 
>>> ---
>>>  docs/specs/libxc-migration-stream.pandoc | 16 +++-
>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/docs/specs/libxc-migration-stream.pandoc 
>>> b/docs/specs/libxc-migration-stream.pandoc
>>> index 31eba10..a90bc5d 100644
>>> --- a/docs/specs/libxc-migration-stream.pandoc
>>> +++ b/docs/specs/libxc-migration-stream.pandoc
>>> @@ -3,7 +3,7 @@
>>>Andrew Cooper <>
>>>Wen Congyang <>
>>>Yang Hongyang <>
>>> -% Revision 1
>>> +% Revision 2
>>>  
>>>  Introduction
>>>  
>>> @@ -631,6 +631,10 @@ The set of valid records depends on the guest 
>>> architecture and type.  No
>>>  assumptions should be made about the ordering or interleaving of
>>>  independent records.  Record dependencies are noted below.
>>>  
>>> +Some records have an exactly specified size.  Some records have variable 
>>> size
>>> +depending on their content.  A record with variable size which ends up 
>>> being
>>> +zero should be omitted entirely from the stream by the sending side.
>> I disagree. I think the stream should include the records with the empty
>> content.  This gives better consistency and does not require changes to
>> the stream.
> 
> There are already some which are properly omitted, like the vcpu records
> for offline vcpus.
> 
> There is no point having empty records; omitting them is an optimisation
> which we absolutely shouldn't preclude.

The optimization doesn't matter since these records are so tiny.

I've expanded on why these should be included in another reply.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [OSSTEST PATCH 1/3] cr-ensure-disk-space: Fix reference to iteration_proceed

2016-07-25 Thread Ian Jackson

8738d60622c1 "cr-ensure-disk-space: Break out check_space" renamed
iteration_continue to iteration_proceed but did not change the call
site.

Signed-off-by: Ian Jackson 
---
 cr-ensure-disk-space | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cr-ensure-disk-space b/cr-ensure-disk-space
index 2d299dd..bb98835 100755
--- a/cr-ensure-disk-space
+++ b/cr-ensure-disk-space
@@ -152,7 +152,7 @@ db_retry($dbh_tests,[], sub {
 @flights = ();
 for (;;) {
last if check_space();
-   iteration_continue();
+   iteration_proceed();
 }
 });
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [OSSTEST PATCH 3/3] cr-ensure-disk-space: Correct stdout output

2016-07-25 Thread Ian Jackson

d221996eea64 "cr-ensure-disk-space: Run check_space before taking
lock" introduced an additional call to check_space but check_space
prints the start of a message (with no newline) expecting
iteration_proceed to print the rest.

Move $|=1 up appropriately and add a couple of messages in the right
place.  This involves calling quit_ok rather than exit 0.

Signed-off-by: Ian Jackson 
---
 cr-ensure-disk-space | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/cr-ensure-disk-space b/cr-ensure-disk-space
index 6a9781a..ff6b01b 100755
--- a/cr-ensure-disk-space
+++ b/cr-ensure-disk-space
@@ -68,13 +68,15 @@ sub quit_ok () {
 
 $|=1;
 
-exit 0 if check_space;
+check_space if check_space;
+
+print "taking lock...";
 
 my $lock = "$c{GlobalLockDir}/publish-lock";
 open LOCK, "> $lock" or die "$lock $!";
 flock LOCK, LOCK_EX or die $!;
 
-$|=1;
+print "proceeding...\n";
 
 my $chkq= db_prepare("SELECT * FROM flights WHERE flight=?");
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [OSSTEST PATCH 2/3] cr-ensure-disk-space: Break out quit_ok

2016-07-25 Thread Ian Jackson

We are going to want another call site.

No functional change.

Signed-off-by: Ian Jackson 
---
 cr-ensure-disk-space | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/cr-ensure-disk-space b/cr-ensure-disk-space
index bb98835..6a9781a 100755
--- a/cr-ensure-disk-space
+++ b/cr-ensure-disk-space
@@ -61,6 +61,13 @@ sub check_space () {
 return $space >= logcfg('MinSpaceMby');
 }
 
+sub quit_ok () {
+printf "ok.\n";
+exit 0;
+}
+
+$|=1;
+
 exit 0 if check_space;
 
 my $lock = "$c{GlobalLockDir}/publish-lock";
@@ -156,5 +163,4 @@ db_retry($dbh_tests,[], sub {
 }
 });
 
-printf "ok.\n";
-exit 0;
+quit_ok();
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records

2016-07-25 Thread Andrew Cooper

On 25/07/16 11:35, David Vrabel wrote:
> On 25/07/16 11:25, Andrew Cooper wrote:
>> On 25/07/16 11:21, David Vrabel wrote:
>>> On 21/07/16 18:17, Andrew Cooper wrote:
 The sending side shouldn't send any variable sized records which end up 
 having
 zero content, but the receiving side will need to tolerate such records for
 compatibility purposes.

 Signed-off-by: Andrew Cooper 
 ---
 CC: Ian Jackson 
 CC: Wei Liu 
 ---
  docs/specs/libxc-migration-stream.pandoc | 16 +++-
  1 file changed, 15 insertions(+), 1 deletion(-)

 diff --git a/docs/specs/libxc-migration-stream.pandoc 
 b/docs/specs/libxc-migration-stream.pandoc
 index 31eba10..a90bc5d 100644
 --- a/docs/specs/libxc-migration-stream.pandoc
 +++ b/docs/specs/libxc-migration-stream.pandoc
 @@ -3,7 +3,7 @@
Andrew Cooper <>
Wen Congyang <>
Yang Hongyang <>
 -% Revision 1
 +% Revision 2
  
  Introduction
  
 @@ -631,6 +631,10 @@ The set of valid records depends on the guest 
 architecture and type.  No
  assumptions should be made about the ordering or interleaving of
  independent records.  Record dependencies are noted below.
  
 +Some records have an exactly specified size.  Some records have variable 
 size
 +depending on their content.  A record with variable size which ends up 
 being
 +zero should be omitted entirely from the stream by the sending side.
>>> I disagree. I think the stream should include the records with the empty
>>> content.  This gives better consistency and does not require changes to
>>> the stream.
>> There are already some which are properly omitted, like the vcpu records
>> for offline vcpus.
>>
>> There is no point having empty records; omitting them is an optimisation
>> which we absolutely shouldn't preclude.
> The optimization doesn't matter since these records are so tiny.
>
> I've expanded on why these should be included in another reply.

We already omit most zero-length records.  That ship has already sailed.

This bug only presents itself because of two hypercalls in Xen returning
different size characteristics.  We already checked for zero first, and
fail to check for zero the second time.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian

2016-07-25 Thread Andreas Ziegler

> On Mon, Jul 25, 2016 at 11:38:20AM +0200, Ingo Jürgensmann wrote:
>> On 22.07.2016 12:21, Ingo Jürgensmann wrote:
>>> On 22.07.2016 11:03, Ingo Jürgensmann wrote:
>>>
 In the meanwhile, I activated IPv6 again on Tuesday evening and today
 the server crashed again some minutes ago. Here's the output from
 netconsole:
>>> ... and the second subsequent crash:
>>
>> ... and another crash below...
>>
>> But in the meanwhile, I wonder if anyone except Andreas and I has some
>> interest in fixing this issue, as there were absolutely no comments or
>> feedback on my previous mails. Quite disappointing. :-/
>>
> 
> I did skim your emails. But the oops was happening in memcpy+0x6 which
> indicated it came back to the origin question why would it got an
> exception there.
> 
[...]
> Your report and the debian report suggested that Dom0 kernel is less
> likely to be the culprit because you've tried different Dom0 kernels.

yes we did. but nothing newer than 3.16 so far, we could try that, too.


> As for Xen, not sure if you would be up for trying a debug build from
> source tree. That would help provide information on whether this is a
> bug in Xen or not.

ok, we'll try!


> As for hardware, it would be worth trying whether this issue happens on
> other hardware platform.

as i wrote earlier on the list, we have two platforms which only have
the Intel Xeon CPU in common, nothing else - and even that is from a
very different generation.


Andreas.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records

2016-07-25 Thread David Vrabel

On 25/07/16 11:38, Andrew Cooper wrote:
> On 25/07/16 11:35, David Vrabel wrote:
>> On 25/07/16 11:25, Andrew Cooper wrote:
>>> On 25/07/16 11:21, David Vrabel wrote:
 On 21/07/16 18:17, Andrew Cooper wrote:
> The sending side shouldn't send any variable sized records which end up 
> having
> zero content, but the receiving side will need to tolerate such records 
> for
> compatibility purposes.
>
> Signed-off-by: Andrew Cooper 
> ---
> CC: Ian Jackson 
> CC: Wei Liu 
> ---
>  docs/specs/libxc-migration-stream.pandoc | 16 +++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/docs/specs/libxc-migration-stream.pandoc 
> b/docs/specs/libxc-migration-stream.pandoc
> index 31eba10..a90bc5d 100644
> --- a/docs/specs/libxc-migration-stream.pandoc
> +++ b/docs/specs/libxc-migration-stream.pandoc
> @@ -3,7 +3,7 @@
>Andrew Cooper <>
>Wen Congyang <>
>Yang Hongyang <>
> -% Revision 1
> +% Revision 2
>  
>  Introduction
>  
> @@ -631,6 +631,10 @@ The set of valid records depends on the guest 
> architecture and type.  No
>  assumptions should be made about the ordering or interleaving of
>  independent records.  Record dependencies are noted below.
>  
> +Some records have an exactly specified size.  Some records have variable 
> size
> +depending on their content.  A record with variable size which ends up 
> being
> +zero should be omitted entirely from the stream by the sending side.
 I disagree. I think the stream should include the records with the empty
 content.  This gives better consistency and does not require changes to
 the stream.
>>> There are already some which are properly omitted, like the vcpu records
>>> for offline vcpus.
>>>
>>> There is no point having empty records; omitting them is an optimisation
>>> which we absolutely shouldn't preclude.
>> The optimization doesn't matter since these records are so tiny.
>>
>> I've expanded on why these should be included in another reply.
> 
> We already omit most zero-length records.  That ship has already sailed.
> 
> This bug only presents itself because of two hypercalls in Xen returning
> different size characteristics.  We already checked for zero first, and
> fail to check for zero the second time.

Then document that those specific records may be omitted, but don't
spread the brokenness to other (future) records.

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records

2016-07-25 Thread Andrew Cooper

On 25/07/16 11:44, David Vrabel wrote:
> On 25/07/16 11:38, Andrew Cooper wrote:
>> On 25/07/16 11:35, David Vrabel wrote:
>>> On 25/07/16 11:25, Andrew Cooper wrote:
 On 25/07/16 11:21, David Vrabel wrote:
> On 21/07/16 18:17, Andrew Cooper wrote:
>> The sending side shouldn't send any variable sized records which end up 
>> having
>> zero content, but the receiving side will need to tolerate such records 
>> for
>> compatibility purposes.
>>
>> Signed-off-by: Andrew Cooper 
>> ---
>> CC: Ian Jackson 
>> CC: Wei Liu 
>> ---
>>  docs/specs/libxc-migration-stream.pandoc | 16 +++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/specs/libxc-migration-stream.pandoc 
>> b/docs/specs/libxc-migration-stream.pandoc
>> index 31eba10..a90bc5d 100644
>> --- a/docs/specs/libxc-migration-stream.pandoc
>> +++ b/docs/specs/libxc-migration-stream.pandoc
>> @@ -3,7 +3,7 @@
>>Andrew Cooper <>
>>Wen Congyang <>
>>Yang Hongyang <>
>> -% Revision 1
>> +% Revision 2
>>  
>>  Introduction
>>  
>> @@ -631,6 +631,10 @@ The set of valid records depends on the guest 
>> architecture and type.  No
>>  assumptions should be made about the ordering or interleaving of
>>  independent records.  Record dependencies are noted below.
>>  
>> +Some records have an exactly specified size.  Some records have 
>> variable size
>> +depending on their content.  A record with variable size which ends up 
>> being
>> +zero should be omitted entirely from the stream by the sending side.
> I disagree. I think the stream should include the records with the empty
> content.  This gives better consistency and does not require changes to
> the stream.
 There are already some which are properly omitted, like the vcpu records
 for offline vcpus.

 There is no point having empty records; omitting them is an optimisation
 which we absolutely shouldn't preclude.
>>> The optimization doesn't matter since these records are so tiny.
>>>
>>> I've expanded on why these should be included in another reply.
>> We already omit most zero-length records.  That ship has already sailed.
>>
>> This bug only presents itself because of two hypercalls in Xen returning
>> different size characteristics.  We already checked for zero first, and
>> fail to check for zero the second time.
> Then document that those specific records may be omitted, but don't
> spread the brokenness to other (future) records.

I disagree, and do not consider this broken.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/6] libxl: add "pv device mode needed" support to device type framework

2016-07-25 Thread Wei Liu

On Tue, Jul 12, 2016 at 05:30:40PM +0200, Juergen Gross wrote:
> Add another callback to the device type framework in order to aid
> decision whether a pv domain needs a device model.
> 
> Signed-off-by: Juergen Gross 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/6] libxl: add "merge" function to generic device type support

2016-07-25 Thread Wei Liu

On Tue, Jul 12, 2016 at 05:30:39PM +0200, Juergen Gross wrote:
> Instead of using a macro generating the code to merge xenstore and
> json configuration data, use the generic device type support for
> this purpose.
> 
> This requires to add some accessor functions to the framework and
> a structure for disks (as disks are added separately they didn't need
> such a structure up to now).
> 
> Signed-off-by: Juergen Gross 

I definitely think this is a good idea.

The code looks good to me:

Acked-by: Wei Liu 

I didn't do a line-by-line review though because most code looks very
mechanical. I'm confident that we can sort out issues should they arise.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 6/6] libxl: move common nic stuff into one source

2016-07-25 Thread Wei Liu

On Tue, Jul 12, 2016 at 05:30:44PM +0200, Juergen Gross wrote:
> Put all nic related stuff of libxl form common files into a dedicated
> source file.
> 
> Signed-off-by: Juergen Gross 

Assuming this is pure code motion:

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 4/6] libxl: split libxl vtpm code into one source

2016-07-25 Thread Wei Liu

On Tue, Jul 12, 2016 at 05:30:42PM +0200, Juergen Gross wrote:
> Put all vtpm related stuff of libxl into a dedicated source file.
> 
> Signed-off-by: Juergen Gross 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/6] libxl: move library pvusb specific code into libxl_pvusb.c

2016-07-25 Thread Wei Liu

On Tue, Jul 12, 2016 at 05:30:41PM +0200, Juergen Gross wrote:
> Outside libxl_pvusb.c only libxl_util.c still contains some pvusb code.
> 
> Move it to libxl_pvusb.c.
> 
> Signed-off-by: Juergen Gross 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 5/6] libxl: add config update callback to device type framework

2016-07-25 Thread Wei Liu

On Tue, Jul 12, 2016 at 05:30:43PM +0200, Juergen Gross wrote:
> Some device types require a configuration update after resume of
> domain. Add a callback for this purpose.
> 
> Signed-off-by: Juergen Gross 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian

2016-07-25 Thread David Vrabel

On 25/07/16 10:38, Ingo Jürgensmann wrote:
> On 22.07.2016 12:21, Ingo Jürgensmann wrote:
>> On 22.07.2016 11:03, Ingo Jürgensmann wrote:
>>
>>> In the meanwhile, I activated IPv6 again on Tuesday evening and today
>>> the server crashed again some minutes ago. Here's the output from
>>> netconsole:
>> ... and the second subsequent crash:
> 
> ... and another crash below...
> 
> But in the meanwhile, I wonder if anyone except Andreas and I has some
> interest in fixing this issue, as there were absolutely no comments or
> feedback on my previous mails. Quite disappointing. :-/

The level of technical support is somewhat proportional to what you pay.

> Jul 25 10:50:00 31.172.31.251 [257646.574819] CPU: 0 PID: 0 Comm:
> swapper/0 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2+deb8u3
> Jul 25 10:50:00 31.172.31.251 [257646.574874] Hardware name: Supermicro
> X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 3.0a
> 01/03/2014
> Jul 25 10:50:00 31.172.31.251 [257646.574934] task: 8181a460 ti:
> 8180 task.ti: 8180
> Jul 25 10:50:00 31.172.31.251 [257646.575080] RIP:
> e030:[]
> Jul 25 10:50:00 31.172.31.251  [] memcpy+0x6/0x110


I think that there's a path in the network stack that is cloning frags
without orphaning them.  The original frags are freed (and unmapped) and
the cloned frags now have unmapped pages.

I think newer kernels address at least one of these paths.

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian

2016-07-25 Thread Andreas Ziegler

>> Your report and the debian report suggested that Dom0 kernel is less
>> likely to be the culprit because you've tried different Dom0 kernels.
> 
> yes we did. but nothing newer than 3.16 so far, we could try that, too.

i have to correct myself:
we also tried 4.4 a while ago, maybe we should try 4.6 now

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [Embedded-pv-devel] Xen Project Developer Summit : program Published

2016-07-25 Thread Dario Faggioli

On Fri, 2016-07-15 at 10:56 -0400, Meng Xu wrote:
> On Fri, Jul 15, 2016 at 8:01 AM, Dario Faggioli
>  wrote:
> > 
> > * "Consideration of Real Time GPU Scheduling of XenGT in Automotive
> >    Embedded System"
> >   Sangyun Lee, LG Electronics http://sched.co/7Lh3
> > 
> > * "Xenbedded: Xen-based client virtualization for phones and
> > tablets"
> >   Chris Patterson & Kyle Temkin, AIS, Inc. http://sched.co/7LhI
> > 
> > * "Review and Analysis of Performance Metrics of the Xen Hypervisor
> > on
> >    Zynq UltraScale+ MPSoC"
> >    Jarvis Roach & Benjamin Sanda, DornerWorks http://sched.co/7Lh0
> > 
> > How cool is all that, eh?
>
> Very cool~ :-)
> 
:-)

> > We better continue to work on putting RTDS into as good as possible
> > shape... World may need it soon! :-P :-P
> Yes. I totally agree! One most important TODO for RTDS is to convert
> it from non-work-conserving model to work-conserving model. I have
> been keeping this in mind for a while. I can discuss this in our
> internal RT-Xen meeting and try to get a plan for this. 
> 
Yes, that would be one good thing.

> BTW, I may go to the summit if my Canada visa got approved on time. 
> Are you coming to the summit? 
> 
Not this year. So, if you go, responsibility of talking with (not
only!) these people, listen to their needs, and lobby in favour of RTDS
will be entirely yours.

And you have all my trust, I'm sure you can do it! :-D

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 12:42:39PM +0200, Andreas Ziegler wrote:
> > On Mon, Jul 25, 2016 at 11:38:20AM +0200, Ingo Jürgensmann wrote:
> >> On 22.07.2016 12:21, Ingo Jürgensmann wrote:
> >>> On 22.07.2016 11:03, Ingo Jürgensmann wrote:
> >>>
>  In the meanwhile, I activated IPv6 again on Tuesday evening and today
>  the server crashed again some minutes ago. Here's the output from
>  netconsole:
> >>> ... and the second subsequent crash:
> >>
> >> ... and another crash below...
> >>
> >> But in the meanwhile, I wonder if anyone except Andreas and I has some
> >> interest in fixing this issue, as there were absolutely no comments or
> >> feedback on my previous mails. Quite disappointing. :-/
> >>
> > 
> > I did skim your emails. But the oops was happening in memcpy+0x6 which
> > indicated it came back to the origin question why would it got an
> > exception there.
> > 
> [...]
> > Your report and the debian report suggested that Dom0 kernel is less
> > likely to be the culprit because you've tried different Dom0 kernels.
> 
> yes we did. but nothing newer than 3.16 so far, we could try that, too.
> 

There is a 4.5 backport kernel for Jessie. Or you can compile a Dom0
kernel with your config file. If you go down that route, I suggest you
either use the newest long term support Linux kernel (4.4 according to
kernel.org) or the latest stable kernel.

> 
> > As for Xen, not sure if you would be up for trying a debug build from
> > source tree. That would help provide information on whether this is a
> > bug in Xen or not.
> 
> ok, we'll try!
> 
> 
> > As for hardware, it would be worth trying whether this issue happens on
> > other hardware platform.
> 
> as i wrote earlier on the list, we have two platforms which only have
> the Intel Xeon CPU in common, nothing else - and even that is from a
> very different generation.
> 
> 

OK.

Wei.

> Andreas.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

2016-07-25 Thread Roger Pau Monné

On Mon, Jul 25, 2016 at 06:29:22PM +0800, Bob Liu wrote:
> 
> On 07/25/2016 05:20 PM, Roger Pau Monné wrote:
> > On Sat, Jul 23, 2016 at 06:18:23AM +0800, Bob Liu wrote:
> >>
> >> On 07/22/2016 07:45 PM, Roger Pau Monné wrote:
> >>> On Fri, Jul 22, 2016 at 05:43:32PM +0800, Bob Liu wrote:
> 
>  On 07/22/2016 05:34 PM, Roger Pau Monné wrote:
> > On Fri, Jul 22, 2016 at 04:17:48PM +0800, Bob Liu wrote:
> >>
> >> On 07/22/2016 03:45 PM, Roger Pau Monné wrote:
> >>> On Thu, Jul 21, 2016 at 06:08:05PM +0800, Bob Liu wrote:
> 
>  On 07/21/2016 04:57 PM, Roger Pau Monné wrote:
> >> +  blk_mq_freeze_queue(info->rq);
> >> +  if (part_in_flight(&info->gd->part0))
> >> +  goto out;
> >> +
> >> +  /*
> >> +   * FrontBackend
> >> +   * Switch to XenbusStateClosed
> >> +   *  frontend_changed():
> >> +   *   case XenbusStateClosed:
> >> +   *  
> >> xen_blkif_disconnect()
> >> +   *  Switch to 
> >> XenbusStateClosed
> >> +   * blkfront_resume():
> >> +   *  frontend_changed():
> >> +   *  reconnect
> >> +   * Wait until XenbusStateConnected
> >> +   */
> >> +  info->reconfiguring = true;
> >> +  xenbus_switch_state(info->xbdev, XenbusStateClosed);
> >> +
> >> +  /* Poll every 100ms, 1 minute timeout. */
> >> +  for (i = 0; i < 600; i++) {
> >> +  /*
> >> +   * Wait backend enter XenbusStateClosed, 
> >> blkback_changed()
> >> +   * will clear reconfiguring.
> >> +   */
> >> +  if (!info->reconfiguring)
> >> +  goto resume;
> >> +  schedule_timeout_interruptible(msecs_to_jiffies(100));
> >> +  }
> >
> > Instead of having this wait, could you just set info->reconfiguring 
> > = 1, set 
> > the frontend state to XenbusStateClosed and mimic exactly what a 
> > resume from 
> > suspension does? blkback_changed would have to set the frontend 
> > state to 
> > InitWait when it detects that the backend has switched to Closed, 
> > and call 
> > blkfront_resume.
> 
> 
>  I think that won't work.
>  In the real "resume" case, the power management system will trigger 
>  all ->resume() path.
>  But there is no place for dynamic configuration.
> >>>
> >>> Hello,
> >>>
> >>> I think it should be possible to set info->reconfiguring and wait for 
> >>> the 
> >>> backend to switch to state Closed, at that point we should call 
> >>> blkif_resume 
> >>> (from blkback_changed) and the backend will follow the reconection.
> >>>
> >>
> >> Okay, I get your point. Yes, that's an option.
> >>
> >> But this will make 'dynamic configuration' to be async, I'm worry 
> >> about the end-user will get panic.
> >> E.g
> >> A end-user "echo  > /sys/devices/vbd-xxx/max_indirect_segs",
> >> but then the device will be Closed and disappeared, the user have to 
> >> wait for a random time so that the device can resume.
> >
> > That should not happen, AFAICT on migration the device never 
> > dissapears. 
> 
>  Oh, yes.
> 
> > alloc_disk and friends should not be called on resume from migration 
> > (see 
> > the switch in blkfront_connect, you should take the 
> > BLKIF_STATE_SUSPENDED 
> > path for the reconfiguration).
> >
> 
>  What about if the end-user starts I/O immediately after writing new 
>  value to /sys?
>  But the resume is still in progress.
> >>>
> >>> blkif_free already stops the queues by calling blk_mq_stop_hw_queues, and 
> >>> blkif_queue_request will refuse to put anything on the ring if the state 
> >>> is different than connected, which in turn makes blkif_queue_rq call 
> >>> blk_mq_stop_hw_queue to stop the queue, so it should be safe.
> >>>
> >>
> >> But this will surprise the end-user, our user script is like this:
> >> 1) echo  > /sys/
> >> 2) Start I/O immediately.
> >>^^^ Fail because requests would be refused(even software queue was 
> >> still freezed).
> > 
> > Which error do you get? AFAICT reads/writes should either block when the 
> 
> You are right, the reads/writes will just block which is fine.
> 
> > queue is full, or if the fd is in non-blocking mode it should return EAGAIN 
> > (which a properly coded application should be prepared to deal with 
> > gracefully if it's using non-blocking fds anyway).
> >  
> >> It's

Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL

2016-07-25 Thread Julien Grall


Hi Wei,

On 25/07/16 09:53, Wei Liu wrote:

On Fri, Jul 22, 2016 at 03:27:30AM +, osstest service owner wrote:

flight 97737 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/97737/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl  15 guest-start/debian.repeat fail REGR. vs. 97664


From

\


Jul 21 17:08:59.405183 [ 4479.814529] [ cut here ]

Jul 21 17:09:16.961529 [ 4479.814600] kernel BUG at 
drivers/xen/grant-table.c:923!

Jul 21 17:09:16.966838 [ 4479.814628] Internal error: Oops - BUG: 0 [#1] SMP ARM

Jul 21 17:09:16.972090 [ 4479.814656] Modules linked in: xen_gntalloc bridge 
stp ipv6 llc brcmfmac brcmutil cfg80211

Jul 21 17:09:16.980340 [ 4479.814759] CPU: 1 PID: 24761 Comm: vif5.0-q0-guest 
Not tainted 3.16.7-ckt12+ #1

Jul 21 17:09:16.987841 [ 4479.814795] task: d8ef7600 ti: d85bc000 task.ti: 
d85bc000

Jul 21 17:09:16.993339 [ 4479.814833] PC is at gnttab_batch_copy+0xd0/0xe4

Jul 21 17:09:16.997963 [ 4479.814860] LR is at gnttab_batch_copy+0x1c/0xe4

Jul 21 17:09:17.002718 [ 4479.814888] pc : []lr : []
psr: a0070013

Jul 21 17:09:17.008962 [ 4479.814888] sp : d85bdea0  ip : deadbeef  fp : 
c0c8e140

Jul 21 17:09:17.014341 [ 4479.814935] r10:   r9 : e1bec000  r8 : 


Jul 21 17:09:17.019595 [ 4479.814960] r7 : 0002  r6 : 0002  r5 : 
d85bdf20  r4 : e1bf4d30

Jul 21 17:09:17.026095 [ 4479.814990] r3 : 0001  r2 : deadbeef  r1 : 
deadbeef  r0 : fff2

Jul 21 17:09:17.032717 [ 4479.815021] Flags: NzCv  IRQs on  FIQs on  Mode 
SVC_32  ISA ARM  Segment kernel

Jul 21 17:09:17.040091 [ 4479.815055] Control: 10c5387d  Table: 78d8406a  DAC: 
0015

Jul 21 17:09:17.045964 [ 4479.815084] Process vif5.0-q0-guest (pid: 24761, 
stack limit = 0xd85bc248)

Jul 21 17:09:17.052840 [ 4479.815114] Stack: (0xd85bdea0 to 0xd85be000)

Jul 21 17:09:17.057218 [ 4479.815145] dea0: 0001 d8b11388 d85bdf20 d85bdf04 
0002 c05eb054 0388 

Jul 21 17:09:17.065469 [ 4479.815183] dec0: d85bdf04   c0b7ea80 
db0995c0 c05e86e4 e1bf4000 003c

Jul 21 17:09:17.073753 [ 4479.815221] dee0:    c0b8849c 
e1bf4cfc c0c8e140 e1bf4d30 e1bf4cc4

Jul 21 17:09:17.082001 [ 4479.815260] df00: db0c3e80  d85bdf08 d85bdf08 
d8c5cb40 d8c5cb40 0001 

Jul 21 17:09:17.090217 [ 4479.815298] df20: 0002  0001  
e1bf4d30 e1c1f530 04c6 023c

Jul 21 17:09:17.098466 [ 4479.815337] df40:   d84aab80 e1bec000 
c05ea990   

Jul 21 17:09:17.106720 [ 4479.815375] df60:  c0266238   
00f8 e1bec000  

Jul 21 17:09:17.114844 [ 4479.815414] df80: d85bdf80 d85bdf80   
d85bdf90 d85bdf90 d85bdfac d84aab80

Jul 21 17:09:17.123093 [ 4479.815451] dfa0: c0266168   c020f138 
   

Jul 21 17:09:17.131345 [ 4479.815489] dfc0:     
   

Jul 21 17:09:17.139596 [ 4479.815527] dfe0:     
0013   

Jul 21 17:09:17.147841 [ 4479.815583] [] (gnttab_batch_copy) from 
[] (xenvif_kthread_guest_rx+0x6c4/0xb58)


From my understanding the hypercall can only return a non-zero value if 
copy_*_guest helpers fails.


Those helpers will only fail when it is not possible to retrieve the 
page associated to a virtual address. The value is in r0 (-EFAULT), seem 
to confirm that. So this looks very suspicious.


Looking at the other parameters and the assembly code (see [1]):
count = 2 (saved in r6)
batch = 0xe1bf4d30 (saved in r4)

They looks valid to me. Also, there was no major change around that code 
recently.


I don't have much ideas what is going on. And unfortunately Xen ARM does 
not print much information when the translation fail.


I have CCed few more people to see if they have a clue.



Jul 21 17:09:17.156969 [ 4479.815636] [] (xenvif_kthread_guest_rx) from 
[] (kthread+0xd0/0xe8)

Jul 21 17:09:17.165217 [ 4479.815681] [] (kthread) from [] 
(ret_from_fork+0x14/0x3c)

Jul 21 17:09:17.172467 [ 4479.815721] Code: e1c432b4 eae0 e7f001f2 e8bd80f8 
(e7f001f2)

Jul 21 17:09:17.178595 [ 4479.815766] ---[ end trace 6ba7d172d52e24e2 ]---



Regards,

[1] 
http://logs.test-lab.xenproject.org/osstest/logs/97737/build-armhf-pvops/info.html


c04bb0c0 :
c04bb0c0:   e92d40f8push{r3, r4, r5, r6, r7, lr}
c04bb0c4:   e1a02001mov r2, r1
c04bb0c8:   e1a04000mov r4, r0
c04bb0cc:   e1a06001mov r6, r1
c04bb0d0:   e1a01000mov r1, r0
c04bb0d4:   e3a5mov r0, #5
c04bb0d8:   ebf54e39bl  c020e9c4 
c04bb0dc:   e350cmp r0, #0
c04bb0e0:   1a2abne c04bb190 

--
Julien Grall

_

Re: [Xen-devel] [PATCH v3 1/4] tools: remove systemd xenstore socket definitions

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 06:33:35AM +0200, Juergen Gross wrote:
> On 22/07/16 20:51, Wei Liu wrote:
> > On Fri, Jul 22, 2016 at 08:49:17PM +0200, Juergen Gross wrote:
> >> On 22/07/16 18:31, Wei Liu wrote:
> >>> Only skim-read this patch, will do proper review later.
> >>>
> >>> On Fri, Jul 22, 2016 at 05:09:28PM +0200, Juergen Gross wrote:
> >>> [...]
>   CAMLprim value ocaml_launched_by_systemd(value ignore)
>   {
>  -CAMLparam1(ignore);
>  -CAMLlocal1(ret);
>  +   CAMLparam1(ignore);
>  +   CAMLlocal1(ret);
>   
>  -ret = Val_false;
>  +   ret = Val_false;
>   
>  -if (sd_listen_fds(0) > 0)
>  -ret = Val_true;
>  +   if (sd_booted() > 0)
>  +   ret = Val_true;
> >>>
> >>> I think this may be problematic.
> >>>
> >>> sd_booted returns true if system is booted with systemd, but it has no
> >>> bearing whether this particular process is launched by systemd.
> >>>
> >>> IIRC using sd_booted would cause oxenstored thinks it is launched by
> >>> systemd even if the user launches it by hand in a shell. That caused
> >>> it's initialisation to fail.  81d758afca7c3c1e3ccbd78154b33d64fd7757fb
> >>> was written to address that issue.
> >>>
> >>> So, what would happen if you start oxenstored by hand with your patch
> >>> apply? Maybe we can just remove this launched_by_systemd check all
> >>> together -- i.e. we always call sd_notify?
> 
> Sure we could. I'll remove the checks in both xenstored variants if
> nobody objects.
> 
> >>
> >> So you are concerned sd_notify() will be called too often, but you are
> >> suggesting to call it always? I don't understand your concerns then.
> >>
> > 
> > No, my concern is that you won't be able to start oxenstored from
> > command line manually if you boot with systemd. At least that was the
> > bug that caused me to write that patch.
> 
> I believe the main problem was xenstored not calling daemonize() in that
> case, right? This problem is being remove with my patch as daemonize()
> will be called always. The systemd service file is modified to reflect
> this change in behavior.
> 

I'm afraid I can't remember all the details. But as long as you can
launch [o/c]xenstored by hand I think we're fine.

Wei.

> 
> Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

2016-07-25 Thread Bob Liu


On 07/25/2016 06:53 PM, Roger Pau Monné wrote:
..[snip..]
  * We get the device lock and blk_mq_freeze_queue() in 
 dynamic_reconfig_device(),
and then have to release in blkif_recover() asynchronously which makes 
 the code more difficult to readable.
>>>
>>> I'm not sure I'm following here, do you mean that you will pick the lock in 
>>> dynamic_reconfig_device and release it in blkif_recover? Why wouldn't you 
>>
>> Yes.
>>
>>> release the lock in dynamic_reconfig_device after doing whatever is needed?
>>>
>>
>> Both 'dynamic configuration' and migration:xenbus_dev_resume() use { 
>> blkfront_resume(); blkif_recover() } and depends on the change of 
>> xbdev->state.
>> If they happen simultaneously, the State machine of xbdev->state is going to 
>> be a mess and very difficult to track.
>>
>> The lock(actually a mutex) is like a big lock to make sure no race would 
>> happen at all.
> 
> So the only thing that you should do is set the frontend state to closed and 
> wait for the backend to also switch to state closed, and then switch the
> frontend state to init again in order to trigger a reconnection.
> 

But if migration:xenbus_dev_resume() is triggered at the same time, any state 
be set might be changed.
=
E.g
Dynamic_reconfig_device:
Migration:xenbus_dev_resume()

Set the frontend state to closed   
 

frontend state will be 
reset to XenbusStateInitialising by xenbus_dev_resume()

Wait for the backend to also switch to state closed
=
Similar situation may happen at any other place regarding set the state.

> You are right that all this process depends on the state being updated 
> correctly, but I don't see how's that different from a normal connection or 
> resume, and we don't seem to have races there.
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL

2016-07-25 Thread Wei Liu

Thanks for investigating.

There are only two arm related changes in the range being tested:

* a43cc8f - (origin/smoke) arm/traps: fix bug in dump_guest_s1_walk handling of 
level 2 page tables (5 days ago) 
* 60e06f2 - arm/traps: fix bug in dump_guest_s1_walk L1 page table offset 
computation (5 days ago) 

They don't look very suspicious.

If you need help navigating osstest test report, please let me know.

Wei.


On Mon, Jul 25, 2016 at 12:05:08PM +0100, Julien Grall wrote:
> Hi Wei,
> 
> On 25/07/16 09:53, Wei Liu wrote:
> >On Fri, Jul 22, 2016 at 03:27:30AM +, osstest service owner wrote:
> >>flight 97737 xen-unstable real [real]
> >>http://logs.test-lab.xenproject.org/osstest/logs/97737/
> >>
> >>Regressions :-(
> >>
> >>Tests which did not succeed and are blocking,
> >>including tests which could not be run:
> >> test-armhf-armhf-xl  15 guest-start/debian.repeat fail REGR. vs. 
> >> 97664
> >
> >From
> >
> >\
> >
> >
> >Jul 21 17:08:59.405183 [ 4479.814529] [ cut here ]
> >
> >Jul 21 17:09:16.961529 [ 4479.814600] kernel BUG at 
> >drivers/xen/grant-table.c:923!
> >
> >Jul 21 17:09:16.966838 [ 4479.814628] Internal error: Oops - BUG: 0 [#1] SMP 
> >ARM
> >
> >Jul 21 17:09:16.972090 [ 4479.814656] Modules linked in: xen_gntalloc bridge 
> >stp ipv6 llc brcmfmac brcmutil cfg80211
> >
> >Jul 21 17:09:16.980340 [ 4479.814759] CPU: 1 PID: 24761 Comm: 
> >vif5.0-q0-guest Not tainted 3.16.7-ckt12+ #1
> >
> >Jul 21 17:09:16.987841 [ 4479.814795] task: d8ef7600 ti: d85bc000 task.ti: 
> >d85bc000
> >
> >Jul 21 17:09:16.993339 [ 4479.814833] PC is at gnttab_batch_copy+0xd0/0xe4
> >
> >Jul 21 17:09:16.997963 [ 4479.814860] LR is at gnttab_batch_copy+0x1c/0xe4
> >
> >Jul 21 17:09:17.002718 [ 4479.814888] pc : []lr : [] 
> >   psr: a0070013
> >
> >Jul 21 17:09:17.008962 [ 4479.814888] sp : d85bdea0  ip : deadbeef  fp : 
> >c0c8e140
> >
> >Jul 21 17:09:17.014341 [ 4479.814935] r10:   r9 : e1bec000  r8 : 
> >
> >
> >Jul 21 17:09:17.019595 [ 4479.814960] r7 : 0002  r6 : 0002  r5 : 
> >d85bdf20  r4 : e1bf4d30
> >
> >Jul 21 17:09:17.026095 [ 4479.814990] r3 : 0001  r2 : deadbeef  r1 : 
> >deadbeef  r0 : fff2
> >
> >Jul 21 17:09:17.032717 [ 4479.815021] Flags: NzCv  IRQs on  FIQs on  Mode 
> >SVC_32  ISA ARM  Segment kernel
> >
> >Jul 21 17:09:17.040091 [ 4479.815055] Control: 10c5387d  Table: 78d8406a  
> >DAC: 0015
> >
> >Jul 21 17:09:17.045964 [ 4479.815084] Process vif5.0-q0-guest (pid: 24761, 
> >stack limit = 0xd85bc248)
> >
> >Jul 21 17:09:17.052840 [ 4479.815114] Stack: (0xd85bdea0 to 0xd85be000)
> >
> >Jul 21 17:09:17.057218 [ 4479.815145] dea0: 0001 d8b11388 d85bdf20 
> >d85bdf04 0002 c05eb054 0388 
> >
> >Jul 21 17:09:17.065469 [ 4479.815183] dec0: d85bdf04   
> >c0b7ea80 db0995c0 c05e86e4 e1bf4000 003c
> >
> >Jul 21 17:09:17.073753 [ 4479.815221] dee0:    
> >c0b8849c e1bf4cfc c0c8e140 e1bf4d30 e1bf4cc4
> >
> >Jul 21 17:09:17.082001 [ 4479.815260] df00: db0c3e80  d85bdf08 
> >d85bdf08 d8c5cb40 d8c5cb40 0001 
> >
> >Jul 21 17:09:17.090217 [ 4479.815298] df20: 0002  0001 
> > e1bf4d30 e1c1f530 04c6 023c
> >
> >Jul 21 17:09:17.098466 [ 4479.815337] df40:   d84aab80 
> >e1bec000 c05ea990   
> >
> >Jul 21 17:09:17.106720 [ 4479.815375] df60:  c0266238  
> > 00f8 e1bec000  
> >
> >Jul 21 17:09:17.114844 [ 4479.815414] df80: d85bdf80 d85bdf80  
> > d85bdf90 d85bdf90 d85bdfac d84aab80
> >
> >Jul 21 17:09:17.123093 [ 4479.815451] dfa0: c0266168   
> >c020f138    
> >
> >Jul 21 17:09:17.131345 [ 4479.815489] dfc0:    
> >    
> >
> >Jul 21 17:09:17.139596 [ 4479.815527] dfe0:    
> > 0013   
> >
> >Jul 21 17:09:17.147841 [ 4479.815583] [] (gnttab_batch_copy) from 
> >[] (xenvif_kthread_guest_rx+0x6c4/0xb58)
> 
> From my understanding the hypercall can only return a non-zero value if
> copy_*_guest helpers fails.
> 
> Those helpers will only fail when it is not possible to retrieve the page
> associated to a virtual address. The value is in r0 (-EFAULT), seem to
> confirm that. So this looks very suspicious.
> 
> Looking at the other parameters and the assembly code (see [1]):
>   count = 2 (saved in r6)
> batch = 0xe1bf4d30 (saved in r4)
> 
> They looks valid to me. Also, there was no major change around that code
> recently.
> 
> I don't have much ideas what is going on. And unfortunately Xen ARM does not
> print much information when the translation fail.
> 
> I have CCed few more people to see if they have a clue.
> 
> >
> >Jul 21 17:09:17.156969 [ 4479.815636] [] (xenvif_kthread_guest_rx) 
> >from [] (kthread+0xd0/0xe8)
> >
> >Jul 21 17:09:17.1

Re: [Xen-devel] [PATCH 2/3] xen: Have schedulers revise initial placement

2016-07-25 Thread George Dunlap

On 18/07/16 11:28, Dario Faggioli wrote:
> On Fri, 2016-07-15 at 19:02 +0100, George Dunlap wrote:
>> The generic domain creation logic in
>> xen/common/domctl.c:default_vcpu0_location() attempts to try to do
>> initial placement load-balancing by placing vcpu 0 on the least-busy
>> non-primary hyperthread available.  Unfortunately, the logic can end
>> up picking a pcpu that's not in the online mask.  When this is passed
>> to a scheduler such which assumes that the initial assignment is
>> valid, it causes a null pointer dereference looking up the runqueue.
>>
>> Furthermore, this initial placement doesn't take into account hard or
>> soft affinity, or any scheduler-specific knowledge (such as historic
>> runqueue load, as in credit2).
>>
>> To solve this, when inserting a vcpu, always call the per-scheduler
>> "pick" function to revise the initial placement.  This will
>> automatically take all knowledge the scheduler has into account.
>>
>> Signed-off-by: George Dunlap 
>> ---
>> CC: Dario Faggioli 
>> CC: Anshul Makkar 
>> CC: Meng Xu 
>> CC: Jan Beulich 
>>
> Reviewed-by: Dario Faggioli 

Just to clarify - does this cover the sched_rt.c changes as well?

Thanks,
 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/4] docs: Clarify the expected behaviour of zero length records

2016-07-25 Thread Ian Jackson

Andrew Cooper writes ("[PATCH 1/4] docs: Clarify the expected behaviour of zero 
length records"):
> The sending side shouldn't send any variable sized records which end
> up having zero content, but the receiving side will need to tolerate
> such records for compatibility purposes.
...
> +Some records have an exactly specified size.  Some records have
> +variable size depending on their content.  A record with variable
> +size which ends up being zero should be omitted entirely from the
> +stream by the sending side.

I'm not sure this is best stated as a general rule.

Consider a record type which specifies `explicit wombat layout' and
consists of a list of `wombat locations'.

Absence of the record would be semantically different from an empty
record.  In the absence of the record, the receiver would do automatic
choice of number and layout of wombats.  With an empty record, the
receiver would create zero wombats (which might be different to the
default).


I think the underlying spec bug here is that the specification for
`X86_PV_VCPU_BASIC, EXTENDED, XSAVE, MSRS' is too vague.  It simply
specifies
  context  Binary data for this VCPU.
and `The format of these records are identical.  They are all binary
blobs of data which are accessed using specific pairs of domctl
hypercalls.'.


I wasn't able to immediately find the corresponding hypercalls (which
is another docs bug).

But if the hypercalls generate and consume 0-length blocks, then I
think it is fine for the migration stream to contain them.


Have I misunderstood ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH XTF 3/3] xtf-runner: regularise runner exit code

2016-07-25 Thread Ian Jackson

Andrew Cooper writes ("Re: [PATCH XTF 3/3] xtf-runner: regularise runner exit 
code"):
> On 22/07/16 10:29, Wei Liu wrote:
> > This would cause a FAILURE to overwrite previous ERROR result. Is that
> > what you want?
> 
> When running more than one test, the overall result should be the most 
> severe.  So yes, a subsequent FAILURE should override an ERROR.

"Error" is surely a more severe problem than "Failure".
In particular, Error might mask a failure.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 16/18] arm/altp2m: Extended libxl to activate altp2m on ARM.

2016-07-25 Thread Sergej Proskurin


On 07/25/2016 12:08 PM, Wei Liu wrote:
> On Mon, Jul 25, 2016 at 10:49:41AM +0100, Julien Grall wrote:
>> Hello,
>>
>> On 25/07/16 10:04, Sergej Proskurin wrote:
>>> On 07/25/2016 10:32 AM, Wei Liu wrote:
 On Sun, Jul 24, 2016 at 06:06:00PM +0200, Sergej Proskurin wrote:
>>> +xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
>>> + libxl_defbool_val(info->u.pv.altp2m));
>>> +}
>>> +#endif
>>> +
>>> static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
>>> libxl_domain_build_info *const info)
>>> {
>>> @@ -433,6 +443,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
>>> return rc;
>>> #endif
>>> }
>>> +#if defined(__arm__) || defined(__aarch64__)
>>> +else /* info->type == LIBXL_DOMAIN_TYPE_PV */
>>> +pv_set_conf_params(ctx->xch, domid, info);
>>> +#endif
>>>
>>> rc = libxl__arch_domain_create(gc, d_config, domid);
>>>
>>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>>> index ef614be..0a164f9 100644
>>> --- a/tools/libxl/libxl_types.idl
>>> +++ b/tools/libxl/libxl_types.idl
>>> @@ -554,6 +554,7 @@ libxl_domain_build_info = 
>>> Struct("domain_build_info",[
>>>   ("features", string, {'const': 
>>> True}),
>>>   # Use host's E820 for PCI 
>>> passthrough.
>>>   ("e820_host", libxl_defbool),
>>> +  ("altp2m", libxl_defbool),
>> Why is this placed in PV instead of arch_arm?
>>
> The current implementation mirrors the x86's altp2m, where the altp2m
> field represents a property of HVM guests in b_info->u.hvm.altp2m. Since
> guests on ARM are marked as PV, I have placed the field altp2m into
> b_info->u.pv.altp2m.
>
> However, if you believe that it would make more sense to place altp2m
> into b_info->arch_arm.altp2m, I will try to adapt the affected fields.
>
 OK. I don't think that one should be placed in u.pv because that union
 is for x86. However it is also not suitable to just promote that to
 a common field because x86 pv doesn't have altp2m support.

 I think arch_arm would be a better location.

>>> Alright, I will move the field altp2m to arch_arm and adopt the
>>> initialization routines.
>> Would not it make more sense to introduce a generic field 'altp2m' (i.e
>> outside of hvm and arch_arm)?
>>
>> Otherwise, toolstack such as libvirt will need to have specific code to
>> handle ARM and x86.
>>
> Hmm... this is a compelling reason.
>
> After mulling over this issue a bit more, I think I'm fine with
> promoting altp2m to a common field.
>
> Sergej, this is a patch to get you started. Please make sure setting the
> old field still works. If I'm not clear enough, please ask.
>

Alright, thank you. One question up front: Do we still need two
different LIBXL_HAVE macros then?

Also, concerning the dom configuration parameters: Do you think we
should use two different dom configuration parameters (the legacy
"altp2mhvm" and the new "altp2m") or shall we merge both into "altp2m"
as we will address only one common altp2m field in the struct?

Best regards
~Sergej


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian

2016-07-25 Thread Ingo Jürgensmann

On 25.07.2016 12:51, Wei Liu wrote:

[...]
> Your report and the debian report suggested that Dom0 kernel is less
> likely to be the culprit because you've tried different Dom0 kernels.
yes we did. but nothing newer than 3.16 so far, we could try that, 
too.

There is a 4.5 backport kernel for Jessie. Or you can compile a Dom0
kernel with your config file. If you go down that route, I suggest you
either use the newest long term support Linux kernel (4.4 according to
kernel.org) or the latest stable kernel.

I already tried kernel 4.5 from backports, but it was still crashing:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=804079#39

--
Ciao...  //http://blog.windfluechter.net
  Ingo \X/ XMPP: i...@jabber.windfluechter.net

gpg pubkey:  http://www.juergensmann.de/ij_public_key.asc

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] tools/libxc: Properly increment ApicIdCoreSize field on AMD

2016-07-25 Thread Ian Jackson

Wei Liu writes ("Re: [PATCH] tools/libxc: Properly increment ApicIdCoreSize 
field on AMD"):
> On Fri, Jul 22, 2016 at 01:45:05PM -0400, Boris Ostrovsky wrote:
> > On 07/22/2016 01:38 PM, Wei Liu wrote:
> > But this actually fixes a regression that was triggered by recent
> > hvmloader change on one particular processor that we have in the test farm.
> 
> Ian, this is a backport candidate and needs to go back as far as
> possible. The bogus calculation was introduced in 2008 (!). 

Thanks, but:

Not queued for backport because the patch is not in staging.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL

2016-07-25 Thread Julien Grall




On 25/07/16 12:11, Wei Liu wrote:

Thanks for investigating.

There are only two arm related changes in the range being tested:

* a43cc8f - (origin/smoke) arm/traps: fix bug in dump_guest_s1_walk handling of level 
2 page tables (5 days ago) 
* 60e06f2 - arm/traps: fix bug in dump_guest_s1_walk L1 page table offset computation 
(5 days ago) 

They don't look very suspicious.


The modified function is not called in the hypervisor at all. It's only 
here for manual debugging.


Although, this may change the offset of some function (assuming we have 
an hidden bug).



If you need help navigating osstest test report, please let me know.


I have noticed that there is 2 kernel BUG in the logs (with one host 
reboot in the middle). Can you detail what the exact the test?


It looks to me that you are trying to power cycle multiple time a guest.

Cheers,


Wei.


On Mon, Jul 25, 2016 at 12:05:08PM +0100, Julien Grall wrote:

Hi Wei,

On 25/07/16 09:53, Wei Liu wrote:

On Fri, Jul 22, 2016 at 03:27:30AM +, osstest service owner wrote:

flight 97737 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/97737/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
test-armhf-armhf-xl  15 guest-start/debian.repeat fail REGR. vs. 97664


From

\


Jul 21 17:08:59.405183 [ 4479.814529] [ cut here ]

Jul 21 17:09:16.961529 [ 4479.814600] kernel BUG at 
drivers/xen/grant-table.c:923!

Jul 21 17:09:16.966838 [ 4479.814628] Internal error: Oops - BUG: 0 [#1] SMP ARM

Jul 21 17:09:16.972090 [ 4479.814656] Modules linked in: xen_gntalloc bridge 
stp ipv6 llc brcmfmac brcmutil cfg80211

Jul 21 17:09:16.980340 [ 4479.814759] CPU: 1 PID: 24761 Comm: vif5.0-q0-guest 
Not tainted 3.16.7-ckt12+ #1

Jul 21 17:09:16.987841 [ 4479.814795] task: d8ef7600 ti: d85bc000 task.ti: 
d85bc000

Jul 21 17:09:16.993339 [ 4479.814833] PC is at gnttab_batch_copy+0xd0/0xe4

Jul 21 17:09:16.997963 [ 4479.814860] LR is at gnttab_batch_copy+0x1c/0xe4

Jul 21 17:09:17.002718 [ 4479.814888] pc : []lr : []
psr: a0070013

Jul 21 17:09:17.008962 [ 4479.814888] sp : d85bdea0  ip : deadbeef  fp : 
c0c8e140

Jul 21 17:09:17.014341 [ 4479.814935] r10:   r9 : e1bec000  r8 : 


Jul 21 17:09:17.019595 [ 4479.814960] r7 : 0002  r6 : 0002  r5 : 
d85bdf20  r4 : e1bf4d30

Jul 21 17:09:17.026095 [ 4479.814990] r3 : 0001  r2 : deadbeef  r1 : 
deadbeef  r0 : fff2

Jul 21 17:09:17.032717 [ 4479.815021] Flags: NzCv  IRQs on  FIQs on  Mode 
SVC_32  ISA ARM  Segment kernel

Jul 21 17:09:17.040091 [ 4479.815055] Control: 10c5387d  Table: 78d8406a  DAC: 
0015

Jul 21 17:09:17.045964 [ 4479.815084] Process vif5.0-q0-guest (pid: 24761, 
stack limit = 0xd85bc248)

Jul 21 17:09:17.052840 [ 4479.815114] Stack: (0xd85bdea0 to 0xd85be000)

Jul 21 17:09:17.057218 [ 4479.815145] dea0: 0001 d8b11388 d85bdf20 d85bdf04 
0002 c05eb054 0388 

Jul 21 17:09:17.065469 [ 4479.815183] dec0: d85bdf04   c0b7ea80 
db0995c0 c05e86e4 e1bf4000 003c

Jul 21 17:09:17.073753 [ 4479.815221] dee0:    c0b8849c 
e1bf4cfc c0c8e140 e1bf4d30 e1bf4cc4

Jul 21 17:09:17.082001 [ 4479.815260] df00: db0c3e80  d85bdf08 d85bdf08 
d8c5cb40 d8c5cb40 0001 

Jul 21 17:09:17.090217 [ 4479.815298] df20: 0002  0001  
e1bf4d30 e1c1f530 04c6 023c

Jul 21 17:09:17.098466 [ 4479.815337] df40:   d84aab80 e1bec000 
c05ea990   

Jul 21 17:09:17.106720 [ 4479.815375] df60:  c0266238   
00f8 e1bec000  

Jul 21 17:09:17.114844 [ 4479.815414] df80: d85bdf80 d85bdf80   
d85bdf90 d85bdf90 d85bdfac d84aab80

Jul 21 17:09:17.123093 [ 4479.815451] dfa0: c0266168   c020f138 
   

Jul 21 17:09:17.131345 [ 4479.815489] dfc0:     
   

Jul 21 17:09:17.139596 [ 4479.815527] dfe0:     
0013   

Jul 21 17:09:17.147841 [ 4479.815583] [] (gnttab_batch_copy) from 
[] (xenvif_kthread_guest_rx+0x6c4/0xb58)


From my understanding the hypercall can only return a non-zero value if
copy_*_guest helpers fails.

Those helpers will only fail when it is not possible to retrieve the page
associated to a virtual address. The value is in r0 (-EFAULT), seem to
confirm that. So this looks very suspicious.

Looking at the other parameters and the assembly code (see [1]):
count = 2 (saved in r6)
batch = 0xe1bf4d30 (saved in r4)

They looks valid to me. Also, there was no major change around that code
recently.

I don't have much ideas what is going on. And unfortunately Xen ARM does not
print much information when the translation fail.

I have CCed few more people to see if they have a clue.



Jul 21 17:09:17.15696

Re: [Xen-devel] [PATCH 16/18] arm/altp2m: Extended libxl to activate altp2m on ARM.

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 01:26:29PM +0200, Sergej Proskurin wrote:
> 
> On 07/25/2016 12:08 PM, Wei Liu wrote:
> > On Mon, Jul 25, 2016 at 10:49:41AM +0100, Julien Grall wrote:
> >> Hello,
> >>
> >> On 25/07/16 10:04, Sergej Proskurin wrote:
> >>> On 07/25/2016 10:32 AM, Wei Liu wrote:
>  On Sun, Jul 24, 2016 at 06:06:00PM +0200, Sergej Proskurin wrote:
> >>> +xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
> >>> + libxl_defbool_val(info->u.pv.altp2m));
> >>> +}
> >>> +#endif
> >>> +
> >>> static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
> >>> libxl_domain_build_info *const info)
> >>> {
> >>> @@ -433,6 +443,10 @@ int libxl__build_pre(libxl__gc *gc, uint32_t 
> >>> domid,
> >>> return rc;
> >>> #endif
> >>> }
> >>> +#if defined(__arm__) || defined(__aarch64__)
> >>> +else /* info->type == LIBXL_DOMAIN_TYPE_PV */
> >>> +pv_set_conf_params(ctx->xch, domid, info);
> >>> +#endif
> >>>
> >>> rc = libxl__arch_domain_create(gc, d_config, domid);
> >>>
> >>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> >>> index ef614be..0a164f9 100644
> >>> --- a/tools/libxl/libxl_types.idl
> >>> +++ b/tools/libxl/libxl_types.idl
> >>> @@ -554,6 +554,7 @@ libxl_domain_build_info = 
> >>> Struct("domain_build_info",[
> >>>   ("features", string, {'const': 
> >>> True}),
> >>>   # Use host's E820 for PCI 
> >>> passthrough.
> >>>   ("e820_host", libxl_defbool),
> >>> +  ("altp2m", libxl_defbool),
> >> Why is this placed in PV instead of arch_arm?
> >>
> > The current implementation mirrors the x86's altp2m, where the altp2m
> > field represents a property of HVM guests in b_info->u.hvm.altp2m. Since
> > guests on ARM are marked as PV, I have placed the field altp2m into
> > b_info->u.pv.altp2m.
> >
> > However, if you believe that it would make more sense to place altp2m
> > into b_info->arch_arm.altp2m, I will try to adapt the affected fields.
> >
>  OK. I don't think that one should be placed in u.pv because that union
>  is for x86. However it is also not suitable to just promote that to
>  a common field because x86 pv doesn't have altp2m support.
> 
>  I think arch_arm would be a better location.
> 
> >>> Alright, I will move the field altp2m to arch_arm and adopt the
> >>> initialization routines.
> >> Would not it make more sense to introduce a generic field 'altp2m' (i.e
> >> outside of hvm and arch_arm)?
> >>
> >> Otherwise, toolstack such as libvirt will need to have specific code to
> >> handle ARM and x86.
> >>
> > Hmm... this is a compelling reason.
> >
> > After mulling over this issue a bit more, I think I'm fine with
> > promoting altp2m to a common field.
> >
> > Sergej, this is a patch to get you started. Please make sure setting the
> > old field still works. If I'm not clear enough, please ask.
> >
> 
> Alright, thank you. One question up front: Do we still need two
> different LIBXL_HAVE macros then?
> 

Yes, we still need one. Something like

  LIBXL_HAVE_COMMON_ALTP2M

?

I'm bad at naming things though.

> Also, concerning the dom configuration parameters: Do you think we
> should use two different dom configuration parameters (the legacy
> "altp2mhvm" and the new "altp2m") or shall we merge both into "altp2m"
> as we will address only one common altp2m field in the struct?
> 

I think using altp2m is fine. But altp2mhvm needs to continue to work
because it is trivial to keep the old one.

In any case, document the new parameter, deprecate the old one and
document and the relationship between the two.

Wei.

> Best regards
> ~Sergej
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian

2016-07-25 Thread Ingo Jürgensmann


On 25.07.2016 12:23, Wei Liu wrote:

First, thank you for replying! Very much appreciated! :)


I did skim your emails. But the oops was happening in memcpy+0x6 which
indicated it came back to the origin question why would it got an
exception there.

Just by staring at the code doesn't get me anywhere. Without a concrete
reproduction of the issue, I'm afraid I can't provide more input here.


Well, from my point of view, it happens quite often when accessing the 
server via SSH. For example today it crashed when I wanted to add 
something and after I clicked into putty and typed the first char. In 
another putty, where I have my netconsole log open, I instantly saw the 
oops.


But what exactly causing these kinds of reboots, I'm clueless as you 
too. Only that I do experience far more frequent crashes when accessing 
the server from workplace via putty on Windows than via SSH on OSX or 
Debian Linux.



There are several moving parts:
0. Hardware
1. Xen hypervisor
2. Dom0 kernel
Your report and the debian report suggested that Dom0 kernel is less
likely to be the culprit because you've tried different Dom0 kernels.


As just written in the other mail, I already tried kernel 4.5 from 
backports. Still crashing.



As for Xen, not sure if you would be up for trying a debug build from
source tree. That would help provide information on whether this is a
bug in Xen or not.


Will try to build from Debian source, but how to enable debug build?

--
Ciao...  //http://blog.windfluechter.net
  Ingo \X/ XMPP: i...@jabber.windfluechter.net


gpg pubkey:  http://www.juergensmann.de/ij_public_key.asc

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating inappropriate zero-length records

2016-07-25 Thread Ian Jackson

David Vrabel writes ("Re: [Xen-devel] [PATCH 3/4] tools/libxc: Avoid generating 
inappropriate zero-length records"):
> For records such as HVM_PARAMS which consist of a set of N items, the
> intention was to most definitely send a record with 0 items.
> 
> For records that fetch an opaque blob from the hypervisor, again the
> intention was to sent this blob as-is with no sort of processing or
> other checking. i.e., if the hypervisor gives us a zero-length blob we
> sent that as-is.
> 
> This makes all the streams look the same with all the same records,
> regardless of what hardware platform it was run on.  Including
> zero-length/count records also makes diagnosing problems easier -- the
> empty record is visible in the stream instead of having to remember that
> sometimes these records are deliberately omitted.
> 
> As such, this series should be limited to making the restore side handle
> the zero count sets or zero length blobs if it does not do so already.
> 
> The specification should be clarified to note that some records may have
> zero-length blobs or contain zero items.

I think I prefer David's view here, but I don't quite feel I
understand what the underlying bug is.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 12:34:53PM +0100, Julien Grall wrote:
> 
> 
> On 25/07/16 12:11, Wei Liu wrote:
> >Thanks for investigating.
> >
> >There are only two arm related changes in the range being tested:
> >
> >* a43cc8f - (origin/smoke) arm/traps: fix bug in dump_guest_s1_walk handling 
> >of level 2 page tables (5 days ago) 
> >* 60e06f2 - arm/traps: fix bug in dump_guest_s1_walk L1 page table offset 
> >computation (5 days ago) 
> >
> >They don't look very suspicious.
> 
> The modified function is not called in the hypervisor at all. It's only here
> for manual debugging.
> 
> Although, this may change the offset of some function (assuming we have an
> hidden bug).
> 
> >If you need help navigating osstest test report, please let me know.
> 
> I have noticed that there is 2 kernel BUG in the logs (with one host reboot
> in the middle). Can you detail what the exact the test?

What I normally do is to look at the summary page of the failed test to
identify the failed step and the time.

In this case:

http://logs.test-lab.xenproject.org/osstest/logs/97737/test-armhf-armhf-xl/info.html

The time stamp said the failed step started at 2016-07-21 19:30:10 Z,
and then I look at the output of failed step log to look for time stamp
that the test failed.  Then I would look for output between these two
time stamps in various log.

I now realise the log I pasted in was not from the failed test. I wanted
to paste in the second kernel oops, which should be the culprit that
the test failed. The two oops was the same one, though.

To identify  which test step was running when the first oops happened,
the same technique applies.

It seems that the oops happened during ts-debian-install according to
time stamps.

Wei.

> 
> It looks to me that you are trying to power cycle multiple time a guest.
> 
> Cheers,
> 
> >Wei.
> >
> >
> >On Mon, Jul 25, 2016 at 12:05:08PM +0100, Julien Grall wrote:
> >>Hi Wei,
> >>
> >>On 25/07/16 09:53, Wei Liu wrote:
> >>>On Fri, Jul 22, 2016 at 03:27:30AM +, osstest service owner wrote:
> flight 97737 xen-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/97737/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
> test-armhf-armhf-xl  15 guest-start/debian.repeat fail REGR. vs. 
> 97664
> >>>
> >>>From
> >>>
> >>>\
> >>>
> >>>
> >>>Jul 21 17:08:59.405183 [ 4479.814529] [ cut here ]
> >>>
> >>>Jul 21 17:09:16.961529 [ 4479.814600] kernel BUG at 
> >>>drivers/xen/grant-table.c:923!
> >>>
> >>>Jul 21 17:09:16.966838 [ 4479.814628] Internal error: Oops - BUG: 0 [#1] 
> >>>SMP ARM
> >>>
> >>>Jul 21 17:09:16.972090 [ 4479.814656] Modules linked in: xen_gntalloc 
> >>>bridge stp ipv6 llc brcmfmac brcmutil cfg80211
> >>>
> >>>Jul 21 17:09:16.980340 [ 4479.814759] CPU: 1 PID: 24761 Comm: 
> >>>vif5.0-q0-guest Not tainted 3.16.7-ckt12+ #1
> >>>
> >>>Jul 21 17:09:16.987841 [ 4479.814795] task: d8ef7600 ti: d85bc000 task.ti: 
> >>>d85bc000
> >>>
> >>>Jul 21 17:09:16.993339 [ 4479.814833] PC is at gnttab_batch_copy+0xd0/0xe4
> >>>
> >>>Jul 21 17:09:16.997963 [ 4479.814860] LR is at gnttab_batch_copy+0x1c/0xe4
> >>>
> >>>Jul 21 17:09:17.002718 [ 4479.814888] pc : []lr : 
> >>>[]psr: a0070013
> >>>
> >>>Jul 21 17:09:17.008962 [ 4479.814888] sp : d85bdea0  ip : deadbeef  fp : 
> >>>c0c8e140
> >>>
> >>>Jul 21 17:09:17.014341 [ 4479.814935] r10:   r9 : e1bec000  r8 : 
> >>>
> >>>
> >>>Jul 21 17:09:17.019595 [ 4479.814960] r7 : 0002  r6 : 0002  r5 : 
> >>>d85bdf20  r4 : e1bf4d30
> >>>
> >>>Jul 21 17:09:17.026095 [ 4479.814990] r3 : 0001  r2 : deadbeef  r1 : 
> >>>deadbeef  r0 : fff2
> >>>
> >>>Jul 21 17:09:17.032717 [ 4479.815021] Flags: NzCv  IRQs on  FIQs on  Mode 
> >>>SVC_32  ISA ARM  Segment kernel
> >>>
> >>>Jul 21 17:09:17.040091 [ 4479.815055] Control: 10c5387d  Table: 78d8406a  
> >>>DAC: 0015
> >>>
> >>>Jul 21 17:09:17.045964 [ 4479.815084] Process vif5.0-q0-guest (pid: 24761, 
> >>>stack limit = 0xd85bc248)
> >>>
> >>>Jul 21 17:09:17.052840 [ 4479.815114] Stack: (0xd85bdea0 to 0xd85be000)
> >>>
> >>>Jul 21 17:09:17.057218 [ 4479.815145] dea0: 0001 d8b11388 d85bdf20 
> >>>d85bdf04 0002 c05eb054 0388 
> >>>
> >>>Jul 21 17:09:17.065469 [ 4479.815183] dec0: d85bdf04   
> >>>c0b7ea80 db0995c0 c05e86e4 e1bf4000 003c
> >>>
> >>>Jul 21 17:09:17.073753 [ 4479.815221] dee0:    
> >>>c0b8849c e1bf4cfc c0c8e140 e1bf4d30 e1bf4cc4
> >>>
> >>>Jul 21 17:09:17.082001 [ 4479.815260] df00: db0c3e80  d85bdf08 
> >>>d85bdf08 d8c5cb40 d8c5cb40 0001 
> >>>
> >>>Jul 21 17:09:17.090217 [ 4479.815298] df20: 0002  0001 
> >>> e1bf4d30 e1c1f530 04c6 023c
> >>>
> >>>Jul 21 17:09:17.098466 [ 4479.815337] df40:   d84aab80 
> >>>e1bec000 c05ea990   
> >>>
> >>>Jul 21 17:09:17.106

Re: [Xen-devel] [PATCH XTF 3/3] xtf-runner: regularise runner exit code

2016-07-25 Thread Andrew Cooper

On 25/07/16 12:25, Ian Jackson wrote:
> Andrew Cooper writes ("Re: [PATCH XTF 3/3] xtf-runner: regularise runner exit 
> code"):
>> On 22/07/16 10:29, Wei Liu wrote:
>>> This would cause a FAILURE to overwrite previous ERROR result. Is that
>>> what you want?
>> When running more than one test, the overall result should be the most 
>> severe.  So yes, a subsequent FAILURE should override an ERROR.
> "Error" is surely a more severe problem than "Failure".

No.  Error means "something unexpected went wrong", while Failure is
"the test worked, and identified that the area under test is defective".

Error is typically a test case bug or infrastructure issue.

> In particular, Error might mask a failure.

Indeed it could.  Either should cause a prompt investigation and effort
to resolve the issues, but one must be more severe than the other and
this is the way things are specified.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

2016-07-25 Thread Roger Pau Monné

On Mon, Jul 25, 2016 at 07:08:36PM +0800, Bob Liu wrote:
> 
> On 07/25/2016 06:53 PM, Roger Pau Monné wrote:
> ..[snip..]
>   * We get the device lock and blk_mq_freeze_queue() in 
>  dynamic_reconfig_device(),
> and then have to release in blkif_recover() asynchronously which 
>  makes the code more difficult to readable.
> >>>
> >>> I'm not sure I'm following here, do you mean that you will pick the lock 
> >>> in 
> >>> dynamic_reconfig_device and release it in blkif_recover? Why wouldn't you 
> >>
> >> Yes.
> >>
> >>> release the lock in dynamic_reconfig_device after doing whatever is 
> >>> needed?
> >>>
> >>
> >> Both 'dynamic configuration' and migration:xenbus_dev_resume() use { 
> >> blkfront_resume(); blkif_recover() } and depends on the change of 
> >> xbdev->state.
> >> If they happen simultaneously, the State machine of xbdev->state is going 
> >> to be a mess and very difficult to track.
> >>
> >> The lock(actually a mutex) is like a big lock to make sure no race would 
> >> happen at all.
> > 
> > So the only thing that you should do is set the frontend state to closed 
> > and 
> > wait for the backend to also switch to state closed, and then switch the
> > frontend state to init again in order to trigger a reconnection.
> > 
> 
> But if migration:xenbus_dev_resume() is triggered at the same time, any state 
> be set might be changed.
> =
> E.g
> Dynamic_reconfig_device:
> Migration:xenbus_dev_resume()
> 
> Set the frontend state to closed   
>  
>   
>   frontend state will be 
> reset to XenbusStateInitialising by xenbus_dev_resume()
> 
> Wait for the backend to also switch to state closed

This is not really a race, the backend will not switch to state closed, and 
instead will appear at state init again, which is what we wanted anyway in 
order to reconnect, so I don't see an issue with this flow.

> =
> Similar situation may happen at any other place regarding set the state.

Right, but I don't see how your proposed patch also avoids this issues. I 
see that you pick the device lock in dynamic_reconfig_device, but I don't 
see xenbus_dev_resume picking the lock at all, and in any case I don't think 
you should prevent the VM from migrating.

Depending on the toolstack it might decide to kill the VM because it has not 
been able to migrate it, in which case the result is not better.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/4] tools/libxc: Tolerate zero-length records in migration v2 streams

2016-07-25 Thread David Vrabel

On 21/07/16 18:17, Andrew Cooper wrote:
> Under some circumstances, the migration v2 save code would generate valid
> records with zero content, when the intended behaviour was to omit the record
  
As explained, this is not the intended behaviour.  I would appreciate it
if you did not misrepresent me here.

> entirely.
> 
> As the stream is otherwise fine, tolerate these records and avoid failing the
> migration.
[...]
> --- a/tools/libxc/xc_sr_restore_x86_hvm.c
> +++ b/tools/libxc/xc_sr_restore_x86_hvm.c
[...]
> +/*
> + * Tolerate empty records.  Older sending sides used to accidentally
> + * generate them.
> + */
> +if ( hdr->count == 0 )
> +{
> +DBGPRINTF("Skipping empty HVM_PARAMS record\n");
> +return 0;
> +}
> +
>  for ( i = 0; i < hdr->count; i++, entry++ )

This loop already handles hdr->count == 0.  The additional check is not
required.

> --- a/tools/libxc/xc_sr_restore_x86_pv.c
> +++ b/tools/libxc/xc_sr_restore_x86_pv.c
> @@ -753,15 +753,26 @@ static int handle_x86_pv_vcpu_blob(struct xc_sr_context 
> *ctx,
>  }
>  
>  /* Confirm that there is a complete header. */
> -if ( rec->length <= sizeof(*vhdr) )
> +if ( rec->length < sizeof(*vhdr) )
>  {
> -ERROR("%s record truncated: length %u, min %zu",
> -  rec_name, rec->length, sizeof(*vhdr) + 1);
> +ERROR("%s record truncated: length %u, header size %zu",
> +  rec_name, rec->length, sizeof(*vhdr));
>  goto out;
>  }
>  
>  blobsz = rec->length - sizeof(*vhdr);
>  
> +/*
> + * Tolerate empty records.  Older sending sides used to accidentally
> + * generate them.
> + */
> +if ( blobsz == 0 )
> +{
> +DBGPRINTF("Skipping empty %s record for vcpu %u\n",
> +  rec_type_to_str(rec->type), vhdr->vcpu_id);
> +goto out;
> +}

This check for a zero-sized blob should be immediately prior to the

   blob = malloc(blobsz);

So all the other length validation is not skipped.  In particular, some
record types may wish to make zero-length blobs invalid.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 1/4] tools: remove systemd xenstore socket definitions

2016-07-25 Thread Juergen Gross

On 25/07/16 13:05, Wei Liu wrote:
> On Mon, Jul 25, 2016 at 06:33:35AM +0200, Juergen Gross wrote:
>> On 22/07/16 20:51, Wei Liu wrote:
>>> On Fri, Jul 22, 2016 at 08:49:17PM +0200, Juergen Gross wrote:
 On 22/07/16 18:31, Wei Liu wrote:
> Only skim-read this patch, will do proper review later.
>
> On Fri, Jul 22, 2016 at 05:09:28PM +0200, Juergen Gross wrote:
> [...]
>>  CAMLprim value ocaml_launched_by_systemd(value ignore)
>>  {
>> -CAMLparam1(ignore);
>> -CAMLlocal1(ret);
>> +   CAMLparam1(ignore);
>> +   CAMLlocal1(ret);
>>  
>> -ret = Val_false;
>> +   ret = Val_false;
>>  
>> -if (sd_listen_fds(0) > 0)
>> -ret = Val_true;
>> +   if (sd_booted() > 0)
>> +   ret = Val_true;
>
> I think this may be problematic.
>
> sd_booted returns true if system is booted with systemd, but it has no
> bearing whether this particular process is launched by systemd.
>
> IIRC using sd_booted would cause oxenstored thinks it is launched by
> systemd even if the user launches it by hand in a shell. That caused
> it's initialisation to fail.  81d758afca7c3c1e3ccbd78154b33d64fd7757fb
> was written to address that issue.
>
> So, what would happen if you start oxenstored by hand with your patch
> apply? Maybe we can just remove this launched_by_systemd check all
> together -- i.e. we always call sd_notify?
>>
>> Sure we could. I'll remove the checks in both xenstored variants if
>> nobody objects.
>>

 So you are concerned sd_notify() will be called too often, but you are
 suggesting to call it always? I don't understand your concerns then.

>>>
>>> No, my concern is that you won't be able to start oxenstored from
>>> command line manually if you boot with systemd. At least that was the
>>> bug that caused me to write that patch.
>>
>> I believe the main problem was xenstored not calling daemonize() in that
>> case, right? This problem is being remove with my patch as daemonize()
>> will be called always. The systemd service file is modified to reflect
>> this change in behavior.
>>
> 
> I'm afraid I can't remember all the details. But as long as you can
> launch [o/c]xenstored by hand I think we're fine.

Verified for both xenstored variants.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

2016-07-25 Thread Bob Liu


On 07/25/2016 08:11 PM, Roger Pau Monné wrote:
> On Mon, Jul 25, 2016 at 07:08:36PM +0800, Bob Liu wrote:
>>
>> On 07/25/2016 06:53 PM, Roger Pau Monné wrote:
>> ..[snip..]
>>  * We get the device lock and blk_mq_freeze_queue() in 
>> dynamic_reconfig_device(),
>>and then have to release in blkif_recover() asynchronously which 
>> makes the code more difficult to readable.
>
> I'm not sure I'm following here, do you mean that you will pick the lock 
> in 
> dynamic_reconfig_device and release it in blkif_recover? Why wouldn't you 

 Yes.

> release the lock in dynamic_reconfig_device after doing whatever is 
> needed?
>

 Both 'dynamic configuration' and migration:xenbus_dev_resume() use { 
 blkfront_resume(); blkif_recover() } and depends on the change of 
 xbdev->state.
 If they happen simultaneously, the State machine of xbdev->state is going 
 to be a mess and very difficult to track.

 The lock(actually a mutex) is like a big lock to make sure no race would 
 happen at all.
>>>
>>> So the only thing that you should do is set the frontend state to closed 
>>> and 
>>> wait for the backend to also switch to state closed, and then switch the
>>> frontend state to init again in order to trigger a reconnection.
>>>
>>
>> But if migration:xenbus_dev_resume() is triggered at the same time, any 
>> state be set might be changed.
>> =
>> E.g
>> Dynamic_reconfig_device:
>> Migration:xenbus_dev_resume()
>>
>> Set the frontend state to closed   
>>  
>>  
>>  frontend state will be 
>> reset to XenbusStateInitialising by xenbus_dev_resume()
>>
>> Wait for the backend to also switch to state closed
> 
> This is not really a race, the backend will not switch to state closed, and 
> instead will appear at state init again, which is what we wanted anyway in 
> order to reconnect, so I don't see an issue with this flow.
> 
>> =
>> Similar situation may happen at any other place regarding set the state.
> 
> Right, but I don't see how your proposed patch also avoids this issues. I 
> see that you pick the device lock in dynamic_reconfig_device, but I don't 
> see xenbus_dev_resume picking the lock at all, and in any case I don't think 

The lock is picked from the power management framework.

Anyway, I'm convinced and will follow your suggestion.
Thank you!

> you should prevent the VM from migrating.
> 
> Depending on the toolstack it might decide to kill the VM because it has not 
> been able to migrate it, in which case the result is not better.
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian

2016-07-25 Thread Sander Eikelenboom


Monday, July 25, 2016, 1:27:20 PM, you wrote:

> On 25.07.2016 12:51, Wei Liu wrote:

>>> [...]
>>> > Your report and the debian report suggested that Dom0 kernel is less
>>> > likely to be the culprit because you've tried different Dom0 kernels.
>>> yes we did. but nothing newer than 3.16 so far, we could try that, 
>>> too.
>> There is a 4.5 backport kernel for Jessie. Or you can compile a Dom0
>> kernel with your config file. If you go down that route, I suggest you
>> either use the newest long term support Linux kernel (4.4 according to
>> kernel.org) or the latest stable kernel.

> I already tried kernel 4.5 from backports, but it was still crashing:

> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=804079#39


Hi Ingo,

Couldn't this just be a more generic Linux kernel issue 
with the combination of ipv6, bridging and ndisc ?

Wouldn't be the first time that a config more common on systems running Xen,
would more prominently expose a kernel bug in the networking stack. 

Have you brought this up on the LKML / netdev mailinglists already ?

--
Sander


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH linux v2 0/9] xen: pvhvm: support bootup on secondary vCPUs

2016-07-25 Thread David Vrabel

On 30/06/16 16:56, Vitaly Kuznetsov wrote:
> It may happen that Xen's and Linux's ideas of vCPU id diverge. In
> particular, when we crash on a secondary vCPU we may want to do kdump
> and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting
> on the vCPU which crashed. This doesn't work very well for PVHVM guests as
> we have a number of hypercalls where we pass vCPU id as a parameter. These
> hypercalls either fail or do something unexpected. To solve the issue we
> need to have a mapping between Linux's and Xen's vCPU ids.
> 
> This series solves the issue for x86 PVHVM guests. PV guests don't (and
> probably won't) support kdump so I always assume Xen's vCPU id == Linux's
> vCPU id. ARM guests will probably need to get proper mapping once we start
> supporting kexec/kdump there.

Applied to for-linus-4.8, thanks.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/4] tools/libxc: Tolerate zero-length records in migration v2 streams

2016-07-25 Thread Andrew Cooper

On 25/07/16 13:21, David Vrabel wrote:
> On 21/07/16 18:17, Andrew Cooper wrote:
>> Under some circumstances, the migration v2 save code would generate valid
>> records with zero content, when the intended behaviour was to omit the record
>   
> As explained, this is not the intended behaviour.

And as previously stated, I still disagree.

>   I would appreciate it
> if you did not misrepresent me here.

WTF?

I am the author of this code, and I very definitely intended for the
record to be omitted.  I am not representing you in at all, let alone
mis-representation.

There is clearly a difference of opinion, and there is no guidance from
the spec, but accusations like this are not warranted.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

[Xen-devel] [ovmf test] 99608: regressions - FAIL

2016-07-25 Thread osstest service owner

flight 99608 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/99608/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   5 xen-build fail REGR. vs. 94748
 build-i386-xsm5 xen-build fail REGR. vs. 94748
 build-i3865 xen-build fail REGR. vs. 94748
 build-amd64-xsm   5 xen-build fail REGR. vs. 94748

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a

version targeted for testing:
 ovmf 227a1ac2d04504eee4c624f534ea3fcbb8140029
baseline version:
 ovmf dc99315b8732b6e3032d01319d3f534d440b43d0

Last test of basis94748  2016-05-24 22:43:25 Z   61 days
Failing since 94750  2016-05-25 03:43:08 Z   61 days  122 attempts
Testing same since99608  2016-07-25 10:57:02 Z0 days1 attempts


People who touched revisions under test:
  Anandakrishnan Loganathan 
  Ard Biesheuvel 
  Bi, Dandan 
  Bret Barkelew 
  Bruce Cran 
  Bruce Cran 
  Chao Zhang 
  Cinnamon Shia 
  Cohen, Eugene 
  Dandan Bi 
  Darbin Reyes 
  david wei 
  Eric Dong 
  Eugene Cohen 
  Evan Lloyd 
  Evgeny Yakovlev 
  Feng Tian 
  Fu Siyuan 
  Fu, Siyuan 
  Gary Li 
  Gary Lin 
  Giri P Mudusuru 
  Graeme Gregory 
  Hao Wu 
  Hegde Nagaraj P 
  Hegde, Nagaraj P 
  hegdenag 
  Heyi Guo 
  Jan D?bro? 
  Jan Dabros 
  Jeff Fan 
  Jeremy Linton 
  Jiaxin Wu 
  Jiewen Yao 
  Joe Zhou 
  Jordan Justen 
  Katie Dellaquila 
  Laszlo Ersek 
  Liming Gao 
  Lu, ShifeiX A 
  lushifex 
  Marcin Wojtas 
  Mark Rutland 
  Marvin H?user 
  Marvin Haeuser 
  Maurice Ma 
  Michael Chang 
  Michael Zimmermann 
  Mudusuru, Giri P 
  Ni, Ruiyu 
  Qin Long 
  Qiu Shumin 
  Ruiyu Ni 
  Ruiyu Niã
  Ryan Harkin 
  Sami Mujawar 
  Satya Yarlagadda 
  Shannon Zhao 
  Shi, Steven 
  Sriram Subramanian 
  Star Zeng 
  Steven Shi 
  Subramanian, Sriram (EG Servers Platform SW) 
  Sunny Wang 
  Tapan Shah 
  Thomas Palmer 
  Yarlagadda, Satya P 
  Yonghong Zhu 
  Zhang Lubo 
  Zhang, Chao B 
  Zhang, Lubo 

jobs:
 build-amd64-xsm  fail
 build-i386-xsm   fail
 build-amd64  fail
 build-i386   fail
 build-amd64-libvirt  blocked 
 build-i386-libvirt   blocked 
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 blocked 
 test-amd64-i386-xl-qemuu-ovmf-amd64  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 11446 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/4] tools/libxc: Tolerate zero-length records in migration v2 streams

2016-07-25 Thread David Vrabel

On 25/07/16 13:46, Andrew Cooper wrote:
> On 25/07/16 13:21, David Vrabel wrote:
>> On 21/07/16 18:17, Andrew Cooper wrote:
>>> Under some circumstances, the migration v2 save code would generate valid
>>> records with zero content, when the intended behaviour was to omit the 
>>> record
>>   
>> As explained, this is not the intended behaviour.
> 
> And as previously stated, I still disagree.
> 
>>   I would appreciate it
>> if you did not misrepresent me here.
> 
> WTF?
>
> I am the author of this code, and I very definitely intended for the
> record to be omitted.  I am not representing you in at all, let alone
> mis-representation.

As the original and principle author of the specification I definitely
intended for empty records to be included.  In the context above (given
the lack of clarity in the text) it is not unreasonable for someone to
conclude that its talking about the specification author's intentions.

Thus without clarifying whose intentions you're talking about, you are
mis-representing my profession opinion.

So, yes. I am serious about you changing the commit message. I do not
want your design decisions (good or bad) attributed to me.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 01:41:41PM +0200, Ingo Jürgensmann wrote:
> On 25.07.2016 12:23, Wei Liu wrote:
> 
> First, thank you for replying! Very much appreciated! :)
> 
> >I did skim your emails. But the oops was happening in memcpy+0x6 which
> >indicated it came back to the origin question why would it got an
> >exception there.
> >
> >Just by staring at the code doesn't get me anywhere. Without a concrete
> >reproduction of the issue, I'm afraid I can't provide more input here.
> 
> Well, from my point of view, it happens quite often when accessing the
> server via SSH. For example today it crashed when I wanted to add something
> and after I clicked into putty and typed the first char. In another putty,
> where I have my netconsole log open, I instantly saw the oops.
> 
> But what exactly causing these kinds of reboots, I'm clueless as you too.
> Only that I do experience far more frequent crashes when accessing the
> server from workplace via putty on Windows than via SSH on OSX or Debian
> Linux.
> 
> >There are several moving parts:
> >0. Hardware
> >1. Xen hypervisor
> >2. Dom0 kernel
> >Your report and the debian report suggested that Dom0 kernel is less
> >likely to be the culprit because you've tried different Dom0 kernels.
> 
> As just written in the other mail, I already tried kernel 4.5 from
> backports. Still crashing.
> 
> >As for Xen, not sure if you would be up for trying a debug build from
> >source tree. That would help provide information on whether this is a
> >bug in Xen or not.
> 
> Will try to build from Debian source, but how to enable debug build?
> 

I was thinking about building directly from xen.git.

http://wiki.xenproject.org/wiki/Compiling_Xen_From_Source

Probably try the Xen 4.7 release.

Wei.

> -- 
> Ciao...  //http://blog.windfluechter.net
>   Ingo \X/ XMPP: i...@jabber.windfluechter.net
> 
> 
> gpg pubkey:  http://www.juergensmann.de/ij_public_key.asc

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH linux v2 0/9] xen: pvhvm: support bootup on secondary vCPUs

2016-07-25 Thread Julien Grall


Hi David,

On 25/07/16 13:38, David Vrabel wrote:

On 30/06/16 16:56, Vitaly Kuznetsov wrote:

It may happen that Xen's and Linux's ideas of vCPU id diverge. In
particular, when we crash on a secondary vCPU we may want to do kdump
and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting
on the vCPU which crashed. This doesn't work very well for PVHVM guests as
we have a number of hypercalls where we pass vCPU id as a parameter. These
hypercalls either fail or do something unexpected. To solve the issue we
need to have a mapping between Linux's and Xen's vCPU ids.

This series solves the issue for x86 PVHVM guests. PV guests don't (and
probably won't) support kdump so I always assume Xen's vCPU id == Linux's
vCPU id. ARM guests will probably need to get proper mapping once we start
supporting kexec/kdump there.


Applied to for-linus-4.8, thanks.


It would have been nice to send a ping before applying. This patch 
series is containing Xen ARM code which has not been acked by Stefano, 
nor had feedback from ARM side.


For instance given that all the hypercalls are representing a "vcpu id" 
using "uint32_t" it is a bit weird to use "int" to define xen_vcpu_id 
(see patch #3).


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/2] systemd: use standard dependencies for xendriverdomain.service

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 10:17:44AM +0100, Wei Liu wrote:
> On Sun, Jul 24, 2016 at 09:26:57PM +0200, Marek Marczykowski-Górecki wrote:
> > Having DefaultDependencies=no means it can be started before / is
> > remounted read-write, which will result in various failures (to start
> > with opening the log).
> > Since "libxl: trigger attach events for devices attached before xl devd
> > startup" it is no longer important to start it as early as possible,
> > because it will process devices created before its startup.
> > 
> > Cc: Ian Jackson 
> > Cc: Wei Liu 
> > Signed-off-by: Marek Marczykowski-Górecki 
> 
> The reasoning makes sense.
> 
> Acked-by: Wei Liu 
> 
> Also CC Rusty (author of this file)
> 

Pushed.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] tools/libxc: Properly increment ApicIdCoreSize field on AMD

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 12:33:37PM +0100, Ian Jackson wrote:
> Wei Liu writes ("Re: [PATCH] tools/libxc: Properly increment ApicIdCoreSize 
> field on AMD"):
> > On Fri, Jul 22, 2016 at 01:45:05PM -0400, Boris Ostrovsky wrote:
> > > On 07/22/2016 01:38 PM, Wei Liu wrote:
> > > But this actually fixes a regression that was triggered by recent
> > > hvmloader change on one particular processor that we have in the test 
> > > farm.
> > 
> > Ian, this is a backport candidate and needs to go back as far as
> > possible. The bogus calculation was introduced in 2008 (!). 
> 
> Thanks, but:
> 
> Not queued for backport because the patch is not in staging.
> 

Pushed to staging.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC v3 00/13] linux: generalize sections, ranges and linker tables

2016-07-25 Thread Masami Hiramatsu

Hi Luis,

On Fri, 22 Jul 2016 14:24:34 -0700
"Luis R. Rodriguez"  wrote:

> This RFC v3 builds off the last RFC v2 series [0] for adding linker tables.
> The largest amount of work here was to take Russell King's feedback on
> using linker table for kprobes text not being appropriate -- and providing
> another lightweight API for simple section ranges: read-only stitched pieces
> of executable code. This required also generalizing common building blocks
> for both linker tables and section ranges, these building blocks are defined
> now in include/linux/sections.h and asm-generic/section.h. The other last 
> thing
> decided was to not support sub-sections. In the hunt for this I could think of
> anything that really required this, and if it was needed it did not seem
> impossible to port over to avoid its use. Please let me know if there are 
> valid
> uses cases for sub-sections.
> 
> Other significant effort here was to provide a set of common assembly helpers
> which could be used across architectures, this starts off some of this work
> for generic helpers which carve out and define custom Linux sections.
> 
> Lastly, this now also goes with two ports which required module support when
> using linker tables: jump labels, and dynamic debug support. A few
> extensions have been made to the original series in order to provide
> support for that.
> 
> Since kprobes actually had both a linker table and a section range the
> patch that dealt with kprobes is now split off in two patches, one
> that deals with its linker table and another for its section ranges.
> 
> More elaborate uses for linker tables are possible, I'll hold off on any
> of this type of work until at least the basic building blocks are fleshed
> out. To review how this work came about, and more elaborate uses being
> evaluated check out the userspace linker-tables mockup solution [1].
> Hopefully most of the possible bikeshedding was already dealt with through
> that tree. Thanks to hpa for tons of feedback.

Great! so table and ranges completely replace the old-style(add-hoc)
_kprobe and NOKPROBE_SYMBOL() implementation, good job! :)

> 
> Should you need it, the code here is also available on my linux-next
> 20160722-linker-table-v3-try2 branch on kernel.org [2]. Lastly, ranges and
> table development go under copyleft-next, Rusty recently asked for code
> to go in prior to the license tag being added denoting this license as
> GPL-compatible [3] -- I had noted in the patch submission which annotated
> copyleft-next's compatibility to GPLv2 that copyleft-next is the license
> of choice for ongoing kernel development on my end [4]. If this is
> objectionable I'm happy to change it to GPLv2 however I'd like a reason
> provided as I've gone through all possible channels to ensure this is kosher,
> including vetting by 3 attorneys now, 2 at SUSE.

I'm not a lawyer, so I don't know really it is compatible with GPLv2,
and if it is, I'm not sure the reason why we need another license.
AFAICS the license terms, most of parts looks reasonable. I just concern
clause 8, after fifteen years, is that still GPLv2 compatible?
(I'd like see FAQ about this license...)

Thank you!

> 
> This all goes tested by 0-day... however since I found an issue with linker
> tables and blackfin this series remains as RFC -- I'll try to debug the
> issue with Steven Miao. The issue with blackfin is using a config that
> enables CONFIG_FW_LOADER=y and CONFIG_DYNAMIC_DEBUG=y [5] we end up with
> the following at final link time:
> 
>   LD  init/built-in.o
> lib/built-in.o: In function `dynamic_debug_init':
> lib/dynamic_debug.c:(.init.text+0x156): undefined reference to 
> `__verbose__end'
> lib/dynamic_debug.c:(.init.text+0x15a): undefined reference to 
> `__verbose__end'
> lib/dynamic_debug.c:(.init.text+0x160): undefined reference to `__verbose'
> lib/dynamic_debug.c:(.init.text+0x164): undefined reference to `__verbose'
> lib/dynamic_debug.c:(.init.text+0x214): undefined reference to 
> `__verbose__end'
> lib/dynamic_debug.c:(.init.text+0x218): undefined reference to 
> `__verbose__end'
> lib/dynamic_debug.c:(.init.text+0x252): undefined reference to 
> `__verbose__end'
> lib/dynamic_debug.c:(.init.text+0x258): undefined reference to `__verbose'
> lib/dynamic_debug.c:(.init.text+0x25c): undefined reference to 
> `__verbose__end'
> lib/dynamic_debug.c:(.init.text+0x260): undefined reference to `__verbose'
> drivers/built-in.o: In function `release_firmware':
> (.text+0x22dc2): undefined reference to `builtin_fw'
> drivers/built-in.o: In function `release_firmware':
> (.text+0x22dc6): undefined reference to `builtin_fw__end'
> drivers/built-in.o: In function `release_firmware':
> (.text+0x22dca): undefined reference to `builtin_fw'
> drivers/built-in.o: In function `release_firmware':
> (.text+0x22dce): undefined reference to `builtin_fw__end'
> drivers/built-in.o: In function `fw_get_builtin_firmware':
> drivers/base/firmware_class.c:(.text+0x23008): undefined ref

Re: [Xen-devel] [PATCH linux v2 0/9] xen: pvhvm: support bootup on secondary vCPUs

2016-07-25 Thread Vitaly Kuznetsov

Julien Grall  writes:

> Hi David,
>
> On 25/07/16 13:38, David Vrabel wrote:
>> On 30/06/16 16:56, Vitaly Kuznetsov wrote:
>>> It may happen that Xen's and Linux's ideas of vCPU id diverge. In
>>> particular, when we crash on a secondary vCPU we may want to do kdump
>>> and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting
>>> on the vCPU which crashed. This doesn't work very well for PVHVM guests as
>>> we have a number of hypercalls where we pass vCPU id as a parameter. These
>>> hypercalls either fail or do something unexpected. To solve the issue we
>>> need to have a mapping between Linux's and Xen's vCPU ids.
>>>
>>> This series solves the issue for x86 PVHVM guests. PV guests don't (and
>>> probably won't) support kdump so I always assume Xen's vCPU id == Linux's
>>> vCPU id. ARM guests will probably need to get proper mapping once we start
>>> supporting kexec/kdump there.
>>
>> Applied to for-linus-4.8, thanks.
>
> It would have been nice to send a ping before applying. This patch
> series is containing Xen ARM code which has not been acked by Stefano,
> nor had feedback from ARM side.
>
> For instance given that all the hypercalls are representing a "vcpu
> id" using "uint32_t" it is a bit weird to use "int" to define
> xen_vcpu_id (see patch #3).

CPU id is usually 'int' in linux and now we pass it to all
hypercalls as it is. It is a bit more convenient in the mapping I
introduce as we can set it to a negative value to indicate there is no
mapping available. I can definitely change that and use something like
U32_MAX-1 to instead but I'm not sure it is worth it...

-- 
  Vitaly

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [Qemu-devel] [PATCH 01/19] xen: Create a new file xen_pvdev.c

2016-07-25 Thread Anthony PERARD

On Sun, Jul 10, 2016 at 02:47:32PM +0300, Emil Condrea wrote:
> The purpose of the new file is to store generic functions shared by frontend
> and backends such as xenstore operations, xendevs.
> 
> Signed-off-by: Quan Xu 
> Signed-off-by: Emil Condrea 
> ---
>  hw/xen/Makefile.objs |   2 +-
>  hw/xen/xen_backend.c | 125 +---
>  hw/xen/xen_pvdev.c   | 149 
> +++
>  include/hw/xen/xen_backend.h |  63 +-
>  include/hw/xen/xen_pvdev.h   |  71 +
>  5 files changed, 223 insertions(+), 187 deletions(-)
>  create mode 100644 hw/xen/xen_pvdev.c
>  create mode 100644 include/hw/xen/xen_pvdev.h
> 
> diff --git a/include/hw/xen/xen_pvdev.h b/include/hw/xen/xen_pvdev.h
> new file mode 100644
> index 000..f60bfae
> --- /dev/null
> +++ b/include/hw/xen/xen_pvdev.h
> @@ -0,0 +1,71 @@
> +#ifndef QEMU_HW_XEN_PVDEV_H
> +#define QEMU_HW_XEN_PVDEV_H 1

No need for the 1. Just "#define X" is enough.

> +
> +#include "hw/xen/xen_common.h"
> +#include 

I don't think this include is needed.

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 4/4] tools: make xenstore domain easy configurable

2016-07-25 Thread Wei Liu

On Fri, Jul 22, 2016 at 05:09:31PM +0200, Juergen Gross wrote:
[...]
> diff --git a/tools/hotplug/Linux/launch-xenstore.in 
> b/tools/hotplug/Linux/launch-xenstore.in
> index 2bd9f64..fdfa33a 100644
> --- a/tools/hotplug/Linux/launch-xenstore.in
> +++ b/tools/hotplug/Linux/launch-xenstore.in
> @@ -48,18 +48,40 @@ test_xenstore && exit 0
>  
>  test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
> @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
>  
> -test -z "$XENSTORED_ROOTDIR" && XENSTORED_ROOTDIR="@XEN_LIB_STORED@"
> -rm -f "$XENSTORED_ROOTDIR"/tdb* 2>/dev/null
> -test -z "$XENSTORED_TRACE" || XENSTORED_ARGS=" -T 
> @XEN_LOG_DIR@/xenstored-trace.log"
> +[ "$XENSTORETYPE" = "" ] && XENSTORETYPE=daemon
> +
> +/bin/mkdir -p @XEN_RUN_DIR@
> +
> +[ "$XENSTORETYPE" = "daemon" ] && {
> + [ -z "$XENSTORED_ROOTDIR" ] && XENSTORED_ROOTDIR="@XEN_LIB_STORED@"
> + /bin/rm -f $XENSTORED_ROOTDIR/tdb* 2>/dev/null
> + [ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T 
> @XEN_LOG_DIR@/xenstored-trace.log"
> + [ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@
> + [ -x "$XENSTORED" ] || {
> + echo "No xenstored found"
> + exit 1
> + }
>  
> -if [ -n "$XENSTORED" ] ; then
>   echo -n Starting $XENSTORED...
>   $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
> -else
> - echo "No xenstored found"
> - exit 1
> -fi
>  
> -timeout_xenstore $XENSTORED || exit 1
> + systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || 
> exit 1
>  
> -exit 0
> + exit 0
> +}
> +
> +[ "$XENSTORETYPE" = "domain" ] && {
> + [ -z "$XENSTORE_DOMAIN_KERNEL" ] && 
> XENSTORE_DOMAIN_KERNEL=@LIBEXEC@/boot/xenstore-stubdom.gz
> + XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --kernel 
> $XENSTORE_DOMAIN_KERNEL"
> + [ -z "$XENSTORE_DOMAIN_SIZE" ] && XENSTORE_DOMAIN_SIZE=8
> + XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --memory 
> $XENSTORE_DOMAIN_SIZE"
> +
> + echo -n Starting $XENSTORE_DOMAIN_KERNEL...
> + ${LIBEXEC_BIN}/init-xenstore-domain $XENSTORE_DOMAIN_ARGS || exit 1
> + systemd-notify --ready 2>/dev/null

Please test if there is systemd-notify before trying to invoke it and
then you can properly log the failure of the invocation.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 4/4] tools: make xenstore domain easy configurable

2016-07-25 Thread Juergen Gross

On 25/07/16 15:43, Wei Liu wrote:
> On Fri, Jul 22, 2016 at 05:09:31PM +0200, Juergen Gross wrote:
> [...]
>> diff --git a/tools/hotplug/Linux/launch-xenstore.in 
>> b/tools/hotplug/Linux/launch-xenstore.in
>> index 2bd9f64..fdfa33a 100644
>> --- a/tools/hotplug/Linux/launch-xenstore.in
>> +++ b/tools/hotplug/Linux/launch-xenstore.in
>> @@ -48,18 +48,40 @@ test_xenstore && exit 0
>>  
>>  test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
>> @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
>>  
>> -test -z "$XENSTORED_ROOTDIR" && XENSTORED_ROOTDIR="@XEN_LIB_STORED@"
>> -rm -f "$XENSTORED_ROOTDIR"/tdb* 2>/dev/null
>> -test -z "$XENSTORED_TRACE" || XENSTORED_ARGS=" -T 
>> @XEN_LOG_DIR@/xenstored-trace.log"
>> +[ "$XENSTORETYPE" = "" ] && XENSTORETYPE=daemon
>> +
>> +/bin/mkdir -p @XEN_RUN_DIR@
>> +
>> +[ "$XENSTORETYPE" = "daemon" ] && {
>> +[ -z "$XENSTORED_ROOTDIR" ] && XENSTORED_ROOTDIR="@XEN_LIB_STORED@"
>> +/bin/rm -f $XENSTORED_ROOTDIR/tdb* 2>/dev/null
>> +[ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T 
>> @XEN_LOG_DIR@/xenstored-trace.log"
>> +[ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@
>> +[ -x "$XENSTORED" ] || {
>> +echo "No xenstored found"
>> +exit 1
>> +}
>>  
>> -if [ -n "$XENSTORED" ] ; then
>>  echo -n Starting $XENSTORED...
>>  $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
>> -else
>> -echo "No xenstored found"
>> -exit 1
>> -fi
>>  
>> -timeout_xenstore $XENSTORED || exit 1
>> +systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || 
>> exit 1
>>  
>> -exit 0
>> +exit 0
>> +}
>> +
>> +[ "$XENSTORETYPE" = "domain" ] && {
>> +[ -z "$XENSTORE_DOMAIN_KERNEL" ] && 
>> XENSTORE_DOMAIN_KERNEL=@LIBEXEC@/boot/xenstore-stubdom.gz
>> +XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --kernel 
>> $XENSTORE_DOMAIN_KERNEL"
>> +[ -z "$XENSTORE_DOMAIN_SIZE" ] && XENSTORE_DOMAIN_SIZE=8
>> +XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --memory 
>> $XENSTORE_DOMAIN_SIZE"
>> +
>> +echo -n Starting $XENSTORE_DOMAIN_KERNEL...
>> +${LIBEXEC_BIN}/init-xenstore-domain $XENSTORE_DOMAIN_ARGS || exit 1
>> +systemd-notify --ready 2>/dev/null
> 
> Please test if there is systemd-notify before trying to invoke it and
> then you can properly log the failure of the invocation.

I thought about that. What would be the purpose doing so? Following
cases are possible:

- system has no systemd installed: systemd-notify will fail, but calling
  it was not necessary -> no harm done
- system is with systemd, but not booted under control of it, and
  systemd-notify is not found: same as above -> no harm done
- system is with systemd, but not booted under control of it, and
  systemd-notify is found: calling systemd-notify isn't really needed,
  but it won't harm
- system is booted under control of systemd, systemd-notify is not
  found: I could log it, but I can't know that I'm under control of
  systemd (standard way to tell from a script is calling
  "systemd-notify --booted" which is kind of chicken and egg problem
  here) so I can't know whether not finding systemd-notify is an error
  or not
- system is booted under control of systemd, systemd-notify is found:
  everything is nice, systemd receives the notification it is waiting
  for


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 07/19] xen: Rename xen_be_unbind_evtchn

2016-07-25 Thread Anthony PERARD

On Sun, Jul 10, 2016 at 02:47:38PM +0300, Emil Condrea wrote:
> Prepare xen_be_unbind_evtchn to be shared with frontends:
>  * xen_be_unbind_evtchn -> xen_pv_unbind_evtchn
> 
> Signed-off-by: Emil Condrea 
> ---
>  hw/block/xen_disk.c| 2 +-
>  hw/char/xen_console.c  | 2 +-
>  hw/display/xenfb.c | 2 +-
>  hw/net/xen_nic.c   | 2 +-
>  hw/xen/xen_pvdev.c | 2 +-
>  include/hw/xen/xen_pvdev.h | 2 +-
>  6 files changed, 6 insertions(+), 6 deletions(-)

There is one change missing in hw/usb/xen-usb.c.

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 2/4] tools: split out xenstored starting form xencommons

2016-07-25 Thread Wei Liu

On Fri, Jul 22, 2016 at 05:09:29PM +0200, Juergen Gross wrote:
> In order to prepare starting a xenstore domain split out the starting
> of the xenstore daemon from the xencommons script into a dedicated
> launch-xenstore script.
> 
> Correct one error: don't remove old tdb files in background, as this
> could lead to very subtle races.
> 
> A rerun of autogen.sh is required.
> 
> Signed-off-by: Juergen Gross 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH linux v2 0/9] xen: pvhvm: support bootup on secondary vCPUs

2016-07-25 Thread David Vrabel

On 25/07/16 14:17, Julien Grall wrote:
> Hi David,
> 
> On 25/07/16 13:38, David Vrabel wrote:
>> On 30/06/16 16:56, Vitaly Kuznetsov wrote:
>>> It may happen that Xen's and Linux's ideas of vCPU id diverge. In
>>> particular, when we crash on a secondary vCPU we may want to do kdump
>>> and unlike plain kexec where we do migrate_to_reboot_cpu() we try
>>> booting
>>> on the vCPU which crashed. This doesn't work very well for PVHVM
>>> guests as
>>> we have a number of hypercalls where we pass vCPU id as a parameter.
>>> These
>>> hypercalls either fail or do something unexpected. To solve the issue we
>>> need to have a mapping between Linux's and Xen's vCPU ids.
>>>
>>> This series solves the issue for x86 PVHVM guests. PV guests don't (and
>>> probably won't) support kdump so I always assume Xen's vCPU id ==
>>> Linux's
>>> vCPU id. ARM guests will probably need to get proper mapping once we
>>> start
>>> supporting kexec/kdump there.
>>
>> Applied to for-linus-4.8, thanks.
> 
> It would have been nice to send a ping before applying. This patch
> series is containing Xen ARM code which has not been acked by Stefano,
> nor had feedback from ARM side.

The ARM parts are trivial and such a small part of this series, I saw no
need to wait.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC v3 00/13] linux: generalize sections, ranges and linker tables

2016-07-25 Thread Richard Fontana

On 07/25/2016 09:32 AM, Masami Hiramatsu wrote:

> I'm not a lawyer, so I don't know really it is compatible with GPLv2,
> and if it is, I'm not sure the reason why we need another license.
> AFAICS the license terms, most of parts looks reasonable. I just concern
> clause 8, after fifteen years, is that still GPLv2 compatible?
> (I'd like see FAQ about this license...)

Yes, after fifteen years the restrictions in the license that make it
copyleft rather than permissive disappear, and the license is
transformed into a simple, essentially unconditional, permissive open
source license.

Richard

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 02/19] xen: Create a new file xen_frontend.c

2016-07-25 Thread Anthony PERARD

On Sun, Jul 10, 2016 at 02:47:33PM +0300, Emil Condrea wrote:
> Its purpose is to store frontend related functions.
> 
> Signed-off-by: Quan Xu 
> Signed-off-by: Emil Condrea 
> ---
>  hw/block/xen_disk.c   |  1 +
>  hw/display/xenfb.c|  1 +
>  hw/net/xen_nic.c  |  1 +
>  hw/xen/Makefile.objs  |  2 +-
>  hw/xen/xen_backend.c  | 47 +
>  hw/xen/xen_frontend.c | 70 
> +++
>  include/hw/xen/xen_backend.h  |  3 --
>  include/hw/xen/xen_frontend.h | 10 +++
>  8 files changed, 85 insertions(+), 50 deletions(-)
>  create mode 100644 hw/xen/xen_frontend.c
>  create mode 100644 include/hw/xen/xen_frontend.h
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index 90aca73..28fbf24 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -25,6 +25,7 @@
>  
>  #include "hw/hw.h"
>  #include "hw/xen/xen_backend.h"
> +#include "hw/xen/xen_frontend.h"
>  #include "xen_blkif.h"
>  #include "sysemu/blockdev.h"
>  #include "sysemu/block-backend.h"
> diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
> index 46b7d5e..5751113 100644
> --- a/hw/display/xenfb.c
> +++ b/hw/display/xenfb.c
> @@ -30,6 +30,7 @@
>  #include "ui/console.h"
>  #include "sysemu/char.h"
>  #include "hw/xen/xen_backend.h"
> +#include "hw/xen/xen_frontend.h"
>  
>  #include 
>  #include 
> diff --git a/hw/net/xen_nic.c b/hw/net/xen_nic.c
> index 0b4ddae..bdfa789 100644
> --- a/hw/net/xen_nic.c
> +++ b/hw/net/xen_nic.c
> @@ -29,6 +29,7 @@
>  #include "net/checksum.h"
>  #include "net/util.h"
>  #include "hw/xen/xen_backend.h"
> +#include "hw/xen/xen_frontend.h"
>  
>  #include 
>  

There is also "hw/usb/xen-usb.c" that needs to be updated to include
"hw/xen/xen_frontend.h".

> diff --git a/include/hw/xen/xen_frontend.h b/include/hw/xen/xen_frontend.h
> new file mode 100644
> index 000..46485b9
> --- /dev/null
> +++ b/include/hw/xen/xen_frontend.h
> @@ -0,0 +1,10 @@
> +#ifndef QEMU_HW_XEN_FRONTEND_H
> +#define QEMU_HW_XEN_FRONTEND_H 1

No need for the 1 here.


-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 08/19] xen: Rename xen_be_send_notify

2016-07-25 Thread Anthony PERARD

On Sun, Jul 10, 2016 at 02:47:39PM +0300, Emil Condrea wrote:
> Prepare xen_be_send_notify to be shared with frontends:
>  * xen_be_send_notify -> xen_pv_send_notify
> 
> Signed-off-by: Emil Condrea 
> ---
>  hw/block/xen_disk.c| 4 ++--
>  hw/char/xen_console.c  | 4 ++--
>  hw/display/xenfb.c | 8 
>  hw/net/xen_nic.c   | 4 ++--
>  hw/xen/xen_pvdev.c | 2 +-
>  include/hw/xen/xen_pvdev.h | 2 +-
>  6 files changed, 12 insertions(+), 12 deletions(-)

hw/usb/xen-usb.c needs to be updated as well.

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 0/4] tools: make xenstore domain/daemon configurable

2016-07-25 Thread Wei Liu

I went back to the long thread in v1. I don't think I can figure out
if all the comments are addressed.

Ian asked for something to be rectified, but I'm not sure if this series
satisfies him. I will let him comment on that.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH linux v2 0/9] xen: pvhvm: support bootup on secondary vCPUs

2016-07-25 Thread Julien Grall


Hello,

On 25/07/16 14:39, Vitaly Kuznetsov wrote:

Julien Grall  writes:


Hi David,

On 25/07/16 13:38, David Vrabel wrote:

On 30/06/16 16:56, Vitaly Kuznetsov wrote:

It may happen that Xen's and Linux's ideas of vCPU id diverge. In
particular, when we crash on a secondary vCPU we may want to do kdump
and unlike plain kexec where we do migrate_to_reboot_cpu() we try booting
on the vCPU which crashed. This doesn't work very well for PVHVM guests as
we have a number of hypercalls where we pass vCPU id as a parameter. These
hypercalls either fail or do something unexpected. To solve the issue we
need to have a mapping between Linux's and Xen's vCPU ids.

This series solves the issue for x86 PVHVM guests. PV guests don't (and
probably won't) support kdump so I always assume Xen's vCPU id == Linux's
vCPU id. ARM guests will probably need to get proper mapping once we start
supporting kexec/kdump there.


Applied to for-linus-4.8, thanks.


It would have been nice to send a ping before applying. This patch
series is containing Xen ARM code which has not been acked by Stefano,
nor had feedback from ARM side.

For instance given that all the hypercalls are representing a "vcpu
id" using "uint32_t" it is a bit weird to use "int" to define
xen_vcpu_id (see patch #3).


CPU id is usually 'int' in linux and now we pass it to all
hypercalls as it is.


Well, we need to differentiate between the internal representation of 
the CPU which is based on the boot order and the logical CPU ID. For 
instance on ARM, the logical CPU ID may not be contiguous nor 0 for the 
first CPU.


From my understanding, the macros in patch #3 will be used at the last 
minute when prepare the hypercall data. IHMO this is very similar to a 
logical ID and defined as uint32_t by the hypercall ABI.


Although, I agree that currently we use the internal CPU id on ARM which 
is very unfortunate because this value is based on the order of the 
nodes in the device tree.


One way to abolish it on ARM would be to use the MPIDR (or at least a 
part) for the VCPU ID.



It is a bit more convenient in the mapping I
introduce as we can set it to a negative value to indicate there is no
mapping available. I can definitely change that and use something like
U32_MAX-1 to instead but I'm not sure it is worth it...


I looked at the definition of cpu_acpi_id on x86 which return 
x86_cpu_to_acpiid that has been defined to an uint32_t.


So you are assuming that it will never be possible to have an ID > 
0x8000.


Also, this may not be true on ARM depending how we define the VCPU 
mapping. We could decide to use the MPIDR which is in this case may be 
considered as "negative".


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 4/4] tools: make xenstore domain easy configurable

2016-07-25 Thread Wei Liu

On Mon, Jul 25, 2016 at 03:56:17PM +0200, Juergen Gross wrote:
> On 25/07/16 15:43, Wei Liu wrote:
> > On Fri, Jul 22, 2016 at 05:09:31PM +0200, Juergen Gross wrote:
> > [...]
> >> diff --git a/tools/hotplug/Linux/launch-xenstore.in 
> >> b/tools/hotplug/Linux/launch-xenstore.in
> >> index 2bd9f64..fdfa33a 100644
> >> --- a/tools/hotplug/Linux/launch-xenstore.in
> >> +++ b/tools/hotplug/Linux/launch-xenstore.in
> >> @@ -48,18 +48,40 @@ test_xenstore && exit 0
> >>  
> >>  test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
> >> @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
> >>  
> >> -test -z "$XENSTORED_ROOTDIR" && XENSTORED_ROOTDIR="@XEN_LIB_STORED@"
> >> -rm -f "$XENSTORED_ROOTDIR"/tdb* 2>/dev/null
> >> -test -z "$XENSTORED_TRACE" || XENSTORED_ARGS=" -T 
> >> @XEN_LOG_DIR@/xenstored-trace.log"
> >> +[ "$XENSTORETYPE" = "" ] && XENSTORETYPE=daemon
> >> +
> >> +/bin/mkdir -p @XEN_RUN_DIR@
> >> +
> >> +[ "$XENSTORETYPE" = "daemon" ] && {
> >> +  [ -z "$XENSTORED_ROOTDIR" ] && XENSTORED_ROOTDIR="@XEN_LIB_STORED@"
> >> +  /bin/rm -f $XENSTORED_ROOTDIR/tdb* 2>/dev/null
> >> +  [ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T 
> >> @XEN_LOG_DIR@/xenstored-trace.log"
> >> +  [ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@
> >> +  [ -x "$XENSTORED" ] || {
> >> +  echo "No xenstored found"
> >> +  exit 1
> >> +  }
> >>  
> >> -if [ -n "$XENSTORED" ] ; then
> >>echo -n Starting $XENSTORED...
> >>$XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
> >> -else
> >> -  echo "No xenstored found"
> >> -  exit 1
> >> -fi
> >>  
> >> -timeout_xenstore $XENSTORED || exit 1
> >> +  systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || 
> >> exit 1
> >>  
> >> -exit 0
> >> +  exit 0
> >> +}
> >> +
> >> +[ "$XENSTORETYPE" = "domain" ] && {
> >> +  [ -z "$XENSTORE_DOMAIN_KERNEL" ] && 
> >> XENSTORE_DOMAIN_KERNEL=@LIBEXEC@/boot/xenstore-stubdom.gz
> >> +  XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --kernel 
> >> $XENSTORE_DOMAIN_KERNEL"
> >> +  [ -z "$XENSTORE_DOMAIN_SIZE" ] && XENSTORE_DOMAIN_SIZE=8
> >> +  XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --memory 
> >> $XENSTORE_DOMAIN_SIZE"
> >> +
> >> +  echo -n Starting $XENSTORE_DOMAIN_KERNEL...
> >> +  ${LIBEXEC_BIN}/init-xenstore-domain $XENSTORE_DOMAIN_ARGS || exit 1
> >> +  systemd-notify --ready 2>/dev/null
> > 
> > Please test if there is systemd-notify before trying to invoke it and
> > then you can properly log the failure of the invocation.
> 
> I thought about that. What would be the purpose doing so? Following
> cases are possible:
> 

This

> - system is booted under control of systemd, systemd-notify is found:
>   everything is nice, systemd receives the notification it is waiting
>   for
> 

Here you assume systemd-notify always succeed. It can fail due to some
reasons. That's what its manpage suggests.

We need to handle this.

Wei.

> 
> Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v3 4/4] tools: make xenstore domain easy configurable

2016-07-25 Thread Juergen Gross

On 25/07/16 16:01, Wei Liu wrote:
> On Mon, Jul 25, 2016 at 03:56:17PM +0200, Juergen Gross wrote:
>> On 25/07/16 15:43, Wei Liu wrote:
>>> On Fri, Jul 22, 2016 at 05:09:31PM +0200, Juergen Gross wrote:
>>> [...]
 diff --git a/tools/hotplug/Linux/launch-xenstore.in 
 b/tools/hotplug/Linux/launch-xenstore.in
 index 2bd9f64..fdfa33a 100644
 --- a/tools/hotplug/Linux/launch-xenstore.in
 +++ b/tools/hotplug/Linux/launch-xenstore.in
 @@ -48,18 +48,40 @@ test_xenstore && exit 0
  
  test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
 @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
  
 -test -z "$XENSTORED_ROOTDIR" && XENSTORED_ROOTDIR="@XEN_LIB_STORED@"
 -rm -f "$XENSTORED_ROOTDIR"/tdb* 2>/dev/null
 -test -z "$XENSTORED_TRACE" || XENSTORED_ARGS=" -T 
 @XEN_LOG_DIR@/xenstored-trace.log"
 +[ "$XENSTORETYPE" = "" ] && XENSTORETYPE=daemon
 +
 +/bin/mkdir -p @XEN_RUN_DIR@
 +
 +[ "$XENSTORETYPE" = "daemon" ] && {
 +  [ -z "$XENSTORED_ROOTDIR" ] && XENSTORED_ROOTDIR="@XEN_LIB_STORED@"
 +  /bin/rm -f $XENSTORED_ROOTDIR/tdb* 2>/dev/null
 +  [ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T 
 @XEN_LOG_DIR@/xenstored-trace.log"
 +  [ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@
 +  [ -x "$XENSTORED" ] || {
 +  echo "No xenstored found"
 +  exit 1
 +  }
  
 -if [ -n "$XENSTORED" ] ; then
echo -n Starting $XENSTORED...
$XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
 -else
 -  echo "No xenstored found"
 -  exit 1
 -fi
  
 -timeout_xenstore $XENSTORED || exit 1
 +  systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || 
 exit 1
  
 -exit 0
 +  exit 0
 +}
 +
 +[ "$XENSTORETYPE" = "domain" ] && {
 +  [ -z "$XENSTORE_DOMAIN_KERNEL" ] && 
 XENSTORE_DOMAIN_KERNEL=@LIBEXEC@/boot/xenstore-stubdom.gz
 +  XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --kernel 
 $XENSTORE_DOMAIN_KERNEL"
 +  [ -z "$XENSTORE_DOMAIN_SIZE" ] && XENSTORE_DOMAIN_SIZE=8
 +  XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --memory 
 $XENSTORE_DOMAIN_SIZE"
 +
 +  echo -n Starting $XENSTORE_DOMAIN_KERNEL...
 +  ${LIBEXEC_BIN}/init-xenstore-domain $XENSTORE_DOMAIN_ARGS || exit 1
 +  systemd-notify --ready 2>/dev/null
>>>
>>> Please test if there is systemd-notify before trying to invoke it and
>>> then you can properly log the failure of the invocation.
>>
>> I thought about that. What would be the purpose doing so? Following
>> cases are possible:
>>
> 
> This
> 
>> - system is booted under control of systemd, systemd-notify is found:
>>   everything is nice, systemd receives the notification it is waiting
>>   for
>>
> 
> Here you assume systemd-notify always succeed. It can fail due to some
> reasons. That's what its manpage suggests.
> 
> We need to handle this.

May I repeat the sd_notify() man page here?

"In order to support both, init systems that implement this scheme and
those which do not, it is generally recommended to ignore the return
value of this call."

Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

1 2 >

1 - 100 of 164 matches

Mail list logo