[Xen-devel] [qemu-mainline test] 78509: regressions - FAIL

2016-01-20 Thread osstest service owner
flight 78509 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/78509/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-cubietruck 15 guest-start/debian.repeat fail REGR. vs. 
78421

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 11 guest-start   fail REGR. vs. 78421

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuu3db34bf64ab4f8797565dd8750003156c32b301d
baseline version:
 qemuu19b6d84316892c8086e0115d6f09cb01abb86cfc

Last test of basis78421  2016-01-18 09:47:22 Z1 days
Testing same since78509  2016-01-19 09:21:37 Z0 days1 attempts


People who touched revisions under test:
  Andreas Färber 
  Christophe Fergeau 
  Daniel P. Berrange 
  Gerd Hoffmann 
  Juan Quintela 
  Mark Cave-Ayland 
  Paolo Bonzini 
  Peter Maydell 
  Wolfgang Bumiller 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-x

Re: [Xen-devel] [PATCH] build: specify minimum versions of make and binutils

2016-01-20 Thread Jan Beulich
>>> On 19.01.16 at 20:24,  wrote:
> On 1/19/16 2:48 AM, Jan Beulich wrote:
> On 18.01.16 at 18:21,  wrote:
>>> On 1/18/16 11:03 AM, Jan Beulich wrote:
>>> On 18.01.16 at 17:53,  wrote:
> To help people avoid having to figure out what versions of make and
> binutils need to be supported document them explicitly. The version of
> binutils that had to be supported was mentioned in
> http://lists.xenproject.org/archives/html/xen-devel/2016-01/msg00609.html 
> as 2.17. Knowing that Jan got these versions from SLES10 I looked up the
> version of GNU make from the same vintage (mid-2006) and landed on 3.81.

 I'm afraid that same SLE10 has been using binutils 2.16.9.
 and make 3.80. While (still building Xen there once in a while) I'd 
 probably
 not be in big trouble if we decided we don't want to support that old an
 environment anymore, I don't think we can just go and document higher
 versions than we so far allowed. We'd first need to settle on where to
 draw the line nowadays (which then likely would mean a gcc minimal
 version bum too).
>>>
>>> Not a problem. I was just trying to take the situation from a guessing
>>> game to be explicitly called out. I was documenting what my logic was
>>> behind the version numbers I selected. I wasn't able to compare dates
>>> with binutils because their repo goes from 2003 to 2011 [1]. So I went
>>> back to SLES10's release date [2] and the GCC 4.1.0 release date [3] to
>>> compare it with GNU make [4].
>>>
>>> Honestly I'd be happy if we just drew a line in the sand so that its
>>> clear what I need to test against when I submit patches. I don't really
>>> care where the line is.
>> 
>> Then how about 2.16.1 and 3.80 respectively as the initial line?
> 
> Sounds great to me. Would you like me to resubmit or do you want to make
> that change. I'm ok if you throw away my patch and author it yourself.
> Whatever is easiest for you (or whoever commits it).

I'd prefer if you re-submitted.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 08:49,  wrote:
>> From: Jan Beulich [mailto:jbeul...@suse.com]
>> Sent: Monday, January 18, 2016 11:14 PM
>> >>> On 03.12.15 at 09:35,  wrote:
>> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> > @@ -83,7 +83,131 @@ static int vmx_msr_write_intercept(unsigned int
>> msr, uint64_t msr_content);
>> > +ASSERT(v->arch.hvm_vmx.pi_block_cpu == NR_CPUS);
>> > +
>> > +/*
>> > + * The vCPU is blocking, we need to add it to one of the per pCPU 
>> > lists.
>> > + * We save v->processor to v->arch.hvm_vmx.pi_block_cpu and use it
>> for
>> > + * the per-CPU list, we also save it to posted-interrupt descriptor 
>> > and
>> > + * make it as the destination of the wake-up notification event.
>> > + */
>> > +v->arch.hvm_vmx.pi_block_cpu = v->processor;
>> > +
>> > +spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock,
>> > +  v->arch.hvm_vmx.pi_block_cpu), flags);
>> > +list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list,
>> > +  &per_cpu(pi_blocked_vcpu, 
>> > v->arch.hvm_vmx.pi_block_cpu));
>> > +spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock,
>> > +   v->arch.hvm_vmx.pi_block_cpu), flags);
>> > +
>> > +ASSERT(!pi_test_sn(pi_desc));
>> > +
>> > +/*
>> > + * We don't need to set the 'NDST' field, since it should point to
>> > + * the same pCPU as v->processor.
>> > + */
>> > +
>> > +write_atomic(&pi_desc->nv, pi_wakeup_vector);
>> 
>> Stray blank line between comment and corresponding code.
>> 
>> Also the ASSERT() is rather more disconnected from the write
>> than seems desirable: Wouldn't it be better to cmpxchg() the
>> whole 32-bit value, validating that SN is clear in the result?
> 
> Not quite understand this. The control field is 64 bits, do you
> mean cmpxchg() the whole 64-bit value like follows:
> 
> +do {
> +old.control = new.control = pi_desc->control;
> +new.nv = pi_wakeup_vector;
> +} while ( cmpxchg(&pi_desc->control, old.control, new.control)
> +  != old.control );
> 
> This a somewhat like the implementation in the earlier versions,
> why do you want to change it back?

Did you read what I've said? I'm worried about the disconnect of
assertion and update: You really care about SN being clear
_after_ the update aiui. And since NDST doesn't change, a 32-bit
CMPXCHG would seem to suffice.

>> > +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
>> > +if ( pi_block_cpu == NR_CPUS )
>> > +return;
>> 
>> Are this condition and the one in the immediately preceding if()
>> connected in any way?
> 
> I am not sure I understand this correctly. Which one is
> " the one in the immediately preceding if()"?

If you hadn't ripped out too much of the context, it would be a
matter of pointing you back up. Now I have to re-quote the full
code:

+if ( pi_desc->nv != posted_intr_vector )
+write_atomic(&pi_desc->nv, posted_intr_vector);
+
+/* the vCPU is not on any blocking list. */
+pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
+if ( pi_block_cpu == NR_CPUS )
+return;

These are the two if()-s the question is about.

>> I.e. could the latter perhaps become an
>> ASSERT() by connecting the two blocks suitably? I'm simply
>> trying to limit the number of different cases to consider mentally
>> when looking at this code ...
> 
> If we get a true from above check, it means the vcpu was removed by
> pi_wakeup_interrupt(), then we just need to return. If we get a false,
> we will acquire the spinlock as the following code does, there are two
> scenarios:
> - pi_wakeup_interrupt() acquires the spinlock before us, then when
> we get the lock, 'v->arch.hvm_vmx.pi_block_cpu' has already been
> set to NR_CPUS by pi_wakeup_interrupt(), we need do nothing.
> - We get the spinlock before pi_wakeup_interrupt(), this time we need
> to remove the vCPU from the list and set 'v->arch.hvm_vmx.pi_block_cpu'
> to NR_CPUS.

This is all only about the second condition, but the question really is
whether one being true or false may imply the result of other. In
which case this would better be ASSERT()ed than handled by two
conditionals.

>> > @@ -106,10 +230,17 @@ static int vmx_vcpu_initialise(struct vcpu *v)
>> >
>> >  spin_lock_init(&v->arch.hvm_vmx.vmcs_lock);
>> >
>> > +INIT_LIST_HEAD(&v->arch.hvm_vmx.pi_blocked_vcpu_list);
>> > +
>> > +v->arch.hvm_vmx.pi_block_cpu = NR_CPUS;
>> > +
>> >  v->arch.schedule_tail= vmx_do_resume;
>> >  v->arch.ctxt_switch_from = vmx_ctxt_switch_from;
>> >  v->arch.ctxt_switch_to   = vmx_ctxt_switch_to;
>> >
>> > +if ( iommu_intpost )
>> > +v->arch.vcpu_block = vmx_vcpu_block;
>> 
>> Considering the locking question above - couldn't this be done only
>> when a device gets added, and the pointer zapped upon removal
>> of the last device, at once saving the conditional inside the hook?
> 
> That is another so

Re: [Xen-devel] [PATCH v2 08/16] xen/hvm/params: Add a new delivery type for event-channel in HVM_PARAM_CALLBACK_IRQ

2016-01-20 Thread Andrew Cooper
On 20/01/2016 06:33, Shannon Zhao wrote:
>
> On 2016/1/18 20:52, Andrew Cooper wrote:
>> On 18/01/16 12:46, Stefano Stabellini wrote:
 On Mon, 18 Jan 2016, Andrew Cooper wrote:
>> On 18/01/16 12:38, Stefano Stabellini wrote:
 On Fri, 15 Jan 2016, Shannon Zhao wrote:
>> From: Shannon Zhao 
>>
>> Add a new delivery type:
>> val[63:56] == 3: val[15:8] is flag: val[7:0] is a PPI.
>> To the flag, bit 0 stands the interrupt mode is edge(1) or level(0) 
>> and
>> bit 1 stands the interrupt polarity is active low(1) or high(0).
>>
>> Signed-off-by: Shannon Zhao 
>> ---
>>  include/xen/interface/hvm/params.h | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/include/xen/interface/hvm/params.h 
>> b/include/xen/interface/hvm/params.h
>> index a6c7991..550688a 100644
>> --- a/include/xen/interface/hvm/params.h
>> +++ b/include/xen/interface/hvm/params.h
>> @@ -34,6 +34,11 @@
>>   *  Domain = val[47:32], Bus  = val[31:16],
>>   *  DevFn  = val[15: 8], IntX = val[ 1: 0]
>>   * val[63:56] == 2: val[7:0] is a vector number.
>> + * val[63:56] == 3: val[15:8] is flag of event-channel interrupt:
>> + *  bit 0: interrupt is edge(1) or level(0) 
>> triggered
>> + *  bit 1: interrupt is active low(1) or high(0)
>> + *  val[7:0] is PPI number used by event-channel.
>> + *  This is only used by ARM/ARM64.
>>   * If val == 0 then CPU0 event-channel notifications are not 
>> delivered.
>>   */
>>  #define HVM_PARAM_CALLBACK_IRQ 0
 Andrew, I think that this patch is correct. Looking back at your
 previous comment 
 (http://marc.info/?l=devicetree&m=144804014214262&w=2),
 is it possible that you were confused by enum callback_via_type, which
 is internal to Xen and offset'ed by 1 compared to the described values
 in xen/include/public/hvm/params.h?

 If not, and indeed somebody introduced one more field but failed to
 document it, then I suggest she sends a patch to fix the issue as soon
 as possible.
>> I was indeed confused - the ABI is utterly mad.
 All right. In that case, Shannon, you can add my

 Reviewed-by: Stefano Stabellini 


>> However, this change does need rebasing over c/s ca5c54b, which was the
>> result of the original discussion.
 c/s ca5c54b is for Xen, while this is a Linux patch (Linux has its own
 set of Xen headers).
>> All ABI changes need to happen in the Xen public headers first.  This
>> patch cannot be accepted yet.
> So you mean it should port c/s ca5c54b to Linux first? Then add this
> change based on that?

FIrst, you must get this content and submit a patch against Xen.  The
Xen repository is the authoritative version of the ABI.

Once that patch is accepted, you will need to port ca5c54b and the new
patch as part of this series.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 06:31,  wrote:
> The primary reason of current solution is to reuse existing NVDIMM
> driver in Linux kernel.

Re-using code in the Dom0 kernel has benefits and drawbacks, and
in any event needs to depend on proper layering to remain in place.
A benefit is less code duplication between Xen and Linux; along the
same lines a drawback is code duplication between various Dom0
OS variants.

> One responsibility of this driver is to discover NVDIMM devices and
> their parameters (e.g. which portion of an NVDIMM device can be mapped
> into the system address space and which address it is mapped to) by
> parsing ACPI NFIT tables. Looking at the NFIT spec in Sec 5.2.25 of
> ACPI Specification v6 and the actual code in Linux kernel
> (drivers/acpi/nfit.*), it's not a trivial task.

To answer one of Kevin's questions: The NFIT table doesn't appear
to require the ACPI interpreter. They seem more like SRAT and SLIT.
Also you failed to answer Kevin's question regarding E820 entries: I
think NVDIMM (or at least parts thereof) get represented in E820 (or
the EFI memory map), and if that's the case this would be a very
strong hint towards management needing to be in the hypervisor.

> Secondly, the driver implements a convenient block device interface to
> let software access areas where NVDIMM devices are mapped. The
> existing vNVDIMM implementation in QEMU uses this interface.
> 
> As Linux NVDIMM driver has already done above, why do we bother to
> reimplement them in Xen?

See above; a possibility is that we may need a split model (block
layer parts on Dom0, "normal memory" parts in the hypervisor.
Iirc the split is being determined by firmware, and hence set in
stone by the time OS (or hypervisor) boot starts.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 08/16] xen/hvm/params: Add a new delivery type for event-channel in HVM_PARAM_CALLBACK_IRQ

2016-01-20 Thread Shannon Zhao


On 2016/1/20 16:39, Andrew Cooper wrote:
> On 20/01/2016 06:33, Shannon Zhao wrote:
>> >
>> > On 2016/1/18 20:52, Andrew Cooper wrote:
>>> >> On 18/01/16 12:46, Stefano Stabellini wrote:
>  On Mon, 18 Jan 2016, Andrew Cooper wrote:
>>> >> On 18/01/16 12:38, Stefano Stabellini wrote:
>  On Fri, 15 Jan 2016, Shannon Zhao wrote:
>>> >> From: Shannon Zhao 
>>> >>
>>> >> Add a new delivery type:
>>> >> val[63:56] == 3: val[15:8] is flag: val[7:0] is a PPI.
>>> >> To the flag, bit 0 stands the interrupt mode is edge(1) 
>>> >> or level(0) and
>>> >> bit 1 stands the interrupt polarity is active low(1) or 
>>> >> high(0).
>>> >>
>>> >> Signed-off-by: Shannon Zhao 
>>> >> ---
>>> >>  include/xen/interface/hvm/params.h | 5 +
>>> >>  1 file changed, 5 insertions(+)
>>> >>
>>> >> diff --git a/include/xen/interface/hvm/params.h 
>>> >> b/include/xen/interface/hvm/params.h
>>> >> index a6c7991..550688a 100644
>>> >> --- a/include/xen/interface/hvm/params.h
>>> >> +++ b/include/xen/interface/hvm/params.h
>>> >> @@ -34,6 +34,11 @@
>>> >>   *  Domain = val[47:32], Bus  = 
>>> >> val[31:16],
>>> >>   *  DevFn  = val[15: 8], IntX = val[ 1: 
>>> >> 0]
>>> >>   * val[63:56] == 2: val[7:0] is a vector number.
>>> >> + * val[63:56] == 3: val[15:8] is flag of event-channel 
>>> >> interrupt:
>>> >> + *  bit 0: interrupt is edge(1) or 
>>> >> level(0) triggered
>>> >> + *  bit 1: interrupt is active 
>>> >> low(1) or high(0)
>>> >> + *  val[7:0] is PPI number used by 
>>> >> event-channel.
>>> >> + *  This is only used by ARM/ARM64.
>>> >>   * If val == 0 then CPU0 event-channel notifications 
>>> >> are not delivered.
>>> >>   */
>>> >>  #define HVM_PARAM_CALLBACK_IRQ 0
>  Andrew, I think that this patch is correct. Looking back at 
>  your
>  previous comment 
>  (http://marc.info/?l=devicetree&m=144804014214262&w=2),
>  is it possible that you were confused by enum 
>  callback_via_type, which
>  is internal to Xen and offset'ed by 1 compared to the 
>  described values
>  in xen/include/public/hvm/params.h?
> 
>  If not, and indeed somebody introduced one more field but 
>  failed to
>  document it, then I suggest she sends a patch to fix the 
>  issue as soon
>  as possible.
>>> >> I was indeed confused - the ABI is utterly mad.
>  All right. In that case, Shannon, you can add my
> 
>  Reviewed-by: Stefano Stabellini 
> 
> 
>>> >> However, this change does need rebasing over c/s ca5c54b, which 
>>> >> was the
>>> >> result of the original discussion.
>  c/s ca5c54b is for Xen, while this is a Linux patch (Linux has its 
>  own
>  set of Xen headers).
>>> >> All ABI changes need to happen in the Xen public headers first.  This
>>> >> patch cannot be accepted yet.
>> > So you mean it should port c/s ca5c54b to Linux first? Then add this
>> > change based on that?
> FIrst, you must get this content and submit a patch against Xen.  The
> Xen repository is the authoritative version of the ABI.
> 
I've sent a patch to Xen, but not applied yet.

> Once that patch is accepted, you will need to port ca5c54b and the new
> patch as part of this series.
Ok, thanks.

-- 
Shannon


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Peng Fan
Introduce pvclk interface which is useful when doing device passthrough
on ARM platform.

To ARM platform, saying embedded SoCs, clock management hardware IP
is controlled by Dom0, DomU does not have clocksource. So we need a
paravirtualized clock source to let DomU can handle device that are
passed through from Dom0.

Signed-off-by: Peng Fan 
Cc: George Dunlap 
Cc: Konrad Rzeszutek Wilk 
Cc: David Vrabel 
Cc: Julien Grall 
Cc: Stefano Stabellini 
Cc: Ian Campbell 
---

V2:
The V1 thread:
http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html
The V1 thread binds the interface part to the implementation for linux kernel.
This V2 only contains the pvclk interface part.

In this patch:
Backend,
/local/domain//backend/vclk//nr-clks
/local/domain//backend/vclk///name
Y is frontend domid, Z is clk id, name is the clk name.

Frontend,
/local/domain//device/vclk/clk-ring-ref
/local/domain//device/vclk/event-channel
/local/domain//device/vclk//name
Y is the clk id.

My original idea is to pass clk name which is parsed from dts.
And not expose clk name to xenstore. DomU pass the name to Dom0,
Dom0 use the name to find correct clk and operate on the clk.

Not sure whether I am right or not, please advise.
I have no knowledge of ACPI, just following Ian's
comment on V1, I add the clk id and expose the info to xenstore,
but discard the devid/bus. If devid/bus is needed, I think we
can encoded it into the clk id part in future.
Exposing the info to xenstore means we need to add entry in xl
configuration file like this: vclks =  ["id:clkname", "id:clkname"] -->
vclks = ["1:uart2-root-clk", "3:gpu-root-clk"]. And the libxl pvclk
should parse the vclks entry and fill the contents to xenstore backend nodes.
The frontend xenstore node will be created, by driver probe function 
when DomU booting or by libxl pvclk before DomU boots.
To my use case, Dom0 and DomU both use device tree, I need to build
the mapping table between id and name, since I use name to lookup
the clk in backend, like this:
"clk = __clk_loopkup(clkname); clk_prepare_enable(clk)". Maybe ACPI
is another different case.

 xen/include/public/io/clkif.h | 129 ++
 1 file changed, 129 insertions(+)
 create mode 100644 xen/include/public/io/clkif.h

diff --git a/xen/include/public/io/clkif.h b/xen/include/public/io/clkif.h
new file mode 100644
index 000..f6f9f20
--- /dev/null
+++ b/xen/include/public/io/clkif.h
@@ -0,0 +1,129 @@
+/*
+ * clkif.h
+ *
+ * CLK I/O interface for Xen guest OSes.
+ *
+ * Copyright (C) 2016, Peng Fan 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __XEN_PUBLIC_IO_CLKIF_H__
+#define __XEN_PUBLIC_IO_CLKIF_H__
+
+#include "ring.h"
+#include "../grant_table.h"
+
+/*
+ * The two halves of an Xen pvclk driver utilize nodes within the
+ * XenStore to communicate and negotiate operating parameters.
+ *
+ * Backend:
+ * /local/domain//backend/vclk//nr-clks
+ * /local/domain//backend/vclk///name
+ *
+ * X - The backend domid
+ * Y - The frontend domid
+ * Z - The clk id
+ * name - The clk name
+ *
+ * Backend example:
+ * /local/domain/0/backend/vclk/1/nr-clks   --> value 2
+ * /local/domain/0/backend/vclk/1/0/name --> uart2-root-clk
+ * /local/domain/0/backend/vclk/1/1/name --> gpu-root-clk
+ *
+ * /local/domain/0/backend/vclk/2/nr-clks   --> value 1
+ * /local/domain/0/backend/vclk/2/0/name --> lcdif-per-clk
+ *
+ * Frontend:
+ * /local/domain//device/vclk/clk-ring-ref
+ * /local/domain//device/vclk/event-channel
+ * /local/domain//device/vclk//name
+ *
+ * Frontend example:
+ * /local/domain/1/device/vclk/0/name --> uart2-root-clk
+ * /local/domain/1/device/vclk/1/name --> gpu-root-clk
+ *
+ * /local/domain/2/device/vclk/0/name -->lcdif-per-clk
+ */
+
+/*
+ * Define the CMDs for communication between frontend and backend
+ *
+ * If the Guest OSes want to ask the privileged OS to operate on
+ * a specific clk, the Guest OSes should pass the CMD to th

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-20 Thread Andrew Cooper
On 20/01/2016 08:46, Jan Beulich wrote:
 On 20.01.16 at 06:31,  wrote:
>> The primary reason of current solution is to reuse existing NVDIMM
>> driver in Linux kernel.
> Re-using code in the Dom0 kernel has benefits and drawbacks, and
> in any event needs to depend on proper layering to remain in place.
> A benefit is less code duplication between Xen and Linux; along the
> same lines a drawback is code duplication between various Dom0
> OS variants.
>
>> One responsibility of this driver is to discover NVDIMM devices and
>> their parameters (e.g. which portion of an NVDIMM device can be mapped
>> into the system address space and which address it is mapped to) by
>> parsing ACPI NFIT tables. Looking at the NFIT spec in Sec 5.2.25 of
>> ACPI Specification v6 and the actual code in Linux kernel
>> (drivers/acpi/nfit.*), it's not a trivial task.
> To answer one of Kevin's questions: The NFIT table doesn't appear
> to require the ACPI interpreter. They seem more like SRAT and SLIT.
> Also you failed to answer Kevin's question regarding E820 entries: I
> think NVDIMM (or at least parts thereof) get represented in E820 (or
> the EFI memory map), and if that's the case this would be a very
> strong hint towards management needing to be in the hypervisor.

Conceptually, an NVDIMM is just like a fast SSD which is linearly mapped
into memory.  I am still on the dom0 side of this fence.

The real question is whether it is possible to take an NVDIMM, split it
in half, give each half to two different guests (with appropriate NFIT
tables) and that be sufficient for the guests to just work.

Either way, it needs to be a toolstack policy decision as to how to
split the resource.

~Andrew

>
>> Secondly, the driver implements a convenient block device interface to
>> let software access areas where NVDIMM devices are mapped. The
>> existing vNVDIMM implementation in QEMU uses this interface.
>>
>> As Linux NVDIMM driver has already done above, why do we bother to
>> reimplement them in Xen?
> See above; a possibility is that we may need a split model (block
> layer parts on Dom0, "normal memory" parts in the hypervisor.
> Iirc the split is being determined by firmware, and hence set in
> stone by the time OS (or hypervisor) boot starts.
>
> Jan
>


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Juergen Gross
On 20/01/16 09:31, Peng Fan wrote:
> Introduce pvclk interface which is useful when doing device passthrough
> on ARM platform.

...

> +/*
> + * Frontend request
> + *
> + * cmd: command for operation on clk, should be XEN_CLK_[xx],
> + *   excluding XEN_CLK_END. id is the
> + * id: clk id
> + * rate: clock rate. Used for set rate.

Which unit? Hz?

> + */
> +struct xen_clkif_request {
> + uint32_t cmd;
> + uint32_t id;
> + uint64_t rate;
> +};
> +typedef struct xen_clkif_request xen_clkif_request_t;
> +
> +/*
> + * Backend response
> + *
> + * cmd: command for operation on clk, same with the cmd in request.
> + * id: clk id, same with the id in request.
> + * success: indicate failure or success for the cmd.

Values?

> + * rate: clock rate. Used for get rate.
> + *
> + * cmd and id are filled by backend and passed to frontend to
> + * let frontend check whether the response is for the current
> + * request or not.

I'd rather let the frontend add a request Id to the request which
will be echoed here instead cmd and clock Id.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Bug 1591 - xentop truncates VM names

2016-01-20 Thread Michael Großer

Hi!

 

Can anybody tell me please where I have to post bugs like this to

find the right contact person who is qualified to fix or even to discuss it?

 

Is xen-devel@lists.xen.org the right list for this?

 

Thanks in advance,

Michael

 

 

Gesendet: Freitag, 15. Januar 2016 um 10:17 Uhr
Von: "Michael Großer" 
An: xen-devel@lists.xen.org
Betreff: [Xen-devel] Bug 1591 - xentop truncated VM names

Hi!

Some days ago, I discovered the 'xentop' tool (somebody mentioned it here:
http://discussions.citrix.com/topic/374218-what-can-we-do-to-decrease-the-time-needed-for-snapshot-reverts/
)

We work with XenServer 6.5 and we use long names for our virtual
machines to distinguish between them.

A machine name can have about 40 characters, but xentop displays only the first
10 characters in the NAME column. The result is, that I cannot identify a
particular VM using xentop.

I read the man page and googled around to find a solution. I found
the page

http://www.bl-nk.net/2014/07/xentop-and-long-server-names/

and created my own quirky solution as follows:

# cd /usr/sbin
# cp xentop xentop_wide
# sed -i 's/10\.10/80\.80/g' xentop_wide

This works for me today, but my common sense tells me that this solution could
cease working after we update to a newer version or that I could demage
some other instructions that accidentally have similar machine codes.

I discovered this 6 years old bug:

http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1591

The composer of this bug makes recommendations that I wouldn't recommend, so
that seams to be the reason why the bug is nearly 6 years old without being
solved.

I recommend a very simple and reconcilable problem-solving approach:

- The man page should contain / describe another option, let me call it
"-w / --column-name-width", which requires a number.

- When starting xentop with "-w 80" or "--column-name-width=80", then
it should display (in every imaginable work mode) a name column with
a width of 80 characters.

- The default width should stay of cause at 10 characters, but with such an
option, I as a user could have a choice.

Please tell me, how is the chance that such an bug fix enters a next possible
release of XenServer, which release number could this be and when will it
roughly appear?

Thanks in advance,
Michael Großer

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 05/11] igd: TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE: call parent realize

2016-01-20 Thread Gerd Hoffmann
  Hi,

> > > > +i440fx_realize = k->realize;
> > > >  k->realize = igd_pt_i440fx_realize;
> > 
> > ... because we are overriding it right here.
> 
> Many device classes have a parent_realize field so they can keep
> a pointer to the original realize function. It's better than a
> static variable.

How does the attached patch (incremental fix, not tested yet) look like?

cheers,
  Gerd

From 3d110e297b5182107e055db3ab69092affdef5bb Mon Sep 17 00:00:00 2001
From: Gerd Hoffmann 
Date: Wed, 20 Jan 2016 10:08:19 +0100
Subject: [PATCH] [fixup] TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE realize_parent

---
 hw/pci-host/igd.c | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/hw/pci-host/igd.c b/hw/pci-host/igd.c
index 160828f..2d51745 100644
--- a/hw/pci-host/igd.c
+++ b/hw/pci-host/igd.c
@@ -49,12 +49,24 @@ out_free:
 g_free(path);
 }
 
-static void (*i440fx_realize)(PCIDevice *pci_dev, Error **errp);
+#define IGD_PT_I440FX_CLASS(class)  \
+OBJECT_CLASS_CHECK(IGDPtI440fxClass, (class),   \
+   TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE)
+#define IGD_PT_I440FX_GET_CLASS(obj)\
+OBJECT_GET_CLASS(IGDPtI440fxClass, (obj),   \
+ TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE)
+
+typedef struct IGDPtI440fxClass {
+PCIDeviceClass parent_class;
+void (*parent_realize)(PCIDevice *dev, Error **errp);
+} IGDPtI440fxClass;
+
 static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp)
 {
+IGDPtI440fxClass *k = IGD_PT_I440FX_GET_CLASS(pci_dev);
 Error *err = NULL;
 
-i440fx_realize(pci_dev, &err);
+k->parent_realize(pci_dev, &err);
 if (err != NULL) {
 error_propagate(errp, err);
 return;
@@ -72,11 +84,12 @@ static void igd_pt_i440fx_realize(PCIDevice *pci_dev, Error **errp)
 
 static void igd_passthrough_i440fx_class_init(ObjectClass *klass, void *data)
 {
+IGDPtI440fxClass *k = IGD_PT_I440FX_CLASS(klass);
 DeviceClass *dc = DEVICE_CLASS(klass);
-PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+PCIDeviceClass *pc = PCI_DEVICE_CLASS(klass);
 
-i440fx_realize = k->realize;
-k->realize = igd_pt_i440fx_realize;
+k->parent_realize = pc->realize;
+pc->realize = igd_pt_i440fx_realize;
 dc->desc = "IGD Passthrough Host bridge (i440fx)";
 }
 
@@ -84,6 +97,7 @@ static const TypeInfo igd_passthrough_i440fx_info = {
 .name  = TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE,
 .parent= TYPE_I440FX_PCI_DEVICE,
 .class_init= igd_passthrough_i440fx_class_init,
+.class_size= sizeof(IGDPtI440fxClass),
 };
 
 static void (*q35_realize)(PCIDevice *pci_dev, Error **errp);
-- 
1.8.3.1

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxc: Expose the MPX cpuid flag to guest

2016-01-20 Thread Li, Liang Z
> Cc: wei.l...@citrix.com; ian.campb...@citrix.com;
> stefano.stabell...@eu.citrix.com; ian.jack...@eu.citrix.com; xen-
> de...@lists.xen.org; jbeul...@suse.com
> Subject: Re: [Xen-devel] [PATCH] libxc: Expose the MPX cpuid flag to guest
> 
> On Mon, Jan 11, 2016 at 04:52:10PM +0800, Liang Li wrote:
> > If hardware support memory protect externsion, expose this feature
> 
> extension

Thanks!

Liang


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Peng Fan
Hi Juergen,

On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote:
>On 20/01/16 09:31, Peng Fan wrote:
>> Introduce pvclk interface which is useful when doing device passthrough
>> on ARM platform.
>
>...
>
>> +/*
>> + * Frontend request
>> + *
>> + * cmd: command for operation on clk, should be XEN_CLK_[xx],
>> + *  excluding XEN_CLK_END. id is the
>> + * id: clk id
>> + * rate: clock rate. Used for set rate.
>
>Which unit? Hz?

Yeah. Hz. I'll add comments in V3.

>
>> + */
>> +struct xen_clkif_request {
>> +uint32_t cmd;
>> +uint32_t id;
>> +uint64_t rate;
>> +};
>> +typedef struct xen_clkif_request xen_clkif_request_t;
>> +
>> +/*
>> + * Backend response
>> + *
>> + * cmd: command for operation on clk, same with the cmd in request.
>> + * id: clk id, same with the id in request.
>> + * success: indicate failure or success for the cmd.
>
>Values?

I'd like to let the frontend/backend driver to determine the value.
In my implementation for linux, if the backend API supports return value,
I'll fill the return value to success entry. If the backend API returns
void, I'll just fill 0 to success entry.

>
>> + * rate: clock rate. Used for get rate.
>> + *
>> + * cmd and id are filled by backend and passed to frontend to
>> + * let frontend check whether the response is for the current
>> + * request or not.
>
>I'd rather let the frontend add a request Id to the request which
>will be echoed here instead cmd and clock Id.

If using request id, I think we can encode cmd and id into request id?
To my dts case, clk id is simple. But I am not sure about ACPI part. Maybe
I can not simply encode them into request id.
Wait for more comments on this:)

Thanks,
Peng.
>
>
>Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Error booting Xen

2016-01-20 Thread Jan Beulich
>>> On 19.01.16 at 20:55,  wrote:
> I tried booting staging branch but results were identical. Following
> line repeats endlessly.
> (XEN) traps.c:3290: GPF (): 82d0801c1cce -> 82d080252e5c
> 
> $ 'addr2line -e xen-syms 82d0801c1cce' returns
> 'xen/xen/arch/x86/xstate.c:387' which again points to
> xsave. Also, adding 'xsave=0' makes it boot just fine.

Ah, I think I see the issue: We're zeroing the entire save area in
the fixup code, which will make XRSTORS fault unconditionally.
Shuai, would you please look into possible ways of fixing this?
Just setting the compressed flag may not be enough, and falling
back to plain XRSTOR doesn't seem to be an option either - I'm in
particular worried that the current model of zeroing everything
is bogus with the growing number of different components which
get loaded here.

But of course another question then is why the XRSTORS faults
in the first place. I guess we'll need a debugging patch to dump
the full state to understand that.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 09:31,  wrote:
> +/*
> + * Backend response
> + *
> + * cmd: command for operation on clk, same with the cmd in request.
> + * id: clk id, same with the id in request.
> + * success: indicate failure or success for the cmd.
> + * rate: clock rate. Used for get rate.
> + *
> + * cmd and id are filled by backend and passed to frontend to
> + * let frontend check whether the response is for the current
> + * request or not.
> + */
> +struct xen_clkif_response {
> + uint32_t cmd;
> + uint32_t id;
> + uint32_t success;
> + uint64_t rate;
> +};

This isn't 32-/64-bit clean. Question is whether echoing cmd is really
needed.

Also naming a field "success" is pretty odd - is this mean to be a
boolean? Better name it e.g. status, allowing for different (error)
indicators.

And what I'm missing throughout the file is some description of how
clock events (interrupts?) are actually meant to make it into the
guest.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-20 Thread Haozhong Zhang
On 01/20/16 08:58, Andrew Cooper wrote:
> On 20/01/2016 08:46, Jan Beulich wrote:
>  On 20.01.16 at 06:31,  wrote:
> >> The primary reason of current solution is to reuse existing NVDIMM
> >> driver in Linux kernel.
> > Re-using code in the Dom0 kernel has benefits and drawbacks, and
> > in any event needs to depend on proper layering to remain in place.
> > A benefit is less code duplication between Xen and Linux; along the
> > same lines a drawback is code duplication between various Dom0
> > OS variants.
> >
> >> One responsibility of this driver is to discover NVDIMM devices and
> >> their parameters (e.g. which portion of an NVDIMM device can be mapped
> >> into the system address space and which address it is mapped to) by
> >> parsing ACPI NFIT tables. Looking at the NFIT spec in Sec 5.2.25 of
> >> ACPI Specification v6 and the actual code in Linux kernel
> >> (drivers/acpi/nfit.*), it's not a trivial task.
> > To answer one of Kevin's questions: The NFIT table doesn't appear
> > to require the ACPI interpreter. They seem more like SRAT and SLIT.
> > Also you failed to answer Kevin's question regarding E820 entries: I
> > think NVDIMM (or at least parts thereof) get represented in E820 (or
> > the EFI memory map), and if that's the case this would be a very
> > strong hint towards management needing to be in the hypervisor.
>

CCing QEMU vNVDIMM maintainer: Xiao Guangrong

> Conceptually, an NVDIMM is just like a fast SSD which is linearly mapped
> into memory.  I am still on the dom0 side of this fence.
> 
> The real question is whether it is possible to take an NVDIMM, split it
> in half, give each half to two different guests (with appropriate NFIT
> tables) and that be sufficient for the guests to just work.
>

Yes, one NVDIMM device can be split into multiple parts and assigned
to different guests, and QEMU is responsible to maintain virtual NFIT
tables for each part.

> Either way, it needs to be a toolstack policy decision as to how to
> split the resource.
>

But the split does not need to be done at Xen side IMO. It can be done
by dom0 kernel and QEMU as long as they tells Xen hypervisor the
address space range of each part.

Haozhong

> ~Andrew
> 
> >
> >> Secondly, the driver implements a convenient block device interface to
> >> let software access areas where NVDIMM devices are mapped. The
> >> existing vNVDIMM implementation in QEMU uses this interface.
> >>
> >> As Linux NVDIMM driver has already done above, why do we bother to
> >> reimplement them in Xen?
> > See above; a possibility is that we may need a split model (block
> > layer parts on Dom0, "normal memory" parts in the hypervisor.
> > Iirc the split is being determined by firmware, and hence set in
> > stone by the time OS (or hypervisor) boot starts.
> >
> > Jan
> >
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 10:25,  wrote:
> On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote:
>>On 20/01/16 09:31, Peng Fan wrote:
>>> + */
>>> +struct xen_clkif_request {
>>> +   uint32_t cmd;
>>> +   uint32_t id;
>>> +   uint64_t rate;
>>> +};
>>> +typedef struct xen_clkif_request xen_clkif_request_t;
>>> +
>>> +/*
>>> + * Backend response
>>> + *
>>> + * cmd: command for operation on clk, same with the cmd in request.
>>> + * id: clk id, same with the id in request.
>>> + * success: indicate failure or success for the cmd.
>>
>>Values?
> 
> I'd like to let the frontend/backend driver to determine the value.

No, that would imply matched pairs of frontends and backends. Yet
the purpose of spelling out an interface is to allow interoperability.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter max_ranges.

2016-01-20 Thread Ian Campbell
On Wed, 2016-01-20 at 03:58 +, Tian, Kevin wrote:
> > From: Yu, Zhang [mailto:yu.c.zh...@linux.intel.com]
> > Sent: Wednesday, January 20, 2016 11:33 AM
> > > As a feature this write-protection has nothing to be GPU
> > > virtualization specific.
> > > In the future the same mediated pass-through idea used in XenGT may
> > > be
> > > used on other I/O devices which need to shadow some structure w/
> > > requirement
> > > to write-protect guest memory. So it's not good to tie this to either
> > > XenGT
> > > or GTT.
> > > 
> > Thank you, Kevin.
> > Well, if this parameter is not supposed to be xengt specific, we do not
> > need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1".
> > Hence the user will have to configure the max_wp_ram_ranges himself,
> > right?
> > 
> 
> Not always. The option can be configured manually by the user, or 
> automatically set in the code when "vgt=1" is recognized.

Is the latter approach not always sufficient? IOW, if it can be done
automatically, why would the user need to tweak it?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter max_ranges.

2016-01-20 Thread Paul Durrant
> -Original Message-
> From: Ian Campbell [mailto:ian.campb...@citrix.com]
> Sent: 20 January 2016 10:16
> To: Kevin Tian; Yu, Zhang; Wei Liu; Paul Durrant
> Cc: Keir (Xen.org); jbeul...@suse.com; Andrew Cooper; xen-
> de...@lists.xen.org; Lv, Zhiyuan; Stefano Stabellini
> Subject: Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter
> max_ranges.
> 
> On Wed, 2016-01-20 at 03:58 +, Tian, Kevin wrote:
> > > From: Yu, Zhang [mailto:yu.c.zh...@linux.intel.com]
> > > Sent: Wednesday, January 20, 2016 11:33 AM
> > > > As a feature this write-protection has nothing to be GPU
> > > > virtualization specific.
> > > > In the future the same mediated pass-through idea used in XenGT may
> > > > be
> > > > used on other I/O devices which need to shadow some structure w/
> > > > requirement
> > > > to write-protect guest memory. So it's not good to tie this to either
> > > > XenGT
> > > > or GTT.
> > > >
> > > Thank you, Kevin.
> > > Well, if this parameter is not supposed to be xengt specific, we do not
> > > need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1".
> > > Hence the user will have to configure the max_wp_ram_ranges himself,
> > > right?
> > >
> >
> > Not always. The option can be configured manually by the user, or
> > automatically set in the code when "vgt=1" is recognized.
> 
> Is the latter approach not always sufficient? IOW, if it can be done
> automatically, why would the user need to tweak it?
>

I think latter is sufficient for now. We always have the option of adding a 
specific wp_ram_ranges parameter in future if there is a need.

  Paul
 
> Ian.
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter max_ranges.

2016-01-20 Thread Wei Liu
On Wed, Jan 20, 2016 at 01:02:39PM +0800, Yu, Zhang wrote:
> 
> 
> On 1/20/2016 11:58 AM, Tian, Kevin wrote:
> >>From: Yu, Zhang [mailto:yu.c.zh...@linux.intel.com]
> >>Sent: Wednesday, January 20, 2016 11:33 AM
> >>>As a feature this write-protection has nothing to be GPU virtualization 
> >>>specific.
> >>>In the future the same mediated pass-through idea used in XenGT may be
> >>>used on other I/O devices which need to shadow some structure w/ 
> >>>requirement
> >>>to write-protect guest memory. So it's not good to tie this to either XenGT
> >>>or GTT.
> >>>
> >>Thank you, Kevin.
> >>Well, if this parameter is not supposed to be xengt specific, we do not
> >>need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1".
> >>Hence the user will have to configure the max_wp_ram_ranges himself,
> >>right?
> >>
> >
> >Not always. The option can be configured manually by the user, or
> >automatically set in the code when "vgt=1" is recognized.
> 
> OK. That sounds more reasonable. :)
> To give a summary, I'll do the following changes in next version:
> 
> 1> rename this new parameter to "max_wp_ram_ranges", then use this
> parameter as the wp-ram rangeset limit, for the I/O rangeset, keep
> MAX_NR_IO_RANGES as its limit;
> 2> clear the documentation part;
> 3> define a LIBXL_HAVE_XXX in libxl.h to indicate a new field in the
> build info;
> 4> We do not introduce the xengt flag by now, and will add code to
> automatically set the "max_wp_ram_ranges" after this flag is accepted
> in the future.
> 
> Does anyone have more suggestions? :)
> 

Ian posted an enquiry earlier:

"Could we use something like one of those to cause the t/stack to just
DTRT without the user having to micromanage the amount of pages which
are allowed to have this property?"

Is that possible?

Wei.


> B.R.
> Yu
> >
> >Thanks
> >Kevin
> >

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: Improvements to clean and distclean targets

2016-01-20 Thread George Dunlap
On Tue, Jan 19, 2016 at 9:38 AM, Ian Campbell  wrote:
> On Tue, 2016-01-19 at 01:43 -0700, Jan Beulich wrote:
>> > > > On 18.01.16 at 19:19,  wrote:
>> > On 18/01/16 16:57, Jan Beulich wrote:
>> > > > > > On 18.01.16 at 17:45,  wrote:
>> > > > On 18/01/16 16:41, Jan Beulich wrote:
>> > > > > > > > On 18.01.16 at 17:27,  wrote:
>> > > > > > * Move '*~' and 'core' into the find rule.
>> > > > > I don't understand this part: Where in the build process do such
>> > > > > get
>> > > > > generated? I'm tempted to instead recommend to just drop those
>> > > > > from the rm invocation...
>> > > > No idea about 'core' files, but *~ are emacs backup files.
>> > > But emacs should clean up after itself; this shouldn't be the job
>> > > of our clean rule.
>> >
>> > Why? the point is to have a one-revision old version of the file to
>> > hand.
>>
>> I guess there may be different strategies here: My editor also
>> creates such named files, but deletes them as the program gets
>> shut down. I.e. the one-revision old backup exists as long as the
>> program is running. I can see benefits from the alternative
>> model, but still it shouldn't be our scripts to clean up such backups.
>> After all - what if another program used another name patter for
>> its backups? Would we go clean those up then too?
>
> IMHO these files should be in .gitignore (so they don't clutter "git
> status", AFAICT this is already done correctly) but it's not really
> necessary for "make clean" (or distclean) to get rid of them, that's up to
> either the editor or the user. IOW I'd be happy removing the existing
> rules.

As an emacs user, I agree with this.  The purpose of "make clean" IMO
is be to make sure that the *build* operates cleanly (i.e., doesn't
end up using any output generated from a previous build), not to get
rid of extraneous random files that don't affect the build.  "git
clean" is the proper tool for cleaning out the tree for git commands.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.

2016-01-20 Thread Xu, Quan
Jan, thanks for your comments.

> On January 15, 2016 at 9:10,  wrote:
> >>> On 23.12.15 at 09:25,  wrote:
> > --- a/xen/drivers/passthrough/vtd/qinval.c
> > +++ b/xen/drivers/passthrough/vtd/qinval.c
> > @@ -190,9 +190,19 @@ static int queue_invalidate_wait(struct iommu
> > *iommu,  static int invalidate_sync(struct iommu *iommu)  {
> >  struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
> > +int rc;
> >
> >  if ( qi_ctrl->qinval_maddr )
> > -return queue_invalidate_wait(iommu, 0, 1, 1);
> > +{
> > +rc = queue_invalidate_wait(iommu, 0, 1, 1);
> > +if ( rc )
> > +{
> > +if ( rc == -ETIMEDOUT )
> > +panic("Queue invalidate wait descriptor was
> > + timeout.\n");
> 


> Unless I'm overlooking something this doesn't replace and existing panic(), 
> and I
> think I had indicated quite clearly that this series shouldn't add any new 
> ones,
> unless the alternative of crashing the owning domain is completely unfeasible.
> 


I will remove it in next v5.

[...]

> > @@ -229,6 +239,63 @@ int qinval_device_iotlb(struct iommu *iommu,
> >  return 0;
> >  }
> >
> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
> > + u16 seg, u8 bus, u8 devfn)
> {
> > +struct domain *d;
> > +struct pci_dev *pdev;
> > +
> > +d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> > +ASSERT(d);
> 
> Don't you need to acquire some lock in order to safely assert this?

Referred to the other use case of 'rcu_lock_domain_by_id()', Xen checks whether
The domain is 'NULL'. 
Could I also replace the 'ASSERT(d)' with
  + If ( d == NULL )
  +   return;
?

> Also note that unused slots hold zero, i.e. there's at least a theoretical 
> risk of
> problems here when you don't also look at
> iommu->domid_bitmap.
> 
I am not clear how to fix this point. Do you have good idea?
Add a lock to 'iommu->domid_bitmap'?


> > +for_each_pdev(d, pdev)
> > +{
> > +if ( (pdev->seg == seg) &&
> > + (pdev->bus == bus) &&
> > + (pdev->devfn == devfn) )
> > +{
> > +if ( pdev->domain )
> 
> If the device is on the domain's list, can this reasonably be false?

I can't find the false case.

> I.e. don't you rather mean ASSERT() here?

I'd better replace 'if' with ASSERT() for this case.

> 
> > +{
> > +list_del(&pdev->domain_list);
> 
> This should happen under pcidevs_lock - you need to either acquire it or
> ASSERT() it being held.
> 

I should check whether it is under pcidevs_lock -- with 
spin_is_locked(&pcidevs_lock)
If it is not under pcidevs_lock, I should acquire it.
I think ASSERT() is not a good idea. Hypervisor acquires this lock and then 
remove the resource.



> > +}
> > +
> > +if ( !is_hardware_domain(d) )
> > +domain_crash(d);
> 
> I wonder whether the device hiding shouldn't also be avoided if the device is
> owned by hwdom.
> 

I think it should be avoided if the device is owned by hwdom.
Otherwise, it is quite similar to panic..


> > @@ -349,9 +416,18 @@ static int flush_iotlb_qi(
> >  queue_invalidate_iotlb(iommu,
> > type >>
> DMA_TLB_FLUSH_GRANU_OFFSET, dr,
> > dw, did, size_order, 0, addr);
> > +
> > +/*
> > + * Synchronize with hardware for invalidation request descriptors
> > + * submitted before Device-TLB invalidate descriptor.
> > + */
> > +rc = invalidate_sync(iommu);
> > +if ( rc )
> > + return rc;
> > +
> >  if ( flush_dev_iotlb )
> >  ret = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
> > -rc = invalidate_sync(iommu);
> > +
> >  if ( !ret )
> >  ret = rc;
> >  }
> 
> This change is because of ...?
> 
As Kevin's comments,
I add 'dev_invalidate_iotlb_sync()' to dev_invalidate_iotlb(). 
Then i can know which device is flush timeout.



> > --- a/xen/drivers/passthrough/vtd/x86/ats.c
> > +++ b/xen/drivers/passthrough/vtd/x86/ats.c
> > @@ -162,6 +162,19 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16
> did,
> >  return -EOPNOTSUPP;
> >  }
> >
> > +/*
> > + * Synchronize with hardware for Device-TLB invalidate
> > + * descriptor.
> > + */
> > +ret = dev_invalidate_iotlb_sync(iommu, did, pdev->seg,
> > +pdev->bus, pdev->devfn);
> > +if ( ret )
> > +{
> > +dprintk(XENLOG_WARNING VTDPREFIX,
> > +"Device-TLB flush timeout.\n");
> 
> Are you aware that dprintk() compiles to nothing in non-debug builds?


To be honest, I was not aware.

> Plus logging the error code and device coordinates might turn out helpful in
> such cases. But first of all - if there was a timeout, we'd make it here only 
> for
> Dom0. Perhaps the printing should 

Re: [Xen-devel] Bug 1591 - xentop truncates VM names

2016-01-20 Thread Ian Campbell
On Wed, 2016-01-20 at 09:56 +0100, 
> Hi!
>  
> Can anybody tell me please where I have to post bugs like this to
> find the right contact person who is qualified to fix or even to discuss
> it?
>  
> Is xen-devel@lists.xen.org the right list for this?

For XenServer related bugs:

If you are a Citrix customer in which case you will have a Support contact
which you should use.

Otherwise the right place is somewhere on www.xenserver.org (forums, lists
etc), since that is a separate downstream project from xen.org
(xenproject).

For upstream/XenProject related buts:
xen-devel@lists.xen.org is the right place, the process/expectations are
described more fully at http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen
 .

At least in recent xen.org releases xentop takes a -f/--full-name option
which does not truncate the name:

root@st40:~# xentop -f -b

  NAME  STATE   CPU(sec) CPU(%) MEM(k) 
MEM(%)  MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD 
  VBD_WR  VBD_RSECT  VBD_WSECT SSID
aa
 -r 560.0 5242886.3 525312   6.3 11 
  53120 1991   43  507388240

  Domain-0 -r3080.0 522484  
  6.2 524288   6.3 8000000  
  0  0  00


So if this isn't present in XenServer then you want to be talking to them
or your Support person.

Ian.
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen: allocate gntdev_copy_batch dynamically

2016-01-20 Thread Arnd Bergmann
struct gntdev_copy_batch is arguably too large to fit on the kernel stack,
and we get a warning about the stack usage in gntdev_ioctl_grant_copy:

drivers/xen/gntdev.c:949:1: error: the frame size of 1240 bytes is larger than 
1024 bytes

This changes the code to us a dynamic allocation instead.

Signed-off-by: Arnd Bergmann 
Fixes: a4cdb556cae0 ("xen/gntdev: add ioctl for grant copy")
---
This is a regression against v4.4, found with ARM randconfig testing.

diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index dc495383ad73..cc753b3a7154 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -915,15 +915,16 @@ static int gntdev_grant_copy_seg(struct gntdev_copy_batch 
*batch,
 static long gntdev_ioctl_grant_copy(struct gntdev_priv *priv, void __user *u)
 {
struct ioctl_gntdev_grant_copy copy;
-   struct gntdev_copy_batch batch;
+   struct gntdev_copy_batch *batch;
unsigned int i;
int ret = 0;
 
if (copy_from_user(©, u, sizeof(copy)))
return -EFAULT;
 
-   batch.nr_ops = 0;
-   batch.nr_pages = 0;
+   batch = kzalloc(sizeof(*batch), GFP_KERNEL);
+   if (!batch)
+   return -ENOMEM;
 
for (i = 0; i < copy.count; i++) {
struct gntdev_grant_copy_segment seg;
@@ -933,18 +934,20 @@ static long gntdev_ioctl_grant_copy(struct gntdev_priv 
*priv, void __user *u)
goto out;
}
 
-   ret = gntdev_grant_copy_seg(&batch, &seg, 
©.segments[i].status);
+   ret = gntdev_grant_copy_seg(batch, &seg, 
©.segments[i].status);
if (ret < 0)
goto out;
 
cond_resched();
}
-   if (batch.nr_ops)
-   ret = gntdev_copy(&batch);
+   if (batch->nr_ops)
+   ret = gntdev_copy(batch);
+   kfree(batch);
return ret;
 
   out:
-   gntdev_put_pages(&batch);
+   gntdev_put_pages(batch);
+   kfree(batch);
return ret;
 }
 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Juergen Gross
On 20/01/16 10:25, Peng Fan wrote:
> Hi Juergen,
> 
> On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote:
>> On 20/01/16 09:31, Peng Fan wrote:
>>> Introduce pvclk interface which is useful when doing device passthrough
>>> on ARM platform.
>>
>> ...
>>
>>> +/*
>>> + * Frontend request
>>> + *
>>> + * cmd: command for operation on clk, should be XEN_CLK_[xx],
>>> + * excluding XEN_CLK_END. id is the
>>> + * id: clk id
>>> + * rate: clock rate. Used for set rate.
>>
>> Which unit? Hz?
> 
> Yeah. Hz. I'll add comments in V3.
> 
>>
>>> + */
>>> +struct xen_clkif_request {
>>> +   uint32_t cmd;
>>> +   uint32_t id;
>>> +   uint64_t rate;
>>> +};
>>> +typedef struct xen_clkif_request xen_clkif_request_t;
>>> +
>>> +/*
>>> + * Backend response
>>> + *
>>> + * cmd: command for operation on clk, same with the cmd in request.
>>> + * id: clk id, same with the id in request.
>>> + * success: indicate failure or success for the cmd.
>>
>> Values?
> 
> I'd like to let the frontend/backend driver to determine the value.
> In my implementation for linux, if the backend API supports return value,
> I'll fill the return value to success entry. If the backend API returns
> void, I'll just fill 0 to success entry.

So please specify "0" to mean success and add some sensible error
values. There might be multiple frontend- and/or backend-variants which
all must agree using the same interface.

>>> + * rate: clock rate. Used for get rate.
>>> + *
>>> + * cmd and id are filled by backend and passed to frontend to
>>> + * let frontend check whether the response is for the current
>>> + * request or not.
>>
>> I'd rather let the frontend add a request Id to the request which
>> will be echoed here instead cmd and clock Id.
> 
> If using request id, I think we can encode cmd and id into request id?

This should just be an opaque value for the backend. The frontend is
free how to create this value, it should be unique for every outstanding
request of a domU, however. It could be built from cmd and id in case
there can't be multiple requests with the same cmd/id combination
active in a domU.


juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-20 Thread Xiao Guangrong


Hi,

On 01/20/2016 06:15 PM, Haozhong Zhang wrote:


CCing QEMU vNVDIMM maintainer: Xiao Guangrong


Conceptually, an NVDIMM is just like a fast SSD which is linearly mapped
into memory.  I am still on the dom0 side of this fence.

The real question is whether it is possible to take an NVDIMM, split it
in half, give each half to two different guests (with appropriate NFIT
tables) and that be sufficient for the guests to just work.



Yes, one NVDIMM device can be split into multiple parts and assigned
to different guests, and QEMU is responsible to maintain virtual NFIT
tables for each part.


Either way, it needs to be a toolstack policy decision as to how to
split the resource.


Currently, we are using NVDIMM as a block device and a DAX-based filesystem
is created upon it in Linux so that file-related accesses directly reach
the NVDIMM device.

In KVM, If the NVDIMM device need to be shared by different VMs, we can
create multiple files on the DAX-based filesystem and assign the file to
each VMs. In the future, we can enable namespace (partition-like) for PMEM
memory and assign the namespace to each VMs (current Linux driver uses the
whole PMEM as a single namespace).

I think it is not a easy work to let Xen hypervisor recognize NVDIMM device
and manager NVDIMM resource.

Thanks!


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH LIBVIRT] libxl: Support cmdline= in xl config files

2016-01-20 Thread Ian Campbell
On Tue, 2016-01-19 at 21:46 -0700, Jim Fehlig wrote:
> On 01/19/2016 05:03 AM, Ian Campbell wrote:
> > I went to ping this but noticed that I had sent it to "jimfehlig" (i.e.
> > no
> > domain), so no wonder there was no reply!
> > 
> > To: line fixed here, let me know if you would prefer a resend.
> 
> That would be much appreciated, thanks!
> 
> > 
> > Ian.
> > 
> > On Wed, 2015-12-16 at 12:09 +, Ian Campbell wrote:
> > > ... and consolidate the cmdline/extra/root parsing to facilitate
> > > doing
> > > so.
> > > 
> > > The logic is the same as xl's parse_cmdline from the current xen.git
> > > master
> > > branch (e6f0e099d2c17de47fd86e817b1998db903cab61), except I was
> > > unable
> > > to figure out how/where to route the warning about ignoring
> > > root+extra if cmdline was specified.
> 
> I think VIR_WARN() would be appropriate.

Ok, will do, thanks.

> > > 
> > > Signed-off-by: Ian Campbell 
> > > ---
> > >  src/xenconfig/xen_xl.c | 62 ++
> > > 
> > > 
> > >  1 file changed, 37 insertions(+), 25 deletions(-)
> > > 
> > > diff --git a/src/xenconfig/xen_xl.c b/src/xenconfig/xen_xl.c
> > > index 91cdff6..ba8b938 100644
> > > --- a/src/xenconfig/xen_xl.c
> > > +++ b/src/xenconfig/xen_xl.c
> > > @@ -58,11 +58,45 @@ extern int xlu_disk_parse(XLU_Config *cfg,
> > >    libxl_device_disk *disk);
> > >  #endif
> > >  
> > > +static int xenParseCmdline(virConfPtr conf, char **r_cmdline)
> > > +{
> > > +char *cmdline = NULL;
> > > +const char *root = NULL, *extra = NULL, *buf = NULL;
> 
> In theory, these three don't need to be initialized since
> xenConfigGetString
> will do that. But in practice, I worry that Coverity might complain :-/.

It looks like some of the callers of xenConfigGetString initialise the
value to NULL, while others don't. I can't see any public libvirt scan
results to look if some of the ones which don't have been picked up or not.

I've just noticed also that the code I am moving/removing didn't initialise
to NULL, so I think I'll remove these initialisers.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 78525: regressions - FAIL

2016-01-20 Thread osstest service owner
flight 78525 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/78525/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemut-win7-amd64  9 windows-installfail REGR. vs. 77892
 test-amd64-i386-xl-qemuu-win7-amd64  9 windows-installfail REGR. vs. 77892

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds15 guest-start/debian.repeat fail blocked in 77892
 test-amd64-i386-rumpuserxen-i386 10 guest-startfail like 77892
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
like 77892
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 77892
 test-amd64-amd64-libvirt-vhd  9 debian-di-installfail   like 77892

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd   9 debian-di-installfail   never pass

version targeted for testing:
 xen  0c2f1283486f953604a282188a06e2db143f475d
baseline version:
 xen  f7347a282420a5edc74afb31e7c42c2765f24de5

Last test of basis77892  2016-01-12 11:30:40 Z7 days
Failing since 77945  2016-01-13 04:01:14 Z7 days   11 attempts
Testing same since78525  2016-01-19 13:38:51 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Ashwin Chaugule 
  Boris Ostrovsky 
  Brendan Gregg 
  Daniel De Graaf 
  Doug Goldstein 
  Edgar E. Iglesias 
  Feng Tang 
  Grant Likely 
  Hanjun Guo 
  Haozhong Zhang 
  Ian Campbell 
  Ian Jackson 
  Jan Beulich 
  Jonathan Creekmore 
  Juergen Gross 
  Kevin Tian 
  Konrad Rzeszutek Wilk 
  Len Brown 
  Parth Dixit 
  Paul Durrant 
  Rafael J. Wysocki 
  Razvan Cojocaru 
  Roger Pau Monne 
  Roger Pau Monné 
  Shannon Zhao 
  Tamas K Lengyel 
  Tomasz Nowicki 
  Wei Liu 
  Wei Liu  for tools bits

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  

[Xen-devel] OSSTEST: Re-blessing cubietruck-{picasso, gleizes, metzinger} for production use

2016-01-20 Thread Ian Campbell
I've been running a couple of adhoc production tests per day on these since
before Xmas and they haven't lost sight of their disks again.

TLDR; I think we should throw them back in the pool.

With the recent timeout fixes they are working as well as the production
cubietruck-braque.

There are two flakey tests test-armhf-armhf-xl-rtds and test-armhf-armhf-
libvirt-raw, but those appear to be much better than before the timeout
changes and not specific to these three boards since the fourth one looks
to behave much the same.

At first glance it looks like some later test steps might just need a bit
more time on CT too.

FWIW comparison of production vs commission runs can be seen at

http://osstest.test-lab.xenproject.org/~ianc/commission/history/test-armhf-armhf-xl-rtds/xen-unstable.html
vs
http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-xl-rtds/xen-unstable.html

and

http://osstest.test-lab.xenproject.org/~ianc/commission/history/test-armhf-armhf-libvirt-raw/xen-unstable.html
vs
http://logs.test-lab.xenproject.org/osstest/results/history/test-armhf-armhf-libvirt-raw/xen-unstable.html

(nb: osstest e23886aa8fe7 is when the timeout fixes went live)

Unfortunately the second one has always run on arndale-*, which doesn't
give any data on cubietruck.


Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-mingo-tip-master test] 78581: regressions - FAIL

2016-01-20 Thread osstest service owner
flight 78581 linux-mingo-tip-master real [real]
http://logs.test-lab.xenproject.org/osstest/logs/78581/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-rumpuserxen-i386 10 guest-start   fail REGR. vs. 60684

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop   fail blocked in 60684

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass

version targeted for testing:
 linux3d1c05c9c775c565b6f40179e42a5cf8b6043d4e
baseline version:
 linux69f75ebe3b1d1e636c4ce0a0ee248edacc69cbe0

Last test of basis60684  2015-08-13 04:21:46 Z  160 days
Failing since 60712  2015-08-15 18:33:48 Z  157 days  109 attempts
Testing same since78581  2016-01-20 04:20:47 Z0 days1 attempts

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-xl-xsm  pass
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64   pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-credit2  pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-i386-rumpuserxen-i386 fail
 tes

Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-20 Thread Haozhong Zhang
On 01/20/16 01:46, Jan Beulich wrote:
> >>> On 20.01.16 at 06:31,  wrote:
> > The primary reason of current solution is to reuse existing NVDIMM
> > driver in Linux kernel.
>

CC'ing QEMU vNVDIMM maintainer: Xiao Guangrong

> Re-using code in the Dom0 kernel has benefits and drawbacks, and
> in any event needs to depend on proper layering to remain in place.
> A benefit is less code duplication between Xen and Linux; along the
> same lines a drawback is code duplication between various Dom0
> OS variants.
>

Not clear about other Dom0 OS. But for Linux, it already has a NVDIMM
driver since 4.2.

> > One responsibility of this driver is to discover NVDIMM devices and
> > their parameters (e.g. which portion of an NVDIMM device can be mapped
> > into the system address space and which address it is mapped to) by
> > parsing ACPI NFIT tables. Looking at the NFIT spec in Sec 5.2.25 of
> > ACPI Specification v6 and the actual code in Linux kernel
> > (drivers/acpi/nfit.*), it's not a trivial task.
> 
> To answer one of Kevin's questions: The NFIT table doesn't appear
> to require the ACPI interpreter. They seem more like SRAT and SLIT.

Sorry, I made a mistake in another reply. NFIT does not contain
anything requiring ACPI interpreter. But there are some _DSM methods
for NVDIMM in SSDT, which needs ACPI interpreter.

> Also you failed to answer Kevin's question regarding E820 entries: I
> think NVDIMM (or at least parts thereof) get represented in E820 (or
> the EFI memory map), and if that's the case this would be a very
> strong hint towards management needing to be in the hypervisor.
>

Legacy NVDIMM devices may use E820 entries or other ad-hoc ways to
announce their locations, but newer ones that follow ACPI v6 spec do
not need E820 any more and only need ACPI NFIT (i.e. firmware may not
build E820 entries for them).

The current linux kernel can handle both legacy and new NVDIMM devices
and provide the same block device interface for them.

> > Secondly, the driver implements a convenient block device interface to
> > let software access areas where NVDIMM devices are mapped. The
> > existing vNVDIMM implementation in QEMU uses this interface.
> > 
> > As Linux NVDIMM driver has already done above, why do we bother to
> > reimplement them in Xen?
> 
> See above; a possibility is that we may need a split model (block
> layer parts on Dom0, "normal memory" parts in the hypervisor.
> Iirc the split is being determined by firmware, and hence set in
> stone by the time OS (or hypervisor) boot starts.
>

For the "normal memory" parts, do you mean parts that map the host
NVDIMM device's address space range to the guest? I'm going to
implement that part in hypervisor and expose it as a hypercall so that
it can be used by QEMU.

Haozhong

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling

2016-01-20 Thread Dario Faggioli
On Wed, 2016-01-20 at 01:35 -0700, Jan Beulich wrote:
> > > > On 20.01.16 at 08:49,  wrote:
> > > 
> > We need to call arch_vcpu_block() before
> > local_events_need_delivery(),
> > since VT-d hardware can issue notification event when we are in
> > arch_vcpu_block(), it that happens, 'ON' bit is set, then we will
> > check
> > it in local_events_need_delivery(). If we do arch_vcpu_block() in
> > the
> > else part, we cannot handle this. This is one reason I can recall,
> > I am
> > not sure whether there are other concerns since it has been really
> > a long time. The current implementation is fully discussed with
> > scheduler
> > maintainers.
> 
> Okay, this all makes sense. 
>
Yes, I proposed / asked about that here:
http://bugs.xenproject.org/xen/mid/%3c1445948204.2937.130.ca...@citrix.com%3E

As Feng says, the whole idea is doing one last check before actually
blocking, to see whether we can avoid doing so, if an event has arrived
in the meantime.

It can be seen as an optimization (whether a really effective one, I
can't tell). In fact, an event can well come just immediately after
local_events_need_delivery() has returned false!

We somehow do the same thing already in vcpu_block(), and consistency
with that was what convinced me that we can live with Feng's code as it
is in this patch.

> It's just that I don't see how the ON
> bit would get checked by local_events_need_delivery(). But that
> may in turn be because, with the long turnaround, I've lost some
> of the details of the rest of this series.
> 
I think what Feng means is "if that happens [== that an event arrives],
'ON' bit is set, then we will check it [== whether an event has
arrived] in local_events_need_delivery()."

If *no* event has arrived, no need to do anything particular, we can
just block. If an event *has* arrived in time for
local_events_need_delivery() to see it, we avoid blocking. We do need
to rollback what arch_vcpu_block() did, and that is done while leaving
the hypervisor, as soon as we realize we haven't blocked.

If we put the hook in 'else', we'd always go down the full blocking
path, even if an event became pending while arranging for that, and
we'll have to be woken again, as soon as the event is handled. How
effective an optimization this is depends on how probable and frequent
it is that we get interrupts during the execution of the hook, which,
as said, it's not easy to either just tell or measure.

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter max_ranges.

2016-01-20 Thread Yu, Zhang



On 1/20/2016 6:18 PM, Paul Durrant wrote:

-Original Message-
From: Ian Campbell [mailto:ian.campb...@citrix.com]
Sent: 20 January 2016 10:16
To: Kevin Tian; Yu, Zhang; Wei Liu; Paul Durrant
Cc: Keir (Xen.org); jbeul...@suse.com; Andrew Cooper; xen-
de...@lists.xen.org; Lv, Zhiyuan; Stefano Stabellini
Subject: Re: [Xen-devel] [PATCH 3/3] tools: introduce parameter
max_ranges.

On Wed, 2016-01-20 at 03:58 +, Tian, Kevin wrote:

From: Yu, Zhang [mailto:yu.c.zh...@linux.intel.com]
Sent: Wednesday, January 20, 2016 11:33 AM

As a feature this write-protection has nothing to be GPU
virtualization specific.
In the future the same mediated pass-through idea used in XenGT may
be
used on other I/O devices which need to shadow some structure w/
requirement
to write-protect guest memory. So it's not good to tie this to either
XenGT
or GTT.


Thank you, Kevin.
Well, if this parameter is not supposed to be xengt specific, we do not
need to connect it with any xengt flag such as ."vgt=1" or "GVT-g=1".
Hence the user will have to configure the max_wp_ram_ranges himself,
right?



Not always. The option can be configured manually by the user, or
automatically set in the code when "vgt=1" is recognized.


Is the latter approach not always sufficient? IOW, if it can be done
automatically, why would the user need to tweak it?



I think latter is sufficient for now. We always have the option of adding a 
specific wp_ram_ranges parameter in future if there is a need


Thank you all for your reply.
Well, I believe the latter option is only sufficient for most
usage models on BDW due to rangeset's ability to merge continuous
pages into one range, but there might be some extreme cases, e.g.
too many graphic related applications in one VM, which create a
great deal of per-process graphic translation tables. And also,
future cpu platforms might provide even more PPGGTs. So, I suggest
we use this max_wp_ram_ranges, and give the control to the system
administrator. Besides, like Kevin said, XenGT's mediated pass-thru
idea can also be adopted to other devices, and this parameter may
also help.
Also, we have plans to upstream the tool-stack changes later this
year. If this max_wp_ram_ranges is not convenient, we can introduce
a method to automatically set its default value.

B.R.
Yu


   Paul


Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling

2016-01-20 Thread Wu, Feng


> -Original Message-
> From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> Sent: Wednesday, January 20, 2016 7:13 PM
> To: Jan Beulich ; Wu, Feng 
> Cc: Andrew Cooper ; George Dunlap
> ; Tian, Kevin ; xen-
> de...@lists.xen.org; Keir Fraser 
> Subject: Re: [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
> 
> On Wed, 2016-01-20 at 01:35 -0700, Jan Beulich wrote:
> > > > > On 20.01.16 at 08:49,  wrote:
> > > >
> 
> If we put the hook in 'else', we'd always go down the full blocking
> path, even if an event became pending while arranging for that, and
> we'll have to be woken again, as soon as the event is handled. How
> effective an optimization this is depends on how probable and frequent
> it is that we get interrupts during the execution of the hook, which,
> as said, it's not easy to either just tell or measure.

Thanks a lot for your explanation, Dario.

Thanks,
Feng
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling

2016-01-20 Thread Wu, Feng


> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Wednesday, January 20, 2016 4:35 PM
> To: Wu, Feng 
> Cc: Andrew Cooper ; Dario Faggioli
> ; George Dunlap ;
> Tian, Kevin ; xen-devel@lists.xen.org; Keir Fraser
> 
> Subject: RE: [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
> 
> >>> On 20.01.16 at 08:49,  wrote:
> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> >> Sent: Monday, January 18, 2016 11:14 PM
> >> >>> On 03.12.15 at 09:35,  wrote:
> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> > @@ -83,7 +83,131 @@ static int vmx_msr_write_intercept(unsigned int
> >> msr, uint64_t msr_content);
> >> > +ASSERT(v->arch.hvm_vmx.pi_block_cpu == NR_CPUS);
> >> > +
> >> > +/*
> >> > + * The vCPU is blocking, we need to add it to one of the per pCPU
> lists.
> >> > + * We save v->processor to v->arch.hvm_vmx.pi_block_cpu and use
> it
> >> for
> >> > + * the per-CPU list, we also save it to posted-interrupt descriptor
> and
> >> > + * make it as the destination of the wake-up notification event.
> >> > + */
> >> > +v->arch.hvm_vmx.pi_block_cpu = v->processor;
> >> > +
> >> > +spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock,
> >> > +  v->arch.hvm_vmx.pi_block_cpu), flags);
> >> > +list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list,
> >> > +  &per_cpu(pi_blocked_vcpu, v-
> >arch.hvm_vmx.pi_block_cpu));
> >> > +spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock,
> >> > +   v->arch.hvm_vmx.pi_block_cpu), flags);
> >> > +
> >> > +ASSERT(!pi_test_sn(pi_desc));
> >> > +
> >> > +/*
> >> > + * We don't need to set the 'NDST' field, since it should point to
> >> > + * the same pCPU as v->processor.
> >> > + */
> >> > +
> >> > +write_atomic(&pi_desc->nv, pi_wakeup_vector);
> >>
> >> Stray blank line between comment and corresponding code.
> >>
> >> Also the ASSERT() is rather more disconnected from the write
> >> than seems desirable: Wouldn't it be better to cmpxchg() the
> >> whole 32-bit value, validating that SN is clear in the result?
> >
> > Not quite understand this. The control field is 64 bits, do you
> > mean cmpxchg() the whole 64-bit value like follows:
> >
> > +do {
> > +old.control = new.control = pi_desc->control;
> > +new.nv = pi_wakeup_vector;
> > +} while ( cmpxchg(&pi_desc->control, old.control, new.control)
> > +  != old.control );
> >
> > This a somewhat like the implementation in the earlier versions,
> > why do you want to change it back?
> 
> Did you read what I've said? I'm worried about the disconnect of
> assertion and update: You really care about SN being clear
> _after_ the update aiui. 

No, the ASSERT has no connection with the update. the SN bit should
be clear before updating pi_desc->nv, adding ASSERT here is just kind
of protection code.

> And since NDST doesn't change, a 32-bit
> CMPXCHG would seem to suffice.
> 
> >> > +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
> >> > +if ( pi_block_cpu == NR_CPUS )
> >> > +return;
> >>
> >> Are this condition and the one in the immediately preceding if()
> >> connected in any way?
> >
> > I am not sure I understand this correctly. Which one is
> > " the one in the immediately preceding if()"?
> 
> If you hadn't ripped out too much of the context, it would be a
> matter of pointing you back up. Now I have to re-quote the full
> code:
> 
> +if ( pi_desc->nv != posted_intr_vector )
> +write_atomic(&pi_desc->nv, posted_intr_vector);
> +
> +/* the vCPU is not on any blocking list. */
> +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
> +if ( pi_block_cpu == NR_CPUS )
> +return;
> 
> These are the two if()-s the question is about.
> 
> >> I.e. could the latter perhaps become an
> >> ASSERT() by connecting the two blocks suitably? I'm simply
> >> trying to limit the number of different cases to consider mentally
> >> when looking at this code ...
> >
> > If we get a true from above check, it means the vcpu was removed by
> > pi_wakeup_interrupt(), then we just need to return. If we get a false,
> > we will acquire the spinlock as the following code does, there are two
> > scenarios:
> > - pi_wakeup_interrupt() acquires the spinlock before us, then when
> > we get the lock, 'v->arch.hvm_vmx.pi_block_cpu' has already been
> > set to NR_CPUS by pi_wakeup_interrupt(), we need do nothing.
> > - We get the spinlock before pi_wakeup_interrupt(), this time we need
> > to remove the vCPU from the list and set 'v->arch.hvm_vmx.pi_block_cpu'
> > to NR_CPUS.
> 
> This is all only about the second condition, but the question really is
> whether one being true or false may imply the result of other. In
> which case this would better be ASSERT()ed than handled by two
> conditionals.

Thanks for the explanation. I don't think there are any connections
Betwee

Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Peng Fan
Hi Jan,

On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote:
 On 20.01.16 at 09:31,  wrote:
>> +/*
>> + * Backend response
>> + *
>> + * cmd: command for operation on clk, same with the cmd in request.
>> + * id: clk id, same with the id in request.
>> + * success: indicate failure or success for the cmd.
>> + * rate: clock rate. Used for get rate.
>> + *
>> + * cmd and id are filled by backend and passed to frontend to
>> + * let frontend check whether the response is for the current
>> + * request or not.
>> + */
>> +struct xen_clkif_response {
>> +uint32_t cmd;
>> +uint32_t id;
>> +uint32_t success;
>> +uint64_t rate;
>> +};
>
>This isn't 32-/64-bit clean. Question is whether echoing cmd is really
>needed.

As Juergen suggested, use a request id. I'll fix this in V3.
32-/64-bit unclean, I can not get you here (:

>
>Also naming a field "success" is pretty odd - is this mean to be a
>boolean? Better name it e.g. status, allowing for different (error)
>indicators.

As you suggested, how about `int status`? And in this clkif.h,
define different status value, such as `#define XEN_CLK_PREPARE_OK 0,
#define XEN_CLK_PREPARE_FAILURE -1`. The frontend and backend should
be aware of the status value.

>
>And what I'm missing throughout the file is some description of how
>clock events (interrupts?) are actually meant to make it into the
>guest.

I have a simple implementation v1 patch for linux, 
http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html.
You can see how it works.

Thanks,
Peng.

>
>Jan
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Peng Fan
Hi Juergen,

On Wed, Jan 20, 2016 at 11:40:55AM +0100, Juergen Gross wrote:
>On 20/01/16 10:25, Peng Fan wrote:
>> Hi Juergen,
>> 
>> On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote:
>>> On 20/01/16 09:31, Peng Fan wrote:
 Introduce pvclk interface which is useful when doing device passthrough
 on ARM platform.
>>>
>>> ...
>>>
 +/*
 + * Frontend request
 + *
 + * cmd: command for operation on clk, should be XEN_CLK_[xx],
 + *excluding XEN_CLK_END. id is the
 + * id: clk id
 + * rate: clock rate. Used for set rate.
>>>
>>> Which unit? Hz?
>> 
>> Yeah. Hz. I'll add comments in V3.
>> 
>>>
 + */
 +struct xen_clkif_request {
 +  uint32_t cmd;
 +  uint32_t id;
 +  uint64_t rate;
 +};
 +typedef struct xen_clkif_request xen_clkif_request_t;
 +
 +/*
 + * Backend response
 + *
 + * cmd: command for operation on clk, same with the cmd in request.
 + * id: clk id, same with the id in request.
 + * success: indicate failure or success for the cmd.
>>>
>>> Values?
>> 
>> I'd like to let the frontend/backend driver to determine the value.
>> In my implementation for linux, if the backend API supports return value,
>> I'll fill the return value to success entry. If the backend API returns
>> void, I'll just fill 0 to success entry.
>
>So please specify "0" to mean success and add some sensible error
>values. There might be multiple frontend- and/or backend-variants which
>all must agree using the same interface.

Will change this to `int status`, as also observed by Jan.
I'll also define macros such as "#define XEN_CLK_ENABLE_OK 0", "#define 
XEN_CLK_ENABLE_FAILURE -1"

>
 + * rate: clock rate. Used for get rate.
 + *
 + * cmd and id are filled by backend and passed to frontend to
 + * let frontend check whether the response is for the current
 + * request or not.
>>>
>>> I'd rather let the frontend add a request Id to the request which
>>> will be echoed here instead cmd and clock Id.
>> 
>> If using request id, I think we can encode cmd and id into request id?
>
>This should just be an opaque value for the backend. The frontend is
>free how to create this value, it should be unique for every outstanding
>request of a domU, however. It could be built from cmd and id in case
>there can't be multiple requests with the same cmd/id combination
>active in a domU.

Thanks for teaching me on this. So the request id should be globally unique
for backend. So "random value(generated in frontend) + frontend domid" is ok 
for this?

Thanks,
Peng.
>
>
>juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] OSSTEST: Re-blessing cubietruck-{picasso, gleizes, metzinger} for production use

2016-01-20 Thread Ian Jackson
Ian Campbell writes ("OSSTEST: Re-blessing 
cubietruck-{picasso,gleizes,metzinger} for production use"):
> I've been running a couple of adhoc production tests per day on these since
> before Xmas and they haven't lost sight of their disks again.
> 
> TLDR; I think we should throw them back in the pool.

Great.

> With the recent timeout fixes they are working as well as the production
> cubietruck-braque.
> 
> There are two flakey tests test-armhf-armhf-xl-rtds and test-armhf-armhf-
> libvirt-raw, but those appear to be much better than before the timeout
> changes and not specific to these three boards since the fourth one looks
> to behave much the same.
> 
> At first glance it looks like some later test steps might just need a bit
> more time on CT too.

Maybe we should have target_adjust_timeout honour a host property to
multiply timeouts by some factor.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 4/4] x86/PV: enable the emulated PIT

2016-01-20 Thread Ian Campbell
On Mon, 2016-01-18 at 18:58 +0100, Roger Pau Monné wrote:
> You certainly are more familiar with the migration code than me, but
> wouldn't it be enough to add a new field to libxl_domain_build_info
> (uint32_t emulation_flags), and teach
> libxl_domain_build_info_gen_json/libxl__domain_build_info_parse_json
>  how to properly parse it?

Is libxl not already aware whether a domain is to be dmlite vs pv or hvm
based on the domain config?

As such can't it figure out the hardcoded set to use already without
actually needing to expose this in libxl API (until we actually have a
desire to expose such nobs to users)? Can't this work on resume just like
it does on create?

> This however raises the question about how to signal that the field is
> not initialised, because 0 is a valid value (maybe ~0)?

In libxl's API we tend to try and avoid opaque hex numbers, so the natural
result would be a struct full of libxl_defbool options, which IIRC already
get properly propagated and have handling for defaulting already in place.

Ian.
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 1/5] libelf: rewrite symtab/strtab loading for Dom0

2016-01-20 Thread Roger Pau Monne
Current implementation of elf_load_bsdsyms is broken when loading inside of
a HVM guest, because it assumes elf_memcpy_safe is able to write into guest
memory space, which it is not.

Take the oportunity to do some cleanup and properly document how
elf_{parse/load}_bsdsyms works. The new implementation uses elf_load_image
when dealing with data that needs to be copied to the guest memory space.
Also reduce the number of section headers copied to the minimum necessary.

Signed-off-by: Roger Pau Monné 
---
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Keir Fraser 
Cc: Tim Deegan 
---
 xen/common/libelf/libelf-loader.c | 213 --
 1 file changed, 158 insertions(+), 55 deletions(-)

diff --git a/xen/common/libelf/libelf-loader.c 
b/xen/common/libelf/libelf-loader.c
index 6f42bea..9552d4c 100644
--- a/xen/common/libelf/libelf-loader.c
+++ b/xen/common/libelf/libelf-loader.c
@@ -153,7 +153,7 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t 
pstart)
 {
 uint64_t sz;
 ELF_HANDLE_DECL(elf_shdr) shdr;
-unsigned i, type;
+unsigned int i;
 
 if ( !ELF_HANDLE_VALID(elf->sym_tab) )
 return;
@@ -164,20 +164,33 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t 
pstart)
 sz = sizeof(uint32_t);
 
 /* Space for the elf and elf section headers */
-sz += (elf_uval(elf, elf->ehdr, e_ehsize) +
-   elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize));
+sz += elf_uval(elf, elf->ehdr, e_ehsize) +
+  3 * elf_uval(elf, elf->ehdr, e_shentsize);
 sz = elf_round_up(elf, sz);
 
-/* Space for the symbol and string tables. */
+/* Space for the symbol and string table. */
 for ( i = 0; i < elf_shdr_count(elf); i++ )
 {
 shdr = elf_shdr_by_index(elf, i);
 if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )
 /* input has an insane section header count field */
 break;
-type = elf_uval(elf, shdr, sh_type);
-if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) )
-sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
+
+if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB )
+continue;
+
+sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
+shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link));
+
+if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )
+/* input has an insane section header count field */
+break;
+
+if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB )
+/* Invalid symtab -> strtab link */
+break;
+
+sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
 }
 
 elf->bsd_symtab_pstart = pstart;
@@ -186,13 +199,31 @@ void elf_parse_bsdsyms(struct elf_binary *elf, uint64_t 
pstart)
 
 static void elf_load_bsdsyms(struct elf_binary *elf)
 {
-ELF_HANDLE_DECL(elf_ehdr) sym_ehdr;
-unsigned long sz;
-elf_ptrval maxva;
-elf_ptrval symbase;
-elf_ptrval symtab_addr;
-ELF_HANDLE_DECL(elf_shdr) shdr;
-unsigned i, type;
+/*
+ * Header that is placed at the end of the kernel and allows
+ * the OS to find where the symtab and strtab have been loaded.
+ * It mimics a valid ELF file header, although it only contains
+ * a symtab and a strtab section.
+ *
+ * NB: according to the ELF spec there's only ONE symtab per ELF
+ * file, and accordingly we will only load the corresponding
+ * strtab, so we only need three section headers in our fake ELF
+ * header (first section header is always a dummy).
+ */
+struct __packed {
+elf_ehdr header;
+elf_shdr section[3];
+} symbol_header;
+
+ELF_HANDLE_DECL(elf_ehdr) header_handle;
+unsigned long shdr_size;
+uint32_t symsize;
+ELF_HANDLE_DECL(elf_shdr) section_handle;
+ELF_HANDLE_DECL(elf_shdr) image_handle;
+unsigned int i, link;
+elf_ptrval header_base;
+elf_ptrval symtab_base;
+elf_ptrval strtab_base;
 
 if ( !elf->bsd_symtab_pstart )
 return;
@@ -200,64 +231,136 @@ static void elf_load_bsdsyms(struct elf_binary *elf)
 #define elf_hdr_elm(_elf, _hdr, _elm, _val) \
 do {\
 if ( elf_64bit(_elf) )  \
-elf_store_field(_elf, _hdr, e64._elm, _val);  \
+(_hdr).e64._elm = _val; \
 else\
-elf_store_field(_elf, _hdr, e32._elm, _val);  \
+(_hdr).e32._elm = _val; \
 } while ( 0 )
 
-symbase = elf_get_ptr(elf, elf->bsd_symtab_pstart);
-symtab_addr = maxva = symbase + sizeof(uint32_t);
+#define SYMTAB_INDEX1
+#define STRTAB_INDEX2
 
-/* Set up Elf header. */
-sym_ehdr = ELF_MAKE_HANDLE(elf_ehdr, symtab_addr);
-sz = elf_uval(elf, elf->ehdr, e_ehsize);
-elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(sym_ehdr), 
ELF_HANDLE_PTRVAL(elf->ehdr), s

[Xen-devel] [PATCH v3 2/5] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNDEF

2016-01-20 Thread Roger Pau Monne
And use it as the default value for the VGA kind. This allows libxl to set
it to the default value later on when the domain type is known. For HVM
guests the default value is LIBXL_VGA_INTERFACE_TYPE_CIRRUS while for
HVMlite the default value is LIBXL_VGA_INTERFACE_TYPE_NONE.

Signed-off-by: Roger Pau Monné 
---
Cc: Ian Jackson 
Cc: Ian Campbell 
Cc: Wei Liu 
---
 tools/libxl/libxl_create.c  | 8 ++--
 tools/libxl/libxl_dm.c  | 6 ++
 tools/libxl/libxl_types.idl | 3 ++-
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index e491d83..61a4001 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -206,8 +206,12 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 if (b_info->u.hvm.mmio_hole_memkb == LIBXL_MEMKB_DEFAULT)
 b_info->u.hvm.mmio_hole_memkb = 0;
 
-if (!b_info->u.hvm.vga.kind)
-b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
+if (b_info->u.hvm.vga.kind == LIBXL_VGA_INTERFACE_TYPE_UNDEF) {
+if (b_info->device_model_version == 
LIBXL_DEVICE_MODEL_VERSION_NONE)
+b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE;
+else
+b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
+}
 
 if (!b_info->u.hvm.hdtype)
 b_info->u.hvm.hdtype = LIBXL_HDTYPE_IDE;
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index a088d71..9aa0cc8 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -531,6 +531,9 @@ static int libxl__build_device_model_args_old(libxl__gc *gc,
 break;
 case LIBXL_VGA_INTERFACE_TYPE_QXL:
 break;
+default:
+LOG(ERROR, "Invalid emulated video card specified");
+return ERROR_INVAL;
 }
 
 if (b_info->u.hvm.boot) {
@@ -970,6 +973,9 @@ static int libxl__build_device_model_args_new(libxl__gc *gc,
 GCSPRINTF("qxl-vga,vram_size_mb=%"PRIu64",ram_size_mb=%"PRIu64,
 (b_info->video_memkb/2/1024), (b_info->video_memkb/2/1024) ) );
 break;
+default:
+LOG(ERROR, "Invalid emulated video card specified");
+return ERROR_INVAL;
 }
 
 if (b_info->u.hvm.boot) {
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 9ad7eba..92c95e5 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -204,11 +204,12 @@ libxl_shutdown_reason = Enumeration("shutdown_reason", [
 ], init_val = "LIBXL_SHUTDOWN_REASON_UNKNOWN")
 
 libxl_vga_interface_type = Enumeration("vga_interface_type", [
+(0, "UNDEF"),
 (1, "CIRRUS"),
 (2, "STD"),
 (3, "NONE"),
 (4, "QXL"),
-], init_val = "LIBXL_VGA_INTERFACE_TYPE_CIRRUS")
+], init_val = "LIBXL_VGA_INTERFACE_TYPE_UNDEF")
 
 libxl_vendor_device = Enumeration("vendor_device", [
 (0, "NONE"),
-- 
1.9.5 (Apple Git-50.3)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 3/5] libxl: initialise the build info before calling prepare_config

2016-01-20 Thread Roger Pau Monne
libxl__arch_domain_prepare_config has access to the libxl_domain_build_info
struct, so make sure it's properly initialised.

Signed-off-by: Roger Pau Monné 
---
Cc: Ian Jackson 
Cc: Ian Campbell 
Cc: Wei Liu 
---
NB: libxl__arch_domain_prepare_config is called from libxl__domain_make.
---
 tools/libxl/libxl_create.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 61a4001..ba4c9e8 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -892,6 +892,12 @@ static void initiate_domain_create(libxl__egc *egc,
 goto error_out;
 }
 
+ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info);
+if (ret) {
+LOG(ERROR, "Unable to set domain build info defaults");
+goto error_out;
+}
+
 ret = libxl__domain_make(gc, d_config, &domid, &state->config);
 if (ret) {
 LOG(ERROR, "cannot make domain: %d", ret);
@@ -903,12 +909,6 @@ static void initiate_domain_create(libxl__egc *egc,
 dcs->guest_domid = domid;
 dcs->dmss.dm.guest_domid = 0; /* means we haven't spawned */
 
-ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info);
-if (ret) {
-LOG(ERROR, "Unable to set domain build info defaults");
-goto error_out;
-}
-
 if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM &&
 (libxl_defbool_val(d_config->b_info.u.hvm.nested_hvm) &&
  libxl_defbool_val(d_config->b_info.u.hvm.altp2m))) {
-- 
1.9.5 (Apple Git-50.3)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 5/5] libxl: add options to enable/disable emulated devices

2016-01-20 Thread Roger Pau Monne
Allow enabling or disabling emulated devices from the libxl domain
configuration file. For HVM guests with a device model all the emulated
devices are enabled. For HVM guests without a device model no devices are
enabled by default, although they can be enabled using the options provided.
The arbiter of whether a combination is posible or not is always Xen, libxl
doesn't do any kind of check.

This set of options is also propagated inside of the libxl migration record
as part of the contents of the libxl_domain_build_info struct.

Signed-off-by: Roger Pau Monné 
---
Cc: Ian Jackson 
Cc: Ian Campbell 
Cc: Wei Liu 
---
 docs/man/xl.cfg.pod.5   | 39 +++
 tools/libxl/libxl.h | 11 +++
 tools/libxl/libxl_create.c  | 21 -
 tools/libxl/libxl_types.idl |  6 ++
 tools/libxl/libxl_x86.c | 33 -
 tools/libxl/xl_cmdimpl.c|  7 +++
 6 files changed, 111 insertions(+), 6 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 8899f75..46d4529 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1762,6 +1762,45 @@ See F for more 
information.
 
 =back
 
+=head3 HVM without a device model options
+
+This options can be used to change the set of emulated devices provided
+to guests without a device model. Note that Xen might not support all
+possible combinations. By default HVM guests without a device model
+don't have any of them enabled.
+
+=over 4
+
+=item B
+
+Enables or disables the Local APIC.
+
+=item B
+
+Enables or disables the IO APIC.
+
+=item B
+
+Enables or disables the RTC.
+
+=item B
+
+Enables or disables the ACPI power management timer and control interfaces.
+
+=item B
+
+Enables or disables the PIC.
+
+=item B
+
+Enables or disables the PIT.
+
+=item B
+
+Enables or disables the HPET.
+
+=back
+
 =head2 Device-Model Options
 
 The following options control the selection of the device-model.  This
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 7114491..fc4ff1d 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -876,6 +876,17 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
libxl_mac *src);
  */
 #define LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE 1
 
+/*
+ * LIBXL_HAVE_EMULATED_DEVS_OPTIONS
+ *
+ * In the case that LIBXL_HAVE_EMULATED_DEVS_OPTIONS is set libxl
+ * allows enabling or disabling emulated devices for HVM guests
+ * without a device model. The following fields are added to the
+ * hvm structure inside of libxl_domain_build_info: lapic, ioapic,
+ * rtc, power_management, pic, pit.
+ */
+#define LIBXL_HAVE_EMULATED_DEVS_OPTIONS 1
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index ba4c9e8..8a76a6b 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -295,13 +295,32 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 libxl_defbool_setdefault(&b_info->u.hvm.acpi_s4,true);
 libxl_defbool_setdefault(&b_info->u.hvm.nx, true);
 libxl_defbool_setdefault(&b_info->u.hvm.viridian,   false);
-libxl_defbool_setdefault(&b_info->u.hvm.hpet,   true);
 libxl_defbool_setdefault(&b_info->u.hvm.vpt_align,  true);
 libxl_defbool_setdefault(&b_info->u.hvm.nested_hvm, false);
 libxl_defbool_setdefault(&b_info->u.hvm.altp2m, false);
 libxl_defbool_setdefault(&b_info->u.hvm.usb,false);
 libxl_defbool_setdefault(&b_info->u.hvm.xen_platform_pci,   true);
 
+if (b_info->type == LIBXL_DOMAIN_TYPE_HVM &&
+b_info->device_model_version == LIBXL_DEVICE_MODEL_VERSION_NONE)
+{
+libxl_defbool_setdefault(&b_info->u.hvm.hpet,   false);
+libxl_defbool_setdefault(&b_info->u.hvm.lapic,  false);
+libxl_defbool_setdefault(&b_info->u.hvm.ioapic, false);
+libxl_defbool_setdefault(&b_info->u.hvm.rtc,false);
+libxl_defbool_setdefault(&b_info->u.hvm.power_management,   false);
+libxl_defbool_setdefault(&b_info->u.hvm.pic,false);
+libxl_defbool_setdefault(&b_info->u.hvm.pit,false);
+} else {
+libxl_defbool_setdefault(&b_info->u.hvm.hpet,   true);
+libxl_defbool_setdefault(&b_info->u.hvm.lapic,  true);
+libxl_defbool_setdefault(&b_info->u.hvm.ioapic, true);
+libxl_defbool_setdefault(&b_info->u.hvm.rtc,true);
+libxl_defbool_setdefault(&b_info->u.hvm.power_management,   true);
+libxl_defbool_setdefault(&b_info->u.hvm.pic,true);
+libxl_defbool_setdefault

[Xen-devel] [PATCH v3 0/5] HVMlite: DomU fixes and a Dom0 preparatory patch

2016-01-20 Thread Roger Pau Monne
Hello, 

The series is sorted in the following way:
 - Patch 1/5 is a preparatory patch for Dom0 HVMlite support.
 - Patch 4/5 is a fix from a fallout introduced by the HVMlite series, which 
   inadvertently disabled the emulated PIT that was added to PV(H) guests.
 - Patches 2, 3 and 5 expand the fields of the libxl_domain_build_info 
   structure in order to store whether emulated devices inside of Xen are 
   enabled or not.
 - NOTE: patch 4/5 and 5/5 should be committed together. I haven't squashed 
   them together in order to ease the review, but I can do so in the next 
   iteration if requested.

The full series can also be found in my git tree:

http://xenbits.xen.org/gitweb/?p=people/royger/xen.git;a=shortlog;h=refs/heads/hvmlite_fixes_v3

I hope this will help review of patch 1/5, which is a significant rework, 
and will probably be better reviewed by looking at the resulting code.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 4/5] x86/PV: allow PV guests to have an emulated PIT

2016-01-20 Thread Roger Pau Monne
This fixes the fallout from the HVMlite series, that removed the emulated
PIT from PV(H) guests. Also, this patch forces the hardware domain to
always have an emulated PIT, regardless of whether the toolstack specified
one or not.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
Changes since v2:
 - Force the emulated PIT to always be enabled for the hardware domain.
 - Change indentation of the valid set of emulation bitmaps check.
---
 xen/arch/x86/domain.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index e70c125..78df5ae 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -542,8 +542,11 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
d->domain_id, config->emulation_flags);
 return -EINVAL;
 }
-if ( config->emulation_flags != 0 &&
- (!is_hvm_domain(d) || config->emulation_flags != XEN_X86_EMU_ALL) 
)
+if ( is_hardware_domain(d) )
+config->emulation_flags |= XEN_X86_EMU_PIT;
+if ( !is_hvm_domain(d) ? (config->emulation_flags != XEN_X86_EMU_PIT) :
+ (config->emulation_flags != XEN_X86_EMU_ALL &&
+  config->emulation_flags != 0))
 {
 printk(XENLOG_G_ERR "d%d: Xen does not allow %s domain creation "
"with the current selection of emulators: %#x\n",
-- 
1.9.5 (Apple Git-50.3)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] OSSTEST: Re-blessing cubietruck-{picasso, gleizes, metzinger} for production use

2016-01-20 Thread Ian Campbell
On Wed, 2016-01-20 at 11:52 +, Ian Jackson wrote:
> Ian Campbell writes ("OSSTEST: Re-blessing cubietruck-
> {picasso,gleizes,metzinger} for production use"):
> > I've been running a couple of adhoc production tests per day on these
> > since
> > before Xmas and they haven't lost sight of their disks again.
> > 
> > TLDR; I think we should throw them back in the pool.
> 
> Great.

I'll take this as a "yes, go ahead" ;-)

> > With the recent timeout fixes they are working as well as the
> > production
> > cubietruck-braque.
> > 
> > There are two flakey tests test-armhf-armhf-xl-rtds and test-armhf-
> > armhf-
> > libvirt-raw, but those appear to be much better than before the timeout
> > changes and not specific to these three boards since the fourth one
> > looks
> > to behave much the same.
> > 
> > At first glance it looks like some later test steps might just need a
> > bit
> > more time on CT too.
> 
> Maybe we should have target_adjust_timeout honour a host property to
> multiply timeouts by some factor.

That's not a bad idea, assuming the remaining issues really are timeouts of
this sort.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 78583: regressions - FAIL

2016-01-20 Thread osstest service owner
flight 78583 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/78583/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-ovmf-amd64 17 guest-start/debianhvm.repeat fail REGR. 
vs. 65543
 test-amd64-amd64-xl-qemuu-ovmf-amd64 17 guest-start/debianhvm.repeat fail 
REGR. vs. 65543

version targeted for testing:
 ovmf 5af229d81f15da8cb97e817ccdf9a10177d6b4c4
baseline version:
 ovmf 5ac96e3a28dd26eabee421919f67fa7c443a47f1

Last test of basis65543  2015-12-08 08:45:15 Z   43 days
Failing since 65593  2015-12-08 23:44:51 Z   42 days   43 attempts
Testing same since78583  2016-01-20 04:26:02 Z0 days1 attempts


People who touched revisions under test:
  "Samer El-Haj-Mahmoud" 
  "Yao, Jiewen" 
  Andrew Fish 
  Ard Biesheuvel 
  Cecil Sheng 
  Chao Zhang 
  Dandan Bi 
  Daocheng Bu 
  Daryl McDaniel 
  Eric Dong 
  Eric Dong 
  Eugene Cohen 
  Feng Tian 
  Fu Siyuan 
  Hao Wu 
  Hess Chen 
  Heyi Guo 
  Jaben Carsey 
  Jeff Fan 
  Jiaxin Wu 
  Jim Dailey 
  Jordan Justen 
  Larry Hauch 
  Laszlo Ersek 
  Leekha Shaveta 
  Liming Gao 
  Mark Rutland 
  Michael Kinney 
  Michael Thomas 
  Paulo Alcantara 
  Qin Long 
  Qiu Shumin 
  Ruiyu Ni 
  Samer El-Haj-Mahmoud 
  Samer El-Haj-Mahmoud 
  Star Zeng 
  Tapan Shah 
  Yao Jiewen 
  Yao, Jiewen 
  Ye Ting 
  Yonghong Zhu 
  Zhang Lubo 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5795 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH LIBVIRT v2] libxl: Support cmdline= in xl config files

2016-01-20 Thread Ian Campbell
... and consolidate the cmdline/extra/root parsing to facilitate doing
so.

The logic is the same as xl's parse_cmdline from the current xen.git master
branch (e6f0e099d2c17de47fd86e817b1998db903cab61).

In order to introduce a use of VIR_WARN for logging I had to add
virerror.h and VIR_LOG_INIT.

Signed-off-by: Ian Campbell 
---
NB: I was unsure if I was supposed to change VIR_FROM_NONE into
VIR_FROM_XEN, so I didn't (but will on advice).

v2: Use VIR_INFO (adding necessary infra)
Don't initialise things to NULL when there is no need.
---
 src/xenconfig/xen_xl.c | 66 +++---
 1 file changed, 41 insertions(+), 25 deletions(-)

diff --git a/src/xenconfig/xen_xl.c b/src/xenconfig/xen_xl.c
index 91cdff6..3d820cc 100644
--- a/src/xenconfig/xen_xl.c
+++ b/src/xenconfig/xen_xl.c
@@ -27,6 +27,7 @@
 
 #include "virconf.h"
 #include "virerror.h"
+#include "virlog.h"
 #include "domain_conf.h"
 #include "viralloc.h"
 #include "virstring.h"
@@ -35,6 +36,8 @@
 
 #define VIR_FROM_THIS VIR_FROM_NONE
 
+VIR_LOG_INIT("xen.xen_xl");
+
 /*
  * Xen provides a libxl utility library, with several useful functions,
  * specifically xlu_disk_parse for parsing xl disk config strings.
@@ -58,11 +61,46 @@ extern int xlu_disk_parse(XLU_Config *cfg,
   libxl_device_disk *disk);
 #endif
 
+static int xenParseCmdline(virConfPtr conf, char **r_cmdline)
+{
+char *cmdline;
+const char *root, *extra, *buf;
+
+if (xenConfigGetString(conf, "cmdline", &buf, NULL) < 0)
+return -1;
+
+if (xenConfigGetString(conf, "root", &root, NULL) < 0)
+return -1;
+
+if (xenConfigGetString(conf, "extra", &extra, NULL) < 0)
+return -1;
+
+if (buf) {
+if (VIR_STRDUP(cmdline, buf) < 0)
+return -1;
+if (root || extra)
+VIR_WARN("ignoring root= and extra= in favour of cmdline=");
+} else {
+if (root && extra) {
+if (virAsprintf(&cmdline, "root=%s %s", root, extra) < 0)
+return -1;
+} else if (root) {
+if (virAsprintf(&cmdline, "root=%s", root) < 0)
+return -1;
+} else if (extra) {
+if (VIR_STRDUP(cmdline, extra) < 0)
+return -1;
+}
+}
+
+*r_cmdline = cmdline;
+return 0;
+}
+
 static int
 xenParseXLOS(virConfPtr conf, virDomainDefPtr def, virCapsPtr caps)
 {
 size_t i;
-const char *extra, *root;
 
 if (def->os.type == VIR_DOMAIN_OSTYPE_HVM) {
 const char *boot;
@@ -84,19 +122,8 @@ xenParseXLOS(virConfPtr conf, virDomainDefPtr def, 
virCapsPtr caps)
 if (xenConfigCopyStringOpt(conf, "ramdisk", &def->os.initrd) < 0)
 return -1;
 
-if (xenConfigGetString(conf, "extra", &extra, NULL) < 0)
-return -1;
-
-if (xenConfigGetString(conf, "root", &root, NULL) < 0)
+if (xenParseCmdline(conf, &def->os.cmdline) < 0)
 return -1;
-
-if (root) {
-if (virAsprintf(&def->os.cmdline, "root=%s %s", root, extra) < 0)
-return -1;
-} else {
-if (VIR_STRDUP(def->os.cmdline, extra) < 0)
-return -1;
-}
 #endif
 
 if (xenConfigGetString(conf, "boot", &boot, "c") < 0)
@@ -132,19 +159,8 @@ xenParseXLOS(virConfPtr conf, virDomainDefPtr def, 
virCapsPtr caps)
 if (xenConfigCopyStringOpt(conf, "ramdisk", &def->os.initrd) < 0)
 return -1;
 
-if (xenConfigGetString(conf, "extra", &extra, NULL) < 0)
-return -1;
-
-if (xenConfigGetString(conf, "root", &root, NULL) < 0)
+if (xenParseCmdline(conf, &def->os.cmdline) < 0)
 return -1;
-
-if (root) {
-if (virAsprintf(&def->os.cmdline, "root=%s %s", root, extra) < 0)
-return -1;
-} else {
-if (VIR_STRDUP(def->os.cmdline, extra) < 0)
-return -1;
-}
 }
 
 return 0;
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Stefano Stabellini
On Wed, 20 Jan 2016, Peng Fan wrote:
> To my use case, Dom0 and DomU both use device tree, I need to build
> the mapping table between id and name, since I use name to lookup
> the clk in backend, like this:
> "clk = __clk_loopkup(clkname); clk_prepare_enable(clk)". Maybe ACPI
> is another different case.

Theoretically on systems using ACPI there is no need to fiddle with
clocks, see

Documentation/arm64/arm-acpi.txt


"In DT, clocks need to be specified
and the drivers need to take them into account.  In ACPI, the assumption
is that UEFI will leave the device in a reasonable default state, including
any clock settings.  If for some reason the driver needs to change a clock
value, this can be done in an ACPI method; all the driver needs to do is
invoke the method and not concern itself with what the method needs to do
to change the clock.  Changing the hardware can then take place over time
by changing what the ACPI method does, and not the driver.

In DT, the parameters needed by the driver to set up clocks as in the example
above are known as "bindings"; in ACPI, these are known as "Device Properties"
and provided to a driver via the _DSD object."


However currently we don't have the ability to run ACPI in DomU guests
on ARM. Even if we had, there is no way to call native ACPI methods from
any guests other than Dom0, even on x86. We just have to hope that
clocks don't need to be reset on ACPI systems.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Xen Security Advisory 167 (CVE-2016-1570) - PV superpage functionality missing sanity checks

2016-01-20 Thread Xen . org security team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Xen Security Advisory CVE-2016-1570 / XSA-167
  version 4

PV superpage functionality missing sanity checks

UPDATES IN VERSION 4


Public release.

ISSUE DESCRIPTION
=

The PV superpage functionality lacks certain validity checks on data
being passed to the hypervisor by guests.  This is the case for the
page identifier (MFN) passed to MMUEXT_MARK_SUPER and
MMUEXT_UNMARK_SUPER sub-ops of the HYPERVISOR_mmuext_op hypercall as
well as for various forms of page table updates.

IMPACT
==

Use of the feature, which is disabled by default, may have unknown
effects, ranging from information leaks through Denial of Service to
privilege escalation.

VULNERABLE SYSTEMS
==

Only systems which enable the PV superpage feature are affected.  That
is, only systems with an `allowsuperpage' setting on the hypervisor
command line.  Note that in Xen 4.0.x and 3.4.x the option is named
`allowhugepage'.

Xen versions 3.4.0, 3.4.1, and from 4.1 onwards are affected.

Only x86 systems are affected.

Only PV guests can exploit the vulnerability.

MITIGATION
==

Running only HVM guests will avoid this issue.

Not enabling PV superpage support (by omitting the `allowsuperpage' or
`allowhugepage' hypervisor command line options) will avoid exposing
the issue.

CREDITS
===

This issue was discovered by Qinghao Tang of 360 Marvel Team.

RESOLUTION
==

Applying the appropriate attached patch resolves this issue.

xsa167.patch   xen-unstable
xsa167-4.6.patch   Xen 4.6.x, 4.5.x
xsa167-4.4.patch   Xen 4.4.x, 4.3.x

$ sha256sum xsa167*
a71f709eef59425cb2113fa48d3b44048c6bf41063200fee1c847f6e0ed45a09  xsa167.patch
194c1ce89292f4cbb9980baa703095bcbeb5849abf46d193e07a98a0d8301f78  
xsa167-4.4.patch
2bd786cccfd13c6732d6db8afc9e18058465efcb1bc93f894c359e3a820d5403  
xsa167-4.6.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.


However deployment of the SUPERPAGE DISABLEMENT MITIGATION is NOT
permitted (except where all the affected systems and VMs are
administered and used only by organisations which are members of the
Xen Project Security Issues Predisclosure List).  Specifically,
deployment on public cloud systems is NOT permitted.

This is because disabling PV superpage support is visible to guests, so
such deployment could lead to the rediscovery of the vulnerability.

Deployment of the mitigation is permitted only AFTER the embargo ends.


Also: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.


(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iQEcBAEBAgAGBQJWn3jEAAoJEIP+FMlX6CvZTOsH/2ReFJ0Yhp5da69XKvFEJR/s
0yEFxjvqiSyBPsWjyiaAdOp/1A2sltEeDDnMy7xEoXHmon0p6IV0IR4L+fMCLjl2
1ZI4tKpkn3zUE+IOjfu/GJ53f87XWSq/u9Ri7yZQdxFpgd3AXcLegGm8i4L/58iY
vdwAAuczACztEN/NbWFedlGUEd5PKqKwb4wOg1uhLIMwzvjxgtejVAyZD83HgP6i
LeWMO7EfeU8ND38Otiw9lNlKD/Ia7vpRG+BXuADLx18hbR1TU9AJ0RO1zb9JnAAj
snYdgB6s1wzRD4/HOc+s1uaIttPPODs0IhZunylI7UVhdWKp5Qkszw/QUcmufnk=
=5acB
-END PGP SIGNATURE-


xsa167.patch
Description: Binary data


xsa167-4.4.patch
Description: Binary data


xsa167-4.6.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Xen Security Advisory 168 (CVE-2016-1571) - VMX: intercept issue with INVLPG on non-canonical address

2016-01-20 Thread Xen . org security team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Xen Security Advisory CVE-2016-1571 / XSA-168
  version 3

   VMX: intercept issue with INVLPG on non-canonical address

UPDATES IN VERSION 3


Public release.

ISSUE DESCRIPTION
=

While INVLPG does not cause a General Protection Fault when used on a
non-canonical address, INVVPID in its "individual address" variant,
which is used to back the intercepted INVLPG in certain cases, fails in
such cases. Failure of INVVPID results in a hypervisor bug check.

IMPACT
==

A malicious guest can crash the host, leading to a Denial of Service.

VULNERABLE SYSTEMS
==

Xen versions from 3.3 onwards are affected.

Only systems using Intel or Cyrix CPUs are affected. ARM and AMD
systems are unaffected.

Only HVM guests using shadow mode paging can expose this
vulnerability.  PV guests, and HVM guests using Hardware Assisted
Paging (also known as EPT on affected hardware), are unaffected.

Note that while unsupported, guests with enabled nested virtualization
are vulnerable even when using EPT.

CHECKING FOR VULNERABLE CONFIGURATION
=

To discover whether your HVM guests are using HAP, or shadow page
tables: request debug key `q' (from the Xen console, or with
`xl debug-keys q').  This will print (to the console, and visible in
`xl dmesg'), debug information for every domain, containing something
like this:

  (XEN) General information for domain 2:
  (XEN) refcnt=1 dying=2 pause_count=2
  (XEN) nr_pages=2 xenheap_pages=0 shared_pages=0 paged_pages=0 
dirty_cpus={} max_pages=262400
  (XEN) handle=ef58ef1a-784d-4e59-8079-42bdee87f219 vm_assist=
  (XEN) paging assistance: hap refcounts translate external
   ^^^
The presence of `hap' here indicates that the host is not
vulnerable to this domain.  For an HVM domain the presence of `shadow'
indicates that the domain can exploit the vulnerability.

Note that `General information' will also be printed for PV domains.
For most PV domains there will be no `paging assistance' reported.
But PV guests currently being migrated will report
  (XEN) paging assistance: shadow log_dirty

Overall: a domain can exploit the vulnerability if this debug output
contains a `paging assistance' line which reports `translate' and
which does not report `hap'.

MITIGATION
==

Running only PV guests will avoid this vulnerability.

Running HVM guests on only AMD hardware will also avoid this
vulnerability.

Running HVM guests with Hardware Assisted Paging (HAP) enabled will
also avoid this vulnerability.  This is the default mode on hardware
supporting HAP, but can be overridden by hypervisor command line
option and guest configuration setting.  Such overrides ("hap=0" in
either case, with variants like "no-hap" being possible in the
hypervisor command line case) would need to be removed to avoid this
vulnerability.

CREDITS
===

This issue was discovered by Jan Beulich of SUSE.

RESOLUTION
==

Applying the attached patch resolves this issue.

xsa168.patch  xen-unstable, Xen 4.6.x, Xen 4.5.x, Xen 4.4.x, Xen 4.3.x

$ sha256sum xsa168*
c95198a66485d6e538d113ce2b84630d77c15f597113c38fadd6bf1e24e4c8ec  xsa168.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iQEcBAEBAgAGBQJWn3dEAAoJEIP+FMlX6CvZLaAH/A1FzwQebCOF0MCEMcM9V/zK
At3L0XG5oBiVZVpbXAfYULeKaLtTGLBXqhBJjzej0FypCvEYX6BLBITLsw7kMqoW
JSYHNHlg4pLH2Wnf6i3fVC7EIHx5XNuDa8Zeyt73wEFJhVpp43PcMwMzBolTUBmP
+f5WDkLYflYXv+0XiHfbBLA2fl+K+A5OdDhKgjPZJouGvdfiZxX7EChR0asmmD1i
AbSZYTLGhdlSU+fvw+w2XUYSeINS1FEhsZxMbWMVuz7jmPBmOn6u8NLrBdZatYoE
Z2Fly81pWD7KDwusVscoLBdmBmI1Wr3u975j5EkQLbsCTsqo5ayP3BpfsieijIg=
=UJX5
-END PGP SIGNATURE-


xsa168.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Juergen Gross
On 20/01/16 12:48, Peng Fan wrote:
> Hi Juergen,
> 
> On Wed, Jan 20, 2016 at 11:40:55AM +0100, Juergen Gross wrote:
>> On 20/01/16 10:25, Peng Fan wrote:
>>> Hi Juergen,
>>>
>>> On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote:
 On 20/01/16 09:31, Peng Fan wrote:
> Introduce pvclk interface which is useful when doing device passthrough
> on ARM platform.

 ...

> +/*
> + * Frontend request
> + *
> + * cmd: command for operation on clk, should be XEN_CLK_[xx],
> + *   excluding XEN_CLK_END. id is the
> + * id: clk id
> + * rate: clock rate. Used for set rate.

 Which unit? Hz?
>>>
>>> Yeah. Hz. I'll add comments in V3.
>>>

> + */
> +struct xen_clkif_request {
> + uint32_t cmd;
> + uint32_t id;
> + uint64_t rate;
> +};
> +typedef struct xen_clkif_request xen_clkif_request_t;
> +
> +/*
> + * Backend response
> + *
> + * cmd: command for operation on clk, same with the cmd in request.
> + * id: clk id, same with the id in request.
> + * success: indicate failure or success for the cmd.

 Values?
>>>
>>> I'd like to let the frontend/backend driver to determine the value.
>>> In my implementation for linux, if the backend API supports return value,
>>> I'll fill the return value to success entry. If the backend API returns
>>> void, I'll just fill 0 to success entry.
>>
>> So please specify "0" to mean success and add some sensible error
>> values. There might be multiple frontend- and/or backend-variants which
>> all must agree using the same interface.
> 
> Will change this to `int status`, as also observed by Jan.
> I'll also define macros such as "#define XEN_CLK_ENABLE_OK 0", "#define 
> XEN_CLK_ENABLE_FAILURE -1"
> 
>>
> + * rate: clock rate. Used for get rate.
> + *
> + * cmd and id are filled by backend and passed to frontend to
> + * let frontend check whether the response is for the current
> + * request or not.

 I'd rather let the frontend add a request Id to the request which
 will be echoed here instead cmd and clock Id.
>>>
>>> If using request id, I think we can encode cmd and id into request id?
>>
>> This should just be an opaque value for the backend. The frontend is
>> free how to create this value, it should be unique for every outstanding
>> request of a domU, however. It could be built from cmd and id in case
>> there can't be multiple requests with the same cmd/id combination
>> active in a domU.
> 
> Thanks for teaching me on this. So the request id should be globally unique
> for backend. So "random value(generated in frontend) + frontend domid" is ok 
> for this?

Being unique per frontend is enough, as each frontend will only ever see
responses for it's own requests. Make it as simple as possible. I guess
there will be a maximum of active requests possible, e.g. the number of
request slots in the request ring. You could use the index into the ring
as Id (pvSCSI frontend is doing it this way).

Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] netfront/netback multiqueue exhausting grants

2016-01-20 Thread Ian Campbell
There have been a few reports recently[0] which relate to a failure of
netfront to allocate sufficient grant refs for all the queues:

[0.533589] xen_netfront: can't alloc rx grant refs
[0.533612] net eth0: only created 31 queues

Which can be worked around by increasing the number of grants on the
hypervisor command line or by limiting the number of queues permitted by
either back or front using a module param (which was broken but is now
fixed on both sides, but I'm not sure it has been backported everywhere
such that it is a reliable thing to always tell users as a workaround).

Is there any plan to do anything about the default/out of the box
experience? Either limiting the number of queues or making both ends cope
more gracefully with failure to create some queues (or both) might be
sufficient?

I think the crash after the above in the first link at [0] is fixed? I
think that was the purpose of ca88ea1247df "xen-netfront: update num_queues
to real created" which was in 4.3.

Ian.

[0] http://lists.xen.org/archives/html/xen-users/2016-01/msg00100.html
    http://lists.xen.org/archives/html/xen-users/2016-01/msg00072.html
    some before hte xmas break too IIRC

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Ian Campbell
On Wed, 2016-01-20 at 12:06 +, Stefano Stabellini wrote:
> On Wed, 20 Jan 2016, Peng Fan wrote:
> > To my use case, Dom0 and DomU both use device tree, I need to build
> > the mapping table between id and name, since I use name to lookup
> > the clk in backend, like this:
> > "clk = __clk_loopkup(clkname); clk_prepare_enable(clk)". Maybe ACPI
> > is another different case.
> 
> Theoretically on systems using ACPI there is no need to fiddle with
> clocks,

I mentioned ACPI in my replies to v1 more as a placeholder for "other
firmware descriptions than DT", in order that the pv protocol we end up
with does not end up being DT specific, which would be a mistake
irrespective of what may or may not be required for non-DT firmware
descriptions.

Ian.

>  see
> 
> Documentation/arm64/arm-acpi.txt
> 
> 
> "In DT, clocks need to be specified
> and the drivers need to take them into account.  In ACPI, the assumption
> is that UEFI will leave the device in a reasonable default state,
> including
> any clock settings.  If for some reason the driver needs to change a
> clock
> value, this can be done in an ACPI method; all the driver needs to do is
> invoke the method and not concern itself with what the method needs to do
> to change the clock.  Changing the hardware can then take place over time
> by changing what the ACPI method does, and not the driver.
> 
> In DT, the parameters needed by the driver to set up clocks as in the
> example
> above are known as "bindings"; in ACPI, these are known as "Device
> Properties"
> and provided to a driver via the _DSD object."
> 
> 
> However currently we don't have the ability to run ACPI in DomU guests
> on ARM. Even if we had, there is no way to call native ACPI methods from
> any guests other than Dom0, even on x86. We just have to hope that
> clocks don't need to be reset on ACPI systems.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] OSSTEST: Re-blessing cubietruck-{picasso, gleizes, metzinger} for production use

2016-01-20 Thread Ian Campbell
On Wed, 2016-01-20 at 11:58 +, Ian Campbell wrote:
> On Wed, 2016-01-20 at 11:52 +, Ian Jackson wrote:
> > Ian Campbell writes ("OSSTEST: Re-blessing cubietruck-
> > {picasso,gleizes,metzinger} for production use"):
> > > I've been running a couple of adhoc production tests per day on these
> > > since
> > > before Xmas and they haven't lost sight of their disks again.
> > > 
> > > TLDR; I think we should throw them back in the pool.
> > 
> > Great.
> 
> I'll take this as a "yes, go ahead" ;-)

and now I have actually gone and done it...


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/5] libxl: introduce LIBXL_VGA_INTERFACE_TYPE_UNDEF

2016-01-20 Thread Ian Campbell
On Wed, 2016-01-20 at 12:57 +0100, Roger Pau Monne wrote:
> And use it as the default value for the VGA kind. This allows libxl to
> set
> it to the default value later on when the domain type is known. For HVM
> guests the default value is LIBXL_VGA_INTERFACE_TYPE_CIRRUS while for
> HVMlite the default value is LIBXL_VGA_INTERFACE_TYPE_NONE.

Most enums in lixl_type.idl call this value "UNKNOWN" rather than "UNDEF".
We should be consistent.

You will also need to add the usual #define LIBXL_HAVE to libxl.h.

Everything else looks ok to me.

> 
> Signed-off-by: Roger Pau Monné 
> ---
> Cc: Ian Jackson 
> Cc: Ian Campbell 
> Cc: Wei Liu 
> ---
>  tools/libxl/libxl_create.c  | 8 ++--
>  tools/libxl/libxl_dm.c  | 6 ++
>  tools/libxl/libxl_types.idl | 3 ++-
>  3 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index e491d83..61a4001 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -206,8 +206,12 @@ int libxl__domain_build_info_setdefault(libxl__gc
> *gc,
>  if (b_info->u.hvm.mmio_hole_memkb == LIBXL_MEMKB_DEFAULT)
>  b_info->u.hvm.mmio_hole_memkb = 0;
>  
> -if (!b_info->u.hvm.vga.kind)
> -b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
> +if (b_info->u.hvm.vga.kind == LIBXL_VGA_INTERFACE_TYPE_UNDEF) {
> +if (b_info->device_model_version ==
> LIBXL_DEVICE_MODEL_VERSION_NONE)
> +b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE;
> +else
> +b_info->u.hvm.vga.kind =
> LIBXL_VGA_INTERFACE_TYPE_CIRRUS;
> +}
>  
>  if (!b_info->u.hvm.hdtype)
>  b_info->u.hvm.hdtype = LIBXL_HDTYPE_IDE;
> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
> index a088d71..9aa0cc8 100644
> --- a/tools/libxl/libxl_dm.c
> +++ b/tools/libxl/libxl_dm.c
> @@ -531,6 +531,9 @@ static int
> libxl__build_device_model_args_old(libxl__gc *gc,
>  break;
>  case LIBXL_VGA_INTERFACE_TYPE_QXL:
>  break;
> +default:
> +LOG(ERROR, "Invalid emulated video card specified");
> +return ERROR_INVAL;
>  }
>  
>  if (b_info->u.hvm.boot) {
> @@ -970,6 +973,9 @@ static int
> libxl__build_device_model_args_new(libxl__gc *gc,
>  GCSPRINTF("qxl-
> vga,vram_size_mb=%"PRIu64",ram_size_mb=%"PRIu64,
>  (b_info->video_memkb/2/1024), (b_info-
> >video_memkb/2/1024) ) );
>  break;
> +default:
> +LOG(ERROR, "Invalid emulated video card specified");
> +return ERROR_INVAL;
>  }
>  
>  if (b_info->u.hvm.boot) {
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 9ad7eba..92c95e5 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -204,11 +204,12 @@ libxl_shutdown_reason =
> Enumeration("shutdown_reason", [
>  ], init_val = "LIBXL_SHUTDOWN_REASON_UNKNOWN")
>  
>  libxl_vga_interface_type = Enumeration("vga_interface_type", [
> +(0, "UNDEF"),
>  (1, "CIRRUS"),
>  (2, "STD"),
>  (3, "NONE"),
>  (4, "QXL"),
> -], init_val = "LIBXL_VGA_INTERFACE_TYPE_CIRRUS")
> +], init_val = "LIBXL_VGA_INTERFACE_TYPE_UNDEF")
>  
>  libxl_vendor_device = Enumeration("vendor_device", [
>  (0, "NONE"),
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/4] add support for vNVDIMM

2016-01-20 Thread Stefano Stabellini
On Wed, 20 Jan 2016, Tian, Kevin wrote:
> > From: Zhang, Haozhong
> > Sent: Tuesday, December 29, 2015 7:32 PM
> > 
> > This patch series is the Xen part patch to provide virtual NVDIMM to
> > guest. The corresponding QEMU patch series is sent separately with the
> > title "[PATCH 0/2] add vNVDIMM support for Xen".
> > 
> > * Background
> > 
> >  NVDIMM (Non-Volatile Dual In-line Memory Module) is going to be
> >  supported on Intel's platform. NVDIMM devices are discovered via ACPI
> >  and configured by _DSM method of NVDIMM device in ACPI. Some
> >  documents can be found at
> >  [1] ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
> >  [2] NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
> >  [3] DSM Interface Example:
> > http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> >  [4] Driver Writer's Guide:
> > http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
> > 
> >  The upstream QEMU (commits 5c42eef ~ 70d1fb9) has added support to
> >  provide virtual NVDIMM in PMEM mode, in which NVDIMM devices are
> >  mapped into CPU's address space and are accessed via normal memory
> >  read/write and three special instructions (clflushopt/clwb/pcommit).
> > 
> >  This patch series and the corresponding QEMU patch series enable Xen
> >  to provide vNVDIMM devices to HVM domains.
> > 
> > * Design
> > 
> >  Supporting vNVDIMM in PMEM mode has three requirements.
> > 
> 
> Although this design is about vNVDIMM, some background of how pNVDIMM
> is managed in Xen would be helpful to understand the whole design since
> in PMEM mode you need map pNVDIMM into GFN addr space so there's
> a matter of how pNVDIMM is allocated.

Yes, some background would be very helpful. Given that there are so many
moving parts on this (Xen, the Dom0 kernel, QEMU, hvmloader, libxl)
I suggest that we start with a design document for this feature.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 3/5] libxl: initialise the build info before calling prepare_config

2016-01-20 Thread Ian Campbell
On Wed, 2016-01-20 at 12:57 +0100, Roger Pau Monne wrote:
> libxl__arch_domain_prepare_config has access to the
> libxl_domain_build_info
> struct, so make sure it's properly initialised.
> 
> Signed-off-by: Roger Pau Monné 
> ---
> Cc: Ian Jackson 
> Cc: Ian Campbell 
> Cc: Wei Liu 
> ---
> NB: libxl__arch_domain_prepare_config is called from libxl__domain_make.

I think this is worth moving into the actual commit message. Plus it would
be useful to clarify that while prepare_config has access to b_info it
doesn't touch it right now (but presumably you are about to make it do so).

If it does touch it then that is currently a bug which should be mentioned
in the commit message and tagged for backport etc.

I suspect the reason for the ordering today is that domain_make is intended
to consume create_info, not build_info. However that distinction seems to
me to be an artefact of a much older API structure which doesn't seem to
make much sense now (but we are stuck with it :-()

> ---
>  tools/libxl/libxl_create.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index 61a4001..ba4c9e8 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -892,6 +892,12 @@ static void initiate_domain_create(libxl__egc *egc,
>  goto error_out;
>  }
>  
> +ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info);
> +if (ret) {
> +LOG(ERROR, "Unable to set domain build info defaults");
> +goto error_out;
> +}
> +
>  ret = libxl__domain_make(gc, d_config, &domid, &state->config);
>  if (ret) {
>  LOG(ERROR, "cannot make domain: %d", ret);
> @@ -903,12 +909,6 @@ static void initiate_domain_create(libxl__egc *egc,
>  dcs->guest_domid = domid;
>  dcs->dmss.dm.guest_domid = 0; /* means we haven't spawned */
>  
> -ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info);
> -if (ret) {
> -LOG(ERROR, "Unable to set domain build info defaults");
> -goto error_out;
> -}
> -
>  if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM &&
>  (libxl_defbool_val(d_config->b_info.u.hvm.nested_hvm) &&
>   libxl_defbool_val(d_config->b_info.u.hvm.altp2m))) {

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 12:04,  wrote:
> On 01/20/16 01:46, Jan Beulich wrote:
>> >>> On 20.01.16 at 06:31,  wrote:
>> > Secondly, the driver implements a convenient block device interface to
>> > let software access areas where NVDIMM devices are mapped. The
>> > existing vNVDIMM implementation in QEMU uses this interface.
>> > 
>> > As Linux NVDIMM driver has already done above, why do we bother to
>> > reimplement them in Xen?
>> 
>> See above; a possibility is that we may need a split model (block
>> layer parts on Dom0, "normal memory" parts in the hypervisor.
>> Iirc the split is being determined by firmware, and hence set in
>> stone by the time OS (or hypervisor) boot starts.
> 
> For the "normal memory" parts, do you mean parts that map the host
> NVDIMM device's address space range to the guest? I'm going to
> implement that part in hypervisor and expose it as a hypercall so that
> it can be used by QEMU.

To answer this I need to have my understanding of the partitioning
being done by firmware confirmed: If that's the case, then "normal"
means the part that doesn't get exposed as a block device (SSD).
In any event there's no correlation to guest exposure here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag

2016-01-20 Thread Paul Durrant
My patch b2700877 "move and amend multicast control documentation"
clarified use of the multicast control protocol between frontend and
backend. However, it transpires that the restrictions that documentation
placed on the "request-multicast-control" flag make it hard for a
frontend to enable 'all multicast' promiscuous mode, in that to do so
would require the frontend and backend to disconnect and re-connect.

This patch adds a new "feature-dynamic-multicast-control" flag to allow
a backend to advertise that it will watch "request-multicast-control" hence
allowing it to be meaningfully modified by the frontend at any time rather
than only when the frontend and backend are disconnected.

Signed-off-by: Paul Durrant 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Keir Fraser 
Cc: Tim Deegan 
---
 xen/include/public/io/netif.h | 34 ++
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/xen/include/public/io/netif.h b/xen/include/public/io/netif.h
index fe0a87f..8816e0f 100644
--- a/xen/include/public/io/netif.h
+++ b/xen/include/public/io/netif.h
@@ -136,18 +136,28 @@
  */
 
 /*
- * "feature-multicast-control" advertises the capability to filter ethernet
- * multicast packets in the backend. To enable use of this capability the
- * frontend must set "request-multicast-control" before moving into the
- * connected state.
- *
- * If "request-multicast-control" is set then the backend transmit side should
- * no longer flood multicast packets to the frontend, it should instead drop 
any
- * multicast packet that does not match in a filter list. The list is
- * amended by the frontend by sending dummy transmit requests containing
- * XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} extra-info fragments as specified 
below.
- * Once enabled by the frontend, the feature cannot be disabled except by
- * closing and re-connecting to the backend.
+ * "feature-multicast-control" and "feature-dynamic-multicast-control"
+ * advertise the capability to filter ethernet multicast packets in the
+ * backend. If the frontend wishes to take advantage of this feature then
+ * it may set "request-multicast-control". If the backend only advertises
+ * "feature-multicast-control" then "request-multicast-control" must be set
+ * before the frontend moves into the connected state. The backend will
+ * sample the value on this state transition and any subsequent change in
+ * value will have no effect. However, if the backend also advertises
+ * "feature-dynamic-multicast-control" then "request-multicast-control"
+ * may be set by the frontend at any time. In this case, the backend will
+ * watch the value and re-sample on watch events.
+ *
+ * If the sampled value of "request-multicast-control" is set then the
+ * backend transmit side should no longer flood multicast packets to the
+ * frontend, it should instead drop any multicast packet that does not
+ * match in a filter list.
+ * The list is amended by the frontend by sending dummy transmit requests
+ * containing XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} extra-info fragments as
+ * specified below.
+ * Note that the filter list may be amended even if the sampled value of
+ * "request-multicast-control" is not set, however the filter should only
+ * be applied if it is set.
  */
 
 /*
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 3/3] VT-d: Fix vt-d Device-TLB flush timeout issue.

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 11:26,  wrote:
>> On January 15, 2016 at 9:10,  wrote:
>> >>> On 23.12.15 at 09:25,  wrote:
>> > @@ -229,6 +239,63 @@ int qinval_device_iotlb(struct iommu *iommu,
>> >  return 0;
>> >  }
>> >
>> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
>> > + u16 seg, u8 bus, u8 devfn)
>> {
>> > +struct domain *d;
>> > +struct pci_dev *pdev;
>> > +
>> > +d = rcu_lock_domain_by_id(iommu->domid_map[did]);
>> > +ASSERT(d);
>> 
>> Don't you need to acquire some lock in order to safely assert this?
> 
> Referred to the other use case of 'rcu_lock_domain_by_id()', Xen checks 
> whether
> The domain is 'NULL'. 
> Could I also replace the 'ASSERT(d)' with
>   + If ( d == NULL )
>   +   return;
> ?

At a first glance this doesn't look right, but in the end that's
something you need to answer.

>> Also note that unused slots hold zero, i.e. there's at least a theoretical 
> risk of
>> problems here when you don't also look at
>> iommu->domid_bitmap.
>> 
> I am not clear how to fix this point. Do you have good idea?
> Add a lock to 'iommu->domid_bitmap'?

How would a lock help avoiding mistaking an unused slot to mean
Dom0? As already suggested - I think you simply need to consult
the bitmap along with the domain ID array.

>> > +{
>> > +list_del(&pdev->domain_list);
>> 
>> This should happen under pcidevs_lock - you need to either acquire it or
>> ASSERT() it being held.
>> 
> 
> I should check whether it is under pcidevs_lock -- with 
> spin_is_locked(&pcidevs_lock)
> If it is not under pcidevs_lock, I should acquire it.
> I think ASSERT() is not a good idea. Hypervisor acquires this lock and then 
> remove the resource.

I don't understand this last sentence.

>> > @@ -349,9 +416,18 @@ static int flush_iotlb_qi(
>> >  queue_invalidate_iotlb(iommu,
>> > type >>
>> DMA_TLB_FLUSH_GRANU_OFFSET, dr,
>> > dw, did, size_order, 0, addr);
>> > +
>> > +/*
>> > + * Synchronize with hardware for invalidation request descriptors
>> > + * submitted before Device-TLB invalidate descriptor.
>> > + */
>> > +rc = invalidate_sync(iommu);
>> > +if ( rc )
>> > + return rc;
>> > +
>> >  if ( flush_dev_iotlb )
>> >  ret = dev_invalidate_iotlb(iommu, did, addr, size_order, 
>> > type);
>> > -rc = invalidate_sync(iommu);
>> > +
>> >  if ( !ret )
>> >  ret = rc;
>> >  }
>> 
>> This change is because of ...?
>> 
> As Kevin's comments,
> I add 'dev_invalidate_iotlb_sync()' to dev_invalidate_iotlb(). 
> Then i can know which device is flush timeout.

I don't understand: How is your reply related to you moving up the
invocation of invalidate_sync()?

>> Plus logging the error code and device coordinates might turn out helpful in
>> such cases. But first of all - if there was a timeout, we'd make it here 
>> only for
>> Dom0. Perhaps the printing should move into an else to the domain_crash()?
>> And if there was another error, the message would be outright wrong.
>> 
> IMO, the print for Dom0 is enough.
> I think it is not a good idea to move into an else to domain_crash(). 
> Domain_crash is quite a common part.
> Anyway I can improve it in low priority.

At the very least the message should not end up being actively
misleading.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 12:20,  wrote:
>> From: Jan Beulich [mailto:jbeul...@suse.com]
>> Sent: Wednesday, January 20, 2016 4:35 PM
>> >>> On 20.01.16 at 08:49,  wrote:
>> >> From: Jan Beulich [mailto:jbeul...@suse.com]
>> >> Sent: Monday, January 18, 2016 11:14 PM
>> >> >>> On 03.12.15 at 09:35,  wrote:
>> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> >> > @@ -83,7 +83,131 @@ static int vmx_msr_write_intercept(unsigned int
>> >> msr, uint64_t msr_content);
>> >> > +ASSERT(v->arch.hvm_vmx.pi_block_cpu == NR_CPUS);
>> >> > +
>> >> > +/*
>> >> > + * The vCPU is blocking, we need to add it to one of the per pCPU
>> lists.
>> >> > + * We save v->processor to v->arch.hvm_vmx.pi_block_cpu and use
>> it
>> >> for
>> >> > + * the per-CPU list, we also save it to posted-interrupt descriptor
>> and
>> >> > + * make it as the destination of the wake-up notification event.
>> >> > + */
>> >> > +v->arch.hvm_vmx.pi_block_cpu = v->processor;
>> >> > +
>> >> > +spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock,
>> >> > +  v->arch.hvm_vmx.pi_block_cpu), flags);
>> >> > +list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list,
>> >> > +  &per_cpu(pi_blocked_vcpu, v-
>> >arch.hvm_vmx.pi_block_cpu));
>> >> > +spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock,
>> >> > +   v->arch.hvm_vmx.pi_block_cpu), flags);
>> >> > +
>> >> > +ASSERT(!pi_test_sn(pi_desc));
>> >> > +
>> >> > +/*
>> >> > + * We don't need to set the 'NDST' field, since it should point to
>> >> > + * the same pCPU as v->processor.
>> >> > + */
>> >> > +
>> >> > +write_atomic(&pi_desc->nv, pi_wakeup_vector);
>> >>
>> >> Stray blank line between comment and corresponding code.
>> >>
>> >> Also the ASSERT() is rather more disconnected from the write
>> >> than seems desirable: Wouldn't it be better to cmpxchg() the
>> >> whole 32-bit value, validating that SN is clear in the result?
>> >
>> > Not quite understand this. The control field is 64 bits, do you
>> > mean cmpxchg() the whole 64-bit value like follows:
>> >
>> > +do {
>> > +old.control = new.control = pi_desc->control;
>> > +new.nv = pi_wakeup_vector;
>> > +} while ( cmpxchg(&pi_desc->control, old.control, new.control)
>> > +  != old.control );
>> >
>> > This a somewhat like the implementation in the earlier versions,
>> > why do you want to change it back?
>> 
>> Did you read what I've said? I'm worried about the disconnect of
>> assertion and update: You really care about SN being clear
>> _after_ the update aiui. 
> 
> No, the ASSERT has no connection with the update. the SN bit should
> be clear before updating pi_desc->nv, adding ASSERT here is just kind
> of protection code.

And SN can't get set behind your back between the ASSERT() and
the update? If the answer is "yes", then the code is indeed fine as is.

>> >> > +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
>> >> > +if ( pi_block_cpu == NR_CPUS )
>> >> > +return;
>> >>
>> >> Are this condition and the one in the immediately preceding if()
>> >> connected in any way?
>> >
>> > I am not sure I understand this correctly. Which one is
>> > " the one in the immediately preceding if()"?
>> 
>> If you hadn't ripped out too much of the context, it would be a
>> matter of pointing you back up. Now I have to re-quote the full
>> code:
>> 
>> +if ( pi_desc->nv != posted_intr_vector )
>> +write_atomic(&pi_desc->nv, posted_intr_vector);
>> +
>> +/* the vCPU is not on any blocking list. */
>> +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
>> +if ( pi_block_cpu == NR_CPUS )
>> +return;
>> 
>> These are the two if()-s the question is about.
>> 
>> >> I.e. could the latter perhaps become an
>> >> ASSERT() by connecting the two blocks suitably? I'm simply
>> >> trying to limit the number of different cases to consider mentally
>> >> when looking at this code ...
>> >
>> > If we get a true from above check, it means the vcpu was removed by
>> > pi_wakeup_interrupt(), then we just need to return. If we get a false,
>> > we will acquire the spinlock as the following code does, there are two
>> > scenarios:
>> > - pi_wakeup_interrupt() acquires the spinlock before us, then when
>> > we get the lock, 'v->arch.hvm_vmx.pi_block_cpu' has already been
>> > set to NR_CPUS by pi_wakeup_interrupt(), we need do nothing.
>> > - We get the spinlock before pi_wakeup_interrupt(), this time we need
>> > to remove the vCPU from the list and set 'v->arch.hvm_vmx.pi_block_cpu'
>> > to NR_CPUS.
>> 
>> This is all only about the second condition, but the question really is
>> whether one being true or false may imply the result of other. In
>> which case this would better be ASSERT()ed than handled by two
>> conditionals.
> 
> Thanks for the explanation. I don't think there are any connections
> Between " if ( pi_desc->nv != poste

Re: [Xen-devel] [PATCH v3 5/5] libxl: add options to enable/disable emulated devices

2016-01-20 Thread Ian Campbell
On Wed, 2016-01-20 at 12:57 +0100, Roger Pau Monne wrote:
> Allow enabling or disabling emulated devices from the libxl domain
> configuration file. For HVM guests with a device model all the emulated
> devices are enabled. For HVM guests without a device model no devices are
> enabled by default, although they can be enabled using the options
> provided.
> The arbiter of whether a combination is posible or not is always Xen,
> libxl
> doesn't do any kind of check.
> 
> This set of options is also propagated inside of the libxl migration record
> as part of the contents of the libxl_domain_build_info struct.

... and this is the real motivation for this change, not actually allowing
users to control all this AIUI.

Did you check that the fields updated using libxl_defbool_setdefault are
actually updated in the JSON copy and therefore propagated to the other
side of a migration as specific values and not as "pick a default"? I think
we don't want these changing on migration. I think/hope all this was
automatically handled by the work Wei did in the last release cycle.

> Signed-off-by: Roger Pau Monné 
> ---
> Cc: Ian Jackson 
> Cc: Ian Campbell 
> Cc: Wei Liu 
> ---
>  docs/man/xl.cfg.pod.5   | 39 +++
>  tools/libxl/libxl.h | 11 +++
>  tools/libxl/libxl_create.c  | 21 -
>  tools/libxl/libxl_types.idl |  6 ++
>  tools/libxl/libxl_x86.c | 33 -
>  tools/libxl/xl_cmdimpl.c|  7 +++
>  6 files changed, 111 insertions(+), 6 deletions(-)
> 
> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> index 8899f75..46d4529 100644
> --- a/docs/man/xl.cfg.pod.5
> +++ b/docs/man/xl.cfg.pod.5
> @@ -1762,6 +1762,45 @@ See F for
> more information.
>  
>  =back
>  
> +=head3 HVM without a device model options
> +
> +This options can be used to change the set of emulated devices provided

"These..."

> +to guests without a device model. Note that Xen might not support all
> +possible combinations. By default HVM guests without a device model
> +don't have any of them enabled.

... and for those with a device model? The title and text suggest these
options are invalid/ignored in that case, but the code does actually honour
what the user specified in this case.

> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 92c95e5..8a21cda 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -519,6 +519,12 @@ libxl_domain_build_info = Struct("domain_build_info",[
> ("serial_list",  
> libxl_string_list),
> ("rdm", libxl_rdm_reserve),
> ("rdm_mem_boundary_memkb", MemKB),
> +   ("lapic",libxl_defbool),
> +   ("ioapic",   libxl_defbool),
> +   ("rtc",  libxl_defbool),
> +   ("power_management", libxl_defbool),
> +   ("pic",  libxl_defbool),
> +   ("pit",  libxl_defbool),

I wonder if these should go in a sub-struct. Although you might reasonably
argue that this is already such a dumping ground it doesn't matter...

> ])),
>   ("pv", Struct(None, [("kernel", string),
>    ("slack_memkb", MemKB),
> diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
> index 46cfafb..92f25fd 100644
> --- a/tools/libxl/libxl_x86.c
> +++ b/tools/libxl/libxl_x86.c
> @@ -7,15 +7,38 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
>    libxl_domain_config *d_config,
>    xc_domain_configuration_t
> *xc_config)
>  {
> +struct libxl_domain_build_info *info = &d_config->b_info;
>  
> -if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM &&
> -d_config->b_info.device_model_version !=
> -LIBXL_DEVICE_MODEL_VERSION_NONE) {
> -/* HVM domains with a device model. */

So, I'm not 100% clear on why this check and the corresponding logic to set
xc_config->emulation_flags is not also sufficient for after migration.
Could you explain (and likely eventually add the rationale to the commit
message).

Ian.
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 12:40,  wrote:
> Hi Jan,
> 
> On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote:
> On 20.01.16 at 09:31,  wrote:
>>> +/*
>>> + * Backend response
>>> + *
>>> + * cmd: command for operation on clk, same with the cmd in request.
>>> + * id: clk id, same with the id in request.
>>> + * success: indicate failure or success for the cmd.
>>> + * rate: clock rate. Used for get rate.
>>> + *
>>> + * cmd and id are filled by backend and passed to frontend to
>>> + * let frontend check whether the response is for the current
>>> + * request or not.
>>> + */
>>> +struct xen_clkif_response {
>>> +   uint32_t cmd;
>>> +   uint32_t id;
>>> +   uint32_t success;
>>> +   uint64_t rate;
>>> +};
>>
>>This isn't 32-/64-bit clean. Question is whether echoing cmd is really
>>needed.
> 
> As Juergen suggested, use a request id. I'll fix this in V3.
> 32-/64-bit unclean, I can not get you here (:

The layout of above structure may end up different for 32- and
64-bit guests, depending on the alignment of uint64_t.

>>And what I'm missing throughout the file is some description of how
>>clock events (interrupts?) are actually meant to make it into the
>>guest.
> 
> I have a simple implementation v1 patch for linux, 
> http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html.
> You can see how it works.

No, sorry, that's not the point of the inquiry. It seems to me that
there are more aspects to this interface, not directly related to
the request/reply protocol written down here, which need to be
spelled out event if they don't require any particular definitions
or type declarations.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Backports for Xen 4.6

2016-01-20 Thread Jan Beulich
>>> On 18.01.16 at 16:53,  wrote:
> Possibly also:
> 42940c046902 "x86/shadow: Fix missing newline in dprintk()"

The affected statement compiles to nothing in a release build, which
can be taken as an argument both ways. I lean towards not putting
it in.

> 6851e979874e "VT-d: use proper error codes in iommu_enable_x2apic_IR()"

Since I had pulled this into our own tree already, and since you're
now also viewing this as useful, I guess I will throw it in.

> 0ce647ad6f70 "x86: suppress bogus log message"

This being a purely cosmetic change, may I ask for the reason
you consider this a backport candidate?

> 484c14b7254e "x86/vmx: enable PML by default"

Not in a point release I would say.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 4/5] x86/PV: allow PV guests to have an emulated PIT

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 12:57,  wrote:
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -542,8 +542,11 @@ int arch_domain_create(struct domain *d, unsigned int 
> domcr_flags,
> d->domain_id, config->emulation_flags);
>  return -EINVAL;
>  }
> -if ( config->emulation_flags != 0 &&
> - (!is_hvm_domain(d) || config->emulation_flags != 
> XEN_X86_EMU_ALL) )
> +if ( is_hardware_domain(d) )
> +config->emulation_flags |= XEN_X86_EMU_PIT;
> +if ( !is_hvm_domain(d) ? (config->emulation_flags != 
> XEN_X86_EMU_PIT) :
> + (config->emulation_flags != XEN_X86_EMU_ALL 
> &&
> +  config->emulation_flags != 0))
>  {
>  printk(XENLOG_G_ERR "d%d: Xen does not allow %s domain creation "
> "with the current selection of emulators: %#x\n",

While you mentioned in the cover letter that this needs to be
committed together with patch 5, having them as separate
commits would still break bisectabaility afaict. Would it really
be so unreasonable to allow PV(H) DomU-s to not have an
emulated PIT, depending on (a future) tool stack gathered
per-domain setting? Converting the check above accordingly
would - afaict - at once break that dependency on patch 5.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-4.6-testing test] 78528: tolerable FAIL - PUSHED

2016-01-20 Thread osstest service owner
flight 78528 xen-4.6-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/78528/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 11 guest-start  fail blocked in 77441
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 77441
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 77441

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass

version targeted for testing:
 xen  6150df9f3f99ecbcbd9917002186d1d895b5602e
baseline version:
 xen  1d3cc6e62c4d2fc3dd9251d4921881425c9d27bd

Last test of basis77441  2016-01-07 23:13:46 Z   12 days
Testing same since78528  2016-01-19 14:09:10 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Ian Jackson 
  Jonathan Davies 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-prev pass
 build-i386-prev  pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm

Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag

2016-01-20 Thread Ian Campbell
On Wed, 2016-01-20 at 12:50 +, Paul Durrant wrote:
> My patch b2700877 "move and amend multicast control documentation"
> clarified use of the multicast control protocol between frontend and
> backend. However, it transpires that the restrictions that documentation
> placed on the "request-multicast-control" flag make it hard for a
> frontend to enable 'all multicast' promiscuous mode, in that to do so
> would require the frontend and backend to disconnect and re-connect.

Do we therefore think that this document reflected reality, i.e. might this
not be "just" a documentation bug?

(Or maybe we can't tell because the only previous implementation was years
ago in Solaris or something)

> This patch adds a new "feature-dynamic-multicast-control" flag to allow
> a backend to advertise that it will watch "request-multicast-control" hence
> allowing it to be meaningfully modified by the frontend at any time rather
> than only when the frontend and backend are disconnected.

Would allowing XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} to take a bcast address
be easier on the backend, in that it would just need to be a static feature
rather than watching stuff on the fly?


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] public/io/netif.h: change semantics of "request-multicast-control" flag

2016-01-20 Thread Paul Durrant
> -Original Message-
> From: Ian Campbell [mailto:ian.campb...@citrix.com]
> Sent: 20 January 2016 13:06
> To: Paul Durrant; xen-de...@lists.xenproject.org
> Cc: Ian Jackson; Jan Beulich; Keir (Xen.org); Tim (Xen.org)
> Subject: Re: [PATCH] public/io/netif.h: change semantics of "request-
> multicast-control" flag
> 
> On Wed, 2016-01-20 at 12:50 +, Paul Durrant wrote:
> > My patch b2700877 "move and amend multicast control documentation"
> > clarified use of the multicast control protocol between frontend and
> > backend. However, it transpires that the restrictions that documentation
> > placed on the "request-multicast-control" flag make it hard for a
> > frontend to enable 'all multicast' promiscuous mode, in that to do so
> > would require the frontend and backend to disconnect and re-connect.
> 
> Do we therefore think that this document reflected reality, i.e. might this
> not be "just" a documentation bug?
> 
> (Or maybe we can't tell because the only previous implementation was years
> ago in Solaris or something)

That's my concern. I hope it's just a documentation bug, but I don't know. Also 
I've already done an implementation in Linux netback according to the 
restricted semantics.

> 
> > This patch adds a new "feature-dynamic-multicast-control" flag to allow
> > a backend to advertise that it will watch "request-multicast-control" hence
> > allowing it to be meaningfully modified by the frontend at any time rather
> > than only when the frontend and backend are disconnected.
> 
> Would allowing XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL} to take a bcast
> address
> be easier on the backend, in that it would just need to be a static feature
> rather than watching stuff on the fly?

The documented semantics of the list are 'exact match' so sending a bcast 
address doesn't do much good with a backend that doesn't know to treat is 
specially hence a frontend can't tell whether 'all multicast' mode is going to 
work without the extra feature flag. As for watching "request-multcast-control" 
vs. add/remove of bcast, the complexity of implementation is cheaper for the 
latter but I think the former is 'nicer'.

  Paul
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Backports for Xen 4.6

2016-01-20 Thread Andrew Cooper
On 20/01/16 12:29, Jan Beulich wrote:
 On 18.01.16 at 16:53,  wrote:
>> Possibly also:
>> 42940c046902 "x86/shadow: Fix missing newline in dprintk()"
> The affected statement compiles to nothing in a release build, which
> can be taken as an argument both ways. I lean towards not putting
> it in.
>
>> 6851e979874e "VT-d: use proper error codes in iommu_enable_x2apic_IR()"
> Since I had pulled this into our own tree already, and since you're
> now also viewing this as useful, I guess I will throw it in.
>
>> 0ce647ad6f70 "x86: suppress bogus log message"
> This being a purely cosmetic change, may I ask for the reason
> you consider this a backport candidate?

The shear volume reduction in debug builds.

XenServer ships both a release and a debug hypervisor for first-line
triage of customer issues.  As such, "only affecting a debug build"
isn't a relevant consideration for us, and I expect we are not alone here.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-20 Thread Andrew Cooper
On 20/01/16 10:36, Xiao Guangrong wrote:
>
> Hi,
>
> On 01/20/2016 06:15 PM, Haozhong Zhang wrote:
>
>> CCing QEMU vNVDIMM maintainer: Xiao Guangrong
>>
>>> Conceptually, an NVDIMM is just like a fast SSD which is linearly
>>> mapped
>>> into memory.  I am still on the dom0 side of this fence.
>>>
>>> The real question is whether it is possible to take an NVDIMM, split it
>>> in half, give each half to two different guests (with appropriate NFIT
>>> tables) and that be sufficient for the guests to just work.
>>>
>>
>> Yes, one NVDIMM device can be split into multiple parts and assigned
>> to different guests, and QEMU is responsible to maintain virtual NFIT
>> tables for each part.
>>
>>> Either way, it needs to be a toolstack policy decision as to how to
>>> split the resource.
>
> Currently, we are using NVDIMM as a block device and a DAX-based
> filesystem
> is created upon it in Linux so that file-related accesses directly reach
> the NVDIMM device.
>
> In KVM, If the NVDIMM device need to be shared by different VMs, we can
> create multiple files on the DAX-based filesystem and assign the file to
> each VMs. In the future, we can enable namespace (partition-like) for
> PMEM
> memory and assign the namespace to each VMs (current Linux driver uses
> the
> whole PMEM as a single namespace).
>
> I think it is not a easy work to let Xen hypervisor recognize NVDIMM
> device
> and manager NVDIMM resource.
>
> Thanks!
>

The more I see about this, the more sure I am that we want to keep it as
a block device managed by dom0.

In the case of the DAX-based filesystem, I presume files are not
necessarily contiguous.  I also presume that this is worked around by
permuting the mapping of the virtual NVDIMM such that the it appears as
a contiguous block of addresses to the guest?

Today in Xen, Qemu already has the ability to create mappings in the
guest's address space, e.g. to map PCI device BARs.  I don't see a
conceptual difference here, although the security/permission model
certainly is more complicated.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling

2016-01-20 Thread Dario Faggioli
On Wed, 2016-01-20 at 04:35 -0700, Jan Beulich wrote:
> > > > On 20.01.16 at 12:20,  wrote:
> > > 
> > > Then you didn't understand: The question isn't this path, but the
> > > path where the hook gets called if non-NULL (and hence the
> > > possibility to avoid such needless calls).
> > 
> > I understand you mean the overhead happens when the hooks
> > is called. My point is the hook is not called in a critical path,
> > so I doubt
> > whether it worth doing so to make the logic complex.
> 
> Are you sure scheduling code is not a critical path?
> 
TBH, I like Jan's point... It's always good to make all we can to avoid
calling the hook, if unnecessary.

Does it really complicates things a lot? Feng, can you give it a try?

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] preparing for 4.6.1

2016-01-20 Thread Jan Beulich
All,

with this first point release due in about two weeks, please indicate
backports you find missing from its staging tree but necessary in the
release.

Thanks, Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling

2016-01-20 Thread Wu, Feng


> -Original Message-
> From: Dario Faggioli [mailto:dario.faggi...@citrix.com]
> Sent: Wednesday, January 20, 2016 9:31 PM
> To: Jan Beulich ; Wu, Feng 
> Cc: Andrew Cooper ; George Dunlap
> ; Tian, Kevin ; xen-
> de...@lists.xen.org; Keir Fraser 
> Subject: Re: [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
> 
> On Wed, 2016-01-20 at 04:35 -0700, Jan Beulich wrote:
> > > > > On 20.01.16 at 12:20,  wrote:
> > > >
> > > > Then you didn't understand: The question isn't this path, but the
> > > > path where the hook gets called if non-NULL (and hence the
> > > > possibility to avoid such needless calls).
> > >
> > > I understand you mean the overhead happens when the hooks
> > > is called. My point is the hook is not called in a critical path,
> > > so I doubt
> > > whether it worth doing so to make the logic complex.
> >
> > Are you sure scheduling code is not a critical path?
> >
> TBH, I like Jan's point... It's always good to make all we can to avoid
> calling the hook, if unnecessary.
> 
> Does it really complicates things a lot? Feng, can you give it a try?

Sure, I will try it and discuss with your guys if I meet some issues in
future.

Thanks,
Feng

> 
> Regards,
> Dario
> --
> <> (Raistlin Majere)
> -
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling

2016-01-20 Thread Wu, Feng


> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Wednesday, January 20, 2016 7:36 PM
> To: Wu, Feng 
> Cc: Andrew Cooper ; Dario Faggioli
> ; George Dunlap ;
> Tian, Kevin ; xen-devel@lists.xen.org; Keir Fraser
> 
> Subject: RE: [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
> 
> >>> On 20.01.16 at 12:20,  wrote:
> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> >> Sent: Wednesday, January 20, 2016 4:35 PM
> >> >>> On 20.01.16 at 08:49,  wrote:
> >> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> >> >> Sent: Monday, January 18, 2016 11:14 PM
> >> >> >>> On 03.12.15 at 09:35,  wrote:
> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> >> > @@ -83,7 +83,131 @@ static int vmx_msr_write_intercept(unsigned
> int
> >> >> msr, uint64_t msr_content);
> >> >> > +ASSERT(v->arch.hvm_vmx.pi_block_cpu == NR_CPUS);
> >> >> > +
> >> >> > +/*
> >> >> > + * The vCPU is blocking, we need to add it to one of the per pCPU
> >> lists.
> >> >> > + * We save v->processor to v->arch.hvm_vmx.pi_block_cpu and
> use
> >> it
> >> >> for
> >> >> > + * the per-CPU list, we also save it to posted-interrupt 
> >> >> > descriptor
> >> and
> >> >> > + * make it as the destination of the wake-up notification event.
> >> >> > + */
> >> >> > +v->arch.hvm_vmx.pi_block_cpu = v->processor;
> >> >> > +
> >> >> > +spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock,
> >> >> > +  v->arch.hvm_vmx.pi_block_cpu), flags);
> >> >> > +list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list,
> >> >> > +  &per_cpu(pi_blocked_vcpu, v-
> >> >arch.hvm_vmx.pi_block_cpu));
> >> >> > +spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock,
> >> >> > +   v->arch.hvm_vmx.pi_block_cpu), flags);
> >> >> > +
> >> >> > +ASSERT(!pi_test_sn(pi_desc));
> >> >> > +
> >> >> > +/*
> >> >> > + * We don't need to set the 'NDST' field, since it should point 
> >> >> > to
> >> >> > + * the same pCPU as v->processor.
> >> >> > + */
> >> >> > +
> >> >> > +write_atomic(&pi_desc->nv, pi_wakeup_vector);
> >> >>
> >> >> Stray blank line between comment and corresponding code.
> >> >>
> >> >> Also the ASSERT() is rather more disconnected from the write
> >> >> than seems desirable: Wouldn't it be better to cmpxchg() the
> >> >> whole 32-bit value, validating that SN is clear in the result?
> >> >
> >> > Not quite understand this. The control field is 64 bits, do you
> >> > mean cmpxchg() the whole 64-bit value like follows:
> >> >
> >> > +do {
> >> > +old.control = new.control = pi_desc->control;
> >> > +new.nv = pi_wakeup_vector;
> >> > +} while ( cmpxchg(&pi_desc->control, old.control, new.control)
> >> > +  != old.control );
> >> >
> >> > This a somewhat like the implementation in the earlier versions,
> >> > why do you want to change it back?
> >>
> >> Did you read what I've said? I'm worried about the disconnect of
> >> assertion and update: You really care about SN being clear
> >> _after_ the update aiui.
> >
> > No, the ASSERT has no connection with the update. the SN bit should
> > be clear before updating pi_desc->nv, adding ASSERT here is just kind
> > of protection code.
> 
> And SN can't get set behind your back between the ASSERT() and
> the update?

Yes.

> If the answer is "yes", then the code is indeed fine as is.
> 
> >> >> > +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
> >> >> > +if ( pi_block_cpu == NR_CPUS )
> >> >> > +return;
> >> >>
> >> >> Are this condition and the one in the immediately preceding if()
> >> >> connected in any way?
> >> >
> >> > I am not sure I understand this correctly. Which one is
> >> > " the one in the immediately preceding if()"?
> >>
> >> If you hadn't ripped out too much of the context, it would be a
> >> matter of pointing you back up. Now I have to re-quote the full
> >> code:
> >>
> >> +if ( pi_desc->nv != posted_intr_vector )
> >> +write_atomic(&pi_desc->nv, posted_intr_vector);
> >> +
> >> +/* the vCPU is not on any blocking list. */
> >> +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
> >> +if ( pi_block_cpu == NR_CPUS )
> >> +return;
> >>
> >> These are the two if()-s the question is about.
> >>
> >> >> I.e. could the latter perhaps become an
> >> >> ASSERT() by connecting the two blocks suitably? I'm simply
> >> >> trying to limit the number of different cases to consider mentally
> >> >> when looking at this code ...
> >> >
> >> > If we get a true from above check, it means the vcpu was removed by
> >> > pi_wakeup_interrupt(), then we just need to return. If we get a false,
> >> > we will acquire the spinlock as the following code does, there are two
> >> > scenarios:
> >> > - pi_wakeup_interrupt() acquires the spinlock before us, then when
> >> > we get the lock, 'v->arch.hvm_vmx.pi_block_cpu' has already been
> >>

Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Peng Fan
Hi Ian, Stefano

On Wed, Jan 20, 2016 at 12:27:07PM +, Ian Campbell wrote:
>On Wed, 2016-01-20 at 12:06 +, Stefano Stabellini wrote:
>> On Wed, 20 Jan 2016, Peng Fan wrote:
>> > To my use case, Dom0 and DomU both use device tree, I need to build
>> > the mapping table between id and name, since I use name to lookup
>> > the clk in backend, like this:
>> > "clk = __clk_loopkup(clkname); clk_prepare_enable(clk)". Maybe ACPI
>> > is another different case.
>> 
>> Theoretically on systems using ACPI there is no need to fiddle with
>> clocks,
>
>I mentioned ACPI in my replies to v1 more as a placeholder for "other
>firmware descriptions than DT", in order that the pv protocol we end up
>with does not end up being DT specific, which would be a mistake
>irrespective of what may or may not be required for non-DT firmware
>descriptions.

Thanks for clarifying. Beside this, are you ok with the xenstore node 
description in this file?

Thanks,
Peng.

>
>Ian.
>
>>  see
>> 
>> Documentation/arm64/arm-acpi.txt
>> 
>> 
>> "In DT, clocks need to be specified
>> and the drivers need to take them into account.  In ACPI, the assumption
>> is that UEFI will leave the device in a reasonable default state,
>> including
>> any clock settings.  If for some reason the driver needs to change a
>> clock
>> value, this can be done in an ACPI method; all the driver needs to do is
>> invoke the method and not concern itself with what the method needs to do
>> to change the clock.  Changing the hardware can then take place over time
>> by changing what the ACPI method does, and not the driver.
>> 
>> In DT, the parameters needed by the driver to set up clocks as in the
>> example
>> above are known as "bindings"; in ACPI, these are known as "Device
>> Properties"
>> and provided to a driver via the _DSD object."
>> 
>> 
>> However currently we don't have the ability to run ACPI in DomU guests
>> on ARM. Even if we had, there is no way to call native ACPI methods from
>> any guests other than Dom0, even on x86. We just have to hope that
>> clocks don't need to be reset on ACPI systems.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 0/3] x86: XSA-168 follow-up

2016-01-20 Thread Jan Beulich
1: mmuext: tighten TLB flush address checks
2: PV: relax LDT address check
3: paging: invlpg() hook returns boolean

Signed-off-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 14:48,  wrote:

> 
>> -Original Message-
>> From: Jan Beulich [mailto:jbeul...@suse.com]
>> Sent: Wednesday, January 20, 2016 7:36 PM
>> To: Wu, Feng 
>> Cc: Andrew Cooper ; Dario Faggioli
>> ; George Dunlap ;
>> Tian, Kevin ; xen-devel@lists.xen.org; Keir Fraser
>> 
>> Subject: RE: [PATCH v10 6/7] vmx: VT-d posted-interrupt core logic handling
>> 
>> >>> On 20.01.16 at 12:20,  wrote:
>> >> From: Jan Beulich [mailto:jbeul...@suse.com]
>> >> Sent: Wednesday, January 20, 2016 4:35 PM
>> >> >>> On 20.01.16 at 08:49,  wrote:
>> >> >> From: Jan Beulich [mailto:jbeul...@suse.com]
>> >> >> Sent: Monday, January 18, 2016 11:14 PM
>> >> >> >>> On 03.12.15 at 09:35,  wrote:
>> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> > @@ -83,7 +83,131 @@ static int vmx_msr_write_intercept(unsigned
>> int
>> >> >> msr, uint64_t msr_content);
>> >> >> > +ASSERT(v->arch.hvm_vmx.pi_block_cpu == NR_CPUS);
>> >> >> > +
>> >> >> > +/*
>> >> >> > + * The vCPU is blocking, we need to add it to one of the per 
>> >> >> > pCPU
>> >> lists.
>> >> >> > + * We save v->processor to v->arch.hvm_vmx.pi_block_cpu and
>> use
>> >> it
>> >> >> for
>> >> >> > + * the per-CPU list, we also save it to posted-interrupt 
>> >> >> > descriptor
>> >> and
>> >> >> > + * make it as the destination of the wake-up notification event.
>> >> >> > + */
>> >> >> > +v->arch.hvm_vmx.pi_block_cpu = v->processor;
>> >> >> > +
>> >> >> > +spin_lock_irqsave(&per_cpu(pi_blocked_vcpu_lock,
>> >> >> > +  v->arch.hvm_vmx.pi_block_cpu), flags);
>> >> >> > +list_add_tail(&v->arch.hvm_vmx.pi_blocked_vcpu_list,
>> >> >> > +  &per_cpu(pi_blocked_vcpu, v-
>> >> >arch.hvm_vmx.pi_block_cpu));
>> >> >> > +spin_unlock_irqrestore(&per_cpu(pi_blocked_vcpu_lock,
>> >> >> > +   v->arch.hvm_vmx.pi_block_cpu), flags);
>> >> >> > +
>> >> >> > +ASSERT(!pi_test_sn(pi_desc));
>> >> >> > +
>> >> >> > +/*
>> >> >> > + * We don't need to set the 'NDST' field, since it should point 
>> >> >> > to
>> >> >> > + * the same pCPU as v->processor.
>> >> >> > + */
>> >> >> > +
>> >> >> > +write_atomic(&pi_desc->nv, pi_wakeup_vector);
>> >> >>
>> >> >> Stray blank line between comment and corresponding code.
>> >> >>
>> >> >> Also the ASSERT() is rather more disconnected from the write
>> >> >> than seems desirable: Wouldn't it be better to cmpxchg() the
>> >> >> whole 32-bit value, validating that SN is clear in the result?
>> >> >
>> >> > Not quite understand this. The control field is 64 bits, do you
>> >> > mean cmpxchg() the whole 64-bit value like follows:
>> >> >
>> >> > +do {
>> >> > +old.control = new.control = pi_desc->control;
>> >> > +new.nv = pi_wakeup_vector;
>> >> > +} while ( cmpxchg(&pi_desc->control, old.control, new.control)
>> >> > +  != old.control );
>> >> >
>> >> > This a somewhat like the implementation in the earlier versions,
>> >> > why do you want to change it back?
>> >>
>> >> Did you read what I've said? I'm worried about the disconnect of
>> >> assertion and update: You really care about SN being clear
>> >> _after_ the update aiui.
>> >
>> > No, the ASSERT has no connection with the update. the SN bit should
>> > be clear before updating pi_desc->nv, adding ASSERT here is just kind
>> > of protection code.
>> 
>> And SN can't get set behind your back between the ASSERT() and
>> the update?
> 
> Yes.
> 
>> If the answer is "yes", then the code is indeed fine as is.
>> 
>> >> >> > +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
>> >> >> > +if ( pi_block_cpu == NR_CPUS )
>> >> >> > +return;
>> >> >>
>> >> >> Are this condition and the one in the immediately preceding if()
>> >> >> connected in any way?
>> >> >
>> >> > I am not sure I understand this correctly. Which one is
>> >> > " the one in the immediately preceding if()"?
>> >>
>> >> If you hadn't ripped out too much of the context, it would be a
>> >> matter of pointing you back up. Now I have to re-quote the full
>> >> code:
>> >>
>> >> +if ( pi_desc->nv != posted_intr_vector )
>> >> +write_atomic(&pi_desc->nv, posted_intr_vector);
>> >> +
>> >> +/* the vCPU is not on any blocking list. */
>> >> +pi_block_cpu = v->arch.hvm_vmx.pi_block_cpu;
>> >> +if ( pi_block_cpu == NR_CPUS )
>> >> +return;
>> >>
>> >> These are the two if()-s the question is about.
>> >>
>> >> >> I.e. could the latter perhaps become an
>> >> >> ASSERT() by connecting the two blocks suitably? I'm simply
>> >> >> trying to limit the number of different cases to consider mentally
>> >> >> when looking at this code ...
>> >> >
>> >> > If we get a true from above check, it means the vcpu was removed by
>> >> > pi_wakeup_interrupt(), then we just need to return. If we get a false,
>> >> > we will acquire the spinlock as the following code does, there are 

[Xen-devel] [PATCH 1/3] x86/mmuext: tighten TLB flush address checks

2016-01-20 Thread Jan Beulich
Addresses passed by PV guests should be subjected to __addr_ok(),
avoiding undue TLB flushes; .

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3268,8 +3268,9 @@ long do_mmuext_op(
 case MMUEXT_INVLPG_LOCAL:
 if ( unlikely(d != pg_owner) )
 rc = -EPERM;
-else if ( !paging_mode_enabled(d) ||
-  paging_invlpg(curr, op.arg1.linear_addr) != 0 )
+else if ( !paging_mode_enabled(d)
+  ? __addr_ok(op.arg1.linear_addr)
+  : paging_invlpg(curr, op.arg1.linear_addr) )
 flush_tlb_one_local(op.arg1.linear_addr);
 break;
 
@@ -3290,7 +3291,7 @@ long do_mmuext_op(
 
 if ( op.cmd == MMUEXT_TLB_FLUSH_MULTI )
 flush_tlb_mask(&pmask);
-else
+else if ( __addr_ok(op.arg1.linear_addr) )
 flush_tlb_one_mask(&pmask, op.arg1.linear_addr);
 break;
 }
@@ -3303,10 +3304,10 @@ long do_mmuext_op(
 break;
 
 case MMUEXT_INVLPG_ALL:
-if ( likely(d == pg_owner) )
-flush_tlb_one_mask(d->domain_dirty_cpumask, 
op.arg1.linear_addr);
-else
+if ( unlikely(d != pg_owner) )
 rc = -EPERM;
+else if ( __addr_ok(op.arg1.linear_addr) )
+flush_tlb_one_mask(d->domain_dirty_cpumask, 
op.arg1.linear_addr);
 break;
 
 case MMUEXT_FLUSH_CACHE:
--- a/xen/include/asm-x86/paging.h
+++ b/xen/include/asm-x86/paging.h
@@ -245,7 +245,9 @@ paging_fault(unsigned long va, struct cp
  * or 0 if it's safe not to do so. */
 static inline int paging_invlpg(struct vcpu *v, unsigned long va)
 {
-return is_canonical_address(va) && paging_get_hostmode(v)->invlpg(v, va);
+return (paging_mode_external(v->domain) ? is_canonical_address(va)
+: __addr_ok(va)) &&
+   paging_get_hostmode(v)->invlpg(v, va);
 }
 
 /* Translate a guest virtual address to the frame number that the



x86/mmuext: tighten TLB flush address checks

Addresses passed by PV guests should be subjected to __addr_ok(),
avoiding undue TLB flushes; .

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3268,8 +3268,9 @@ long do_mmuext_op(
 case MMUEXT_INVLPG_LOCAL:
 if ( unlikely(d != pg_owner) )
 rc = -EPERM;
-else if ( !paging_mode_enabled(d) ||
-  paging_invlpg(curr, op.arg1.linear_addr) != 0 )
+else if ( !paging_mode_enabled(d)
+  ? __addr_ok(op.arg1.linear_addr)
+  : paging_invlpg(curr, op.arg1.linear_addr) )
 flush_tlb_one_local(op.arg1.linear_addr);
 break;
 
@@ -3290,7 +3291,7 @@ long do_mmuext_op(
 
 if ( op.cmd == MMUEXT_TLB_FLUSH_MULTI )
 flush_tlb_mask(&pmask);
-else
+else if ( __addr_ok(op.arg1.linear_addr) )
 flush_tlb_one_mask(&pmask, op.arg1.linear_addr);
 break;
 }
@@ -3303,10 +3304,10 @@ long do_mmuext_op(
 break;
 
 case MMUEXT_INVLPG_ALL:
-if ( likely(d == pg_owner) )
-flush_tlb_one_mask(d->domain_dirty_cpumask, 
op.arg1.linear_addr);
-else
+if ( unlikely(d != pg_owner) )
 rc = -EPERM;
+else if ( __addr_ok(op.arg1.linear_addr) )
+flush_tlb_one_mask(d->domain_dirty_cpumask, 
op.arg1.linear_addr);
 break;
 
 case MMUEXT_FLUSH_CACHE:
--- a/xen/include/asm-x86/paging.h
+++ b/xen/include/asm-x86/paging.h
@@ -245,7 +245,9 @@ paging_fault(unsigned long va, struct cp
  * or 0 if it's safe not to do so. */
 static inline int paging_invlpg(struct vcpu *v, unsigned long va)
 {
-return is_canonical_address(va) && paging_get_hostmode(v)->invlpg(v, va);
+return (paging_mode_external(v->domain) ? is_canonical_address(va)
+: __addr_ok(va)) &&
+   paging_get_hostmode(v)->invlpg(v, va);
 }
 
 /* Translate a guest virtual address to the frame number that the
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/3] x86/PV: relax LDT address check

2016-01-20 Thread Jan Beulich
There's no point placing restrictions on its address when the LDT size
is zero.

Also convert a local variable to a slightly more efficient type.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3348,8 +3348,8 @@ long do_mmuext_op(
 
 case MMUEXT_SET_LDT:
 {
-unsigned long ptr  = op.arg1.linear_addr;
-unsigned long ents = op.arg2.nr_ents;
+unsigned int ents = op.arg2.nr_ents;
+unsigned long ptr = ents ? op.arg1.linear_addr : 0;
 
 if ( unlikely(d != pg_owner) )
 rc = -EPERM;
@@ -3361,7 +3361,7 @@ long do_mmuext_op(
 else if ( ((ptr & (PAGE_SIZE - 1)) != 0) || !__addr_ok(ptr) ||
   (ents > 8192) )
 {
-MEM_LOG("Bad args to SET_LDT: ptr=%lx, ents=%lx", ptr, ents);
+MEM_LOG("Bad args to SET_LDT: ptr=%lx, ents=%x", ptr, ents);
 rc = -EINVAL;
 }
 else if ( (curr->arch.pv_vcpu.ldt_ents != ents) ||



x86/PV: relax LDT address check

There's no point placing restrictions on its address when the LDT size
is zero.

Also convert a local variable to a slightly more efficient type.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3348,8 +3348,8 @@ long do_mmuext_op(
 
 case MMUEXT_SET_LDT:
 {
-unsigned long ptr  = op.arg1.linear_addr;
-unsigned long ents = op.arg2.nr_ents;
+unsigned int ents = op.arg2.nr_ents;
+unsigned long ptr = ents ? op.arg1.linear_addr : 0;
 
 if ( unlikely(d != pg_owner) )
 rc = -EPERM;
@@ -3361,7 +3361,7 @@ long do_mmuext_op(
 else if ( ((ptr & (PAGE_SIZE - 1)) != 0) || !__addr_ok(ptr) ||
   (ents > 8192) )
 {
-MEM_LOG("Bad args to SET_LDT: ptr=%lx, ents=%lx", ptr, ents);
+MEM_LOG("Bad args to SET_LDT: ptr=%lx, ents=%x", ptr, ents);
 rc = -EINVAL;
 }
 else if ( (curr->arch.pv_vcpu.ldt_ents != ents) ||
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Peng Fan
Hi Jan,
On Wed, Jan 20, 2016 at 05:01:40AM -0700, Jan Beulich wrote:
 On 20.01.16 at 12:40,  wrote:
>> Hi Jan,
>> 
>> On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote:
>> On 20.01.16 at 09:31,  wrote:
 +/*
 + * Backend response
 + *
 + * cmd: command for operation on clk, same with the cmd in request.
 + * id: clk id, same with the id in request.
 + * success: indicate failure or success for the cmd.
 + * rate: clock rate. Used for get rate.
 + *
 + * cmd and id are filled by backend and passed to frontend to
 + * let frontend check whether the response is for the current
 + * request or not.
 + */
 +struct xen_clkif_response {
 +  uint32_t cmd;
 +  uint32_t id;
 +  uint32_t success;
 +  uint64_t rate;
 +};
>>>
>>>This isn't 32-/64-bit clean. Question is whether echoing cmd is really
>>>needed.
>> 
>> As Juergen suggested, use a request id. I'll fix this in V3.
>> 32-/64-bit unclean, I can not get you here (:
>
>The layout of above structure may end up different for 32- and
>64-bit guests, depending on the alignment of uint64_t.

Thanks for teaching me. In V3, the layout will be changed to like this
struct xen_clkif_response {
uint32_t request_id;
int32_t status;
uint64_t rate;
};

And more macro definitions for the status entry:
#define XEN_CLK_OP_SUCCESS 0
#define XEN_CLK_ENABLE_FALIURE -1
#define XEN_CLK_DISABLE_FAILURE -2
#define XEN_CLK_PREPARE_FAILURE -3
#define XEN_CLK_UNPREPARE_FAILURE -4
#define XEN_CLK_SET_RATE_FAILURE -5
#define XEN_CLK_GET_RATE_FALIURE -6

>
>>>And what I'm missing throughout the file is some description of how
>>>clock events (interrupts?) are actually meant to make it into the
>>>guest.
>> 
>> I have a simple implementation v1 patch for linux, 
>> http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html.
>> You can see how it works.
>
>No, sorry, that's not the point of the inquiry. It seems to me that
>there are more aspects to this interface, not directly related to
>the request/reply protocol written down here, which need to be
>spelled out event if they don't require any particular definitions
>or type declarations.

I try to follow you about clock events(interrupts?), not sure whether I 
understand correct or not.
clock in this file is the clk for a device. In linux side, it managed by clk 
subsystem, drivers/clk/xx.
This is not the clock events or clock source in drivers/clocksource/xx.
For the pvclk interface, there will be no interrupt for the guest.

I'll add more texts in the file or commit log.

Thanks,
Peng.

>
>Jan
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 3/3] x86/paging: invlpg() hook returns boolean

2016-01-20 Thread Jan Beulich
... so make its return type reflect this.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -680,7 +680,7 @@ static int hap_page_fault(struct vcpu *v
  * HAP guests can handle invlpg without needing any action from Xen, so
  * should not be intercepting it.
  */
-static int hap_invlpg(struct vcpu *v, unsigned long va)
+static bool_t hap_invlpg(struct vcpu *v, unsigned long va)
 {
 if (nestedhvm_enabled(v->domain)) {
 /* Emulate INVLPGA:
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3510,11 +3510,12 @@ propagate:
 }
 
 
-static int
-sh_invlpg(struct vcpu *v, unsigned long va)
-/* Called when the guest requests an invlpg.  Returns 1 if the invlpg
+/*
+ * Called when the guest requests an invlpg.  Returns 1 if the invlpg
  * instruction should be issued on the hardware, or 0 if it's safe not
- * to do so. */
+ * to do so.
+ */
+static bool_t sh_invlpg(struct vcpu *v, unsigned long va)
 {
 mfn_t sl1mfn;
 shadow_l2e_t sl2e;
--- a/xen/arch/x86/mm/shadow/none.c
+++ b/xen/arch/x86/mm/shadow/none.c
@@ -32,10 +32,10 @@ static int _page_fault(struct vcpu *v, u
 return 0;
 }
 
-static int _invlpg(struct vcpu *v, unsigned long va)
+static bool_t _invlpg(struct vcpu *v, unsigned long va)
 {
 ASSERT_UNREACHABLE();
-return -EOPNOTSUPP;
+return 1;
 }
 
 static unsigned long _gva_to_gfn(struct vcpu *v, struct p2m_domain *p2m,
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4392,8 +4392,7 @@ static int __do_update_va_mapping(
 switch ( (bmap_ptr = flags & ~UVMF_FLUSHTYPE_MASK) )
 {
 case UVMF_LOCAL:
-if ( !paging_mode_enabled(d) ||
- (paging_invlpg(v, va) != 0) ) 
+if ( !paging_mode_enabled(d) || paging_invlpg(v, va) )
 flush_tlb_one_local(va);
 break;
 case UVMF_ALL:
--- a/xen/include/asm-x86/paging.h
+++ b/xen/include/asm-x86/paging.h
@@ -104,7 +104,7 @@ struct shadow_paging_mode {
 struct paging_mode {
 int   (*page_fault)(struct vcpu *v, unsigned long va,
 struct cpu_user_regs *regs);
-int   (*invlpg)(struct vcpu *v, unsigned long va);
+bool_t(*invlpg)(struct vcpu *v, unsigned long va);
 unsigned long (*gva_to_gfn)(struct vcpu *v,
 struct p2m_domain *p2m,
 unsigned long va,
@@ -243,7 +243,7 @@ paging_fault(unsigned long va, struct cp
 /* Handle invlpg requests on vcpus.
  * Returns 1 if the invlpg instruction should be issued on the hardware,
  * or 0 if it's safe not to do so. */
-static inline int paging_invlpg(struct vcpu *v, unsigned long va)
+static inline bool_t paging_invlpg(struct vcpu *v, unsigned long va)
 {
 return (paging_mode_external(v->domain) ? is_canonical_address(va)
 : __addr_ok(va)) &&



x86/paging: invlpg() hook returns boolean

... so make its return type reflect this.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -680,7 +680,7 @@ static int hap_page_fault(struct vcpu *v
  * HAP guests can handle invlpg without needing any action from Xen, so
  * should not be intercepting it.
  */
-static int hap_invlpg(struct vcpu *v, unsigned long va)
+static bool_t hap_invlpg(struct vcpu *v, unsigned long va)
 {
 if (nestedhvm_enabled(v->domain)) {
 /* Emulate INVLPGA:
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3510,11 +3510,12 @@ propagate:
 }
 
 
-static int
-sh_invlpg(struct vcpu *v, unsigned long va)
-/* Called when the guest requests an invlpg.  Returns 1 if the invlpg
+/*
+ * Called when the guest requests an invlpg.  Returns 1 if the invlpg
  * instruction should be issued on the hardware, or 0 if it's safe not
- * to do so. */
+ * to do so.
+ */
+static bool_t sh_invlpg(struct vcpu *v, unsigned long va)
 {
 mfn_t sl1mfn;
 shadow_l2e_t sl2e;
--- a/xen/arch/x86/mm/shadow/none.c
+++ b/xen/arch/x86/mm/shadow/none.c
@@ -32,10 +32,10 @@ static int _page_fault(struct vcpu *v, u
 return 0;
 }
 
-static int _invlpg(struct vcpu *v, unsigned long va)
+static bool_t _invlpg(struct vcpu *v, unsigned long va)
 {
 ASSERT_UNREACHABLE();
-return -EOPNOTSUPP;
+return 1;
 }
 
 static unsigned long _gva_to_gfn(struct vcpu *v, struct p2m_domain *p2m,
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4392,8 +4392,7 @@ static int __do_update_va_mapping(
 switch ( (bmap_ptr = flags & ~UVMF_FLUSHTYPE_MASK) )
 {
 case UVMF_LOCAL:
-if ( !paging_mode_enabled(d) ||
- (paging_invlpg(v, va) != 0) ) 
+if ( !paging_mode_enabled(d) || paging_invlpg(v, va) )
 flush_tlb_one_local(va);
 break;
 case UVMF_ALL:
--- a/xen/include/asm-x86

Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Peng Fan
Hi Juergen,

On Wed, Jan 20, 2016 at 01:11:39PM +0100, Juergen Gross wrote:
>On 20/01/16 12:48, Peng Fan wrote:
>> Hi Juergen,
>> 
>> On Wed, Jan 20, 2016 at 11:40:55AM +0100, Juergen Gross wrote:
>>> On 20/01/16 10:25, Peng Fan wrote:
 Hi Juergen,

 On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote:
> On 20/01/16 09:31, Peng Fan wrote:
>> Introduce pvclk interface which is useful when doing device passthrough
>> on ARM platform.
>
> ...
>
>> +/*
>> + * Frontend request
>> + *
>> + * cmd: command for operation on clk, should be XEN_CLK_[xx],
>> + *  excluding XEN_CLK_END. id is the
>> + * id: clk id
>> + * rate: clock rate. Used for set rate.
>
> Which unit? Hz?

 Yeah. Hz. I'll add comments in V3.

>
>> + */
>> +struct xen_clkif_request {
>> +uint32_t cmd;
>> +uint32_t id;
>> +uint64_t rate;
>> +};
>> +typedef struct xen_clkif_request xen_clkif_request_t;
>> +
>> +/*
>> + * Backend response
>> + *
>> + * cmd: command for operation on clk, same with the cmd in request.
>> + * id: clk id, same with the id in request.
>> + * success: indicate failure or success for the cmd.
>
> Values?

 I'd like to let the frontend/backend driver to determine the value.
 In my implementation for linux, if the backend API supports return value,
 I'll fill the return value to success entry. If the backend API returns
 void, I'll just fill 0 to success entry.
>>>
>>> So please specify "0" to mean success and add some sensible error
>>> values. There might be multiple frontend- and/or backend-variants which
>>> all must agree using the same interface.
>> 
>> Will change this to `int status`, as also observed by Jan.
>> I'll also define macros such as "#define XEN_CLK_ENABLE_OK 0", "#define 
>> XEN_CLK_ENABLE_FAILURE -1"
>> 
>>>
>> + * rate: clock rate. Used for get rate.
>> + *
>> + * cmd and id are filled by backend and passed to frontend to
>> + * let frontend check whether the response is for the current
>> + * request or not.
>
> I'd rather let the frontend add a request Id to the request which
> will be echoed here instead cmd and clock Id.

 If using request id, I think we can encode cmd and id into request id?
>>>
>>> This should just be an opaque value for the backend. The frontend is
>>> free how to create this value, it should be unique for every outstanding
>>> request of a domU, however. It could be built from cmd and id in case
>>> there can't be multiple requests with the same cmd/id combination
>>> active in a domU.
>> 
>> Thanks for teaching me on this. So the request id should be globally unique
>> for backend. So "random value(generated in frontend) + frontend domid" is ok 
>> for this?
>
>Being unique per frontend is enough, as each frontend will only ever see
>responses for it's own requests. Make it as simple as possible. I guess
>there will be a maximum of active requests possible, e.g. the number of
>request slots in the request ring. You could use the index into the ring
>as Id (pvSCSI frontend is doing it this way).

Thanks for this info.
In my case, such as let DomU handle uart2, I only need to let DomU
handle uart2-root-clk. Later I will passthrough gpu to DomU, only
gpu-root-clk needs be handled by DomU.
In linux side, clk operation is exclusive for one device(not definitely sure),
so the number of slots can be max number of devices that supported for device 
passthrough.

I'll take pvSCSI for reference.

Thanks,
Peng.

>
>Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 15:05,  wrote:
> Hi Jan,
> On Wed, Jan 20, 2016 at 05:01:40AM -0700, Jan Beulich wrote:
> On 20.01.16 at 12:40,  wrote:
>>> Hi Jan,
>>> 
>>> On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote:
>>> On 20.01.16 at 09:31,  wrote:
> +/*
> + * Backend response
> + *
> + * cmd: command for operation on clk, same with the cmd in request.
> + * id: clk id, same with the id in request.
> + * success: indicate failure or success for the cmd.
> + * rate: clock rate. Used for get rate.
> + *
> + * cmd and id are filled by backend and passed to frontend to
> + * let frontend check whether the response is for the current
> + * request or not.
> + */
> +struct xen_clkif_response {
> + uint32_t cmd;
> + uint32_t id;
> + uint32_t success;
> + uint64_t rate;
> +};

This isn't 32-/64-bit clean. Question is whether echoing cmd is really
needed.
>>> 
>>> As Juergen suggested, use a request id. I'll fix this in V3.
>>> 32-/64-bit unclean, I can not get you here (:
>>
>>The layout of above structure may end up different for 32- and
>>64-bit guests, depending on the alignment of uint64_t.
> 
> Thanks for teaching me. In V3, the layout will be changed to like this
> struct xen_clkif_response {
>   uint32_t request_id;
>   int32_t status;
>   uint64_t rate;
> };

Okay. Just as a not - iirc other pv interfaces use 64-bit ID values,
so not doing this here perhaps ought to be justified.

> And more macro definitions for the status entry:
> #define XEN_CLK_OP_SUCCESS 0
> #define XEN_CLK_ENABLE_FALIURE -1
> #define XEN_CLK_DISABLE_FAILURE -2
> #define XEN_CLK_PREPARE_FAILURE -3
> #define XEN_CLK_UNPREPARE_FAILURE -4
> #define XEN_CLK_SET_RATE_FAILURE -5
> #define XEN_CLK_GET_RATE_FALIURE -6

That's bogus - different kinds of errors would be meaningful,
but an error per command is quite pointless imo. (Also please
be aware of typos and parenthesization.)

And what I'm missing throughout the file is some description of how
clock events (interrupts?) are actually meant to make it into the
guest.
>>> 
>>> I have a simple implementation v1 patch for linux, 
>>> http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html.
>>> You can see how it works.
>>
>>No, sorry, that's not the point of the inquiry. It seems to me that
>>there are more aspects to this interface, not directly related to
>>the request/reply protocol written down here, which need to be
>>spelled out event if they don't require any particular definitions
>>or type declarations.
> 
> I try to follow you about clock events(interrupts?), not sure whether I 
> understand correct or not.
> clock in this file is the clk for a device. In linux side, it managed by clk 
> subsystem, drivers/clk/xx.
> This is not the clock events or clock source in drivers/clocksource/xx.
> For the pvclk interface, there will be no interrupt for the guest.

Then (also considering the set of commands you propose) what
use is the clock to the guest? It can't get events from it, it can't
read its current value, all it can is get/set its rate, enable/disable,
and prepare/unprepare it. I may be lacking some ARM knowledge
here, but all of this looks pretty odd to me.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/3] x86/mmuext: tighten TLB flush address checks

2016-01-20 Thread Andrew Cooper
On 20/01/16 14:05, Jan Beulich wrote:
> Addresses passed by PV guests should be subjected to __addr_ok(),
> avoiding undue TLB flushes; .
>
> Signed-off-by: Jan Beulich 
>
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -3268,8 +3268,9 @@ long do_mmuext_op(
>  case MMUEXT_INVLPG_LOCAL:
>  if ( unlikely(d != pg_owner) )
>  rc = -EPERM;
> -else if ( !paging_mode_enabled(d) ||
> -  paging_invlpg(curr, op.arg1.linear_addr) != 0 )
> +else if ( !paging_mode_enabled(d)
> +  ? __addr_ok(op.arg1.linear_addr)
> +  : paging_invlpg(curr, op.arg1.linear_addr) )

paging_mode_enabled() changes depending on whether the guest has
logdirty currently enabled.

As you have also patched paging_invlpg() to DTRT with __addr_ok(), I
think this hunk can be dropped.

Everything else, Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline baseline-only test] 38665: trouble: broken/fail/pass

2016-01-20 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 38665 qemu-mainline real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/38665/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-amd64-pvgrub  3 host-install(3)broken REGR. vs. 38647
 test-amd64-i386-xl-qemuu-debianhvm-amd64 3 host-install(3) broken REGR. vs. 
38647
 test-amd64-i386-xl-xsm3 host-install(3) broken REGR. vs. 38647
 test-amd64-amd64-xl-qcow2 3 host-install(3) broken REGR. vs. 38647

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 3 host-install(3) broken 
REGR. vs. 38647
 test-amd64-amd64-libvirt-xsm  3 host-install(3) broken REGR. vs. 38647
 test-amd64-amd64-xl-pvh-amd   3 host-install(3)  broken like 38647
 test-amd64-amd64-qemuu-nested-amd  3 host-install(3) broken like 38647
 test-amd64-amd64-xl-qemuu-win7-amd64  3 host-install(3)  broken like 38647
 test-amd64-amd64-libvirt-pair  4 host-install/dst_host(4)broken like 38647
 test-amd64-amd64-libvirt-pair  3 host-install/src_host(3)broken like 38647
 test-amd64-amd64-pair 3 host-install/src_host(3) broken like 38647
 test-amd64-amd64-pair 4 host-install/dst_host(4) broken like 38647
 test-amd64-amd64-i386-pvgrub 10 guest-start  fail blocked in 38647
 test-amd64-amd64-xl  19 guest-start/debian.repeatfail   like 38647
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail like 38647

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-midway   12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass

version targeted for testing:
 qemuu19b6d84316892c8086e0115d6f09cb01abb86cfc
baseline version:
 qemuu5a57acb66f19ee52723aa05b8afbbc41c3e9ec99

Last test of basis38647  2016-01-16 22:18:51 Z3 days
Testing same since38665  2016-01-19 09:20:50 Z1 days1 attempts


People who touched revisions under test:
  Cao jin 
  Daniel P. Berrange 
  Fam Zheng 
  Michael S. Tsirkin 
  P J P 
  Paolo Bonzini 
  Peter Maydell 
  Prasad J Pandit 
  Shmulik Ladkani 
  Zhu Lingshan 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386 

[Xen-devel] [xen-unstable-smoke test] 78617: regressions - FAIL

2016-01-20 Thread osstest service owner
flight 78617 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/78617/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   5 xen-build fail REGR. vs. 78553

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386  1 build-check(1) blocked n/a
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  bf05e88ed7342a91cceba050b6c622accb809842
baseline version:
 xen  1949868d640427dc91bfb23741d78eb1d86841c8

Last test of basis78553  2016-01-19 20:02:25 Z0 days
Testing same since78617  2016-01-20 13:07:12 Z0 days1 attempts


People who touched revisions under test:
  Ian Campbell 
  Jan Beulich 

jobs:
 build-amd64  fail
 build-armhf  pass
 build-amd64-libvirt  blocked 
 test-armhf-armhf-xl  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 blocked 
 test-amd64-amd64-libvirt blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit bf05e88ed7342a91cceba050b6c622accb809842
Author: Jan Beulich 
Date:   Wed Jan 20 13:50:10 2016 +0100

x86/VMX: prevent INVVPID failure due to non-canonical guest address

While INVLPG (and on SVM INVLPGA) don't fault on non-canonical
addresses, INVVPID fails (in the "individual address" case) when passed
such an address.

Since such intercepted INVLPG are effectively no-ops anyway, don't fix
this in vmx_invlpg_intercept(), but instead have paging_invlpg() never
return true in such a case.

This is CVE-2016-1571 / XSA-168.

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 
Acked-by: Ian Campbell 

commit 47abf29a9255b2e7b94e56d66b455d0a584b68b8
Author: Jan Beulich 
Date:   Wed Jan 20 13:49:23 2016 +0100

x86/mm: PV superpage handling lacks sanity checks

MMUEXT_{,UN}MARK_SUPER fail to check the input MFN for validity before
dereferencing pointers into the superpage frame table.

Reported-by: Qinghao Tang 

get_superpage() has a similar issue.

This is CVE-2016-1570 / XSA-167.

Signed-off-by: Jan Beulich 
Acked-by: Ian Campbell 
(qemu changes not included)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/4] add support for vNVDIMM

2016-01-20 Thread Zhang, Haozhong
On 01/20/16 12:43, Stefano Stabellini wrote:
> On Wed, 20 Jan 2016, Tian, Kevin wrote:
> > > From: Zhang, Haozhong
> > > Sent: Tuesday, December 29, 2015 7:32 PM
> > > 
> > > This patch series is the Xen part patch to provide virtual NVDIMM to
> > > guest. The corresponding QEMU patch series is sent separately with the
> > > title "[PATCH 0/2] add vNVDIMM support for Xen".
> > > 
> > > * Background
> > > 
> > >  NVDIMM (Non-Volatile Dual In-line Memory Module) is going to be
> > >  supported on Intel's platform. NVDIMM devices are discovered via ACPI
> > >  and configured by _DSM method of NVDIMM device in ACPI. Some
> > >  documents can be found at
> > >  [1] ACPI 6: 
> > > http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
> > >  [2] NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
> > >  [3] DSM Interface Example:
> > > http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> > >  [4] Driver Writer's Guide:
> > > http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
> > > 
> > >  The upstream QEMU (commits 5c42eef ~ 70d1fb9) has added support to
> > >  provide virtual NVDIMM in PMEM mode, in which NVDIMM devices are
> > >  mapped into CPU's address space and are accessed via normal memory
> > >  read/write and three special instructions (clflushopt/clwb/pcommit).
> > > 
> > >  This patch series and the corresponding QEMU patch series enable Xen
> > >  to provide vNVDIMM devices to HVM domains.
> > > 
> > > * Design
> > > 
> > >  Supporting vNVDIMM in PMEM mode has three requirements.
> > > 
> > 
> > Although this design is about vNVDIMM, some background of how pNVDIMM
> > is managed in Xen would be helpful to understand the whole design since
> > in PMEM mode you need map pNVDIMM into GFN addr space so there's
> > a matter of how pNVDIMM is allocated.
> 
> Yes, some background would be very helpful. Given that there are so many
> moving parts on this (Xen, the Dom0 kernel, QEMU, hvmloader, libxl)
> I suggest that we start with a design document for this feature.

Let me prepare a design document. Basically, it would include
following contents. Please let me know if you want anything additional
to be included.

* What NVDIMM is and how it is used
* Software interface of NVDIMM
  - ACPI NFIT: what parameters are recorded and their usage
  - ACPI SSDT: what _DSM methods are provided and their functionality
  - New instructions: clflushopt/clwb/pcommit
* How the linux kernel drives NVDIMM
  - ACPI parsing
  - Block device interface
  - Partition NVDIMM devices
* How KVM/QEMU implements vNVDIMM
* What I propose to implement vNVDIMM in Xen
  - Xen hypervisor/toolstack: new instruction enabling and address mapping
  - Dom0 Linux kernel: host NVDIMM driver
  - QEMU: virtual NFIT/SSDT, _DSM handling, and role in address mapping

Haozhong

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/3] x86/PV: relax LDT address check

2016-01-20 Thread Andrew Cooper
On 20/01/16 14:06, Jan Beulich wrote:
> There's no point placing restrictions on its address when the LDT size
> is zero.
>
> Also convert a local variable to a slightly more efficient type.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] x86/paging: invlpg() hook returns boolean

2016-01-20 Thread Andrew Cooper
On 20/01/16 14:07, Jan Beulich wrote:
> ... so make its return type reflect this.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper , although this
needs MM and shadow acks as well (CC'd).

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-20 Thread Stefano Stabellini
On Wed, 20 Jan 2016, Andrew Cooper wrote:
> On 20/01/16 10:36, Xiao Guangrong wrote:
> >
> > Hi,
> >
> > On 01/20/2016 06:15 PM, Haozhong Zhang wrote:
> >
> >> CCing QEMU vNVDIMM maintainer: Xiao Guangrong
> >>
> >>> Conceptually, an NVDIMM is just like a fast SSD which is linearly
> >>> mapped
> >>> into memory.  I am still on the dom0 side of this fence.
> >>>
> >>> The real question is whether it is possible to take an NVDIMM, split it
> >>> in half, give each half to two different guests (with appropriate NFIT
> >>> tables) and that be sufficient for the guests to just work.
> >>>
> >>
> >> Yes, one NVDIMM device can be split into multiple parts and assigned
> >> to different guests, and QEMU is responsible to maintain virtual NFIT
> >> tables for each part.
> >>
> >>> Either way, it needs to be a toolstack policy decision as to how to
> >>> split the resource.
> >
> > Currently, we are using NVDIMM as a block device and a DAX-based
> > filesystem
> > is created upon it in Linux so that file-related accesses directly reach
> > the NVDIMM device.
> >
> > In KVM, If the NVDIMM device need to be shared by different VMs, we can
> > create multiple files on the DAX-based filesystem and assign the file to
> > each VMs. In the future, we can enable namespace (partition-like) for
> > PMEM
> > memory and assign the namespace to each VMs (current Linux driver uses
> > the
> > whole PMEM as a single namespace).
> >
> > I think it is not a easy work to let Xen hypervisor recognize NVDIMM
> > device
> > and manager NVDIMM resource.
> >
> > Thanks!
> >
> 
> The more I see about this, the more sure I am that we want to keep it as
> a block device managed by dom0.
> 
> In the case of the DAX-based filesystem, I presume files are not
> necessarily contiguous.  I also presume that this is worked around by
> permuting the mapping of the virtual NVDIMM such that the it appears as
> a contiguous block of addresses to the guest?
> 
> Today in Xen, Qemu already has the ability to create mappings in the
> guest's address space, e.g. to map PCI device BARs.  I don't see a
> conceptual difference here, although the security/permission model
> certainly is more complicated.

I imagine that mmap'ing  these /dev/pmemXX devices require root
privileges, does it not?

I wouldn't encourage the introduction of anything else that requires
root privileges in QEMU. With QEMU running as non-root by default in
4.7, the feature will not be available unless users explicitly ask to
run QEMU as root (which they shouldn't really).

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 3/5] build: Alloc space for sched list in the link file

2016-01-20 Thread Jonathan Creekmore

> On Jan 15, 2016, at 11:01 AM, Jonathan Creekmore 
>  wrote:
> 
> Creates a section to contain scheduler entry pointers that are gathered
> together into an array. This will allow, in a follow-on patch, scheduler
> entries to be automatically gathered together into the array for
> automatic parsing.
> 
> CC: Ian Campbell 
> CC: Stefano Stabellini 
> CC: Keir Fraser 
> CC: Jan Beulich 
> CC: Andrew Cooper 
> Signed-off-by: Jonathan Creekmore 
> Reviewed-by: Andrew Cooper 
> Reviewed-by: Doug Goldstein 
> Reviewed-by: Konrad Rzeszutek Wilk 
> 
> ---
> Changed since v4:
>  * Remove defensive check for schedulers since the credit scheduler
>must always be present
> 
> Changed since v3:
>  * Add defensive check for schedulers in the linker
> 
> Changed since v1:
>  * rename the __start and __end symbols to better match
>the rest of the file
> ---
> xen/arch/arm/xen.lds.S | 4 
> xen/arch/x86/xen.lds.S | 4 
> 2 files changed, 8 insertions(+)
> 
> diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
> index 0488f37..f501a2f 100644
> --- a/xen/arch/arm/xen.lds.S
> +++ b/xen/arch/arm/xen.lds.S
> @@ -57,6 +57,10 @@ SECTIONS
>. = ALIGN(PAGE_SIZE);
>*(.data.page_aligned)
>*(.data)
> +   . = ALIGN(8);
> +   __start_schedulers_array = .;
> +   *(.data.schedulers)
> +   __end_schedulers_array = .;
>*(.data.rel)
>*(.data.rel.*)
>CONSTRUCTORS
> diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
> index e18e08f..c1ce027 100644
> --- a/xen/arch/x86/xen.lds.S
> +++ b/xen/arch/x86/xen.lds.S
> @@ -80,6 +80,10 @@ SECTIONS
>__stop___pre_ex_table = .;
> 
>*(.data.read_mostly)
> +   . = ALIGN(8);
> +   __start_schedulers_array = .;
> +   *(.data.schedulers)
> +   __end_schedulers_array = .;
>*(.data.rel.ro)
>*(.data.rel.ro.*)
>   } :text
> -- 
> 2.6.4


I am pretty sure, with Dario’s latest ACK on (2/5), that this patch is the only 
one in the series that has not been 
ACKed yet. Is there anything else that I need to do to get this series in, 
especially since the (1/5) CONFIG_EXPERT
patch has already landed?


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/4] add support for vNVDIMM

2016-01-20 Thread Stefano Stabellini
On Wed, 20 Jan 2016, Zhang, Haozhong wrote:
> On 01/20/16 12:43, Stefano Stabellini wrote:
> > On Wed, 20 Jan 2016, Tian, Kevin wrote:
> > > > From: Zhang, Haozhong
> > > > Sent: Tuesday, December 29, 2015 7:32 PM
> > > > 
> > > > This patch series is the Xen part patch to provide virtual NVDIMM to
> > > > guest. The corresponding QEMU patch series is sent separately with the
> > > > title "[PATCH 0/2] add vNVDIMM support for Xen".
> > > > 
> > > > * Background
> > > > 
> > > >  NVDIMM (Non-Volatile Dual In-line Memory Module) is going to be
> > > >  supported on Intel's platform. NVDIMM devices are discovered via ACPI
> > > >  and configured by _DSM method of NVDIMM device in ACPI. Some
> > > >  documents can be found at
> > > >  [1] ACPI 6: 
> > > > http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
> > > >  [2] NVDIMM Namespace: 
> > > > http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
> > > >  [3] DSM Interface Example:
> > > > http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> > > >  [4] Driver Writer's Guide:
> > > > http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
> > > > 
> > > >  The upstream QEMU (commits 5c42eef ~ 70d1fb9) has added support to
> > > >  provide virtual NVDIMM in PMEM mode, in which NVDIMM devices are
> > > >  mapped into CPU's address space and are accessed via normal memory
> > > >  read/write and three special instructions (clflushopt/clwb/pcommit).
> > > > 
> > > >  This patch series and the corresponding QEMU patch series enable Xen
> > > >  to provide vNVDIMM devices to HVM domains.
> > > > 
> > > > * Design
> > > > 
> > > >  Supporting vNVDIMM in PMEM mode has three requirements.
> > > > 
> > > 
> > > Although this design is about vNVDIMM, some background of how pNVDIMM
> > > is managed in Xen would be helpful to understand the whole design since
> > > in PMEM mode you need map pNVDIMM into GFN addr space so there's
> > > a matter of how pNVDIMM is allocated.
> > 
> > Yes, some background would be very helpful. Given that there are so many
> > moving parts on this (Xen, the Dom0 kernel, QEMU, hvmloader, libxl)
> > I suggest that we start with a design document for this feature.
> 
> Let me prepare a design document. Basically, it would include
> following contents. Please let me know if you want anything additional
> to be included.

Thank you!


> * What NVDIMM is and how it is used
> * Software interface of NVDIMM
>   - ACPI NFIT: what parameters are recorded and their usage
>   - ACPI SSDT: what _DSM methods are provided and their functionality
>   - New instructions: clflushopt/clwb/pcommit
> * How the linux kernel drives NVDIMM
>   - ACPI parsing
>   - Block device interface
>   - Partition NVDIMM devices
> * How KVM/QEMU implements vNVDIMM

This is a very good start.


> * What I propose to implement vNVDIMM in Xen
>   - Xen hypervisor/toolstack: new instruction enabling and address mapping
>   - Dom0 Linux kernel: host NVDIMM driver
>   - QEMU: virtual NFIT/SSDT, _DSM handling, and role in address mapping

This is OK. It might be also good to list other options that were
discussed, but it is certainly not necessary in first instance.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [distros-debian-squeeze test] 38672: trouble: blocked/broken/pass

2016-01-20 Thread Platform Team regression test user
flight 38672 distros-debian-squeeze real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/38672/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   3 host-install(3) broken REGR. vs. 38630

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-i386-squeeze-netboot-pygrub  1 build-check(1) blocked n/a
 test-amd64-amd64-amd64-squeeze-netboot-pygrub  1 build-check(1)blocked n/a
 test-amd64-i386-amd64-squeeze-netboot-pygrub  1 build-check(1) blocked n/a
 test-amd64-i386-i386-squeeze-netboot-pygrub  1 build-check(1)  blocked n/a

baseline version:
 flight   38630

jobs:
 build-amd64  broken  
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-squeeze-netboot-pygrubblocked 
 test-amd64-i386-amd64-squeeze-netboot-pygrub blocked 
 test-amd64-amd64-i386-squeeze-netboot-pygrub blocked 
 test-amd64-i386-i386-squeeze-netboot-pygrub  blocked 



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface

2016-01-20 Thread Peng Fan
Hi Jan,

On Wed, Jan 20, 2016 at 07:16:36AM -0700, Jan Beulich wrote:
 On 20.01.16 at 15:05,  wrote:
>> Hi Jan,
>> On Wed, Jan 20, 2016 at 05:01:40AM -0700, Jan Beulich wrote:
>> On 20.01.16 at 12:40,  wrote:
 Hi Jan,
 
 On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote:
 On 20.01.16 at 09:31,  wrote:
>> +/*
>> + * Backend response
>> + *
>> + * cmd: command for operation on clk, same with the cmd in request.
>> + * id: clk id, same with the id in request.
>> + * success: indicate failure or success for the cmd.
>> + * rate: clock rate. Used for get rate.
>> + *
>> + * cmd and id are filled by backend and passed to frontend to
>> + * let frontend check whether the response is for the current
>> + * request or not.
>> + */
>> +struct xen_clkif_response {
>> +uint32_t cmd;
>> +uint32_t id;
>> +uint32_t success;
>> +uint64_t rate;
>> +};
>
>This isn't 32-/64-bit clean. Question is whether echoing cmd is really
>needed.
 
 As Juergen suggested, use a request id. I'll fix this in V3.
 32-/64-bit unclean, I can not get you here (:
>>>
>>>The layout of above structure may end up different for 32- and
>>>64-bit guests, depending on the alignment of uint64_t.
>> 
>> Thanks for teaching me. In V3, the layout will be changed to like this
>> struct xen_clkif_response {
>>  uint32_t request_id;
>>  int32_t status;
>>  uint64_t rate;
>> };
>
>Okay. Just as a not - iirc other pv interfaces use 64-bit ID values,
>so not doing this here perhaps ought to be justified.

oh. I see uint64_t id in blkif.h :). Thanks.

>
>> And more macro definitions for the status entry:
>> #define XEN_CLK_OP_SUCCESS 0
>> #define XEN_CLK_ENABLE_FALIURE -1
>> #define XEN_CLK_DISABLE_FAILURE -2
>> #define XEN_CLK_PREPARE_FAILURE -3
>> #define XEN_CLK_UNPREPARE_FAILURE -4
>> #define XEN_CLK_SET_RATE_FAILURE -5
>> #define XEN_CLK_GET_RATE_FALIURE -6
>
>That's bogus - different kinds of errors would be meaningful,
>but an error per command is quite pointless imo. (Also please
>be aware of typos and parenthesization.)

Will fine tune this in V3.

>
>And what I'm missing throughout the file is some description of how
>clock events (interrupts?) are actually meant to make it into the
>guest.
 
 I have a simple implementation v1 patch for linux, 
 http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html.
 You can see how it works.
>>>
>>>No, sorry, that's not the point of the inquiry. It seems to me that
>>>there are more aspects to this interface, not directly related to
>>>the request/reply protocol written down here, which need to be
>>>spelled out event if they don't require any particular definitions
>>>or type declarations.
>> 
>> I try to follow you about clock events(interrupts?), not sure whether I 
>> understand correct or not.
>> clock in this file is the clk for a device. In linux side, it managed by clk 
>> subsystem, drivers/clk/xx.
>> This is not the clock events or clock source in drivers/clocksource/xx.
>> For the pvclk interface, there will be no interrupt for the guest.
>
>Then (also considering the set of commands you propose) what
>use is the clock to the guest? It can't get events from it, it can't
>read its current value, all it can is get/set its rate, enable/disable,
>and prepare/unprepare it. I may be lacking some ARM knowledge
>here, but all of this looks pretty odd to me.

I follow this 
https://events.linuxfoundation.org/sites/events/files/slides/talk_5.pdf to do 
platform device
passthrough on ARM platform. I met the same issue as note in the ppt, at page 
24.

In my test case, the uart driver works well without xen. There is serveral 
function calls in the driver, such as
clk = clk_get("uart2_root"),rate = clk_get_rate(clk), use rate to set baudrate 
for uart.


There is a clk tree in kernel without XEN or dom0 kernel like the following 
(only an example):
osc -
|-->pll1
|-->pll2
 |-->pll2_div
 |-->uart2_gate
 |-->uart2_root  --> clk for uart2

uart2_root is directly handled by Dom0.If I assign uart2 to DomU, DomU does
not have the clk tree as above, so DomU can not handle directly handle 
uart2_root and the uart2 driver in
DomU will run into failure to intialize the uart2 hardware IP.

So I introudce pvclk. Pass the operation for uart2_root in DomU to Dom0 and 
Dom0 directly handle the clock management hardware IP.

Hope this is clear.

Thanks,
Peng.

>
>Jan
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] x86/paging: invlpg() hook returns boolean

2016-01-20 Thread Tim Deegan
At 07:07 -0700 on 20 Jan (1453273623), Jan Beulich wrote:
> ... so make its return type reflect this.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/3] x86/mmuext: tighten TLB flush address checks

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 15:23,  wrote:
> On 20/01/16 14:05, Jan Beulich wrote:
>> Addresses passed by PV guests should be subjected to __addr_ok(),
>> avoiding undue TLB flushes; .
>>
>> Signed-off-by: Jan Beulich 
>>
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -3268,8 +3268,9 @@ long do_mmuext_op(
>>  case MMUEXT_INVLPG_LOCAL:
>>  if ( unlikely(d != pg_owner) )
>>  rc = -EPERM;
>> -else if ( !paging_mode_enabled(d) ||
>> -  paging_invlpg(curr, op.arg1.linear_addr) != 0 )
>> +else if ( !paging_mode_enabled(d)
>> +  ? __addr_ok(op.arg1.linear_addr)
>> +  : paging_invlpg(curr, op.arg1.linear_addr) )
> 
> paging_mode_enabled() changes depending on whether the guest has
> logdirty currently enabled.
> 
> As you have also patched paging_invlpg() to DTRT with __addr_ok(), I
> think this hunk can be dropped.

How that? Without the change there's no address validation at
all when !paging_mode_enabled(d).

> Everything else, Reviewed-by: Andrew Cooper 

Please clarify.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-20 Thread Haozhong Zhang
On 01/20/16 14:29, Stefano Stabellini wrote:
> On Wed, 20 Jan 2016, Andrew Cooper wrote:
> > On 20/01/16 10:36, Xiao Guangrong wrote:
> > >
> > > Hi,
> > >
> > > On 01/20/2016 06:15 PM, Haozhong Zhang wrote:
> > >
> > >> CCing QEMU vNVDIMM maintainer: Xiao Guangrong
> > >>
> > >>> Conceptually, an NVDIMM is just like a fast SSD which is linearly
> > >>> mapped
> > >>> into memory.  I am still on the dom0 side of this fence.
> > >>>
> > >>> The real question is whether it is possible to take an NVDIMM, split it
> > >>> in half, give each half to two different guests (with appropriate NFIT
> > >>> tables) and that be sufficient for the guests to just work.
> > >>>
> > >>
> > >> Yes, one NVDIMM device can be split into multiple parts and assigned
> > >> to different guests, and QEMU is responsible to maintain virtual NFIT
> > >> tables for each part.
> > >>
> > >>> Either way, it needs to be a toolstack policy decision as to how to
> > >>> split the resource.
> > >
> > > Currently, we are using NVDIMM as a block device and a DAX-based
> > > filesystem
> > > is created upon it in Linux so that file-related accesses directly reach
> > > the NVDIMM device.
> > >
> > > In KVM, If the NVDIMM device need to be shared by different VMs, we can
> > > create multiple files on the DAX-based filesystem and assign the file to
> > > each VMs. In the future, we can enable namespace (partition-like) for
> > > PMEM
> > > memory and assign the namespace to each VMs (current Linux driver uses
> > > the
> > > whole PMEM as a single namespace).
> > >
> > > I think it is not a easy work to let Xen hypervisor recognize NVDIMM
> > > device
> > > and manager NVDIMM resource.
> > >
> > > Thanks!
> > >
> > 
> > The more I see about this, the more sure I am that we want to keep it as
> > a block device managed by dom0.
> > 
> > In the case of the DAX-based filesystem, I presume files are not
> > necessarily contiguous.  I also presume that this is worked around by
> > permuting the mapping of the virtual NVDIMM such that the it appears as
> > a contiguous block of addresses to the guest?
> > 
> > Today in Xen, Qemu already has the ability to create mappings in the
> > guest's address space, e.g. to map PCI device BARs.  I don't see a
> > conceptual difference here, although the security/permission model
> > certainly is more complicated.
> 
> I imagine that mmap'ing  these /dev/pmemXX devices require root
> privileges, does it not?
>

Yes, unless we assign non-root access permissions to /dev/pmemXX (but
this is not the default behavior of linux kernel so far).

> I wouldn't encourage the introduction of anything else that requires
> root privileges in QEMU. With QEMU running as non-root by default in
> 4.7, the feature will not be available unless users explicitly ask to
> run QEMU as root (which they shouldn't really).
>

Yes, I'll include those privileged operations in the design document.

Haozhong

> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] x86/paging: invlpg() hook returns boolean

2016-01-20 Thread Jan Beulich
>>> On 20.01.16 at 15:30,  wrote:
> On 20/01/16 14:07, Jan Beulich wrote:
>> ... so make its return type reflect this.
>>
>> Signed-off-by: Jan Beulich 
> 
> Reviewed-by: Andrew Cooper , although this
> needs MM and shadow acks as well (CC'd).

Oh, sure - I meant to but then forgot; thanks for noticing!

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/4] hvmloader: add support to load extra ACPI tables from qemu

2016-01-20 Thread Andrew Cooper
On 20/01/16 14:29, Stefano Stabellini wrote:
> On Wed, 20 Jan 2016, Andrew Cooper wrote:
>> On 20/01/16 10:36, Xiao Guangrong wrote:
>>> Hi,
>>>
>>> On 01/20/2016 06:15 PM, Haozhong Zhang wrote:
>>>
 CCing QEMU vNVDIMM maintainer: Xiao Guangrong

> Conceptually, an NVDIMM is just like a fast SSD which is linearly
> mapped
> into memory.  I am still on the dom0 side of this fence.
>
> The real question is whether it is possible to take an NVDIMM, split it
> in half, give each half to two different guests (with appropriate NFIT
> tables) and that be sufficient for the guests to just work.
>
 Yes, one NVDIMM device can be split into multiple parts and assigned
 to different guests, and QEMU is responsible to maintain virtual NFIT
 tables for each part.

> Either way, it needs to be a toolstack policy decision as to how to
> split the resource.
>>> Currently, we are using NVDIMM as a block device and a DAX-based
>>> filesystem
>>> is created upon it in Linux so that file-related accesses directly reach
>>> the NVDIMM device.
>>>
>>> In KVM, If the NVDIMM device need to be shared by different VMs, we can
>>> create multiple files on the DAX-based filesystem and assign the file to
>>> each VMs. In the future, we can enable namespace (partition-like) for
>>> PMEM
>>> memory and assign the namespace to each VMs (current Linux driver uses
>>> the
>>> whole PMEM as a single namespace).
>>>
>>> I think it is not a easy work to let Xen hypervisor recognize NVDIMM
>>> device
>>> and manager NVDIMM resource.
>>>
>>> Thanks!
>>>
>> The more I see about this, the more sure I am that we want to keep it as
>> a block device managed by dom0.
>>
>> In the case of the DAX-based filesystem, I presume files are not
>> necessarily contiguous.  I also presume that this is worked around by
>> permuting the mapping of the virtual NVDIMM such that the it appears as
>> a contiguous block of addresses to the guest?
>>
>> Today in Xen, Qemu already has the ability to create mappings in the
>> guest's address space, e.g. to map PCI device BARs.  I don't see a
>> conceptual difference here, although the security/permission model
>> certainly is more complicated.
> I imagine that mmap'ing  these /dev/pmemXX devices require root
> privileges, does it not?

I presume it does, although mmap()ing a file on a DAX filesystem will
work in the standard POSIX way.

Neither of these are sufficient however.  That gets Qemu a mapping of
the NVDIMM, not the guest.  Something, one way or another, has to turn
this into appropriate add-to-phymap hypercalls.

>
> I wouldn't encourage the introduction of anything else that requires
> root privileges in QEMU. With QEMU running as non-root by default in
> 4.7, the feature will not be available unless users explicitly ask to
> run QEMU as root (which they shouldn't really).

This isn't how design works.

First, design a feature in an architecturally correct way, and then
design an security policy to fit.  (note, both before implement happens).

We should not stunt design based on an existing implementation.  In
particular, if design shows that being a root only feature is the only
sane way of doing this, it should be a root only feature.  (I hope this
is not the case, but it shouldn't cloud the judgement of a design).

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   3   >