[Xen-devel] [GIT PULL] xen: fix for 4.15-rc7

2018-01-05 Thread Juergen Gross
Linus,

Please git pull the following tag:

 git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git 
for-linus-4.15-rc7-tag

xen: fix for 4.15-rc7

It contains one minor fix adjusting the kmalloc flags in the new
pvcalls driver added in rc1.

Thanks.

Juergen

 drivers/xen/pvcalls-front.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Wei Yongjun (1):
  xen/pvcalls: use GFP_ATOMIC under spin lock

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] xen-netfront: enable device after manual module load

2018-01-05 Thread Eduardo Otubo
When loading the module after unloading it, the network interface would
not be enabled and thus wouldn't have a backend counterpart and unable
to be used by the guest.

The guest would face errors like:

  [root@guest ~]# ethtool -i eth0
  Cannot get driver information: No such device

  [root@guest ~]# ifconfig eth0
  eth0: error fetching interface information: Device not found

This patch initializes the state of the netfront device whenever it is
loaded manually, this state would communicate the netback to create its
device and establish the connection between them.

Signed-off-by: Eduardo Otubo 
---
 drivers/net/xen-netfront.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index c5a34671abda..9bd7ddeeb6a5 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1326,6 +1326,7 @@ static struct net_device *xennet_create_dev(struct 
xenbus_device *dev)
 
netif_carrier_off(netdev);
 
+   xenbus_switch_state(dev, XenbusStateInitialising);
return netdev;
 
  exit:
-- 
2.14.3


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Ping: [PATCH RFC v2] x86/domctl: Don't pause the whole domain if only getting vcpu state

2018-01-05 Thread Alexandru Stefan ISAILA
Any thoughts appreciated.

On Vi, 2017-10-06 at 13:02 +0300, Alexandru Isaila wrote:
> This patch adds the hvm_save_one_cpu_ctxt() function.
> It optimizes by only pausing the vcpu on all HVMSR_PER_VCPU save
> callbacks where only data for one VCPU is required.
>
> Signed-off-by: Alexandru Isaila 
>
> ---
> Changes since V1:
> - Integrated the vcpu check into all the save callbacks
> ---
>  tools/tests/vhpet/emul.h   |   3 +-
>  tools/tests/vhpet/main.c   |   2 +-
>  xen/arch/x86/cpu/mcheck/vmce.c |  16 ++-
>  xen/arch/x86/domctl.c  |   2 -
>  xen/arch/x86/hvm/hpet.c|   2 +-
>  xen/arch/x86/hvm/hvm.c | 280 ++-
> --
>  xen/arch/x86/hvm/i8254.c   |   2 +-
>  xen/arch/x86/hvm/irq.c |   6 +-
>  xen/arch/x86/hvm/mtrr.c|  32 -
>  xen/arch/x86/hvm/pmtimer.c |   2 +-
>  xen/arch/x86/hvm/rtc.c |   2 +-
>  xen/arch/x86/hvm/save.c|  71 ---
>  xen/arch/x86/hvm/vioapic.c |   2 +-
>  xen/arch/x86/hvm/viridian.c|  17 ++-
>  xen/arch/x86/hvm/vlapic.c  |  23 +++-
>  xen/arch/x86/hvm/vpic.c|   2 +-
>  xen/include/asm-x86/hvm/hvm.h  |   2 +
>  xen/include/asm-x86/hvm/save.h |   5 +-
>  18 files changed, 324 insertions(+), 147 deletions(-)
>
> diff --git a/tools/tests/vhpet/emul.h b/tools/tests/vhpet/emul.h
> index 383acff..99d5bbd 100644
> --- a/tools/tests/vhpet/emul.h
> +++ b/tools/tests/vhpet/emul.h
> @@ -296,7 +296,8 @@ struct hvm_hw_hpet
>  };
>
>  typedef int (*hvm_save_handler)(struct domain *d,
> -hvm_domain_context_t *h);
> +hvm_domain_context_t *h,
> +unsigned int instance);
>  typedef int (*hvm_load_handler)(struct domain *d,
>  hvm_domain_context_t *h);
>
> diff --git a/tools/tests/vhpet/main.c b/tools/tests/vhpet/main.c
> index 6fe65ea..3d8e7f5 100644
> --- a/tools/tests/vhpet/main.c
> +++ b/tools/tests/vhpet/main.c
> @@ -177,7 +177,7 @@ void __init hvm_register_savevm(uint16_t
> typecode,
>
>  int do_save(uint16_t typecode, struct domain *d,
> hvm_domain_context_t *h)
>  {
> -return hvm_sr_handlers[typecode].save(d, h);
> +return hvm_sr_handlers[typecode].save(d, h, d->max_vcpus);
>  }
>
>  int do_load(uint16_t typecode, struct domain *d,
> hvm_domain_context_t *h)
> diff --git a/xen/arch/x86/cpu/mcheck/vmce.c
> b/xen/arch/x86/cpu/mcheck/vmce.c
> index e07cd2f..a1a12a5 100644
> --- a/xen/arch/x86/cpu/mcheck/vmce.c
> +++ b/xen/arch/x86/cpu/mcheck/vmce.c
> @@ -349,12 +349,24 @@ int vmce_wrmsr(uint32_t msr, uint64_t val)
>  return ret;
>  }
>
> -static int vmce_save_vcpu_ctxt(struct domain *d,
> hvm_domain_context_t *h)
> +static int vmce_save_vcpu_ctxt(struct domain *d,
> hvm_domain_context_t *h, unsigned int instance)
>  {
>  struct vcpu *v;
>  int err = 0;
>
> -for_each_vcpu ( d, v )
> +if( instance < d->max_vcpus )
> +{
> +struct hvm_vmce_vcpu ctxt;
> +
> +v = d->vcpu[instance];
> +ctxt.caps = v->arch.vmce.mcg_cap;
> +ctxt.mci_ctl2_bank0 = v->arch.vmce.bank[0].mci_ctl2;
> +ctxt.mci_ctl2_bank1 = v->arch.vmce.bank[1].mci_ctl2;
> +ctxt.mcg_ext_ctl = v->arch.vmce.mcg_ext_ctl;
> +
> +err = hvm_save_entry(VMCE_VCPU, v->vcpu_id, h, &ctxt);
> +}
> +else for_each_vcpu ( d, v )
>  {
>  struct hvm_vmce_vcpu ctxt = {
>  .caps = v->arch.vmce.mcg_cap,
> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
> index 540ba08..d3c4e14 100644
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -624,12 +624,10 @@ long arch_do_domctl(
>   !is_hvm_domain(d) )
>  break;
>
> -domain_pause(d);
>  ret = hvm_save_one(d, domctl->u.hvmcontext_partial.type,
> domctl->u.hvmcontext_partial.instance,
> domctl->u.hvmcontext_partial.buffer,
> &domctl->u.hvmcontext_partial.bufsz);
> -domain_unpause(d);
>
>  if ( !ret )
>  copyback = true;
> diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
> index 3ea895a..56f4691 100644
> --- a/xen/arch/x86/hvm/hpet.c
> +++ b/xen/arch/x86/hvm/hpet.c
> @@ -509,7 +509,7 @@ static const struct hvm_mmio_ops hpet_mmio_ops =
> {
>  };
>
>
> -static int hpet_save(struct domain *d, hvm_domain_context_t *h)
> +static int hpet_save(struct domain *d, hvm_domain_context_t *h,
> unsigned int instance)
>  {
>  HPETState *hp = domain_vhpet(d);
>  struct vcpu *v = pt_global_vcpu_target(d);
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 205b4cb..140f2c3 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -728,13 +728,19 @@ void hvm_domain_destroy(struct domain *d)
>  }
>  }
>
> -static int hvm_save_tsc_adjust(struct domain *d,
> hvm_domain_context_t *h)
> +static int hvm_save_tsc_adjust(struct domain *d,
> hvm_

Re: [Xen-devel] PCI Device Subtree Change from Traditional to Upstream

2018-01-05 Thread Paul Durrant
> -Original Message-
> From: Kevin Stange [mailto:ke...@steadfast.net]
> Sent: 04 January 2018 21:17
> To: Paul Durrant 
> Cc: George Dunlap ; xen-
> de...@lists.xenproject.org; Anthony Perard 
> Subject: Re: [Xen-devel] PCI Device Subtree Change from Traditional to
> Upstream
> 
> On 01/04/2018 07:26 AM, Paul Durrant wrote:
> >> -Original Message-
> >> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On
> Behalf
> >> Of Anthony PERARD
> >> Sent: 04 January 2018 12:52
> >> To: Kevin Stange 
> >> Cc: George Dunlap ; xen-
> >> de...@lists.xenproject.org
> >> Subject: Re: [Xen-devel] PCI Device Subtree Change from Traditional to
> >> Upstream
> >>
> >> On Wed, Jan 03, 2018 at 05:10:54PM -0600, Kevin Stange wrote:
> >>> On 01/03/2018 11:57 AM, Anthony PERARD wrote:
>  On Wed, Dec 20, 2017 at 11:40:03AM -0600, Kevin Stange wrote:
> > Hi,
> >
> > I've been working on transitioning a number of Windows guests
> under
> >> HVM
> > from using QEMU traditional to QEMU upstream as is recommended
> in
> >> the
> > documentation.  When I move these guests, the PCI subtree for Xen
> > devices changes and Windows creates a totally new copy of each
> >> device.
> > Windows tracks down the storage without issue, but it treats the new
> > instance of the NIC driver as a new device and clears the network
> > configuration even though the MAC address is unchanged.  Manually
> > booting the guest back on the traditional device model reactivates the
> > original PCI subtree and the old network configuration with it.
> >
> > The only thing that I have been able to find that's substantially
> > different comparing the device trees is that the device instance ID
> > values differ on the parent Xen PCI device:
> >
> >
> >>
> PCI\VEN_5853&DEV_0001&SUBSYS_00015853&REV_01\3&267A616A&3&18
> >
> >
> >>
> PCI\VEN_5853&DEV_0001&SUBSYS_00015853&REV_01\3&267A616A&3&10
> >
> > Besides actually setting the guest to boot using QEMU traditional, is
> > there a way to convince Windows to treat these devices as the same?
> A
> > patch-based solution would be acceptable to me if there is one, but I
> > don't understand the code well enough to create my own solution.
> >
> > Kevin,
> >
> > I missed the original email as it went past...
> >
> > Are Xen Project PV drivers installed in the guest? And are you talking about
> a PV NIC device or an emulated device?
> 
> These guests use some of the older Xen PV drivers with a PV NIC, not an
> emulated device.
> 

Ok. I was curious because the latest PV drivers contain a hack (that was 
actually suggested by someone at Microsoft) to make sure that (as far as the 
Windows PnP subsystem is concerned) the Xen platform device never moves once 
the XENBUS driver has been installed. This is done by installing a filter 
driver onto Windows' PCI bus driver that spots the platform device and 
re-writes the trailing 'uniquifier' to be exactly what it was at the time of 
driver installation.
So, if you update your VMs to use newer PV drivers first, then you should be 
immune to the platform device moving on the bus.

Cheers,

  Paul

> --
> Kevin Stange
> Chief Technology Officer
> Steadfast | Managed Infrastructure, Datacenter and Cloud Services
> 800 S Wells, Suite 190 | Chicago, IL 60607
> 312.602.2689 X203 | Fax: 312.602.2688
> ke...@steadfast.net | www.steadfast.net
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Xen 4.11 Development Update

2018-01-05 Thread Juergen Gross
This email only tracks big items for xen.git tree. Please reply for items you
would like to see in 4.11 so that people have an idea what is going on and
prioritise accordingly.

You're welcome to provide description and use cases of the feature you're
working on.

= Timeline =

We now adopt a fixed cut-off date scheme. We will release twice a
year. The upcoming 4.11 timeline are as followed:

* Last posting date: March 16th, 2018
* Hard code freeze: March 30th, 2018
* RC1: TBD
* Release: June 1st, 2018

Note that we don't have freeze exception scheme anymore. All patches
that wish to go into 4.11 must be posted no later than the last posting
date. All patches posted after that date will be automatically queued
into next release.

RCs will be arranged immediately after freeze.

We recently introduced a jira instance to track all the tasks (not only big)
for the project. See: https://xenproject.atlassian.net/projects/XEN/issues.

Most of the tasks tracked by this e-mail also have a corresponding jira task
referred by XEN-N.

I have started to include the version number of series associated to each
feature. Can each owner send an update on the version number if the series
was posted upstream?

= Projects =

== Hypervisor == 

*  Per-cpu tasklet
  -  XEN-28
  -  Konrad Rzeszutek Wilk

=== x86 === 

*  Enable Memory Bandwidth Allocation in Xen (v10)
  -  XEN-48
  -  Yi Sun

*  guest resource mapping (v17)
  -  Paul Durrant

*  vNVDIMM support for HVM guest (RFC v4)
  -  XEN-45
  -  Haozhong Zhang

*  SMMUv3 driver (RFC v4)
  -  Sameer Goel

== Grub2 == 

*  Support PVH guest boot (v1)
  -  Juergen Gross


Juergen Gross

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC 01/44] passthrough/vtd: Don't DMA to the stack in queue_invalidate_wait()

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 21:21,  wrote:
> DMA-ing to the stack is generally considered bad practice.  In this case, if a
> timeout occurs because of a sluggish device which is processing the request,
> the completion notification will corrupt the stack of a subsequent deeper call
> tree.
> 
> Place the poll_slot in a percpu area and DMA to that instead.
> 
> Note: This change does not address other issues with the current
> implementation, such as once a timeout has been suffered, subsequent
> completions can't be correlated with their requests.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

> Julien: This wants backporting to all releases, and therefore should be
> considered for 4.10 at this point.

Interesting remark at this point in time ;-)

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution

2018-01-05 Thread Andrew Cooper
On 05/01/2018 07:48, Juergen Gross wrote:
> On 04/01/18 21:21, Andrew Cooper wrote:
>> This work was developed as an SP3 mitigation, but shelved when it became 
>> clear
>> that it wasn't viable to get done in the timeframe.
>>
>> To protect against SP3 attacks, most mappings needs to be flushed while in
>> user context.  However, to protect against all cross-VM attacks, it is
>> necessary to ensure that the Xen stacks are not mapped in any other cpus
>> address space, or an attacker can still recover at least the GPR state of
>> separate VMs.
> Above statement is too strict: it would be sufficient if no stacks of
> other domains are mapped.

Sadly not.  Having stacks shared by domain means one vcpu can still
steal at least GPR state from other vcpus belonging to the same domain.

Whether or not a specific kernel cares, some definitely will.

> I'm just working on a proof of concept using dedicated per-vcpu stacks
> for 64 bit pv domains. Those stacks would be mapped in the per-domain
> region of the address space. I hope to have a RFC version of the patches
> ready next week.
>
> This would allow to remove the per physical cpu mappings in the guest
> visible address space when doing page table isolation.
>
> In order to avoid SP3 attacks to other vcpu's stacks of the same guest
> we could extend the pv ABI to mark a guest's user L4 page table as
> "single use", i.e. not allowed to be active on multiple vcpus at the
> same time (introducing that ABI modification in the Linux kernel would
> be simple, as the Linux kernel currently lacks support for cross-cpu
> stack exploits and when that support is being added by per-cpu L4 user
> page tables we could just chime in). A L4 page table marked as "single
> use" would map the local vcpu stacks only.

For PV guests, it is the Xen stacks which matter, not the vcpu guest
kernel's ones.

64bit PV guest kernels are already mitigated better than KPTI can ever
manage, because there are no entry stacks or entry stubs required to be
mapped into guest userspace at all.

>> To have isolated stacks, Xen needs a per-pcpu isolated region, which requires
>> that two pCPUs never share the same %cr3.  This is trivial for 32bit PV 
>> guests
>> and HVM guests due to the existing per-vcpu Monitor Tables, but is 
>> problematic
>> for 64bit PV guests, which will run on the same %cr3 when scheduling 
>> different
>> threads from the same process.
>>
>> To avoid breaking the PV ABI, Xen needs to shadow the guest L4 pagetables if
>> it wants to maintain the unique %cr3 property it needs.
>>
>> tl;dr The shadowing algorithm in pt-shadow.c is too much of a performance
>> overhead to be viable, and very high risk to productise in an embargo window.
>> If we want to continue down this route, we either need someone to have a
>> clever alternative to the shadowing algorithm I came up with, or change the 
>> PV
>> ABI to require VMs not to share L4 pagetables.
>>
>> Either way, these patches are presented to start a discussion of the issues.
>> The series as a whole is not in a suitable state for committing.
> I think patch 1 should be excluded from that statement, as it is not
> directly related to the series.

There are bits of the series I do intend to take in, largely in this
form.  Another is "x86/pv: Drop support for paging out the LDT" because
its long-since time for that to disappear.

I should also say that the net changes to context switch and
critical-structure handling across this series is a performance and
security benefit, irrespective of the KAISER/KPTI side of things. 
They'd qualify for inclusion on their own merits alone (if it weren't
for the dependent L4 shadowing issues).

If you're interested, I stumbled onto patch one after introducing the
per-pcpu stack mapping, as virt_to_maddr() came out spectacularly
wrong.  Very observant readers might also notice the bit of misc
debugging which caused me to blindly stumble into XSA-243, which was an
interesting diversion from Xen crashing because of my own pagetable
mistakes.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen 4.11 Development Update

2018-01-05 Thread Jan Beulich
>>> On 05.01.18 at 10:16,  wrote:
> === x86 === 
> 
> *  Enable Memory Bandwidth Allocation in Xen (v10)
>   -  XEN-48
>   -  Yi Sun
> 
> *  guest resource mapping (v17)
>   -  Paul Durrant
> 
> *  vNVDIMM support for HVM guest (RFC v4)
>   -  XEN-45
>   -  Haozhong Zhang
> 
> *  SMMUv3 driver (RFC v4)
>   -  Sameer Goel

I don't think this is x86, but ARM.

I think the PV-shim and per-CPU/L4-shadowing work would now
also belong on this list.

Another x86 item are the emulator additions to support post-AVX
insns and some other, earlier ones we don't have support for
yet. The main parts of that series have now been pending review
for over half a year, I think. I do realize that the recently
published news have had a meaningful impact on the bandwidth
available for review here, but to be honest I'm not very positive
that the situation would be much different if those issues hadn't
been there. Once I get into the position to do the AVX512 work,
I don't even want to think of how long its review may then take.

I don't think it is the right time to propose a (perhaps somewhat
radical/controversial) solution to this, but once things have
calmed down, I think I will have to do so. Otoh those recent
events may mean that not much other development work will be
possible to be completed anyway by mid of March.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC 01/44] passthrough/vtd: Don't DMA to the stack in queue_invalidate_wait()

2018-01-05 Thread Andrew Cooper
On 05/01/2018 09:21, Jan Beulich wrote:
 On 04.01.18 at 21:21,  wrote:
>> DMA-ing to the stack is generally considered bad practice.  In this case, if 
>> a
>> timeout occurs because of a sluggish device which is processing the request,
>> the completion notification will corrupt the stack of a subsequent deeper 
>> call
>> tree.
>>
>> Place the poll_slot in a percpu area and DMA to that instead.
>>
>> Note: This change does not address other issues with the current
>> implementation, such as once a timeout has been suffered, subsequent
>> completions can't be correlated with their requests.
>>
>> Signed-off-by: Andrew Cooper 
> Reviewed-by: Jan Beulich 
>
>> Julien: This wants backporting to all releases, and therefore should be
>> considered for 4.10 at this point.
> Interesting remark at this point in time ;-)

Oops yes.  This might leak the point at which I shelved the plan.

With this all out in the open now, observant people might notice now
many of my 4.10 patches are relevant to the issues at hand.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Dynamic Disassembling domU Instructions

2018-01-05 Thread Jan Beulich
>>> On 05.01.18 at 04:17,  wrote:
> I am trying to modify Xen 4.8 to have it print out the opcode as well as
> some registers of an HVM domU as it runs. I tried to modify
> xen/arch/x86/hvm/emulate.c 's hvmemul_insn_fetch to output the content in
> hvmemul_ctxt->insn_buf with printk. In hvmemul_insn_fetch, it seems that a
> lot of the requested bytes are cached, does the domU's OS repeatedly calls
> the same instruction region over and over again?

No, but certain operations require going through the emulator
twice (e.g. once to formulate a request to qemu, and a second
time to process its response). It would be wrong to read guest
memory a second time in such a case.

You will also notice that after a completed emulation that cache
is being invalidated.

> Lastly, I am using printk to log the opcodes. Ideally I would want the
> opcode to be written to a separate file, but I read that it is not good to
> do any file access in kernel programming. Are there other alternatives or
> util functions that I should consider using?

xentrace would come to mind.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution

2018-01-05 Thread Juergen Gross
On 05/01/18 10:26, Andrew Cooper wrote:
> On 05/01/2018 07:48, Juergen Gross wrote:
>> On 04/01/18 21:21, Andrew Cooper wrote:
>>> This work was developed as an SP3 mitigation, but shelved when it became 
>>> clear
>>> that it wasn't viable to get done in the timeframe.
>>>
>>> To protect against SP3 attacks, most mappings needs to be flushed while in
>>> user context.  However, to protect against all cross-VM attacks, it is
>>> necessary to ensure that the Xen stacks are not mapped in any other cpus
>>> address space, or an attacker can still recover at least the GPR state of
>>> separate VMs.
>> Above statement is too strict: it would be sufficient if no stacks of
>> other domains are mapped.
> 
> Sadly not.  Having stacks shared by domain means one vcpu can still
> steal at least GPR state from other vcpus belonging to the same domain.
> 
> Whether or not a specific kernel cares, some definitely will.
> 
>> I'm just working on a proof of concept using dedicated per-vcpu stacks
>> for 64 bit pv domains. Those stacks would be mapped in the per-domain
>> region of the address space. I hope to have a RFC version of the patches
>> ready next week.
>>
>> This would allow to remove the per physical cpu mappings in the guest
>> visible address space when doing page table isolation.
>>
>> In order to avoid SP3 attacks to other vcpu's stacks of the same guest
>> we could extend the pv ABI to mark a guest's user L4 page table as
>> "single use", i.e. not allowed to be active on multiple vcpus at the
>> same time (introducing that ABI modification in the Linux kernel would
>> be simple, as the Linux kernel currently lacks support for cross-cpu
>> stack exploits and when that support is being added by per-cpu L4 user
>> page tables we could just chime in). A L4 page table marked as "single
>> use" would map the local vcpu stacks only.
> 
> For PV guests, it is the Xen stacks which matter, not the vcpu guest
> kernel's ones.

Indeed. That's the reason I want to have per-vcpu Xen stacks.

> 64bit PV guest kernels are already mitigated better than KPTI can ever
> manage, because there are no entry stacks or entry stubs required to be
> mapped into guest userspace at all.

But without Xen being secured via a mechanism similar to KPTI this
is moot, as user mode can exploit the whole host including the own
kernel's memory.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen 4.11 Development Update

2018-01-05 Thread Manish Jaggi
Hello Juergen,

On 5 January 2018 at 14:46, Juergen Gross  wrote:
> This email only tracks big items for xen.git tree. Please reply for items you
> would like to see in 4.11 so that people have an idea what is going on and
> prioritise accordingly.
>
> You're welcome to provide description and use cases of the feature you're
> working on.
>
> = Timeline =
>
> We now adopt a fixed cut-off date scheme. We will release twice a
> year. The upcoming 4.11 timeline are as followed:
>
> * Last posting date: March 16th, 2018
> * Hard code freeze: March 30th, 2018
> * RC1: TBD
> * Release: June 1st, 2018
>
> Note that we don't have freeze exception scheme anymore. All patches
> that wish to go into 4.11 must be posted no later than the last posting
> date. All patches posted after that date will be automatically queued
> into next release.
>
> RCs will be arranged immediately after freeze.
>
> We recently introduced a jira instance to track all the tasks (not only big)
> for the project. See: https://xenproject.atlassian.net/projects/XEN/issues.
>
> Most of the tasks tracked by this e-mail also have a corresponding jira task
> referred by XEN-N.
>
> I have started to include the version number of series associated to each
> feature. Can each owner send an update on the version number if the series
> was posted upstream?
>
> = Projects =
>
> == Hypervisor ==
>
> *  Per-cpu tasklet
>   -  XEN-28
>   -  Konrad Rzeszutek Wilk
>
> === x86 ===
>
> *  Enable Memory Bandwidth Allocation in Xen (v10)
>   -  XEN-48
>   -  Yi Sun
>
> *  guest resource mapping (v17)
>   -  Paul Durrant
>
> *  vNVDIMM support for HVM guest (RFC v4)
>   -  XEN-45
>   -  Haozhong Zhang
>
> *  SMMUv3 driver (RFC v4)
>   -  Sameer Goel
>
> == Grub2 ==
>
> *  Support PVH guest boot (v1)
>   -  Juergen Gross
>
>
Please add arm: IORT support for Xen as a candidate for 4.11
I have posted RFC [1]
This patchset corresponds to XEN70 / Xen74 Jira Tasks.

[1] https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg7.html

-Manish Jaggi

> Juergen Gross
>
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution

2018-01-05 Thread Andrew Cooper
On 05/01/2018 09:39, Juergen Gross wrote:
> On 05/01/18 10:26, Andrew Cooper wrote:
>> On 05/01/2018 07:48, Juergen Gross wrote:
>>> On 04/01/18 21:21, Andrew Cooper wrote:
 This work was developed as an SP3 mitigation, but shelved when it became 
 clear
 that it wasn't viable to get done in the timeframe.

 To protect against SP3 attacks, most mappings needs to be flushed while in
 user context.  However, to protect against all cross-VM attacks, it is
 necessary to ensure that the Xen stacks are not mapped in any other cpus
 address space, or an attacker can still recover at least the GPR state of
 separate VMs.
>>> Above statement is too strict: it would be sufficient if no stacks of
>>> other domains are mapped.
>> Sadly not.  Having stacks shared by domain means one vcpu can still
>> steal at least GPR state from other vcpus belonging to the same domain.
>>
>> Whether or not a specific kernel cares, some definitely will.
>>
>>> I'm just working on a proof of concept using dedicated per-vcpu stacks
>>> for 64 bit pv domains. Those stacks would be mapped in the per-domain
>>> region of the address space. I hope to have a RFC version of the patches
>>> ready next week.
>>>
>>> This would allow to remove the per physical cpu mappings in the guest
>>> visible address space when doing page table isolation.
>>>
>>> In order to avoid SP3 attacks to other vcpu's stacks of the same guest
>>> we could extend the pv ABI to mark a guest's user L4 page table as
>>> "single use", i.e. not allowed to be active on multiple vcpus at the
>>> same time (introducing that ABI modification in the Linux kernel would
>>> be simple, as the Linux kernel currently lacks support for cross-cpu
>>> stack exploits and when that support is being added by per-cpu L4 user
>>> page tables we could just chime in). A L4 page table marked as "single
>>> use" would map the local vcpu stacks only.
>> For PV guests, it is the Xen stacks which matter, not the vcpu guest
>> kernel's ones.
> Indeed. That's the reason I want to have per-vcpu Xen stacks.

We will have to be extra careful going along those lines (and to
forewarn you, I don't have a good gut feeling about it).

For one, livepatching safety currently depends on the per-pcpu stacks. 
Also, you will have to entirely rework how the IST stacks work, as they
will have to move to being per-vcpu as well, which means modifying the
TSS and rewriting the syscall stubs on context switch.

At the moment, Xen's per-pcpu stacks have shielded us some of the
SP2/RSB issues, because of reset_stack_and_jump() used during
scheduling.  The waitqueue infrastructure is the one place where this is
violated at the moment, and is only used in practice during
introspection.  However, for other reasons, I'm looking to delete that
code and pretend that it never existed.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen 4.11 Development Update

2018-01-05 Thread Juergen Gross
On 05/01/18 10:32, Jan Beulich wrote:
 On 05.01.18 at 10:16,  wrote:
>> === x86 === 
>>
>> *  Enable Memory Bandwidth Allocation in Xen (v10)
>>   -  XEN-48
>>   -  Yi Sun
>>
>> *  guest resource mapping (v17)
>>   -  Paul Durrant
>>
>> *  vNVDIMM support for HVM guest (RFC v4)
>>   -  XEN-45
>>   -  Haozhong Zhang
>>
>> *  SMMUv3 driver (RFC v4)
>>   -  Sameer Goel
> 
> I don't think this is x86, but ARM.

Right, that was just an error in my script for generating the mail.

> I think the PV-shim and per-CPU/L4-shadowing work would now
> also belong on this list.

Yep.

> Another x86 item are the emulator additions to support post-AVX
> insns and some other, earlier ones we don't have support for
> yet. The main parts of that series have now been pending review
> for over half a year, I think. I do realize that the recently
> published news have had a meaningful impact on the bandwidth
> available for review here, but to be honest I'm not very positive
> that the situation would be much different if those issues hadn't
> been there. Once I get into the position to do the AVX512 work,
> I don't even want to think of how long its review may then take.

Can I add you for being responsible?

> I don't think it is the right time to propose a (perhaps somewhat
> radical/controversial) solution to this, but once things have
> calmed down, I think I will have to do so. Otoh those recent
> events may mean that not much other development work will be
> possible to be completed anyway by mid of March.

Why don't you post your proposal now? Even if the discussion will
be a bit slower due to current activities maybe the additional time
to think about your idea could help.

Thanks for the notes,


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v17 06/11] x86/hvm/ioreq: add a new mappable resource type...

2018-01-05 Thread Paul Durrant
> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf
> Of Paul Durrant
> Sent: 03 January 2018 16:48
> To: 'Jan Beulich' 
> Cc: StefanoStabellini ; Wei Liu
> ; Andrew Cooper ; Tim
> (Xen.org) ; George Dunlap ;
> JulienGrall ; Ian Jackson ;
> xen-devel@lists.xenproject.org
> Subject: Re: [Xen-devel] [PATCH v17 06/11] x86/hvm/ioreq: add a new
> mappable resource type...
> 
> > -Original Message-
> > From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On
> Behalf
> > Of Jan Beulich
> > Sent: 03 January 2018 16:41
> > To: Paul Durrant 
> > Cc: StefanoStabellini ; Wei Liu
> > ; Andrew Cooper ;
> Tim
> > (Xen.org) ; George Dunlap ;
> > JulienGrall ; xen-devel@lists.xenproject.org; Ian
> > Jackson 
> > Subject: Re: [Xen-devel] [PATCH v17 06/11] x86/hvm/ioreq: add a new
> > mappable resource type...
> >
> > >>> On 03.01.18 at 17:06,  wrote:
> > >>  -Original Message-
> > >> From: Jan Beulich [mailto:jbeul...@suse.com]
> > >> Sent: 03 January 2018 15:48
> > >> To: Paul Durrant 
> > >> Cc: JulienGrall ; Andrew Cooper
> > >> ; Wei Liu ; George
> > >> Dunlap ; Ian Jackson
> > ;
> > >> Stefano Stabellini ; xen-
> > de...@lists.xenproject.org;
> > >> Konrad Rzeszutek Wilk ; Tim (Xen.org)
> > >> 
> > >> Subject: Re: [PATCH v17 06/11] x86/hvm/ioreq: add a new mappable
> > >> resource type...
> > >>
> > >> >>> On 03.01.18 at 13:19,  wrote:
> > >> > +static void hvm_free_ioreq_mfn(struct hvm_ioreq_server *s, bool
> > buf)
> > >> > +{
> > >> > +struct domain *d = s->domain;
> > >> > +struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
> > >> > +
> > >> > +if ( !iorp->page )
> > >> > +return;
> > >> > +
> > >> > +page_list_add_tail(iorp->page, &d-
> > >> >arch.hvm_domain.ioreq_server.pages);
> > >>
> > >> Afaict s->domain is the guest, not the domain containing the
> > >> emulator. Hence this new model of freeing the pages is safe only
> > >> when the emulator domain is dead by the time the guest is being
> > >> cleaned up.
> > >
> > > From the investigations done w.r.t. the grant table pages I don't think 
> > > this
> > > is the case. The emulating domain will have references on the pages and
> > this
> > > keeps the target domain in existence, only completing domain
> destruction
> > when
> > > the references are finally dropped. I've tested this by leaving an 
> > > emulator
> > > running whilst I 'xl destroy' the domain; the domain remains as a zombie
> > > until emulator terminates.
> >
> > Oh, right, I forgot about that aspect.
> >
> > >> What is additionally confusing me is the page ownership: Wasn't
> > >> the (original) intention to make the pages owned by the emulator
> > >> domain rather than the guest? I seem to recall you referring to
> > >> restrictions in do_mmu_update(), but a domain should always be
> > >> able to map pages it owns, shouldn't it?
> > >
> > > I'm sure we had this discussion before. I am trying to make resource
> > mapping
> > > as uniform as possible so, like the grant table pages, the ioreq server
> pages
> > > are assigned to the target domain. Otherwise the domain trying to map
> > > resources has know which actual domain they are assigned to, rather
> than
> > the
> > > domain they relate to... which is pretty ugly.
> >
> > Didn't I suggest a slight change to the interface to actually make
> > this not as ugly?
> 
> Yes, you did but I didn't really want to go that way unless I absolutely had 
> to.
> If you'd really prefer things that way then I'll re-work the hypercall to 
> allow
> the domain owning the resource pages to be passed back. Maybe it will
> ultimately end up neater.
> 
> >
> > >> Furthermore you continue to use Xen heap pages rather than
> > >> domain heap ones.
> > >
> > > Yes, this seems reasonable since Xen will always need mappings of the
> > pages
> > > and the aforementioned reference counting only works for Xen heap
> > pages AIUI.
> >
> > share_xen_page_with_guest() makes any page a Xen heap one.
> 
> Oh, that's somewhat counter-intuitive.
> 
> > See vmx_alloc_vlapic_mapping() for an example.
> >
> 
> Ok, thanks. If change back to having the pages owned by the tools domain
> then I guess this will all be avoided anyway.

I've run into a problem this this, but it may be easily soluable...

If I pass back the domid of the resource page owner and that owner is the tools 
domain, then when the tools domain attempts the mmu_update hypercall it fails 
because it has passed its own domid to mmu_update. The failure is caused by a 
check in get_pg_owner() which errors own if the passed in domid == 
curr->domain_id but, strangely, not if domid == DOMID_SELF. Any idea why this 
check is there? To me it looks like it should be safe to specify 
curr->domain_id and have get_pg_owner() simply behave as if DOMID_SELF was 
passed.

The alternative would be to have the acquire_resource hypercall do the check 
and pass back DOMID_SELF is the ioreq server dm domain happens t

Re: [Xen-devel] Xen 4.11 Development Update

2018-01-05 Thread Jan Beulich
>>> On 05.01.18 at 10:57,  wrote:
> On 05/01/18 10:32, Jan Beulich wrote:
>> Another x86 item are the emulator additions to support post-AVX
>> insns and some other, earlier ones we don't have support for
>> yet. The main parts of that series have now been pending review
>> for over half a year, I think. I do realize that the recently
>> published news have had a meaningful impact on the bandwidth
>> available for review here, but to be honest I'm not very positive
>> that the situation would be much different if those issues hadn't
>> been there. Once I get into the position to do the AVX512 work,
>> I don't even want to think of how long its review may then take.
> 
> Can I add you for being responsible?

Of course.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen 4.11 Development Update

2018-01-05 Thread Paul Durrant
> -Original Message-
> From: Juergen Gross [mailto:jgr...@suse.com]
> Sent: 05 January 2018 09:17
> To: xen-devel@lists.xenproject.org
> Cc: jgr...@suse.com
> Subject: Xen 4.11 Development Update
> 
> This email only tracks big items for xen.git tree. Please reply for items you
> would like to see in 4.11 so that people have an idea what is going on and
> prioritise accordingly.
> 
> You're welcome to provide description and use cases of the feature you're
> working on.
> 
> = Timeline =
> 
> We now adopt a fixed cut-off date scheme. We will release twice a
> year. The upcoming 4.11 timeline are as followed:
> 
> * Last posting date: March 16th, 2018
> * Hard code freeze: March 30th, 2018
> * RC1: TBD
> * Release: June 1st, 2018
> 
> Note that we don't have freeze exception scheme anymore. All patches
> that wish to go into 4.11 must be posted no later than the last posting
> date. All patches posted after that date will be automatically queued
> into next release.
> 
> RCs will be arranged immediately after freeze.
> 
> We recently introduced a jira instance to track all the tasks (not only big)
> for the project. See: https://xenproject.atlassian.net/projects/XEN/issues.
> 
> Most of the tasks tracked by this e-mail also have a corresponding jira task
> referred by XEN-N.
> 
> I have started to include the version number of series associated to each
> feature. Can each owner send an update on the version number if the series
> was posted upstream?
> 
> = Projects =
> 
> == Hypervisor ==
> 
> *  Per-cpu tasklet
>   -  XEN-28
>   -  Konrad Rzeszutek Wilk
> 
> === x86 ===
> 
> *  Enable Memory Bandwidth Allocation in Xen (v10)
>   -  XEN-48
>   -  Yi Sun
> 
> *  guest resource mapping (v17)
>   -  Paul Durrant
> 

Could you also add PV-IOMMU here? I do have some preliminary patches and have 
successfully tested a dom0 with a 1:1 GFN:BFN mapping set up using the new 
hypercalls, so I expect to post something in time for 4.11.

Thanks,

  Paul

> *  vNVDIMM support for HVM guest (RFC v4)
>   -  XEN-45
>   -  Haozhong Zhang
> 
> *  SMMUv3 driver (RFC v4)
>   -  Sameer Goel
> 
> == Grub2 ==
> 
> *  Support PVH guest boot (v1)
>   -  Juergen Gross
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v17 06/11] x86/hvm/ioreq: add a new mappable resource type...

2018-01-05 Thread Jan Beulich
>>> On 05.01.18 at 11:16,  wrote:
>> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf
>> Of Paul Durrant
>> Sent: 03 January 2018 16:48
>> Ok, thanks. If change back to having the pages owned by the tools domain
>> then I guess this will all be avoided anyway.
> 
> I've run into a problem this this, but it may be easily soluable...
> 
> If I pass back the domid of the resource page owner and that owner is the 
> tools domain, then when the tools domain attempts the mmu_update hypercall it 
> fails because it has passed its own domid to mmu_update. The failure is 
> caused by a check in get_pg_owner() which errors own if the passed in domid 
> == curr->domain_id but, strangely, not if domid == DOMID_SELF. Any idea why 
> this check is there? To me it looks like it should be safe to specify 
> curr->domain_id and have get_pg_owner() simply behave as if DOMID_SELF was 
> passed.

A little while there was some discussion on this general topic (sadly
I don't recall the context), and iirc it was in particular Andrew (or
George?) who thought it should be the other way around: If
DOMID_SELF can be used, the actual domain ID should not be
accepted (which iirc is currently the case in some places, but not
in others). But ...

> The alternative would be to have the acquire_resource hypercall do the check 
> and pass back DOMID_SELF is the ioreq server dm domain happens to match 
> currd->domain_id, but that seems a bit icky.

... this wasn't the plan anyway. Instead we had talked of the
hypercall returning just a boolean indicator, to distinguish
self-owned pages from target-domain-owned ones. The
caller is supposed to know the domain ID of the target domain,
after all.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v17 06/11] x86/hvm/ioreq: add a new mappable resource type...

2018-01-05 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 05 January 2018 10:29
> To: Paul Durrant 
> Cc: JulienGrall ; Andrew Cooper
> ; George Dunlap
> ; Ian Jackson ; Wei Liu
> ; StefanoStabellini ; xen-
> de...@lists.xenproject.org; Tim (Xen.org) 
> Subject: RE: [Xen-devel] [PATCH v17 06/11] x86/hvm/ioreq: add a new
> mappable resource type...
> 
> >>> On 05.01.18 at 11:16,  wrote:
> >> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On
> Behalf
> >> Of Paul Durrant
> >> Sent: 03 January 2018 16:48
> >> Ok, thanks. If change back to having the pages owned by the tools domain
> >> then I guess this will all be avoided anyway.
> >
> > I've run into a problem this this, but it may be easily soluable...
> >
> > If I pass back the domid of the resource page owner and that owner is the
> > tools domain, then when the tools domain attempts the mmu_update
> hypercall it
> > fails because it has passed its own domid to mmu_update. The failure is
> > caused by a check in get_pg_owner() which errors own if the passed in
> domid
> > == curr->domain_id but, strangely, not if domid == DOMID_SELF. Any idea
> why
> > this check is there? To me it looks like it should be safe to specify
> > curr->domain_id and have get_pg_owner() simply behave as if
> DOMID_SELF was
> > passed.
> 
> A little while there was some discussion on this general topic (sadly
> I don't recall the context), and iirc it was in particular Andrew (or
> George?) who thought it should be the other way around: If
> DOMID_SELF can be used, the actual domain ID should not be
> accepted (which iirc is currently the case in some places, but not
> in others). But ...
> 
> > The alternative would be to have the acquire_resource hypercall do the
> check
> > and pass back DOMID_SELF is the ioreq server dm domain happens to
> match
> > currd->domain_id, but that seems a bit icky.
> 
> ... this wasn't the plan anyway. Instead we had talked of the
> hypercall returning just a boolean indicator, to distinguish
> self-owned pages from target-domain-owned ones. The
> caller is supposed to know the domain ID of the target domain,
> after all.

Ah, ok. If it's only necessary to distinguish between self-owned and 
target-owned then that will be fine. My current test series just makes the 
domid an IN/OUT parameter and re-writes it if necessary. I'll switch to using a 
flag to avoid the issue as you suggest.

  Paul

> 
> Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 10/74] x86/time: Print a more helpful error when a platform timer can't be found

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> From: Andrew Cooper 
> 
> Signed-off-by: Andrew Cooper 
> Reviewed-by: Wei Liu 

Acked-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 11/74] x86/link: Introduce and use SECTION_ALIGN

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> From: Andrew Cooper 
> 
> ... to reduce the quantity of #ifdef EFI.
> 
> Signed-off-by: Andrew Cooper 
> Reviewed-by: Wei Liu 

Acked-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 12/74] xen/acpi: mark the PM timer FADT field as optional

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> From: Roger Pau Monne 
> 
> According to the ACPI 6.1 specification this field is optional, so
> mark it as such.
> 
> Signed-off-by: Roger Pau Monné 

This would probably better be a direct port of Linux commit
1d82980c99 (obviously just the tbfadt.c parts of it); perhaps
the other comment in acpi_tb_validate_fadt() would also be
worth updating.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 13/74] xen/domctl: Return arch_config via getdomaininfo

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -116,6 +116,7 @@ struct xen_domctl_getdomaininfo {
>  uint32_t ssidref;
>  xen_domain_handle_t handle;
>  uint32_t cpupool;
> +struct xen_arch_domainconfig arch_config;
>  };

Such an addition requires the interface version to be bumped.
As I assume we will want to backport this to 4.10, we should
make sure this (and perhaps others in this series, but none
outside) is the only domctl interface change for this version,
i.e. for any others until 4.11 goes out we'd need to remember
to bump it a second time then.

With that
Reviewed-by: Jan Beulich 

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 16/74] x86/fixmap: Modify fix_to_virt() to return a void pointer

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> --- a/xen/drivers/acpi/apei/apei-io.c
> +++ b/xen/drivers/acpi/apei/apei-io.c
> @@ -92,7 +92,7 @@ static void __iomem *__init apei_range_map(paddr_t paddr, 
> unsigned long size)
>   apei_range_nr++;
>   }
>  
> - return (void __iomem *)fix_to_virt(FIX_APEI_RANGE_BASE + start_nr);
> + return fix_to_virt(FIX_APEI_RANGE_BASE + start_nr);
>  }

Granted we probably don't use "__iomem" consistently, and we may
hence well want to consider dropping it altogether. But without that
being called out in the description, I don't think it should be dropped
here and further down.

Another option would be to introduce something like fix_to_io_virt(),
with that annotation included in the cast.

> --- a/xen/include/asm-x86/apicdef.h
> +++ b/xen/include/asm-x86/apicdef.h
> @@ -119,7 +119,7 @@
>  /* Only available in x2APIC mode */
>  #define  APIC_SELF_IPI   0x3F0
>  
> -#define APIC_BASE (fix_to_virt(FIX_APIC_BASE))
> +#define APIC_BASE (__fix_to_virt(FIX_APIC_BASE))

Please take the opportunity to get rid of the unnecessary
parentheses.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 17/74] ---- x86/Kconfig: Options for Xen and PVH support

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:

Please drop the stray  from the subject.

> From: Andrew Cooper 
> 
> Signed-off-by: Andrew Cooper 

No description (rationale) at all? But perhaps that's to be attributed
to the RFC nature of the series.

> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -117,6 +117,23 @@ config TBOOT
> Technology (TXT)
>  
> If unsure, say Y.
> +
> +config XEN_GUEST
> + def_bool n
> + prompt "Xen Guest"
> + ---help---
> +   Support for Xen detecting when it is running under Xen.
> +
> +   If unsure, say N.
> +
> +config PVH_GUEST
> + def_bool n
> + prompt "PVH Guest"
> + depends on XEN_GUEST
> + ---help---
> +   Support booting using the PVH ABI.
> +
> +   If unsure, say N.

The names of the options are ambiguous, yet I can't really think of
nice alternatives. Maybe XEN_AS_GUEST and XEN_AS_PVH_GUEST
or GUEST_OF_XEN and PHVH_GUEST_OF_XEN? Same goes for the
prompts and PVH_GUEST's help text.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 18/74] x86/link: Relocate program headers

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> From: Andrew Cooper 
> 
> When the xen binary is loaded by libelf (in the future) we rely on the
> elf loader to load the binary accordingly.

It would really help if it was said here what effect this has on the
program headers - I can only guess that it'll make p_vaddr different
from p_paddr. I'm also rather uncertain about the entry point
change wrt various (and especially older) boot loaders.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [distros-debian-jessie test] 73932: trouble: blocked/broken

2018-01-05 Thread Platform Team regression test user
flight 73932 distros-debian-jessie real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/73932/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-pvopsbroken
 build-i386   broken
 build-amd64-pvopsbroken
 build-armhf  broken
 build-amd64  broken
 build-i386-pvops broken
 build-armhf-pvops 5 capture-logsbroken REGR. vs. 73569
 build-armhf   5 capture-logsbroken REGR. vs. 73569
 build-armhf-pvops 3 syslog-serverrunning
 build-armhf   3 syslog-serverrunning

Tests which did not succeed, but are not blocking:
 test-amd64-i386-i386-jessie-netboot-pvgrub  1 build-check(1)   blocked n/a
 test-amd64-i386-amd64-jessie-netboot-pygrub  1 build-check(1)  blocked n/a
 test-amd64-amd64-i386-jessie-netboot-pygrub  1 build-check(1)  blocked n/a
 test-armhf-armhf-armhf-jessie-netboot-pygrub  1 build-check(1) blocked n/a
 test-amd64-amd64-amd64-jessie-netboot-pvgrub  1 build-check(1) blocked n/a
 build-armhf-pvops 4 host-install(4)  broken like 73569
 build-armhf   4 host-install(4)  broken like 73569
 build-i386-pvops  4 host-install(4)  broken like 73569
 build-i3864 host-install(4)  broken like 73569
 build-amd64   4 host-install(4)  broken like 73569
 build-amd64-pvops 4 host-install(4)  broken like 73569

baseline version:
 flight   73569

jobs:
 build-amd64  broken  
 build-armhf  broken  
 build-i386   broken  
 build-amd64-pvopsbroken  
 build-armhf-pvopsbroken  
 build-i386-pvops broken  
 test-amd64-amd64-amd64-jessie-netboot-pvgrub blocked 
 test-amd64-i386-i386-jessie-netboot-pvgrub   blocked 
 test-amd64-i386-amd64-jessie-netboot-pygrub  blocked 
 test-armhf-armhf-armhf-jessie-netboot-pygrub blocked 
 test-amd64-amd64-i386-jessie-netboot-pygrub  blocked 



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 19/74] x86: introduce ELFNOTE macro

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> It is needed later for introducing PVH entry point.

Perhaps worth moving the addition there, rather than introducing
dead code here?

> --- a/xen/include/asm-x86/asm_defns.h
> +++ b/xen/include/asm-x86/asm_defns.h
> @@ -409,4 +409,16 @@ static always_inline void stac(void)
>  #define REX64_PREFIX "rex64/"
>  #endif
>  
> +#define ELFNOTE(name, type, desc)   \
> +.pushsection .note.name   ; \

Please also specify section attributes and type.

> +.align 4  ; \

I think we should try to avoid the ambiguous .align, and instead
use .balign or .p2align in new code.

> +.long 2f - 1f   /* namesz */  ; \
> +.long 4f - 3f   /* descsz */  ; \
> +.long type  /* type   */  ; \
> +1:.asciz #name  /* name   */  ; \
> +2:.align 4; \
> +3:desc  /* desc   */  ; \
> +4:.align 4; \

I'd prefer if you used .L-prefixed labels in new macros, to eliminate
the risk of references around the macro use sites becoming broken.
And if you really meant to stick with numeric labels, please add two
padding blanks after each of them, to align the directives.

Considering this is meant to be used by assembly code only, perhaps
it would be better to make this an assembler macro rather than a C
one (eliminating the need for all the "; \")?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Xen Project Spectre/Meltdown FAQ

2018-01-05 Thread Lars Kurth
Hi all, this is a repost of 
https://blog.xenproject.org/2018/01/04/xen-project-spectremeltdown-faq/ for 
xen-users/xen-devel. If you have questions, please reply to this thread and we 
will try and improve the FAQ based on questions.
Regards
Lars


Google’s Project Zero announced several information leak vulnerabilities 
affecting all modern superscalar processors. Details can be found on their 
blog, and in the Xen Project Advisory 254 [1]. To help our users understand the 
impact and our next steps forward, we put together the following FAQ.

Note that we will update the FAQ as new information surfaces.

= Is Xen impacted by Meltdown and Spectre? =

There are two angles to consider for this question:

* Can an untrusted guest attack the hypervisor using Meltdown or Spectre?
* Can a guest user-space program attack a guest kernel using Meltdown or 
Spectre?

Systems running Xen, like all operating systems and hypervisors, are 
potentially affected by Spectre (referred to as SP1 and SP2 in Advisory 254 
[1]). For Arm Processors information, you can find which processors are 
impacted here [2].  In general, both the hypervisor and a guest kernel are 
vulnerable to attack via SP1 and SP2.

Only Intel processors are impacted by Meltdown (referred to as SP3 in Advisory 
254 [1]). On Intel processors, only 64-bit PV mode guests can attack Xen. 
Guests running in 32-bit PV mode, HVM mode, and PVH mode cannot attack the 
hypervisor using SP3. However, in 32-bit PV mode, HVM mode, and PVH mode, guest 
userspaces can attack guest kernels using SP3; so updating guest kernels is 
advisable.

Interestingly, guest kernels running in 64-bit PV mode are not vulnerable to 
attack using SP3, because 64-bit PV guests already run in a KPTI-like mode.

= Is there any risk of privilege escalation? =

Meltdown and Spectre are, by themselves, only information leaks. There is no 
suggestion that speculative execution can be used to modify memory or cause a 
system to do anything it might not have done already.

= Where can I find more information? =

We will update this blog post and Advisory 254 [1] as new information becomes 
available. Updates will also be published on xen-announce@.

We will also maintain a technical FAQ on our wiki [3] for answers to more 
detailed technical questions that emerge on xen-devel@ and other communication 
channels.

= Are there any patches for the vulnerability? =

We have prototype patches for a mitigation for Meltdown on Intel CPUs and a 
Mitigation for SP2/CVE-2017-5715, which are functional but have not undergone 
rigorous review and have not been backported to all supported Xen Project 
releases.

As information related to Meltdown and Spectre is now public, development will 
continue in public on xen-devel@ and patches will be posted and attached to 
Advisory 254 [1] as they become available in the next few days.

= Can SP1/SP2 be fixed at all? What plans are there to mitigate them? =

SP2 can be mitigated in two ways, both of which essentially prevent speculative 
execution of indirect branches. The first is to flush the branch prediction 
logic on entry into the hypervisor. This requires microcode updates, which 
Intel and AMD are in the process of preparing, as well as patches to the 
hypervisor which are also in process and should be available soon.

The second is to do indirect jumps in a way which is not subject to speculative 
execution. This requires the hypervisor to be recompiled with a compiler that 
contains special new features. These new compiler features are also in the 
process of being prepared for both gcc and clang, and should be available soon.

SP1 is much more difficult to mitigate. We have some ideas we’re exploring, but 
they’re still at the design stage at this point.

= Does Xen have any equivalent to Linux’s KPTI series? =

Linux’s KPTI series is designed to address SP3 only.  For Xen guests, only 
64-bit PV guests are affected by SP3. A KPTI-like approach was explored 
initially, but required significant ABI changes.  Instead we’ve decided to go 
with an alternate approach, which is less disruptive and less complex to 
implement. The chosen approach runs PV guests in a PVH container, which ensures 
that PV guests continue to behave as before, while providing the isolation that 
protects the hypervisor from SP3. This works well for Xen 4.8 to Xen 4.10, 
which is currently our priority.

For Xen 4.6 and 4.7, we are evaluating several options, but we have not yet 
finalized the best solution.

= Devicemodel stub domains run in PV mode, so is it still more safe to run 
device models in a stub domain than in domain 0? =

The short answer is, yes, it is still safer to run stub domains than otherwise.

If an attacker can gain control of the device model running in a stub domain, 
it can indeed attempt to use these processor vulnerabilities to read 
information from Xen.

However, if an attacker can gain control of a device model running in domain 0 
without deprivilegi

Re: [Xen-devel] [PATCH RFC v1 20/74] x86: produce a binary that can be booted as PVH

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> Signed-off-by: Wei Liu 
> Signed-off-by: Andrew Cooper 

Again I assume a description is still being intended to be written

> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -75,6 +75,8 @@ efi-y := $(shell if [ ! -r 
> $(BASEDIR)/include/xen/compile.h -o \
>-O $(BASEDIR)/include/xen/compile.h ]; then \
>   echo '$(TARGET).efi'; fi)
>  
> +shim-$(CONFIG_PVH_GUEST) := $(TARGET)-shim
> +
>  ifneq ($(build_id_linker),)
>  notes_phdrs = --notes
>  else
> @@ -93,7 +95,7 @@ endif
>  syms-warn-dup-y := --warn-dup
>  syms-warn-dup-$(CONFIG_SUPPRESS_DUPLICATE_SYMBOL_WARNINGS) :=
>  
> -$(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32
> +$(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32 $(shim-y)
>   ./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) 
> $(XEN_IMG_OFFSET) \
>  `$(NM) $(TARGET)-syms | sed -ne 's/^\([^ ]*\) . 
> __2M_rwdata_end$$/0x\1/p'`

Hmm, so you mean to build shim and "normal" Xen at the same time,
with all the same objects? That's rather unexpected following the
earlier exchange Andrew and I had. I would expect the shim to not
require quite a few bits and pieces, and hence wanting to be built
independently.

> @@ -144,6 +146,11 @@ $(TARGET)-syms: prelink.o xen.lds 
> $(BASEDIR)/common/symbols-dummy.o
>   >$(@D)/$(@F).map
>   rm -f $(@D)/.$(@F).[0-9]*
>  
> +# Use elf32-x86-64 if toolchain support exists, elf32-i386 otherwise.
> +$(TARGET)-shim: FORMAT = $(firstword $(filter elf32-x86-64,$(shell 
> $(OBJCOPY) --help)) elf32-i386)

What are the implications of using one vs the other? If elf32-i386
works, why not use it all the time?

> @@ -374,6 +375,15 @@ cs32_switch:
>  /* Jump to earlier loaded address. */
>  jmp *%edi
>  
> +
> +#ifdef CONFIG_PVH_GUEST

No double blank lines please.

> +ELFNOTE(Xen, XEN_ELFNOTE_PHYS32_ENTRY, .long sym_offs(__pvh_start))
> +
> +__pvh_start:
> +ud2a
> +
> +#endif /* CONFIG_PVH_GUEST */
> +
>  __start:

Does the new code strictly need to live here? Can't is be kept both
out of the resulting binary sequence currently resulting here and
out of this source file altogether (by introducing a new pvh.S or
shim.S)?

> --- a/xen/arch/x86/xen.lds.S
> +++ b/xen/arch/x86/xen.lds.S
> @@ -34,7 +34,7 @@ OUTPUT_ARCH(i386:x86-64)
>  PHDRS
>  {
>text PT_LOAD ;
> -#if defined(BUILD_ID) && !defined(EFI)
> +#if (defined(BUILD_ID) && !defined(EFI)) || defined (CONFIG_PVH_GUEST)

Did you mean

#if (defined(BUILD_ID) || defined(CONFIG_PVH_GUEST)) && !defined(EFI)

? Of course this would be moot if main and shim binary were to
be built independently.

Also - stray blank.

> @@ -128,6 +128,12 @@ SECTIONS
> __param_end = .;
>} :text
>  
> +#if defined(CONFIG_PVH_GUEST) && !defined(EFI)

The EFI part here then also wouldn't be necessary, afaict.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable test] 117617: regressions - trouble: broken/fail/pass

2018-01-05 Thread osstest service owner
flight 117617 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/117617/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm   broken
 test-amd64-amd64-xl-qcow2broken
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsmbroken
 test-amd64-amd64-xl-qemuu-debianhvm-amd64   broken
 test-amd64-i386-xl-raw   broken
 test-amd64-amd64-xl-qcow2 4 host-install(4)broken REGR. vs. 117311
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 4 host-install(4) broken REGR. 
vs. 117311
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 4 host-install(4) broken REGR. 
vs. 117311
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 4 host-install(4) broken REGR. vs. 
117311
 test-amd64-i386-xl-raw4 host-install(4)broken REGR. vs. 117311
 test-amd64-amd64-qemuu-nested-amd 12 host-ping-check-native/l1 fail REGR. vs. 
117311
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
117311

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 117311
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 117311
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 117311
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 117311
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 117311
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 117311
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 117311
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 117311
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 117311
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 wi

Re: [Xen-devel] Xen Project Spectre/Meltdown FAQ

2018-01-05 Thread Juergen Gross
On 05/01/18 12:35, Lars Kurth wrote:
> Hi all, this is a repost of 
> https://blog.xenproject.org/2018/01/04/xen-project-spectremeltdown-faq/ for 
> xen-users/xen-devel. If you have questions, please reply to this thread and 
> we will try and improve the FAQ based on questions.
> Regards
> Lars
> 
> 
> Google’s Project Zero announced several information leak vulnerabilities 
> affecting all modern superscalar processors. Details can be found on their 
> blog, and in the Xen Project Advisory 254 [1]. To help our users understand 
> the impact and our next steps forward, we put together the following FAQ.
> 
> Note that we will update the FAQ as new information surfaces.
> 
> = Is Xen impacted by Meltdown and Spectre? =
> 
> There are two angles to consider for this question:
> 
> * Can an untrusted guest attack the hypervisor using Meltdown or Spectre?
> * Can a guest user-space program attack a guest kernel using Meltdown or 
> Spectre?
> 
> Systems running Xen, like all operating systems and hypervisors, are 
> potentially affected by Spectre (referred to as SP1 and SP2 in Advisory 254 
> [1]). For Arm Processors information, you can find which processors are 
> impacted here [2].  In general, both the hypervisor and a guest kernel are 
> vulnerable to attack via SP1 and SP2.
> 
> Only Intel processors are impacted by Meltdown (referred to as SP3 in 
> Advisory 254 [1]). On Intel processors, only 64-bit PV mode guests can attack 
> Xen. Guests running in 32-bit PV mode, HVM mode, and PVH mode cannot attack 
> the hypervisor using SP3. However, in 32-bit PV mode, HVM mode, and PVH mode, 
> guest userspaces can attack guest kernels using SP3; so updating guest 
> kernels is advisable.
> 
> Interestingly, guest kernels running in 64-bit PV mode are not vulnerable to 
> attack using SP3, because 64-bit PV guests already run in a KPTI-like mode.

And this is wrong. Guest kernels running in 64-bit PV mode can't be
attacked directly from their users, but indirectly via a user program
reading the host's memory, of which the guest's kernel memory is a
part of.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen Project Spectre/Meltdown FAQ

2018-01-05 Thread George Dunlap
On 01/05/2018 11:35 AM, Lars Kurth wrote:
> Hi all, this is a repost of 
> https://blog.xenproject.org/2018/01/04/xen-project-spectremeltdown-faq/ for 
> xen-users/xen-devel. If you have questions, please reply to this thread and 
> we will try and improve the FAQ based on questions.

I also started a "Practical response" FAQ here:

https://wiki.xenproject.org/wiki/Respond_to_Meltdown_and_Spectre

Please give feedback and add practical information as needed.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 21/74] x86/entry: Early PVH boot code

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> --- a/xen/arch/x86/boot/head.S
> +++ b/xen/arch/x86/boot/head.S
> @@ -380,7 +380,39 @@ cs32_switch:
>  ELFNOTE(Xen, XEN_ELFNOTE_PHYS32_ENTRY, .long sym_offs(__pvh_start))
>  
>  __pvh_start:
> -ud2a
> +cld
> +cli
> +
> +/*
> + * We need one push/pop to determine load address.  Use the same
> + * absolute address as the native path, for lack of a better

... stack address ...

> @@ -544,12 +576,18 @@ trampoline_setup:
>  /* Get bottom-most low-memory stack address. */
>  add $TRAMPOLINE_SPACE,%ecx
>  
> +#ifdef CONFIG_PVH_GUEST
> +cmpb$1, sym_fs(pvh_boot)
> +je  1f

I'd much prefer

cmpb$0, sym_fs(pvh_boot)
jne 1f

in cases like this one.

But then I sort of dislike the addition of such random in-memory
flags. Considering ...

> +#endif
> +
>  /* Save the Multiboot info struct (after relocation) for later use. 
> */
>  push%ecx/* Bottom-most low-memory stack address. 
> */
>  push%ebx/* Multiboot information address. */
>  push%eax/* Multiboot magic. */

... the values used here, couldn't the flag be replaced by setting
one or both of %eax and %ebx to zero before jumping to
trampoline_setup? Or wait, further down I see that this flag is
also being use in C code. Perhaps fine then as is. Otoh, keying
this off of one of the register values would allow the #ifdef to
be dropped.

> --- /dev/null
> +++ b/xen/arch/x86/guest/pvh-boot.c
> @@ -0,0 +1,119 @@
> +/**
> + * arch/x86/guest/pvh-boot.c
> + *
> + * PVH boot time support
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see .
> + *
> + * Copyright (c) 2017 Citrix Systems Ltd.
> + */
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#include 
> +
> +/* Initialised in head.S, before .bss is zeroed. */
> +bool pvh_boot __initdata;
> +uint32_t pvh_start_info_pa __initdata;

Would you mind using the more common placement of __initdata,
like you do ...

> +static multiboot_info_t __initdata pvh_mbi;
> +static module_t __initdata pvh_mbi_mods[32];
> +static char *__initdata pvh_loader = "PVH Directboot";

... here?

For the last item

static const char __initconst pvh_loader[] = "PVH Directboot";

please. For mods[] - isn't 32 overly much?

> +static void __init convert_pvh_info(void)
> +{
> +struct hvm_start_info *pvh_info = __va(pvh_start_info_pa);
> +struct hvm_modlist_entry *entry;

const (twice)

> +module_t *mod;
> +unsigned int i;
> +
> +ASSERT(pvh_info->magic == XEN_HVM_START_MAGIC_VALUE);
> +
> +/*
> + * Turn hvm_start_info into mbi. Luckily all modules are placed under 4GB
> + * boundary on x86.

ISTR having that discussion relatively recently in another context:
All the header states is "NB: Xen on x86 will always try to place all
the data below the 4GiB boundary." Note the "try to". Hence I
think ...

> + */
> +pvh_mbi.flags = MBI_CMDLINE | MBI_MODULES | MBI_LOADERNAME;
> +
> +ASSERT(!(pvh_info->cmdline_paddr >> 32));

... this, if we don't want to handle the case, should be BUG_ON() or
panic() (same further down).

> +pvh_mbi.cmdline = pvh_info->cmdline_paddr;
> +pvh_mbi.boot_loader_name = __pa(pvh_loader);
> +
> +ASSERT(pvh_info->nr_modules < 32);

ARRAY_SIZE(pvh_mbi_mods) and perhaps again BUG_ON() or
panic().

> +pvh_mbi.mods_count = pvh_info->nr_modules;
> +pvh_mbi.mods_addr = __pa(pvh_mbi_mods);
> +
> +mod = pvh_mbi_mods;
> +entry = __va(pvh_info->modlist_paddr);

How come __va() already works at this point in time? And what about
this address being beyond 4Gb?

> +for ( i = 0; i < pvh_info->nr_modules; i++ )
> +{
> +ASSERT(!(entry[i].paddr >> 32));

To relax this condition (in particular to allow huge initrd), how
about ...

> +mod[i].mod_start = entry[i].paddr;
> +mod[i].mod_end   = entry[i].paddr + entry[i].size;

... using the EFI approach here and store the PFN in mod_start
and the size in mod_end?

> +mod[i].string= entry[i].cmdline_paddr;

No 4Gb check here?

> +void __init pvh_print_info(void)
> +{
> +struct hvm_start_info *pvh_info = __va(pvh_start_info_pa);
> +struct hvm

Re: [Xen-devel] [PATCH RFC v1 23/74] x86/entry: Probe for Xen early during boot

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> --- /dev/null
> +++ b/xen/arch/x86/guest/xen.c
> @@ -0,0 +1,75 @@
> +/**
> + * arch/x86/guest/xen.c
> + *
> + * Support for detecting and running under Xen.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see .
> + *
> + * Copyright (c) 2017 Citrix Systems Ltd.
> + */
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +
> +#include 
> +
> +bool xen_guest;

__read_mostly?

> +static uint32_t xen_cpuid_base;

Depending on future use, __initdata or __read_mostly?

> --- a/xen/include/asm-x86/guest.h
> +++ b/xen/include/asm-x86/guest.h
> @@ -20,6 +20,7 @@
>  #define __X86_GUEST_H__
>  
>  #include 
> +#include 
>  
>  #endif /* __X86_GUEST_H__ */

I'm increasingly curious to understand what this header's purpose
is meant to be. It looks as if you mean source files to only ever
include this one, but why? Rather than exposing everything at
once, we should try (unrelated to this series) to limit what each
CU gets to see, speeding up builds (not the least incremental ones
by reducing the dependency trees).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v1] x86/mm: Supresses vm_events caused by page-walks

2018-01-05 Thread Razvan Cojocaru
On 10/30/2017 07:38 PM, Tamas K Lengyel wrote:
> On Mon, Oct 30, 2017 at 11:19 AM, Razvan Cojocaru
>  wrote:
>> On 10/30/2017 07:07 PM, Tamas K Lengyel wrote:
>>> On Mon, Oct 30, 2017 at 11:01 AM, Razvan Cojocaru
>>>  wrote:
 On 10/30/2017 06:39 PM, Tamas K Lengyel wrote:
> On Mon, Oct 30, 2017 at 10:24 AM, Razvan Cojocaru
>  wrote:
>> On 30.10.2017 18:01, Tamas K Lengyel wrote:
>>> On Mon, Oct 30, 2017 at 4:32 AM, Alexandru Isaila
>>>  wrote:
 This patch is adding a way to enable/disable nested pagefault
 events. It introduces the xc_monitor_nested_pagefault function
 and adds the nested_pagefault_disabled in the monitor structure.
 This is needed by the introspection so it will only get gla
 faults and not get spammed with other faults.
 In p2m_set_ad_bits the v->arch.sse_pg_dirty.eip and
 v->arch.sse_pg_dirty.gla are used to mark that this is the
 second time a fault occurs and the dirty bit is set.
>>>
>>> Could you describe under what conditions do you get these other faults?
>>
>> Hey Tamas, the whole story is at page 8 of this document:
>>
>> https://www.researchgate.net/publication/281835515_Proposed_Processor_Extensions_for_Significant_Speedup_of_Hypervisor_Memory_Introspection
>
> Hi Razvan,
> thanks but I'm not sure that doc addresses my question. You
> effectively filter out npfec_kind_in_gpt and npfec_kind_unknown in
> this patch. The first, npfec_kind_in_gpt should only happen if you
> have restricted access to the gpt with ept and the processor couldn't
> walk the table. But if you don't want to get events of these types
> then why not simply not restrict access the gpt to begin with? And as
> for npfec_kind_unknown, I don't think that gets generated under any
> situation. So hence my question, what is your setup that makes this
> patch necessary?

 On the npfec_kind_unknown case, indeed, we were wondering when that
 might possibly occur when discussing this patch - it's probably reserved
 for the future?

 On why our introspection engine decides to restrict access to those
 specific pages, I am not intimate with its inner workings, and not sure
 how much could be disclosed here in any case. Is it not a worthwhile
 (and otherwise harmless) tool to be able to switch A/D bits-triggered
 EPT faults anyway, for introspection purposes?
>>>
>>> It changes the default behavior of mem_access events so I just wanted
>>> to get some background on when that is really required. Technically
>>> there is no reason why we couldn't do that filtering in Xen. I think
>>> it might be better to flip the filter the other way though so the
>>> default behavior remains as is (ie. change the option to enable
>>> filtering instead of enabling monitoring).
>>
>> Wait, it shouldn't change the default behaviour at all. If nobody calls
>> that function, all the EPT event kinds should be sent out - the new
>> monitor flag is a "disable" flag for non-GLA event (the so-called
>> "nested page fault" events).
> 
> Oh yea you are right, I completely overlooked that it is named
> "nested_pagefault_disabled" =) Maybe a comment in the domctl header
> would be warranted to note that this is enabled by default when
> mem_access is used.

Other than adding the above mentioned comment, does anyone require other
changes we should make in V2?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 24/74] x86/guest: Hypercall support

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> --- /dev/null
> +++ b/xen/arch/x86/guest/hypercall_page.S
> @@ -0,0 +1,79 @@
> +#include 
> +#include 
> +#include 
> +
> +.section ".text.page_aligned", "ax", @progbits
> +.p2align PAGE_SHIFT
> +
> +GLOBAL(hypercall_page)
> + /* Poisoned with `ret` for safety before hypercalls are set up. */
> +.fill PAGE_SIZE, 1, 0xc3

How is RET a useful poison value? Why not 0xcc?

> +.type hypercall_page, STT_OBJECT

I'd rather omit the type altogether - it's not really an object (nor a
function), the more that you produce individual entry symbols
below anyway.

> +.size hypercall_page, PAGE_SIZE
> +
> +/*
> + * Identify a specific hypercall in the hypercall page
> + * @param name Hypercall name.
> + */
> +#define DECLARE_HYPERCALL(name)  
>\
> +.globl HYPERCALL_ ## name;   
>\
> +.set   HYPERCALL_ ## name, hypercall_page + __HYPERVISOR_ ## name * 
> 32; \
> +.type  HYPERCALL_ ## name, STT_FUNC; 
>\
> +.size  HYPERCALL_ ## name, 32

This is certainly fine for now, but going forward wants to be
machine generated directly from the header, so that it won't
need touching when new hypercalls are being added. Until
then I wonder whether you really need all the entries you
enumerate below - some (like iret) are plain invalid for PVH.

> --- /dev/null
> +++ b/xen/include/asm-x86/guest/hypercall.h
> @@ -0,0 +1,92 @@
> +/**
> + * asm-x86/guest/hypercall.h
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms and conditions of the GNU General Public
> + * License, version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public
> + * License along with this program; If not, see 
> .
> + *
> + * Copyright (c) 2017 Citrix Systems Ltd.
> + */
> +
> +#ifndef __X86_XEN_HYPERCALL_H__
> +#define __X86_XEN_HYPERCALL_H__
> +
> +#ifdef CONFIG_XEN_GUEST
> +
> +/*
> + * Hypercall primatives for 64bit
> + *
> + * Inputs: %rdi, %rsi, %rdx, %r10, %r8, %r9 (arguments 1-6)
> + */
> +
> +#define _hypercall64_1(type, hcall, a1) \
> +({  \
> +long res, tmp;  \

Especially for tmp I think it would be quite a bit more safe if it
had a trailing underscore attached, so that an occasional use
of

_hypercall64_1(..., tmp);

would work as intended.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 25/74] x86/shutdown: Support for using SCHEDOP_{shutdown, reboot}

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> From: Andrew Cooper 
> 
> Signed-off-by: Andrew Cooper 
> Signed-off-by: Wei Liu 

Reviewed-by: Jan Beulich 
with two remarks:

> --- a/xen/include/asm-x86/guest/hypercall.h
> +++ b/xen/include/asm-x86/guest/hypercall.h
> @@ -19,6 +19,11 @@
>  #ifndef __X86_XEN_HYPERCALL_H__
>  #define __X86_XEN_HYPERCALL_H__
>  
> +#include 
> +
> +#include 
> +#include 
> +
>  #ifdef CONFIG_XEN_GUEST

Why do you #include ahead of the #ifdef?

> @@ -78,6 +83,30 @@
>  (type)res;  \
>  })
>  
> +/*
> + * Primitive Hypercall wrappers
> + */
> +static inline long xen_hypercall_sched_op(unsigned int cmd, void *arg)
> +{
> +return _hypercall64_2(long, __HYPERVISOR_sched_op, cmd, arg);
> +}
> +
> +/*
> + * Higher level hypercall helpers
> + */
> +static inline long xen_hypercall_shutdown(unsigned int reason)
> +{
> +return xen_hypercall_sched_op(SCHEDOP_shutdown, &reason);

It would seem more correct if you went through struct
sched_shutdown here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 26/74] x86/pvh: Retrieve memory map from Xen

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> Signed-off-by: Wei Liu 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 27/74] xen/console: Introduce console=xen

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> From: Andrew Cooper 
> 
> This specifies whether to use Xen specific console output. There are
> two variants: one is the hypervisor console, the other is the magic
> debug port 0xe9.

With just x86 in mind this is all fine, but for ARM (and for other
reasons even for x86) this surely wants some #ifdef-s added.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 24/74] x86/guest: Hypercall support

2018-01-05 Thread Andrew Cooper
On 05/01/18 13:53, Jan Beulich wrote:
 On 04.01.18 at 14:05,  wrote:
>> --- /dev/null
>> +++ b/xen/arch/x86/guest/hypercall_page.S
>> @@ -0,0 +1,79 @@
>> +#include 
>> +#include 
>> +#include 
>> +
>> +.section ".text.page_aligned", "ax", @progbits
>> +.p2align PAGE_SHIFT
>> +
>> +GLOBAL(hypercall_page)
>> + /* Poisoned with `ret` for safety before hypercalls are set up. */
>> +.fill PAGE_SIZE, 1, 0xc3
> How is RET a useful poison value? Why not 0xcc?

This was all imported basically-verbatim from XTF (which also answers
some of your lower questions).

ret over cc prevents problems when crashing early.  Turning the
preferred schedop_shutdown() into a nop stop you taking a cascade fault,
and instead try a different shutdown mechanism.

Also, before my recent patch to fix int3 behaviour, Xen will happily
execute its way (slowly) through debug traps without printing anything
useful.

>
>> +.type hypercall_page, STT_OBJECT
> I'd rather omit the type altogether - it's not really an object (nor a
> function), the more that you produce individual entry symbols
> below anyway.
>
>> +.size hypercall_page, PAGE_SIZE
>> +
>> +/*
>> + * Identify a specific hypercall in the hypercall page
>> + * @param name Hypercall name.
>> + */
>> +#define DECLARE_HYPERCALL(name) 
>> \
>> +.globl HYPERCALL_ ## name;  
>> \
>> +.set   HYPERCALL_ ## name, hypercall_page + __HYPERVISOR_ ## name * 
>> 32; \
>> +.type  HYPERCALL_ ## name, STT_FUNC;
>> \
>> +.size  HYPERCALL_ ## name, 32
> This is certainly fine for now, but going forward wants to be
> machine generated directly from the header, so that it won't
> need touching when new hypercalls are being added. Until
> then I wonder whether you really need all the entries you
> enumerate below - some (like iret) are plain invalid for PVH.
>
>> --- /dev/null
>> +++ b/xen/include/asm-x86/guest/hypercall.h
>> @@ -0,0 +1,92 @@
>> +/**
>> + * asm-x86/guest/hypercall.h
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms and conditions of the GNU General Public
>> + * License, version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public
>> + * License along with this program; If not, see 
>> .
>> + *
>> + * Copyright (c) 2017 Citrix Systems Ltd.
>> + */
>> +
>> +#ifndef __X86_XEN_HYPERCALL_H__
>> +#define __X86_XEN_HYPERCALL_H__
>> +
>> +#ifdef CONFIG_XEN_GUEST
>> +
>> +/*
>> + * Hypercall primatives for 64bit
>> + *
>> + * Inputs: %rdi, %rsi, %rdx, %r10, %r8, %r9 (arguments 1-6)
>> + */
>> +
>> +#define _hypercall64_1(type, hcall, a1) \
>> +({  \
>> +long res, tmp;  \
> Especially for tmp I think it would be quite a bit more safe if it
> had a trailing underscore attached, so that an occasional use
> of
>
> _hypercall64_1(..., tmp);
>
> would work as intended.

Hmm.  I'd not even considered that issue.  I'll add it to my todo list.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution

2018-01-05 Thread George Dunlap
On Fri, Jan 5, 2018 at 9:39 AM, Juergen Gross  wrote:
> On 05/01/18 10:26, Andrew Cooper wrote:
>> On 05/01/2018 07:48, Juergen Gross wrote:
>>> On 04/01/18 21:21, Andrew Cooper wrote:
 This work was developed as an SP3 mitigation, but shelved when it became 
 clear
 that it wasn't viable to get done in the timeframe.

 To protect against SP3 attacks, most mappings needs to be flushed while in
 user context.  However, to protect against all cross-VM attacks, it is
 necessary to ensure that the Xen stacks are not mapped in any other cpus
 address space, or an attacker can still recover at least the GPR state of
 separate VMs.
>>> Above statement is too strict: it would be sufficient if no stacks of
>>> other domains are mapped.
>>
>> Sadly not.  Having stacks shared by domain means one vcpu can still
>> steal at least GPR state from other vcpus belonging to the same domain.
>>
>> Whether or not a specific kernel cares, some definitely will.
>>
>>> I'm just working on a proof of concept using dedicated per-vcpu stacks
>>> for 64 bit pv domains. Those stacks would be mapped in the per-domain
>>> region of the address space. I hope to have a RFC version of the patches
>>> ready next week.
>>>
>>> This would allow to remove the per physical cpu mappings in the guest
>>> visible address space when doing page table isolation.
>>>
>>> In order to avoid SP3 attacks to other vcpu's stacks of the same guest
>>> we could extend the pv ABI to mark a guest's user L4 page table as
>>> "single use", i.e. not allowed to be active on multiple vcpus at the
>>> same time (introducing that ABI modification in the Linux kernel would
>>> be simple, as the Linux kernel currently lacks support for cross-cpu
>>> stack exploits and when that support is being added by per-cpu L4 user
>>> page tables we could just chime in). A L4 page table marked as "single
>>> use" would map the local vcpu stacks only.
>>
>> For PV guests, it is the Xen stacks which matter, not the vcpu guest
>> kernel's ones.
>
> Indeed. That's the reason I want to have per-vcpu Xen stacks.
>
>> 64bit PV guest kernels are already mitigated better than KPTI can ever
>> manage, because there are no entry stacks or entry stubs required to be
>> mapped into guest userspace at all.
>
> But without Xen being secured via a mechanism similar to KPTI this
> is moot, as user mode can exploit the whole host including the own
> kernel's memory.

Here's a question:  What if we didn't try to prevent the guest from
reading hypervisor memory at all, but instead just tried to make sure
that there was nothing of interest there?

If sensitive information pertaining to a given vcpu were only maped on
the processor currently running that vcpu, then it would mitigate not
only SP3, but also SP2 and SP1.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 28/74] x86: initialise shared_info page

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> --- a/xen/arch/x86/guest/xen.c
> +++ b/xen/arch/x86/guest/xen.c
> @@ -72,6 +72,30 @@ void __init probe_hypervisor(void)
>  xen_guest = true;
>  }
>  
> +static void map_shared_info(struct e820map *e820)
> +{
> +paddr_t frame = 0xff00; /* TODO: Hardcoded beside magic frames. */

What are the plans here?

> +struct xen_add_to_physmap xatp = {
> +.domid = DOMID_SELF,
> +.idx = 0,
> +.space = XENMAPSPACE_shared_info,
> +.gpfn = frame >> PAGE_SHIFT,
> +};
> +
> +if ( !e820_add_range(e820, frame, frame + PAGE_SIZE, E820_RESERVED) )
> +panic("Failed to reserve shared_info range");
> +
> +if ( xen_hypercall_memory_op(XENMEM_add_to_physmap, &xatp) )
> +panic("Failed to map shared_info page");

Also report the error code?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 5/6] xen: Add only xen-sysdev to dynamic sysbus device list

2018-01-05 Thread Anthony PERARD
On Sat, Nov 25, 2017 at 01:16:09PM -0200, Eduardo Habkost wrote:
> There's no need to make the machine allow every possible sysbus
> device.  We can now just add xen-sysdev to the allowed list.
> 
> Cc: Stefano Stabellini 
> Cc: Anthony Perard 
> Cc: xen-devel@lists.xenproject.org
> Cc: Juergen Gross 
> Signed-off-by: Eduardo Habkost 

I've tested the patch series with every hotplug things I could think of,
and it worked fine.

Acked-by: Anthony PERARD 

> ---
> Changes series v1 -> v2:
> * New patch added to series
> ---
>  hw/xen/xen_backend.c | 7 +--
>  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
> index 82380ea9ee..7445b506ac 100644
> --- a/hw/xen/xen_backend.c
> +++ b/hw/xen/xen_backend.c
> @@ -564,12 +564,7 @@ static void xen_set_dynamic_sysbus(void)
>  ObjectClass *oc = object_get_class(machine);
>  MachineClass *mc = MACHINE_CLASS(oc);
>  
> -/*
> - * Emulate old mc->has_dynamic_sysbus=true assignment
> - *
> - *TODO: add only Xen devices to the list
> - */
> -machine_class_allow_dynamic_sysbus_dev(mc, TYPE_SYS_BUS_DEVICE);
> +machine_class_allow_dynamic_sysbus_dev(mc, TYPE_XENSYSDEV);
>  }
>  
>  int xen_be_register(const char *type, struct XenDevOps *ops)

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution

2018-01-05 Thread Juergen Gross
On 05/01/18 15:11, George Dunlap wrote:
> On Fri, Jan 5, 2018 at 9:39 AM, Juergen Gross  wrote:
>> On 05/01/18 10:26, Andrew Cooper wrote:
>>> On 05/01/2018 07:48, Juergen Gross wrote:
 On 04/01/18 21:21, Andrew Cooper wrote:
> This work was developed as an SP3 mitigation, but shelved when it became 
> clear
> that it wasn't viable to get done in the timeframe.
>
> To protect against SP3 attacks, most mappings needs to be flushed while in
> user context.  However, to protect against all cross-VM attacks, it is
> necessary to ensure that the Xen stacks are not mapped in any other cpus
> address space, or an attacker can still recover at least the GPR state of
> separate VMs.
 Above statement is too strict: it would be sufficient if no stacks of
 other domains are mapped.
>>>
>>> Sadly not.  Having stacks shared by domain means one vcpu can still
>>> steal at least GPR state from other vcpus belonging to the same domain.
>>>
>>> Whether or not a specific kernel cares, some definitely will.
>>>
 I'm just working on a proof of concept using dedicated per-vcpu stacks
 for 64 bit pv domains. Those stacks would be mapped in the per-domain
 region of the address space. I hope to have a RFC version of the patches
 ready next week.

 This would allow to remove the per physical cpu mappings in the guest
 visible address space when doing page table isolation.

 In order to avoid SP3 attacks to other vcpu's stacks of the same guest
 we could extend the pv ABI to mark a guest's user L4 page table as
 "single use", i.e. not allowed to be active on multiple vcpus at the
 same time (introducing that ABI modification in the Linux kernel would
 be simple, as the Linux kernel currently lacks support for cross-cpu
 stack exploits and when that support is being added by per-cpu L4 user
 page tables we could just chime in). A L4 page table marked as "single
 use" would map the local vcpu stacks only.
>>>
>>> For PV guests, it is the Xen stacks which matter, not the vcpu guest
>>> kernel's ones.
>>
>> Indeed. That's the reason I want to have per-vcpu Xen stacks.
>>
>>> 64bit PV guest kernels are already mitigated better than KPTI can ever
>>> manage, because there are no entry stacks or entry stubs required to be
>>> mapped into guest userspace at all.
>>
>> But without Xen being secured via a mechanism similar to KPTI this
>> is moot, as user mode can exploit the whole host including the own
>> kernel's memory.
> 
> Here's a question:  What if we didn't try to prevent the guest from
> reading hypervisor memory at all, but instead just tried to make sure
> that there was nothing of interest there?
> 
> If sensitive information pertaining to a given vcpu were only maped on
> the processor currently running that vcpu, then it would mitigate not
> only SP3, but also SP2 and SP1.

You are aware this includes the mappings when running in the hypervisor?
So i.e. the mapping of physical memory of the host...


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 29/74] x86: xen pv clock time source

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> It is a variant of TSC clock source.
> 
> Signed-off-by: Wei Liu 
> Signed-off-by: Andrew Cooper 
> Signed-off-by: Roger Pau Monné 

Mostly fine, with the TODO addressed, u64 etc replaced by uint64_t
etc, ...

> +static always_inline
> +u64 __read_cycle(const struct vcpu_time_info *info, u64 tsc)

... the double underscores dropped here, and ...

> +static u64 last_value;

... this moved into the only function it's needed in.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 28/74] x86: initialise shared_info page

2018-01-05 Thread Andrew Cooper
On 05/01/18 14:11, Jan Beulich wrote:
 On 04.01.18 at 14:05,  wrote:
>> --- a/xen/arch/x86/guest/xen.c
>> +++ b/xen/arch/x86/guest/xen.c
>> @@ -72,6 +72,30 @@ void __init probe_hypervisor(void)
>>  xen_guest = true;
>>  }
>>  
>> +static void map_shared_info(struct e820map *e820)
>> +{
>> +paddr_t frame = 0xff00; /* TODO: Hardcoded beside magic frames. */
> What are the plans here?

Nothing immediately.  This is compatible with all versions of libxc in
existance, but we need to start a thread discussing HVM guest physical
address space.

We've also just found a passive performance hole, where enabling any
kind of PCI Passthrough causes Windows and Linux's grant table mappings
to turn uncached because they are allocated inside what the OS thinks is
an MMIO BAR.

I'll start a thread when I'm a little less busy.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution

2018-01-05 Thread Jan Beulich
>>> On 05.01.18 at 15:11,  wrote:
> Here's a question:  What if we didn't try to prevent the guest from
> reading hypervisor memory at all, but instead just tried to make sure
> that there was nothing of interest there?
> 
> If sensitive information pertaining to a given vcpu were only maped on
> the processor currently running that vcpu, then it would mitigate not
> only SP3, but also SP2 and SP1.

Unless there were hypervisor secrets pertaining to this guest.
Also, while the idea behind your question is certainly nice, fully
separating memories related to individual guests would come
at quite significant a price: No direct access to a random
domain's control structures would be possible anymore, which
I'd foresee to be a problem in particular when wanting to
forward interrupts / event channel operations to the right
destination. But as I've said elsewhere recently: With all the
workarounds now being put in place, perhaps we don't care
about performance all that much anymore anyway...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution

2018-01-05 Thread Jan Beulich
>>> On 05.01.18 at 15:21,  wrote:
> We already have map_domain_page(), as a result of 32-bit mode and
>>5TiB mode, so getting the domain pages out of the HV should be pretty
> easy.

E.g. by doing away with the directmap altogether.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 28/74] x86: initialise shared_info page

2018-01-05 Thread Roger Pau Monné
On Fri, Jan 05, 2018 at 02:20:16PM +, Andrew Cooper wrote:
> On 05/01/18 14:11, Jan Beulich wrote:
>  On 04.01.18 at 14:05,  wrote:
> >> --- a/xen/arch/x86/guest/xen.c
> >> +++ b/xen/arch/x86/guest/xen.c
> >> @@ -72,6 +72,30 @@ void __init probe_hypervisor(void)
> >>  xen_guest = true;
> >>  }
> >>  
> >> +static void map_shared_info(struct e820map *e820)
> >> +{
> >> +paddr_t frame = 0xff00; /* TODO: Hardcoded beside magic frames. */
> > What are the plans here?
> 
> Nothing immediately.  This is compatible with all versions of libxc in
> existance, but we need to start a thread discussing HVM guest physical
> address space.

Patches 43/44/45 remove this hardcoding.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution

2018-01-05 Thread George Dunlap
On Fri, Jan 5, 2018 at 2:17 PM, Juergen Gross  wrote:
> On 05/01/18 15:11, George Dunlap wrote:
>> On Fri, Jan 5, 2018 at 9:39 AM, Juergen Gross  wrote:
>>> On 05/01/18 10:26, Andrew Cooper wrote:
 On 05/01/2018 07:48, Juergen Gross wrote:
> On 04/01/18 21:21, Andrew Cooper wrote:
>> This work was developed as an SP3 mitigation, but shelved when it became 
>> clear
>> that it wasn't viable to get done in the timeframe.
>>
>> To protect against SP3 attacks, most mappings needs to be flushed while 
>> in
>> user context.  However, to protect against all cross-VM attacks, it is
>> necessary to ensure that the Xen stacks are not mapped in any other cpus
>> address space, or an attacker can still recover at least the GPR state of
>> separate VMs.
> Above statement is too strict: it would be sufficient if no stacks of
> other domains are mapped.

 Sadly not.  Having stacks shared by domain means one vcpu can still
 steal at least GPR state from other vcpus belonging to the same domain.

 Whether or not a specific kernel cares, some definitely will.

> I'm just working on a proof of concept using dedicated per-vcpu stacks
> for 64 bit pv domains. Those stacks would be mapped in the per-domain
> region of the address space. I hope to have a RFC version of the patches
> ready next week.
>
> This would allow to remove the per physical cpu mappings in the guest
> visible address space when doing page table isolation.
>
> In order to avoid SP3 attacks to other vcpu's stacks of the same guest
> we could extend the pv ABI to mark a guest's user L4 page table as
> "single use", i.e. not allowed to be active on multiple vcpus at the
> same time (introducing that ABI modification in the Linux kernel would
> be simple, as the Linux kernel currently lacks support for cross-cpu
> stack exploits and when that support is being added by per-cpu L4 user
> page tables we could just chime in). A L4 page table marked as "single
> use" would map the local vcpu stacks only.

 For PV guests, it is the Xen stacks which matter, not the vcpu guest
 kernel's ones.
>>>
>>> Indeed. That's the reason I want to have per-vcpu Xen stacks.
>>>
 64bit PV guest kernels are already mitigated better than KPTI can ever
 manage, because there are no entry stacks or entry stubs required to be
 mapped into guest userspace at all.
>>>
>>> But without Xen being secured via a mechanism similar to KPTI this
>>> is moot, as user mode can exploit the whole host including the own
>>> kernel's memory.
>>
>> Here's a question:  What if we didn't try to prevent the guest from
>> reading hypervisor memory at all, but instead just tried to make sure
>> that there was nothing of interest there?
>>
>> If sensitive information pertaining to a given vcpu were only maped on
>> the processor currently running that vcpu, then it would mitigate not
>> only SP3, but also SP2 and SP1.
>
> You are aware this includes the mappings when running in the hypervisor?
> So i.e. the mapping of physical memory of the host...

Yes, of course.  You'd have to map domain memory on-demand, and make
sure it was unmapped before switching to a different domain.  (And in
the case of 64-bit PV guests, before switching back to guest space.)
And you'd have to try to identify as much 'sensitive' information as
possible and move it out of the xen-wide domain heap, into per-domain
structures.

We already have map_domain_page(), as a result of 32-bit mode and
>5TiB mode, so getting the domain pages out of the HV should be pretty
easy.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution

2018-01-05 Thread Andrew Cooper
On 05/01/18 14:27, Jan Beulich wrote:
 On 05.01.18 at 15:11,  wrote:
>> Here's a question:  What if we didn't try to prevent the guest from
>> reading hypervisor memory at all, but instead just tried to make sure
>> that there was nothing of interest there?
>>
>> If sensitive information pertaining to a given vcpu were only maped on
>> the processor currently running that vcpu, then it would mitigate not
>> only SP3, but also SP2 and SP1.
> Unless there were hypervisor secrets pertaining to this guest.
> Also, while the idea behind your question is certainly nice, fully
> separating memories related to individual guests would come
> at quite significant a price: No direct access to a random
> domain's control structures would be possible anymore, which
> I'd foresee to be a problem in particular when wanting to
> forward interrupts / event channel operations to the right
> destination. But as I've said elsewhere recently: With all the
> workarounds now being put in place, perhaps we don't care
> about performance all that much anymore anyway...

Even if we did manage to isolate the mappings to only domian-pertinant
information (which is hard, because interrupts need to still work), you
still don't protect against a piece of userspace using SP2 to attack a
co-scheduled piece of userspace in the domain.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 30/74] x86: APIC timer calibration when running as a guest

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> The timer calibration depends on the number of ticks. Introduce a
> variant to wait for a tick when running as a guest.

The change itself is fine, i.e.
Reviewed-by: Jan Beulich 
but the description (to me, but it may be just me) doesn't really
match it. How about

The timer calibration currently depends on PIT. Introduce a variant
to wait for a tick's worth of time to elapse when running as a PVH
guest.

Jan

> Signed-off-by: Wei Liu 
> ---
>  xen/arch/x86/apic.c | 38 ++
>  1 file changed, 30 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
> index ed59440c45..5039173827 100644
> --- a/xen/arch/x86/apic.c
> +++ b/xen/arch/x86/apic.c
> @@ -36,6 +36,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  static bool __read_mostly tdt_enabled;
>  static bool __initdata tdt_enable = true;
> @@ -1091,6 +1093,20 @@ static void setup_APIC_timer(void)
>  local_irq_restore(flags);
>  }
>  
> +static void wait_tick_pvh(void)
> +{
> +u64 lapse_ns = 10ULL / HZ;
> +s_time_t start, curr_time;
> +
> +start = NOW();
> +
> +/* Won't wrap around */
> +do {
> +cpu_relax();
> +curr_time = NOW();
> +} while ( curr_time - start < lapse_ns );
> +}
> +
>  /*
>   * In this function we calibrate APIC bus clocks to the external
>   * timer. Unfortunately we cannot use jiffies and the timer irq
> @@ -1123,12 +1139,15 @@ static int __init calibrate_APIC_clock(void)
>   */
>  __setup_APIC_LVTT(10);
>  
> -/*
> - * The timer chip counts down to zero. Let's wait
> - * for a wraparound to start exact measurement:
> - * (the current tick might have been already half done)
> - */
> -wait_8254_wraparound();
> +if ( !xen_guest )
> +/*
> + * The timer chip counts down to zero. Let's wait
> + * for a wraparound to start exact measurement:
> + * (the current tick might have been already half done)
> + */
> +wait_8254_wraparound();
> +else
> +wait_tick_pvh();
>  
>  /*
>   * We wrapped around just now. Let's start:
> @@ -1137,10 +1156,13 @@ static int __init calibrate_APIC_clock(void)
>  tt1 = apic_read(APIC_TMCCT);
>  
>  /*
> - * Let's wait LOOPS wraprounds:
> + * Let's wait LOOPS ticks:
>   */
>  for (i = 0; i < LOOPS; i++)
> -wait_8254_wraparound();
> +if ( !xen_guest )
> +wait_8254_wraparound();
> +else
> +wait_tick_pvh();
>  
>  tt2 = apic_read(APIC_TMCCT);
>  t2 = rdtsc_ordered();
> -- 
> 2.11.0
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org 
> https://lists.xenproject.org/mailman/listinfo/xen-devel 




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen Project Spectre/Meltdown FAQ

2018-01-05 Thread Julien Grall
(apologies for the formatting)

Hi Lars,

Thank you for putting together an FAQ.

Few comments below around Arm.


On 5 Jan 2018 13:37, "Lars Kurth"  wrote:

Hi all, this is a repost of https://blog.xenproject.org/
2018/01/04/xen-project-spectremeltdown-faq/ for xen-users/xen-devel. If you
have questions, please reply to this thread and we will try and improve the
FAQ based on questions.
Regards
Lars


Google’s Project Zero announced several information leak vulnerabilities
affecting all modern superscalar processors. Details can be found on their
blog, and in the Xen Project Advisory 254 [1]. To help our users understand
the impact and our next steps forward, we put together the following FAQ.

Note that we will update the FAQ as new information surfaces.

= Is Xen impacted by Meltdown and Spectre? =

There are two angles to consider for this question:

* Can an untrusted guest attack the hypervisor using Meltdown or Spectre?
* Can a guest user-space program attack a guest kernel using Meltdown or
Spectre?

Systems running Xen, like all operating systems and hypervisors, are
potentially affected by Spectre (referred to as SP1 and SP2 in Advisory 254
[1]). For Arm Processors information, you can find which processors are
impacted here [2].  In general, both the hypervisor and a guest kernel are
vulnerable to attack via SP1 and SP2.


The website list processors designed by Arm (i.e Cortex family). It does
not include processors made by Arm licensees. I will leave the various
licensees speak for themselves here.

Regarding Arm-designed processors, most of them are not vulnerable to any
variant. Those affected will mostly be vulnerable to attack via SP1 and SP2.

But this does not rule out attack via SP3 on Arm. From the website, one
Cortex processor is affected.

While this will not affect Xen (the hypervisor is using a different set  of
page-tables). Guest kernel will be vulnerable to it.


Only Intel processors are impacted by Meltdown (referred to as SP3 in
Advisory 254 [1]). On Intel processors, only 64-bit PV mode guests can
attack Xen. Guests running in 32-bit PV mode, HVM mode, and PVH mode cannot
attack the hypervisor using SP3. However, in 32-bit PV mode, HVM mode, and
PVH mode, guest userspaces can attack guest kernels using SP3; so updating
guest kernels is advisable.


Interestingly, guest kernels running in 64-bit PV mode are not vulnerable
to attack using SP3, because 64-bit PV guests already run in a KPTI-like
mode.

= Is there any risk of privilege escalation? =

Meltdown and Spectre are, by themselves, only information leaks. There is
no suggestion that speculative execution can be used to modify memory or
cause a system to do anything it might not have done already.

= Where can I find more information? =

We will update this blog post and Advisory 254 [1] as new information
becomes available. Updates will also be published on xen-announce@.

We will also maintain a technical FAQ on our wiki [3] for answers to more
detailed technical questions that emerge on xen-devel@ and other
communication channels.

= Are there any patches for the vulnerability? =

We have prototype patches for a mitigation for Meltdown on Intel CPUs and a
Mitigation for SP2/CVE-2017-5715, which are functional but have not
undergone rigorous review and have not been backported to all supported Xen
Project releases.

As information related to Meltdown and Spectre is now public, development
will continue in public on xen-devel@ and patches will be posted and
attached to Advisory 254 [1] as they become available in the next few days.

= Can SP1/SP2 be fixed at all? What plans are there to mitigate them? =

SP2 can be mitigated in two ways, both of which essentially prevent
speculative execution of indirect branches. The first is to flush the
branch prediction logic on entry into the hypervisor. This requires
microcode updates, which Intel and AMD are in the process of preparing, as
well as patches to the hypervisor which are also in process and should be
available soon.

The second is to do indirect jumps in a way which is not subject to
speculative execution. This requires the hypervisor to be recompiled with a
compiler that contains special new features. These new compiler features
are also in the process of being prepared for both gcc and clang, and
should be available soon.

SP1 is much more difficult to mitigate. We have some ideas we’re exploring,
but they’re still at the design stage at this point.

= Does Xen have any equivalent to Linux’s KPTI series? =

Linux’s KPTI series is designed to address SP3 only.  For Xen guests, only
64-bit PV guests are affected by SP3. A KPTI-like approach was explored
initially, but required significant ABI changes.  Instead we’ve decided to
go with an alternate approach, which is less disruptive and less complex to
implement. The chosen approach runs PV guests in a PVH container, which
ensures that PV guests continue to behave as before, while providing the
isolation that protects 

Re: [Xen-devel] [PATCH RFC v1 28/74] x86: initialise shared_info page

2018-01-05 Thread Andrew Cooper
On 05/01/18 14:28, Roger Pau Monné wrote:
> On Fri, Jan 05, 2018 at 02:20:16PM +, Andrew Cooper wrote:
>> On 05/01/18 14:11, Jan Beulich wrote:
>> On 04.01.18 at 14:05,  wrote:
 --- a/xen/arch/x86/guest/xen.c
 +++ b/xen/arch/x86/guest/xen.c
 @@ -72,6 +72,30 @@ void __init probe_hypervisor(void)
  xen_guest = true;
  }
  
 +static void map_shared_info(struct e820map *e820)
 +{
 +paddr_t frame = 0xff00; /* TODO: Hardcoded beside magic frames. */
>>> What are the plans here?
>> Nothing immediately.  This is compatible with all versions of libxc in
>> existance, but we need to start a thread discussing HVM guest physical
>> address space.
> Patches 43/44/45 remove this hardcoding.

Oh sorry - I'm even more out of date than I thought I was.

I'll get back to my other work.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 31/74] x86: read wallclock from Xen running in pvh mode

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> Signed-off-by: Wei Liu 

Reviewed-by: Jan Beulich 
with a suggestion on code structure:

> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -969,6 +969,36 @@ static unsigned long get_cmos_time(void)
>  return mktime(rtc.year, rtc.mon, rtc.day, rtc.hour, rtc.min, rtc.sec);
>  }
>  
> +static unsigned long noinline get_xen_wallclock_time(void)
> +{
> +#ifdef CONFIG_XEN_GUEST
> +struct shared_info *sh_info = XEN_shared_info;
> +uint32_t wc_version;
> +uint64_t wc_sec;
> +
> +do {
> +wc_version = sh_info->wc_version & ~1;
> +smp_rmb();
> +
> +wc_sec  = sh_info->wc_sec;
> +smp_rmb();
> +} while ( wc_version != sh_info->wc_version );
> +
> +return wc_sec + read_xen_timer() / 10;

Why not move all of this ...

> +#else
> +ASSERT_UNREACHABLE();
> +return 0;
> +#endif
> +}
> +
> +static unsigned long get_wallclock_time(void)
> +{

... here:

#ifdef CONFIG_XEN_GUEST
if ( xen_guest )
{
...
return wc_sec + read_xen_timer() / 10;
}
#endif

   return get_cmos_time();
}

avoiding one of these not very nice ASSERT_UNREACHABLE()?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 32/74] x86: don't swallow the first command line item in pvh mode

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> @@ -632,11 +633,10 @@ static char * __init cmdline_cook(char *p, const char 
> *loader_name)
>  while ( *p == ' ' )
>  p++;
>  
> -/* GRUB2 does not include image name as first item on command line. */
> -if ( loader_is_grub2(loader_name) )
> +if ( !loader_is_grub1(loader_name) )
>  return p;

Behavior here changes for xen.efi booted without grub2 afaict.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen Project Spectre/Meltdown FAQ

2018-01-05 Thread Lars Kurth
Julien,

> On 5 Jan 2018, at 14:40, Julien Grall  wrote:
> 
> (apologies for the formatting)
> 
> Hi Lars,
> 
> Thank you for putting together an FAQ.
> 
> Few comments below around Arm.
> 
> Systems running Xen, like all operating systems and hypervisors, are 
> potentially affected by Spectre (referred to as SP1 and SP2 in Advisory 254 
> [1]). For Arm Processors information, you can find which processors are 
> impacted here [2].  In general, both the hypervisor and a guest kernel are 
> vulnerable to attack via SP1 and SP2.
> 
> The website list processors designed by Arm (i.e Cortex family). It does not 
> include processors made by Arm licensees. I will leave the various licensees 
> speak for themselves here.
> 
> Regarding Arm-designed processors, most of them are not vulnerable to any 
> variant. Those affected will mostly be vulnerable to attack via SP1 and SP2.
> 
> But this does not rule out attack via SP3 on Arm. From the website, one 
> Cortex processor is affected.
> 
> While this will not affect Xen (the hypervisor is using a different set  of 
> page-tables). Guest kernel will be vulnerable to it.

I would be quite happy to have a specific question covering ARM/ARM eco-system 
where you can explain all this. Feel free to formulate a question + answer and 
I will add it
Lars___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 33/74] x86/guest: enable event channels upcalls

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> @@ -30,6 +31,7 @@
>  bool xen_guest;
>  
>  static uint32_t xen_cpuid_base;
> +static uint8_t evtchn_upcall_vector;

There being a single global vector, why do you use
HVMOP_set_evtchn_upcall_vector instead of setting
HVM_PARAM_CALLBACK_IRQ? Aiui this would also make ...

> @@ -91,9 +93,81 @@ static void map_shared_info(struct e820map *e820)
>  set_fixmap(FIX_XEN_SHARED_INFO, frame);
>  }
>  
> +static void xen_evtchn_upcall(struct cpu_user_regs *regs)
> +{
> +unsigned int cpu = smp_processor_id();
> +struct vcpu_info *vcpu_info = &XEN_shared_info->vcpu_info[cpu];
> +
> +vcpu_info->evtchn_upcall_pending = 0;
> +xchg(&vcpu_info->evtchn_pending_sel, 0);
> +
> +ack_APIC_irq();

... this call unnecessary.

Also wouldn't it be better to decouple uses of vcpu_info from
XEN_shared_info right away, for the later extension to more
vCPU-s to be less intrusive?

Also - why xchg() rather than write_atomic() (again further down)?

> +static void ap_setup_event_channels(bool clear)
> +{
> +unsigned int i, cpu = smp_processor_id();
> +struct vcpu_info *vcpu_info = &XEN_shared_info->vcpu_info[cpu];
> +int rc;
> +
> +ASSERT(evtchn_upcall_vector);
> +ASSERT(cpu < ARRAY_SIZE(XEN_shared_info->vcpu_info));

Strictly speaking this assertion comes too late. But yes, we have
quite a few such examples elsewhere, so I don't really mind.

> +if ( !clear )
> +{
> +/*
> + * This is necessary to ensure that a CPU will be interrupted in case
> + * of an event channel notification.
> + */
> +ASSERT(vcpu_info->evtchn_upcall_pending == 0);
> +ASSERT(vcpu_info->evtchn_pending_sel == 0);
> +}
> +
> +rc = xen_hypercall_set_evtchn_upcall_vector(cpu, evtchn_upcall_vector);
> +if ( rc )
> +panic("Unable to set evtchn upcall vector: %d", rc);
> +
> +if ( clear )
> +{
> +/*
> + * Clear any pending upcall bits. This makes us effectively ignore 
> any
> + * previous upcalls which might be suboptimal.
> + */
> +vcpu_info->evtchn_upcall_pending = 0;
> +xchg(&vcpu_info->evtchn_pending_sel, 0);
> +
> +/*
> + * evtchn_pending can be cleared only on the boot CPU because it's
> + * located in a shared structure.
> + */
> +for ( i = 0; i < 8; i++ )

ARRAY_SIZE() (also further down)

I also don't really understand the comment - all CPUs can access
shared info. But then again I don't really understand all this
clearing anyway, including the respective ASSERT()s further up.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 33/74] x86/guest: enable event channels upcalls

2018-01-05 Thread Andrew Cooper
On 05/01/18 15:07, Jan Beulich wrote:
 On 04.01.18 at 14:05,  wrote:
>> @@ -30,6 +31,7 @@
>>  bool xen_guest;
>>  
>>  static uint32_t xen_cpuid_base;
>> +static uint8_t evtchn_upcall_vector;
> There being a single global vector, why do you use
> HVMOP_set_evtchn_upcall_vector instead of setting
> HVM_PARAM_CALLBACK_IRQ?

Because another discovery is that HVM_PARAM_CALLBACK_IRQ is subtly
broken.  It is incompatible with L0 Xen choosing to use hardware APIC
assistance, due to its deliberate (ab)use of the IRR state model.

OTOH, there are patches (perhaps later, perhaps not posted yet) which do
try to make use of CALLBACK_IRQ for compatibility on older L0 hypervisors.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 34/74] x86/guest: add PV console code

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> --- /dev/null
> +++ b/xen/drivers/char/xen_pv_console.c
> @@ -0,0 +1,198 @@
> +/**
> + * drivers/char/xen_pv_console.c
> + *
> + * A frontend driver for Xen's PV console.
> + * Can be used when Xen is running on top of Xen in pv-in-pvh mode.
> + * (Linux's name for this is hvc console)
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see .
> + *
> + * Copyright (c) 2017 Citrix Systems Ltd.
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +
> +#include 
> +
> +static struct xencons_interface *cons_ring;
> +static evtchn_port_t cons_evtchn;
> +static serial_rx_fn cons_rx_handler;
> +static DEFINE_SPINLOCK(tx_lock);
> +
> +void __init pv_console_init(void)
> +{
> +long r;
> +uint64_t raw_pfn = 0, raw_evtchn = 0;
> +
> +if ( !xen_guest )
> +{
> +printk("PV console init failed: xen_guest mode is not active!\n");
> +return;
> +}
> +
> +r = xen_hypercall_hvm_get_param(HVM_PARAM_CONSOLE_PFN, &raw_pfn);
> +if ( r < 0 )
> +goto error;
> +
> +r = xen_hypercall_hvm_get_param(HVM_PARAM_CONSOLE_EVTCHN, &raw_evtchn);
> +if ( r < 0 )
> +goto error;
> +
> +set_fixmap(FIX_PV_CONSOLE, raw_pfn << PAGE_SHIFT);
> +cons_ring = (struct xencons_interface *)fix_to_virt(FIX_PV_CONSOLE);

Pointless cast with the earlier return type change.

> +cons_evtchn = raw_evtchn;
> +
> +printk("Initialised PV console at 0x%p with pfn %#lx and evtchn %#x\n",

Does %#p not work?

> +void __init pv_console_set_rx_handler(serial_rx_fn fn)
> +{
> +cons_rx_handler = fn;
> +}

Especially this and ...

> +size_t pv_console_rx(struct cpu_user_regs *regs)
> +{
> +char c;
> +XENCONS_RING_IDX cons, prod;
> +size_t recv = 0;
> +
> +if ( !cons_ring )
> +return 0;
> +
> +/* TODO: move this somewhere */
> +if ( !test_bit(cons_evtchn, XEN_shared_info->evtchn_pending) )
> +return 0;

... the need for this and ...

> +prod = ACCESS_ONCE(cons_ring->in_prod);
> +cons = cons_ring->in_cons;
> +/* Get pointers before reading the ring */
> +smp_rmb();
> +
> +ASSERT((prod - cons) <= sizeof(cons_ring->in));
> +
> +while ( cons != prod )
> +{
> +c = cons_ring->in[MASK_XENCONS_IDX(cons++, cons_ring->in)];
> +if ( cons_rx_handler )
> +cons_rx_handler(c, regs);
> +recv++;
> +}
> +
> +/* No need for a mem barrier because every character was already 
> consumed */
> +barrier();
> +ACCESS_ONCE(cons_ring->in_cons) = cons;
> +notify_daemon();
> +
> +clear_bit(cons_evtchn, XEN_shared_info->evtchn_pending);

... this at this layer are very hard to judge about with all the code
here being dead for the moment. Can't this driver be modeled like
any other of the UART drivers, surfacing the accessors through
struct uart_driver (and making the ad-hoc call sites in the next
patch [mostly] unnecessary)?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 36/74] --- x86/shim: Kconfig and command line options

2018-01-05 Thread Jan Beulich
>>> On 04.01.18 at 14:05,  wrote:
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -133,6 +133,28 @@ config PVH_GUEST
>   ---help---
> Support booting using the PVH ABI.
>  
> +   If unsure, say N.
> +
> +config PV_SHIM
> + def_bool n
> + prompt "PV Shim"
> + depends on PV && XEN_GUEST
> + ---help---
> +   Build Xen with a mode which acts as a shim to allow PV guest to run
> +   in an HVM/PVH container. This mode can only be enabled with command
> +   line option.
> +
> +   If unsure, say N.
> +
> +config PV_SHIM_EXCLUSIVE
> + def_bool n
> + prompt "PV Shim Exclusive"
> + depends on PV_SHIM

My expectation so far was that this would be the only mode we
target, hence I think at the very least the default wants to be y
here.

> --- /dev/null
> +++ b/xen/arch/x86/pv/shim.c
> @@ -0,0 +1,39 @@
> +/**
> + * arch/x86/pv/shim.c
> + *
> + * Functionaltiy for PV Shim mode
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see .
> + *
> + * Copyright (c) 2017 Citrix Systems Ltd.
> + */
> +#include 
> +#include 
> +
> +#include 
> +
> +#ifndef CONFIG_PV_SHIM_EXCLUSIVE
> +bool pv_shim;

__read_mostly (if not __initdata)?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] x86/svm: Add checks for nested HW features

2018-01-05 Thread Brian Woods
On Fri, Dec 22, 2017 at 03:15:48PM +, Andrew Cooper wrote:
> Unfortunately, nestedhvm_enabled() is guaranteed to be false at the
> point that construct_vmcb() is called (due the order in which
> information appears while constructing the VM), which means we will
> never enable these optimisations.
> 
> Combined with the observation of EFER in the pipeline, the logic to
> enable/disable these optimisations needs to be in
> svm_update_guest_efer(), and need to trigger when EFER.SVME changes.
> 
> ~Andrew

Sorry for the late reply.  I tired working this before vacation but it
turned out to be a little bit longer than that... I have a set of
patches that _should_ work, but there are other issues.  Turns out there
are interrupt issues with nestted SVM HVM and I'm trying to hunt those
down and fix them so I can properly test the patches I've done.  Oddly
enough you can at least get a system booted on 17h family systems even
if it's fragile but 15h just fails to even boot.  Not sure how it even
worked when I tested previous patches on the 15h system. 

-- 
Brian Woods

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-linus test] 117623: regressions - trouble: broken/fail/pass

2018-01-05 Thread osstest service owner
flight 117623 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/117623/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm   broken
 test-amd64-i386-qemuu-rhel6hvm-intel broken
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm   broken
 test-amd64-i386-xl-qemuu-debianhvm-amd64broken
 test-amd64-amd64-amd64-pvgrub broken
 test-amd64-i386-qemuu-rhel6hvm-intel 4 host-install(4) broken REGR. vs. 115643
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 4 host-install(4) broken 
REGR. vs. 115643
 test-amd64-i386-xl-qemuu-debianhvm-amd64 4 host-install(4) broken REGR. vs. 
115643
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 4 host-install(4) broken REGR. 
vs. 115643
 test-amd64-amd64-amd64-pvgrub  4 host-install(4)   broken REGR. vs. 115643
 test-amd64-amd64-examine  5 host-install   broken REGR. vs. 115643
 test-amd64-amd64-xl-qemuu-ws16-amd64 16 guest-localmigrate/x10 fail REGR. vs. 
115643

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 115643
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 115643
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 115643
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 115643
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 115643
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 115643
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 115643
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 115643
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemuu-win10-i

Re: [Xen-devel] Xen Project Spectre/Meltdown FAQ

2018-01-05 Thread Hans van Kranenburg
On 01/05/2018 12:35 PM, Lars Kurth wrote:
> Hi all, this is a repost of
> https://blog.xenproject.org/2018/01/04/xen-project-spectremeltdown-faq/
> for xen-users/xen-devel. If you have questions, please reply to this
> thread and we will try and improve the FAQ based on questions. 
> Regards Lars

Thanks for the writeup.

The main reason for the reader to get confused is the amount of
different combinations of situations that are possible, which all again
have their own set of vulnerabilities and also their own (maybe even
different) set of possibilities to be used as environment for executing
an attack.

So let's help them by being more explicit.

> Google’s Project Zero announced several information leak
> vulnerabilities affecting all modern superscalar processors. Details
> can be found on their blog, and in the Xen Project Advisory 254 [1].
> To help our users understand the impact and our next steps forward,
> we put together the following FAQ.
> 
> Note that we will update the FAQ as new information surfaces.
> 
> = Is Xen impacted by Meltdown and Spectre? =
> 
> There are two angles to consider for this question:
> 
> * Can an untrusted guest attack the hypervisor using Meltdown or
> Spectre?
> * Can a guest user-space program attack a guest kernel using
> Meltdown or Spectre?

> Systems running Xen, like all operating systems and hypervisors, are
> potentially affected by Spectre (referred to as SP1 and SP2 in
> Advisory 254 [1]). For Arm Processors information, you can find which
> processors are impacted here [2].  In general, both the hypervisor
> and a guest kernel are vulnerable to attack via SP1 and SP2.
> 
> Only Intel processors are impacted by Meltdown (referred to as SP3 in
> Advisory 254 [1]).

> On Intel processors, only 64-bit PV mode guests can attack Xen.

"On Intel processors an attack at Xen using SP3 can only be done by
64-bit PV mode guests."

Even if it looks super-redundant, I think keeping explicit information
in every sentence is preferable, so they cannot be misinterpreted or
accidentally be taken out of context.

> Guests running in 32-bit PV mode, HVM mode, and PVH
> mode cannot attack the hypervisor using SP3. However, in 32-bit PV
> mode, HVM mode, and PVH mode, guest userspaces can attack guest
> kernels using SP3; so updating guest kernels is advisable.

> Interestingly, guest kernels running in 64-bit PV mode are not
> vulnerable to attack using SP3, because 64-bit PV guests already run
> in a KPTI-like mode.

Like Juergen already mentioned, additionally: "However, keep in mind
that a succesful attack on the hypervisor can still be used to recover
information about the same guest from physical memory."

> = Is there any risk of privilege escalation? =
> 
> Meltdown and Spectre are, by themselves, only information leaks.
> There is no suggestion that speculative execution can be used to
> modify memory or cause a system to do anything it might not have done
> already.
> 
> = Where can I find more information? =
> 
> We will update this blog post and Advisory 254 [1] as new information
> becomes available. Updates will also be published on xen-announce@.
> 
> We will also maintain a technical FAQ on our wiki [3] for answers to
> more detailed technical questions that emerge on xen-devel@ and other
> communication channels.
> 
> = Are there any patches for the vulnerability? =
> 
> We have prototype patches for a mitigation for Meltdown on Intel CPUs
> and a Mitigation for SP2/CVE-2017-5715, which are functional but have
> not undergone rigorous review and have not been backported to all
> supported Xen Project releases.
> 
> As information related to Meltdown and Spectre is now public,
> development will continue in public on xen-devel@ and patches will be
> posted and attached to Advisory 254 [1] as they become available in
> the next few days.
> 
> = Can SP1/SP2 be fixed at all? What plans are there to mitigate them?
> =
> 
> SP2 can be mitigated in two ways, both of which essentially prevent
> speculative execution of indirect branches. The first is to flush the
> branch prediction logic on entry into the hypervisor. This requires
> microcode updates, which Intel and AMD are in the process of
> preparing, as well as patches to the hypervisor which are also in
> process and should be available soon.
> 
> The second is to do indirect jumps in a way which is not subject to
> speculative execution. This requires the hypervisor to be recompiled
> with a compiler that contains special new features. These new
> compiler features are also in the process of being prepared for both
> gcc and clang, and should be available soon.
> 
> SP1 is much more difficult to mitigate. We have some ideas we’re
> exploring, but they’re still at the design stage at this point.
> 
> = Does Xen have any equivalent to Linux’s KPTI series? =
> 
> Linux’s KPTI series is designed to address SP3 only.

This one...

> For Xen guests, only 64-bit PV guests are affected by SP3.

...should be more expl

[Xen-devel] [PATCH] x86/efi: fix build with linkers that support both coff-x86-64 and pe-x86-64

2018-01-05 Thread Roger Pau Monne
When using a linker that supports both formats the following error
will be triggered:

efi/buildid.o: file not recognized: File format is ambiguous
efi/buildid.o: matching formats: coff-x86-64 pe-x86-64

Solve this by specifying the buildid.o format to pe-x86-64.

Signed-off-by: Roger Pau Monné 
---
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/Makefile | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index f708323722..fbff9ac3dc 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -188,20 +188,20 @@ endif
 $(TARGET).efi: prelink-efi.o $(note_file) efi.lds efi/relocs-dummy.o 
$(BASEDIR)/common/symbols-dummy.o efi/mkreloc
$(foreach base, $(VIRT_BASE) $(ALT_BASE), \
  $(guard) $(LD) $(call EFI_LDFLAGS,$(base)) -T efi.lds -N $< 
efi/relocs-dummy.o \
-   $(BASEDIR)/common/symbols-dummy.o $(note_file) -o 
$(@D)/.$(@F).$(base).0 &&) :
+   $(BASEDIR)/common/symbols-dummy.o -b pe-x86-64 
$(note_file) -o $(@D)/.$(@F).$(base).0 &&) :
$(guard) efi/mkreloc $(foreach base,$(VIRT_BASE) 
$(ALT_BASE),$(@D)/.$(@F).$(base).0) >$(@D)/.$(@F).0r.S
$(guard) $(NM) -pa --format=sysv $(@D)/.$(@F).$(VIRT_BASE).0 \
| $(guard) $(BASEDIR)/tools/symbols $(all_symbols) --sysv 
--sort >$(@D)/.$(@F).0s.S
$(guard) $(MAKE) -f $(BASEDIR)/Rules.mk $(@D)/.$(@F).0r.o 
$(@D)/.$(@F).0s.o
$(foreach base, $(VIRT_BASE) $(ALT_BASE), \
  $(guard) $(LD) $(call EFI_LDFLAGS,$(base)) -T efi.lds -N $< \
-   $(@D)/.$(@F).0r.o $(@D)/.$(@F).0s.o $(note_file) -o 
$(@D)/.$(@F).$(base).1 &&) :
+   $(@D)/.$(@F).0r.o $(@D)/.$(@F).0s.o -b pe-x86-64 
$(note_file) -o $(@D)/.$(@F).$(base).1 &&) :
$(guard) efi/mkreloc $(foreach base,$(VIRT_BASE) 
$(ALT_BASE),$(@D)/.$(@F).$(base).1) >$(@D)/.$(@F).1r.S
$(guard) $(NM) -pa --format=sysv $(@D)/.$(@F).$(VIRT_BASE).1 \
| $(guard) $(BASEDIR)/tools/symbols $(all_symbols) --sysv 
--sort >$(@D)/.$(@F).1s.S
$(guard) $(MAKE) -f $(BASEDIR)/Rules.mk $(@D)/.$(@F).1r.o 
$(@D)/.$(@F).1s.o
$(guard) $(LD) $(call EFI_LDFLAGS,$(VIRT_BASE)) -T efi.lds -N $< \
-   $(@D)/.$(@F).1r.o $(@D)/.$(@F).1s.o $(note_file) -o $@
+   $(@D)/.$(@F).1r.o $(@D)/.$(@F).1s.o -b pe-x86-64 
$(note_file) -o $@
if $(guard) false; then rm -f $@; echo 'EFI support disabled'; \
else $(NM) -pa --format=sysv $(@D)/$(@F) \
| $(BASEDIR)/tools/symbols --xensyms --sysv --sort 
>$(@D)/$(@F).map; fi
-- 
2.15.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] PCI Device Subtree Change from Traditional to Upstream

2018-01-05 Thread Kevin Stange
On 01/05/2018 03:03 AM, Paul Durrant wrote:
>> -Original Message-
>> From: Kevin Stange [mailto:ke...@steadfast.net]
>> Sent: 04 January 2018 21:17
>> To: Paul Durrant 
>> Cc: George Dunlap ; xen-
>> de...@lists.xenproject.org; Anthony Perard 
>> Subject: Re: [Xen-devel] PCI Device Subtree Change from Traditional to
>> Upstream
>>
>> On 01/04/2018 07:26 AM, Paul Durrant wrote:
 -Original Message-
 From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On
>> Behalf
 Of Anthony PERARD
 Sent: 04 January 2018 12:52
 To: Kevin Stange 
 Cc: George Dunlap ; xen-
 de...@lists.xenproject.org
 Subject: Re: [Xen-devel] PCI Device Subtree Change from Traditional to
 Upstream

 On Wed, Jan 03, 2018 at 05:10:54PM -0600, Kevin Stange wrote:
> On 01/03/2018 11:57 AM, Anthony PERARD wrote:
>> On Wed, Dec 20, 2017 at 11:40:03AM -0600, Kevin Stange wrote:
>>> Hi,
>>>
>>> I've been working on transitioning a number of Windows guests
>> under
 HVM
>>> from using QEMU traditional to QEMU upstream as is recommended
>> in
 the
>>> documentation.  When I move these guests, the PCI subtree for Xen
>>> devices changes and Windows creates a totally new copy of each
 device.
>>> Windows tracks down the storage without issue, but it treats the new
>>> instance of the NIC driver as a new device and clears the network
>>> configuration even though the MAC address is unchanged.  Manually
>>> booting the guest back on the traditional device model reactivates the
>>> original PCI subtree and the old network configuration with it.
>>>
>>> The only thing that I have been able to find that's substantially
>>> different comparing the device trees is that the device instance ID
>>> values differ on the parent Xen PCI device:
>>>
>>>

>> PCI\VEN_5853&DEV_0001&SUBSYS_00015853&REV_01\3&267A616A&3&18
>>>
>>>

>> PCI\VEN_5853&DEV_0001&SUBSYS_00015853&REV_01\3&267A616A&3&10
>>>
>>> Besides actually setting the guest to boot using QEMU traditional, is
>>> there a way to convince Windows to treat these devices as the same?
>> A
>>> patch-based solution would be acceptable to me if there is one, but I
>>> don't understand the code well enough to create my own solution.
>>>
>>> Kevin,
>>>
>>> I missed the original email as it went past...
>>>
>>> Are Xen Project PV drivers installed in the guest? And are you talking about
>> a PV NIC device or an emulated device?
>>
>> These guests use some of the older Xen PV drivers with a PV NIC, not an
>> emulated device.
>>
> 
> Ok. I was curious because the latest PV drivers contain a hack (that was 
> actually suggested by someone at Microsoft) to make sure that (as far as the 
> Windows PnP subsystem is concerned) the Xen platform device never moves once 
> the XENBUS driver has been installed. This is done by installing a filter 
> driver onto Windows' PCI bus driver that spots the platform device and 
> re-writes the trailing 'uniquifier' to be exactly what it was at the time of 
> driver installation.
> So, if you update your VMs to use newer PV drivers first, then you should be 
> immune to the platform device moving on the bus.

This is interesting and good to learn, but I had a lot of trouble in the
past trying to convert existing guests to use the modern PV drivers, due
to difficulties completely removing the old ones and getting Windows to
adopt the new ones.  The resulting mess is more work than dealing with
the current problem, which is why it would be nice to be able to just
massage the Windows guests to the desired configuration from outside.

-- 
Kevin Stange
Chief Technology Officer
Steadfast | Managed Infrastructure, Datacenter and Cloud Services
800 S Wells, Suite 190 | Chicago, IL 60607
312.602.2689 X203 | Fax: 312.602.2688
ke...@steadfast.net | www.steadfast.net

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] PCI Device Subtree Change from Traditional to Upstream

2018-01-05 Thread Kevin Stange
On 01/04/2018 03:16 PM, Kevin Stange wrote:
> On 01/04/2018 06:52 AM, Anthony PERARD wrote:
>> On Wed, Jan 03, 2018 at 05:10:54PM -0600, Kevin Stange wrote:
>>> On 01/03/2018 11:57 AM, Anthony PERARD wrote:
 On Wed, Dec 20, 2017 at 11:40:03AM -0600, Kevin Stange wrote:
> Hi,
>
> I've been working on transitioning a number of Windows guests under HVM
> from using QEMU traditional to QEMU upstream as is recommended in the
> documentation.  When I move these guests, the PCI subtree for Xen
> devices changes and Windows creates a totally new copy of each device.
> Windows tracks down the storage without issue, but it treats the new
> instance of the NIC driver as a new device and clears the network
> configuration even though the MAC address is unchanged.  Manually
> booting the guest back on the traditional device model reactivates the
> original PCI subtree and the old network configuration with it.
>
> The only thing that I have been able to find that's substantially
> different comparing the device trees is that the device instance ID
> values differ on the parent Xen PCI device:
>
> PCI\VEN_5853&DEV_0001&SUBSYS_00015853&REV_01\3&267A616A&3&18
>
> PCI\VEN_5853&DEV_0001&SUBSYS_00015853&REV_01\3&267A616A&3&10
>
> Besides actually setting the guest to boot using QEMU traditional, is
> there a way to convince Windows to treat these devices as the same?  A
> patch-based solution would be acceptable to me if there is one, but I
> don't understand the code well enough to create my own solution.

 Hi Kevin,

 I've got a patch to QEMU that seems to do the trick:

 From: Anthony PERARD 
 Subject: [PATCH] xen-platform: Hardcode PCI slot to 3

 Signed-off-by: Anthony PERARD 
 ---
  hw/i386/pc_piix.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
 index 5e47528993..93e3a9a916 100644
 --- a/hw/i386/pc_piix.c
 +++ b/hw/i386/pc_piix.c
 @@ -405,7 +405,7 @@ static void pc_xen_hvm_init(MachineState *machine)
  
  bus = pci_find_primary_bus();
  if (bus != NULL) {
 -pci_create_simple(bus, -1, "xen-platform");
 +pci_create_simple(bus, PCI_DEVFN(3, 0), "xen-platform");
  }
  }
  #endif


 The same thing could be done by libxl, by providing specific command
 line options to qemu. (I think that could even be done via a different
 config file for the guest.)
>>>
>>> This patch doesn't seem to work for me.  It seems like the device model
>>> process is exiting immediately, but I haven't been able to find any
>>> information as to what is going wrong.  I tested with Xen 4.6.6 and the
>>> QEMU packaged with that release.  Should I try it on a different version
>>> of Xen and QEMU?
>>
>> What this patch does is asking QEMU to insert the PCI card
>> "xen-platform" into the 3rd PCI slot. My guess is that failed because
>> there is already a PCI device there.
>>
>> You could check qemu's logs, it's in
>> /var/log/xen/qemu-dm-${guest_name}.log
> 
> The log file in question only says:
> 
> qemu: terminating on signal 1 from pid 8865
> 
>> Let's try something else, instead of patching QEMU, we can patch libxl,
>> that might work better. Can you try this patch? (I've only test
>> compiled.) I've write the patch for Xen 4.6, since that the version you
>> are using.
> 
> This isn't doing the trick either, with the same misbehavior. The log
> file is the same in both cases.

I'm getting confusing behavior here. I tried to boot a guest using a
build with the second patch and behaves the way the first one did, with
the qemu-system-i386 process exiting and preventing the guest from ever
booting.  However, I tried to downgrade the packages to completely
unpatched version in preparation to reboot again and once the older copy
of the runtime is installed, the qemu-system-i386 starts properly using
the command line arguments that libxl had specified and the system comes
up with the correct PCI subtree.

This leads me to believe something about my build is screwed up somehow
such that my qemu-system-i386 is broken.  I'm quite sure I'm not
applying any extra patches to it that weren't otherwise in the CentOS
virt packages.

-- 
Kevin Stange
Chief Technology Officer
Steadfast | Managed Infrastructure, Datacenter and Cloud Services
800 S Wells, Suite 190 | Chicago, IL 60607
312.602.2689 X203 | Fax: 312.602.2688
ke...@steadfast.net | www.steadfast.net

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen Security Advisory 254 - Information leak via side effects of speculative execution

2018-01-05 Thread Doug Goldstein
I'm just adding some comments below on some updates that might be
helpful to add to help clarify things for interested parties. These
comments are driven purely based on the questions that I've had to field
from others.

- Since this advisory talks about 3 CVEs and then breaks the issue into
3 items SP1, SP2 and SP3 it would be helpful to directly map them to
their CVEs.
- There has been some confusion around mitigation and resolution where
people misunderstand the terms and therefore there might be some value
in providing some updates to provide some more clarity.

> 
> SP1, "Bounds-check bypass": Poison the branch predictor, such that
> operating system or hypervisor code is speculatively executed past
> boundary and security checks.  This would allow an attacker to, for
> instance, cause speculative code in the normal hypercall / emulation
> path to execute with wild array indexes.

please add CVE-2017-5753

> 
> SP2, "Branch Target Injection": Poison the branch predictor.
> Well-abstracted code often involves calling function pointers via
> indirect branches; reading these function pointers may involve a
> (slow) memory access, so the CPU attempts to guess where indirect
> branches will lead.  Poisoning this enables an attacker to
> speculatively branch to any code that exists in the hypervisor.

please add CVE-2017-5715


> 
> SP3, "Rogue Data Load": On some processors, certain pagetable
> permission checks only happen when the instruction is retired;
> effectively meaning that speculative execution is not subject to
> pagetable permission checks.  On such processors, an attacker can
> speculatively execute arbitrary code in userspace with, effectively,
> the highest privilege level.

please add CVE-2017-5754 and/or reference this is meltdown.

> 
> MITIGATION
> ==
> 
> There is no mitigation for SP1 and SP2.
> 
> SP3 can be mitigated by running guests in HVM or PVH mode.
> 
> For guests with legacy PV kernels which cannot be run in HVM mode, we
> have developed a "shim" hypervisor that allows PV guests to run in PVH
> mode.  Unfortunately, due to the accelerated schedule, this is not yet
> ready to release.  We expect to have it ready for 4.10, as well as PVH
> backports to 4.9 and 4.8, available over the next few days.
> 
> RESOLUTION
> ==
> 
> There is no available resolution for SP1 or SP3.

I believe there has been some confusion among some people about the
terms here. There are some people that understand "mitigation" as "what
can I do now to avoid this" and "resolution" as "what updates can I
apply". As a result they are misunderstanding here what the net result
is. Some clarifications could be that the PVH shim is the resolution for
the SP3 issue. However its not a fix for PV itself but instead changes
the very nature of how PV guests are started up.


-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen Project Spectre/Meltdown FAQ

2018-01-05 Thread Lars Kurth

> On 5 Jan 2018, at 15:55, Hans van Kranenburg  wrote:
> 
> On 01/05/2018 12:35 PM, Lars Kurth wrote:
>> Hi all, this is a repost of
>> https://blog.xenproject.org/2018/01/04/xen-project-spectremeltdown-faq/ 
>> 
>> for xen-users/xen-devel. If you have questions, please reply to this
>> thread and we will try and improve the FAQ based on questions. 
>> Regards Lars
> 
> Thanks for the writeup.
> 
> The main reason for the reader to get confused is the amount of
> different combinations of situations that are possible, which all again
> have their own set of vulnerabilities and also their own (maybe even
> different) set of possibilities to be used as environment for executing
> an attack.
> 
> So let's help them by being more explicit.

That sounds reasonable

>> On Intel processors, only 64-bit PV mode guests can attack Xen.
> 
> "On Intel processors an attack at Xen using SP3 can only be done by
> 64-bit PV mode guests."
> 
> Even if it looks super-redundant, I think keeping explicit information
> in every sentence is preferable, so they cannot be misinterpreted or
> accidentally be taken out of context.

Alright: I think I prefer "On Intel processors, only 64-bit PV mode guests can 
attack Xen using SP3."

> 
>> Guests running in 32-bit PV mode, HVM mode, and PVH
>> mode cannot attack the hypervisor using SP3. However, in 32-bit PV
>> mode, HVM mode, and PVH mode, guest userspaces can attack guest
>> kernels using SP3; so updating guest kernels is advisable.
> 
>> Interestingly, guest kernels running in 64-bit PV mode are not
>> vulnerable to attack using SP3, because 64-bit PV guests already run
>> in a KPTI-like mode.
> 
> Like Juergen already mentioned, additionally: "However, keep in mind
> that a succesful attack on the hypervisor can still be used to recover
> information about the same guest from physical memory."

Good suggestion.

>> 
>> = Does Xen have any equivalent to Linux’s KPTI series? =
>> 
>> Linux’s KPTI series is designed to address SP3 only.
> 
> This one...
> 
>> For Xen guests, only 64-bit PV guests are affected by SP3.
> 
> ...should be more explicit. The words "affected" and "impacted" do not
> tell the reader if it's about being an attacker, or about being the
> victim and what is attacked or attacking.
> 
> "For Xen guests, only 64-bit PV guests are able to execute a SP3 attack
> against the hypervisor."

Sounds fine

I will update the blog post sometimes tomorrow or Monday. There were a few 
further comments, which may be worth rolling into a change

Lars

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC v1 36/74] --- x86/shim: Kconfig and command line options

2018-01-05 Thread Andrew Cooper
On 05/01/18 15:26, Jan Beulich wrote:
 On 04.01.18 at 14:05,  wrote:
>> --- a/xen/arch/x86/Kconfig
>> +++ b/xen/arch/x86/Kconfig
>> @@ -133,6 +133,28 @@ config PVH_GUEST
>>  ---help---
>>Support booting using the PVH ABI.
>>  
>> +  If unsure, say N.
>> +
>> +config PV_SHIM
>> +def_bool n
>> +prompt "PV Shim"
>> +depends on PV && XEN_GUEST
>> +---help---
>> +  Build Xen with a mode which acts as a shim to allow PV guest to run
>> +  in an HVM/PVH container. This mode can only be enabled with command
>> +  line option.
>> +
>> +  If unsure, say N.
>> +
>> +config PV_SHIM_EXCLUSIVE
>> +def_bool n
>> +prompt "PV Shim Exclusive"
>> +depends on PV_SHIM
> My expectation so far was that this would be the only mode we
> target, hence I think at the very least the default wants to be y
> here.

Until proper out-of-tree Xen builds work, building the shim binary at
all is a PITA.

These defaults give a developer a single binary which is capable of
running natively or as the shim, which has made development far more
productive.  Its certainly the way I'd expect to do primary future
development of the shim.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Xen Project Spectre/Meltdown FAQ

2018-01-05 Thread Rich Persaud
> On Jan 5, 2018, at 06:35, Lars Kurth  wrote:
> 
> Hi all, this is a repost of 
> https://blog.xenproject.org/2018/01/04/xen-project-spectremeltdown-faq/ for 
> xen-users/xen-devel. If you have questions, please reply to this thread and 
> we will try and improve the FAQ based on questions.

Very helpful, thanks.

> Regards
> Lars
> 
> 
> Google’s Project Zero announced several information leak vulnerabilities 
> affecting all modern superscalar processors. Details can be found on their 
> blog, and in the Xen Project Advisory 254 [1]. To help our users understand 
> the impact and our next steps forward, we put together the following FAQ.
> 
> Note that we will update the FAQ as new information surfaces.
> 
> = Is Xen impacted by Meltdown and Spectre? =
> 
> There are two angles to consider for this question:
> 
> * Can an untrusted guest attack the hypervisor using Meltdown or Spectre?
> * Can a guest user-space program attack a guest kernel using Meltdown or 
> Spectre?
> 
> Systems running Xen, like all operating systems and hypervisors, are 
> potentially affected by Spectre (referred to as SP1 and SP2 in Advisory 254 
> [1]). For Arm Processors information, you can find which processors are 
> impacted here [2].  In general, both the hypervisor and a guest kernel are 
> vulnerable to attack via SP1 and SP2.
> 
> Only Intel processors are impacted by Meltdown (referred to as SP3 in 
> Advisory 254 [1]). On Intel processors, only 64-bit PV mode guests can attack 
> Xen. Guests running in 32-bit PV mode, HVM mode, and PVH mode cannot attack 
> the hypervisor using SP3. However, in 32-bit PV mode, HVM mode, and PVH mode, 
> guest userspaces can attack guest kernels using SP3; so updating guest 
> kernels is advisable.
> 
> Interestingly, guest kernels running in 64-bit PV mode are not vulnerable to 
> attack using SP3, because 64-bit PV guests already run in a KPTI-like mode.
> 
> = Is there any risk of privilege escalation? =
> 
> Meltdown and Spectre are, by themselves, only information leaks. There is no 
> suggestion that speculative execution can be used to modify memory or cause a 
> system to do anything it might not have done already.
> 
> = Where can I find more information? =
> 
> We will update this blog post and Advisory 254 [1] as new information becomes 
> available. Updates will also be published on xen-announce@.
> 
> We will also maintain a technical FAQ on our wiki [3] for answers to more 
> detailed technical questions that emerge on xen-devel@ and other 
> communication channels.
> 
> = Are there any patches for the vulnerability? =
> 
> We have prototype patches for a mitigation for Meltdown on Intel CPUs and a 
> Mitigation for SP2/CVE-2017-5715, which are functional but have not undergone 
> rigorous review and have not been backported to all supported Xen Project 
> releases.
> 
> As information related to Meltdown and Spectre is now public, development 
> will continue in public on xen-devel@ and patches will be posted and attached 
> to Advisory 254 [1] as they become available in the next few days.
> 
> = Can SP1/SP2 be fixed at all? What plans are there to mitigate them? =
> 
> SP2 can be mitigated in two ways, both of which essentially prevent 
> speculative execution of indirect branches. The first is to flush the branch 
> prediction logic on entry into the hypervisor. This requires microcode 
> updates, which Intel and AMD are in the process of preparing, as well as 
> patches to the hypervisor which are also in process and should be available 
> soon.

The Linux kernel's IBRS patchset has a doc link which compares retpoline, IBRS 
Dynamic ("opt-in") and IBRS Always On ("opt-in if more paranoid").  

https://lkml.org/lkml/2018/1/4/615

https://docs.google.com/document/d/e/2PACX-1vSMrwkaoSUBAFc6Fjd19F18c1O9pudkfAY-7lGYGOTN8mc9ul-J6pWadcAaBJZcVA7W_3jlLKRtKRbd/pub

Would be nice to have that comparison for other CPU vendors.  Some information 
is aggregated at https://github.com/marcan/speculation-bugs:

"This repo is an attempt to collect information on the class of information 
disclosure vulnerabilities caused by CPU speculative execution that were 
disclosed on January 3rd, 2018.  Existing nomenclature is inconsistent and 
there is no agreed-upon name for the entire class of bugs, but the names 
Spectre and Meltdown have been used for subclasses of attacks.  This is a 
combination of publicly available information and educated guesses/ speculation 
based on the nature of the attacks. Pull requests with corrections or 
discussion are welcome."

> The second is to do indirect jumps in a way which is not subject to 
> speculative execution. This requires the hypervisor to be recompiled with a 
> compiler that contains special new features. These new compiler features are 
> also in the process of being prepared for both gcc and clang, and should be 
> available soon.
> 
> SP1 is much more difficult to mitigate. We have some ideas we’re exploring, 
> but they’re still at the

[Xen-devel] Xen Security Advisory 254 (CVE-2017-5753, CVE-2017-5715, CVE-2017-5754) - Information leak via side effects of speculative execution

2018-01-05 Thread Xen . org security team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

 Xen Security Advisory CVE-2017-5753,CVE-2017-5715,CVE-2017-5754 / XSA-254
 version 3

Information leak via side effects of speculative execution

UPDATES IN VERSION 3


Add information about ARM vulnerability.

Correct description of SP2 difficulty.

Mention that resolutions for SP1 and SP3 may be available in the
future.

Move description of the PV-in-PVH shim from Mitigation to Resolution.
(When available and deployed, it will eliminate the SP3
vulnerability.)

Add colloquial names and CVEs to the relevant paragraphs in Issue
Description.

Add a URL.

Say explicitly in Vulnerable Systems that HVM guests cannot exploit
SP3.

Clarify that SP1 and SP2 can be exploited against other victims
besides operating systems and hypervisors.

Grammar fixes.

Remove erroneous detail about when Xen direct maps the whole of
physical memory.

State in Description that Xen ARM guests run in a separate address
space.

ISSUE DESCRIPTION
=

Processors give the illusion of a sequence of instructions executed
one-by-one.  However, in order to most efficiently use cpu resources,
modern superscalar processors actually begin executing many
instructions in parallel.  In cases where instructions depend on the
result of previous instructions or checks which have not yet
completed, execution happens based on guesses about what the outcome
will be.  If the guess is correct, execution has been sped up.  If the
guess is incorrect, partially-executed instructions are cancelled and
architectural state changes (to registers, memory, and so on)
reverted; but the whole process is no slower than if no guess had been
made at all.  This is sometimes called "speculative execution".

Unfortunately, although architectural state is rolled back, there are
other side effects, such as changes to TLB or cache state, which are
not rolled back.  These side effects can subsequently be detected by
an attacker to determine information about what happened during the
speculative execution phase.  If an attacker can cause speculative
execution to access sensitive memory areas, they may be able to infer
what that sensitive memory contained.

Furthermore, these guesses can often be 'poisoned', such that attacker
can cause logic to reliably 'guess' the way the attacker chooses.
This advisory discusses three ways to cause speculative execution to
access sensitive memory areas (named here according to the
discoverer's naming scheme):

"Bounds-check bypass" (aka SP1, "Variant 1", Spectre CVE-2017-5753):
Poison the branch predictor, such that victim code is speculatively
executed past boundary and security checks.  This would allow an
attacker to, for instance, cause speculative code in the normal
hypercall / emulation path to execute with wild array indexes.

"Branch Target Injection" (aka SP2, "Variant 2", Spectre CVE-2017-5715):
Poison the branch predictor.  Well-abstracted code often involves
calling function pointers via indirect branches; reading these
function pointers may involve a (slow) memory access, so the CPU
attempts to guess where indirect branches will lead.  Poisoning this
enables an attacker to speculatively branch to any code that is
executable by the victim (eg, anywhere in the hypervisor).

"Rogue Data Load" (aka SP3, "Variant 3", Meltdown, CVE-2017-5754):
On some processors, certain pagetable permission checks only happen
when the instruction is retired; effectively meaning that speculative
execution is not subject to pagetable permission checks.  On such
processors, an attacker can speculatively execute arbitrary code in
userspace with, effectively, the highest privilege level.

More information is available here:
  https://meltdownattack.com/
  https://spectreattack.com/
  
https://googleprojectzero.blogspot.co.uk/2018/01/reading-privileged-memory-with-side.html

Additional Xen-specific background:

Xen hypervisors on most systems map all of physical RAM, so code
speculatively executed in a hypervisor context can read all of system
RAM.

When running PV guests, the guest and the hypervisor share the address
space; guest kernels run in a lower privilege level, and Xen runs in
the highest privilege level.  (x86 HVM and PVH guests, and ARM guests,
run in a separate address space to the hypervisor.)  However, only
64-bit PV guests can generate addresses large enough to point to
hypervisor memory.

IMPACT
==

Xen guests may be able to infer the contents of arbitrary host memory,
including memory assigned to other guests.

An attacker's choice of code to speculatively execute (and thus the
ease of extracting useful information) goes up with the numbers.  For
SP1, an attacker is limited to windows of code after bound checks of
user-supplied indexes.  For SP2, the attacker will in many cases will
be limited to executing arbitrary pre-existing code inside of Xen.
For SP3 (and other cases for SP2), an attacker can write arbit

[Xen-devel] [xen-4.8-testing test] 117628: trouble: broken/fail/pass

2018-01-05 Thread osstest service owner
flight 117628 xen-4.8-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/117628/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-xtf-amd64-amd64-5   broken
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm  broken
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 4 host-install(4) broken 
REGR. vs. 117144
 test-amd64-amd64-livepatch   broken  in 117586
 test-amd64-i386-xl-xsm   broken  in 117586
 test-amd64-amd64-xl-qemuu-win7-amd64  broken in 117586
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm broken in 117586
 test-amd64-i386-rumprun-i386 broken  in 117586

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-xsm   4 host-install(4) broken in 117586 pass in 117628
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 4 host-install(4) broken in 
117586 pass in 117628
 test-amd64-i386-rumprun-i386 4 host-install(4) broken in 117586 pass in 117628
 test-amd64-amd64-livepatch   4 host-install(4) broken in 117586 pass in 117628
 test-amd64-amd64-xl-qemuu-win7-amd64 4 host-install(4) broken in 117586 pass 
in 117628
 test-xtf-amd64-amd64-54 host-install(4)  broken pass in 117586
 test-xtf-amd64-amd64-2 49 xtf/test-hvm64-lbr-tsx-vmentry fail in 117586 pass 
in 117628
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 16 guest-localmigrate/x10 fail 
in 117586 pass in 117628
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 16 guest-localmigrate/x10 
fail pass in 117586

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-rtds   16 guest-start/debian.repeat fail blocked in 117144
 test-xtf-amd64-amd64-1  49 xtf/test-hvm64-lbr-tsx-vmentry fail like 117144
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 117144
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 117144
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 117144
 test-armhf-armhf-xl-credit2  16 guest-start/debian.repeatfail  like 117144
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 117144
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 117144
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 117144
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 117144
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 117144
 build-amd64-prev  7 xen-build/dist-test  fail   never pass
 build-i386-prev   7 xen-build/dist-test  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-a

Re: [Xen-devel] Xen Project Spectre/Meltdown FAQ

2018-01-05 Thread Andrew Cooper
On 05/01/18 18:16, Rich Persaud wrote:
>> On Jan 5, 2018, at 06:35, Lars Kurth > > wrote:
>> SP2 can be mitigated in two ways, both of which essentially prevent
>> speculative execution of indirect branches. The first is to flush the
>> branch prediction logic on entry into the hypervisor. This requires
>> microcode updates, which Intel and AMD are in the process of
>> preparing, as well as patches to the hypervisor which are also in
>> process and should be available soon.
>
> The Linux kernel's IBRS patchset has a doc link which compares
> retpoline, IBRS Dynamic ("opt-in") and IBRS Always On ("opt-in if more
> paranoid").  
>
> https://lkml.org/lkml/2018/1/4/615
>
> https://docs.google.com/document/d/e/2PACX-1vSMrwkaoSUBAFc6Fjd19F18c1O9pudkfAY-7lGYGOTN8mc9ul-J6pWadcAaBJZcVA7W_3jlLKRtKRbd/pub
>
> Would be nice to have that comparison for other CPU vendors.  Some
> information is aggregated at https://github.com/marcan/speculation-bugs:
>
> "This repo is an attempt to collect information on the class of
> information disclosure vulnerabilities caused by CPU speculative
> execution that were disclosed on January 3rd, 2018.  Existing
> nomenclature is inconsistent and there is no agreed-upon name for the
> entire class of bugs, but the names Spectre and Meltdown have been
> used for subclasses of attacks.  This is a combination of publicly
> available information and educated guesses/ speculation based on the
> nature of the attacks. Pull requests with corrections or discussion
> are welcome."

Prefer the google names.  Spectre in particular is a mix of two quite
different attack techniques.

>
>> Linux’s KPTI series is designed to address SP3 only.  For Xen guests,
>> only 64-bit PV guests are affected by SP3. A KPTI-like approach was
>> explored initially, but required significant ABI changes.  Instead
>> we’ve decided to go with an alternate approach, which is less
>> disruptive and less complex to implement. The chosen approach runs PV
>> guests in a PVH container, which ensures that PV guests continue to
>> behave as before, while providing the isolation that protects the
>> hypervisor from SP3. This works well for Xen 4.8 to Xen 4.10, which
>> is currently our priority.
>
> Since PVH does not yet support PCI passthrough, are there other
> recommended SP3 mitigations for 64-bit PV driver domains?

Lock them down?  Device driver domains, even if not fully trusted, are
going to be part of the system and therefore at least semi-TCB.

If an attacker can't run code in your driver domain (and be aware of
things like server side processing, JIT of SQL, etc as "running code"
methods), they aren't in a position to mount an SP3 attack.

>  Would CPU pinning of an untrusted guest driver domain reduce its
> ability to attack the host?

All of SP1/2/3 can in principle be used to attack Xen, at which point
you have to presume that the entire system is readable.

CPU pinning can be used to prevent certain guests from sharing branch
prediction resources, and thus prevent them from directly attacking each
other using SP1 or SP2.

However, you can't isolate Xen away from the guest, so pinning is no
mitigation against attacks targeting Xen.

> Since 32-bit PV guests are not affected by SP3, will they continue to
> run without a PVH container, so that PCI passthrough continues to
> function?

32bit PV guests are unable to use SP3 to attack Xen, but 32bit PV guest
userspace can still use SP3 to attack the guest kernel.

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] PCI Device Subtree Change from Traditional to Upstream

2018-01-05 Thread Kevin Stange
On 01/05/2018 11:10 AM, Kevin Stange wrote:
> On 01/04/2018 03:16 PM, Kevin Stange wrote:
>> On 01/04/2018 06:52 AM, Anthony PERARD wrote:
>>> On Wed, Jan 03, 2018 at 05:10:54PM -0600, Kevin Stange wrote:
 On 01/03/2018 11:57 AM, Anthony PERARD wrote:
> On Wed, Dec 20, 2017 at 11:40:03AM -0600, Kevin Stange wrote:
>> Hi,
>>
>> I've been working on transitioning a number of Windows guests under HVM
>> from using QEMU traditional to QEMU upstream as is recommended in the
>> documentation.  When I move these guests, the PCI subtree for Xen
>> devices changes and Windows creates a totally new copy of each device.
>> Windows tracks down the storage without issue, but it treats the new
>> instance of the NIC driver as a new device and clears the network
>> configuration even though the MAC address is unchanged.  Manually
>> booting the guest back on the traditional device model reactivates the
>> original PCI subtree and the old network configuration with it.
>>
>> The only thing that I have been able to find that's substantially
>> different comparing the device trees is that the device instance ID
>> values differ on the parent Xen PCI device:
>>
>> PCI\VEN_5853&DEV_0001&SUBSYS_00015853&REV_01\3&267A616A&3&18
>>
>> PCI\VEN_5853&DEV_0001&SUBSYS_00015853&REV_01\3&267A616A&3&10
>>
>> Besides actually setting the guest to boot using QEMU traditional, is
>> there a way to convince Windows to treat these devices as the same?  A
>> patch-based solution would be acceptable to me if there is one, but I
>> don't understand the code well enough to create my own solution.
>
> Hi Kevin,
>
> I've got a patch to QEMU that seems to do the trick:
>
> From: Anthony PERARD 
> Subject: [PATCH] xen-platform: Hardcode PCI slot to 3
>
> Signed-off-by: Anthony PERARD 
> ---
>  hw/i386/pc_piix.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 5e47528993..93e3a9a916 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -405,7 +405,7 @@ static void pc_xen_hvm_init(MachineState *machine)
>  
>  bus = pci_find_primary_bus();
>  if (bus != NULL) {
> -pci_create_simple(bus, -1, "xen-platform");
> +pci_create_simple(bus, PCI_DEVFN(3, 0), "xen-platform");
>  }
>  }
>  #endif
>
>
> The same thing could be done by libxl, by providing specific command
> line options to qemu. (I think that could even be done via a different
> config file for the guest.)

 This patch doesn't seem to work for me.  It seems like the device model
 process is exiting immediately, but I haven't been able to find any
 information as to what is going wrong.  I tested with Xen 4.6.6 and the
 QEMU packaged with that release.  Should I try it on a different version
 of Xen and QEMU?
>>>
>>> What this patch does is asking QEMU to insert the PCI card
>>> "xen-platform" into the 3rd PCI slot. My guess is that failed because
>>> there is already a PCI device there.
>>>
>>> You could check qemu's logs, it's in
>>> /var/log/xen/qemu-dm-${guest_name}.log
>>
>> The log file in question only says:
>>
>> qemu: terminating on signal 1 from pid 8865
>>
>>> Let's try something else, instead of patching QEMU, we can patch libxl,
>>> that might work better. Can you try this patch? (I've only test
>>> compiled.) I've write the patch for Xen 4.6, since that the version you
>>> are using.
>>
>> This isn't doing the trick either, with the same misbehavior. The log
>> file is the same in both cases.
> 
> I'm getting confusing behavior here. I tried to boot a guest using a
> build with the second patch and behaves the way the first one did, with
> the qemu-system-i386 process exiting and preventing the guest from ever
> booting.  However, I tried to downgrade the packages to completely
> unpatched version in preparation to reboot again and once the older copy
> of the runtime is installed, the qemu-system-i386 starts properly using
> the command line arguments that libxl had specified and the system comes
> up with the correct PCI subtree.
> 
> This leads me to believe something about my build is screwed up somehow
> such that my qemu-system-i386 is broken.  I'm quite sure I'm not
> applying any extra patches to it that weren't otherwise in the CentOS
> virt packages.

George pointed me at the fact I had failed to pull in the seabios
package from CentOS virt.  The version from RHEL is broken and that was
my issue.  Sorry for generating extra noise as a result.

I can confirm that patch 2 (and probably patch 1, really) work around
the issue for me.  Thank you for the help!

It would be nice if there was a way to set default or override options
to domains from a configuration file that is read by libxl, qemu, or
libvirt but I see no code or d

[Xen-devel] [libvirt test] 117631: regressions - trouble: broken/fail/pass

2018-01-05 Thread osstest service owner
flight 117631 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/117631/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt-pair broken
 test-amd64-amd64-libvirt-pair 4 host-install/src_host(4) broken REGR. vs. 
117589
 test-amd64-amd64-libvirt-pair 5 host-install/dst_host(5) broken REGR. vs. 
117589
 test-armhf-armhf-libvirt-xsm 16 guest-start/debian.repeat fail REGR. vs. 117589

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 117589
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 117589
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 117589
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-libvirt-qcow2 12 migrate-support-checkfail never pass
 test-arm64-arm64-libvirt-qcow2 13 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  9a22251bbe6a4ff8dab90da53a1c0df82d8d29fc
baseline version:
 libvirt  8bcceaa9244fbf89dd1173756dd835c8c0d5af1c

Last test of basis   117589  2018-01-03 04:22:30 Z2 days
Testing same since   117631  2018-01-04 10:04:20 Z1 days1 attempts


People who touched revisions under test:
  Chen Hanxiao 
  Christian Ehrhardt 
  Cédric Bosdonnat 
  Eduardo Habkost 
  Eric Blake 
  Erik Skultety 
  Julio Faracco 
  Michal Privoznik 
  Peter Krempa 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairbroken  
 test-amd64-i386-libvirt-pair pass
 test-arm64-arm64-libvirt-qcow2   pass
 test-armhf-armhf-libvirt-raw pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org

[Xen-devel] [xen-unstable-smoke test] 117663: tolerable all pass - PUSHED

2018-01-05 Thread osstest service owner
flight 117663 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/117663/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  2d1c82261d966735e82e5971eddb63ba3c565a37
baseline version:
 xen  7b5b8ca7dffde866d851f0b87b994e0b13e5b867

Last test of basis   117634  2018-01-04 14:01:11 Z1 days
Testing same since   117663  2018-01-05 21:01:48 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-i386 pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   7b5b8ca7df..2d1c82261d  2d1c82261d966735e82e5971eddb63ba3c565a37 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Dynamic Disassembling domU Instructions

2018-01-05 Thread Man Chon Kuok
Hi Jan,

Thank you for the insight. That explains a lot of the repeated cached
instruction fetching.
This might belong to another thread, I was exploring xentrace, ran it with
"sudo xentrace output", but it returns the message of "ERROR: Failed to map
cpu buffer! (13 = Permission denied)", even running as root. Googled around
but nothing insightful showed up, any suggestion would be appreciated.

Best,


On Thu, Jan 4, 2018 at 11:38 PM, Jan Beulich  wrote:

> >>> On 05.01.18 at 04:17,  wrote:
> > I am trying to modify Xen 4.8 to have it print out the opcode as well as
> > some registers of an HVM domU as it runs. I tried to modify
> > xen/arch/x86/hvm/emulate.c 's hvmemul_insn_fetch to output the content in
> > hvmemul_ctxt->insn_buf with printk. In hvmemul_insn_fetch, it seems that
> a
> > lot of the requested bytes are cached, does the domU's OS repeatedly
> calls
> > the same instruction region over and over again?
>
> No, but certain operations require going through the emulator
> twice (e.g. once to formulate a request to qemu, and a second
> time to process its response). It would be wrong to read guest
> memory a second time in such a case.
>
> You will also notice that after a completed emulation that cache
> is being invalidated.
>
> > Lastly, I am using printk to log the opcodes. Ideally I would want the
> > opcode to be written to a separate file, but I read that it is not good
> to
> > do any file access in kernel programming. Are there other alternatives or
> > util functions that I should consider using?
>
> xentrace would come to mind.
>
> Jan
>
>
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [qemu-mainline test] 117633: regressions - trouble: broken/fail/pass

2018-01-05 Thread osstest service owner
flight 117633 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/117633/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt-xsm broken
 test-amd64-amd64-libvirt-vhd broken
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm  broken in 
117590
 test-amd64-amd64-xl-xsm  broken  in 117590
 test-amd64-amd64-xl  broken  in 117590
 test-amd64-i386-xl-qemuu-debianhvm-amd64  broken in 117590
 test-armhf-armhf-xl-credit2 16 guest-start/debian.repeat fail REGR. vs. 117335

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 4 host-install(4) broken in 
117590 pass in 117633
 test-amd64-amd64-xl-xsm  4 host-install(4) broken in 117590 pass in 117633
 test-amd64-i386-xl-qemuu-debianhvm-amd64 4 host-install(4) broken in 117590 
pass in 117633
 test-amd64-amd64-xl  4 host-install(4) broken in 117590 pass in 117633
 test-amd64-amd64-libvirt-xsm  4 host-install(4)  broken pass in 117590
 test-amd64-amd64-libvirt-vhd  4 host-install(4)  broken pass in 117590
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeat fail in 117590 pass in 
117633

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-xsm 13 migrate-support-check fail in 117590 never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-check fail in 117590 never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 117335
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail  like 117335
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 117335
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 117335
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 117335
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 117335
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail n

Re: [Xen-devel] [PATCH] x86/efi: fix build with linkers that support both coff-x86-64 and pe-x86-64

2018-01-05 Thread Doug Goldstein
On 1/5/18 10:43 AM, Roger Pau Monne wrote:
> When using a linker that supports both formats the following error
> will be triggered:
> 
> efi/buildid.o: file not recognized: File format is ambiguous
> efi/buildid.o: matching formats: coff-x86-64 pe-x86-64
> 
> Solve this by specifying the buildid.o format to pe-x86-64.
> 
> Signed-off-by: Roger Pau Monné 

Yes. Please let's do this.

Reviewed-by: Doug Goldstein 
Tested-by: Doug Goldstein 

-- 
Doug Goldstein



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-4.9 test] 117637: tolerable FAIL - PUSHED

2018-01-05 Thread osstest service owner
flight 117637 linux-4.9 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/117637/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 117255
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 117255
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 117255
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 117255
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 117255
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win10-i386 10 windows-installfail never pass
 test-amd64-amd64-xl-qemut-win10-i386 10 windows-installfail never pass
 test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
 test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass

version targeted for testing:
 linux07bcb2489b96b2bd8b030822b4495e4a18c7b5da
baseline version:
 linuxee52d08d2e09539154f397c8a412c68189c4d6a0

Last test of basis   117255  2017-12-17 21:46:37 Z   19 days
Failing since117374  2017-12-20 09:38:50 Z   16 days4 attempts
Testing same since   117595  2018-01-03 09:07:55 Z2 days2 attempts


369 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm

Re: [Xen-devel] [PATCH 1/3] xen: remove tests for pvh mode in pure pv paths

2018-01-05 Thread HW42
Juergen Gross:
> Remove the last tests for XENFEAT_auto_translated_physmap in pure
> PV-domain specific paths. PVH V1 is gone and the feature will always
> be "false" in PV guests.
[...]
> diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c
> index 276da636dd39..6083ba462f35 100644
> --- a/arch/x86/xen/p2m.c
> +++ b/arch/x86/xen/p2m.c
[...]
> @@ -711,9 +694,6 @@ int set_foreign_p2m_mapping(struct gnttab_map_grant_ref 
> *map_ops,
>   int i, ret = 0;
>   pte_t *pte;
>  
> - if (xen_feature(XENFEAT_auto_translated_physmap))
> - return 0;
> -
>   if (kmap_ops) {
>   ret = HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref,
>   kmap_ops, count);
> @@ -756,9 +736,6 @@ int clear_foreign_p2m_mapping(struct 
> gnttab_unmap_grant_ref *unmap_ops,
>  {
>   int i, ret = 0;
>  
> - if (xen_feature(XENFEAT_auto_translated_physmap))
> - return 0;
> -
>   for (i = 0; i < count; i++) {
>   unsigned long mfn = __pfn_to_mfn(page_to_pfn(pages[i]));
>   unsigned long pfn = page_to_pfn(pages[i]);

This removes the check for autotranslation in {set,clear}_foreign_p2m_mapping.
But those are called by the grant-table code also on PVH/HVM guest. So
since 4.14 I see crashes similar to this one (ignore the kernel version,
it's in the middle of a bisect): 

[   33.778854] page must be ballooned
[   33.778860] [ cut here ]
[   33.778887] WARNING: CPU: 1 PID: 1581 at arch/x86/xen/p2m.c:720 
set_foreign_p2m_mapping+0x13b/0x370
[   33.778903] Modules linked in: xen_gntdev xen_gntalloc xen_blkback xenfs 
xen_privcmd xen_evtchn dm_snapshot dm_bufio xen_blkfront
[   33.778931] CPU: 1 PID: 1581 Comm: qubesdb-daemon Not tainted 4.13.0-lt-37 #1
[   33.778946] task: 8800f4251b80 task.stack: c9818000
[   33.778960] RIP: 0010:set_foreign_p2m_mapping+0x13b/0x370
[   33.778970] RSP: 0018:c981bc90 EFLAGS: 00010286
[   33.778981] RAX: 0016 RBX: 0001 RCX: 81e4a898
[   33.778994] RDX: 0001 RSI: 0092 RDI: 0247
[   33.779016] RBP: c981bce0 R08: 0143 R09: 820d1660
[   33.779026] R10: 002a R11:  R12: 8800f0c2c320
[   33.779037] R13: 8800f4b6a3c8 R14: 8000 R15: 
[   33.779047] FS:  7fbfd5739f80() GS:8800f9d0() 
knlGS:
[   33.779056] CS:  0010 DS:  ES:  CR0: 80050033
[   33.779064] CR2: 7ff25daca0c0 CR3: f2faa005 CR4: 001606e0
[   33.779074] Call Trace:
[   33.779082]  ? x86_configure_nx+0x50/0x50
[   33.779091]  gnttab_map_refs+0xc2/0x160
[   33.779097]  ? decrease_reservation+0x256/0x2e0
[   33.779105]  gntdev_mmap+0x358/0x5c0 [xen_gntdev]
[   33.779113]  mmap_region+0x392/0x5e0
[   33.779119]  do_mmap+0x2ae/0x480
[   33.779125]  vm_mmap_pgoff+0xa1/0xe0
[   33.779132]  SyS_mmap_pgoff+0x1ba/0x260
[   33.787439] systemd-journald[1548]: Received request to flush runtime 
journal from PID 1
[   33.931963]  SyS_mmap+0x16/0x20
[   33.931966]  do_syscall_64+0x53/0xf0
[   33.931980]  entry_SYSCALL64_slow_path+0x25/0x25
[   33.931981] RIP: 0033:0x7fbfd50ebdda
[   33.931982] RSP: 002b:7fff2bae9238 EFLAGS: 0246 ORIG_RAX: 
0009
[   33.931984] RAX: ffda RBX: 0003 RCX: 7fbfd50ebdda
[   33.931984] RDX: 0003 RSI: 1000 RDI: 
[   33.931985] RBP: 0007 R08: 0007 R09: 
[   33.931986] R10: 0001 R11: 0246 R12: 
[   33.931986] R13: 1000 R14: 0001 R15: 
[   33.931987] Code: 83 b4 00 00 00 48 8b 05 9c 5c f2 00 48 83 3c d0 ff 0f 84 
50 01 00 00 48 c7 c7 d7 20 bd 81 48 89 55 c8 48 89 75 d0 e8 f1 70 09 00 <0f> ff 
48 8b 75 d0 48 8b 55 c8 4c 09 f6 48 89 d7 e8 70 fe ff ff 
[   33.932007] ---[ end trace 858dec3c813fa284 ]---
[   33.932011] [ cut here ]
[   33.932011] kernel BUG at arch/x86/xen/p2m.c:651!
[   33.932014] invalid opcode:  [#1] SMP
[   33.932014] Modules linked in: xen_gntdev xen_gntalloc xen_blkback xenfs 
xen_privcmd xen_evtchn dm_snapshot dm_bufio xen_blkfront
[   33.932022] CPU: 1 PID: 1581 Comm: qubesdb-daemon Tainted: GW   
4.13.0-lt-37 #1
[   33.932601] task: 8800f4251b80 task.stack: c9818000
[   33.932605] RIP: 0010:__set_phys_to_machine+0x36/0x130
[   33.932606] RSP: 0018:c981bc68 EFLAGS: 00010287
[   33.932609] RAX: 0016 RBX: 000f3cf3 RCX: 81e4a898
[   33.932609] RDX: 000f3cf3 RSI: 8012daef RDI: 000f3cf3
[   33.932610] RBP: c981bc80 R08: 0143 R09: 820d1660
[   33.932611] R10: 002a R11:  R12: 8012daef
[   33.932611] R13: 8800f4b6a3c8 R14: 8000 R15: 
[   33.932