Re: [Xen-devel] [PATCH v4 for 4.5] x86/VPMU: Clear last_vcpu when destroying VPMU

2015-01-07 Thread Tian, Kevin
> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> Sent: Tuesday, January 06, 2015 12:33 AM
> 
> On Thu, Dec 18, 2014 at 01:06:40PM -0500, Boris Ostrovsky wrote:
> > We need to make sure that last_vcpu is not pointing to VCPU whose
> > VPMU is being destroyed. Otherwise we may try to dereference it in
> > the future, when VCPU is gone.
> >
> > We have to do this via IPI since otherwise there is a (somewheat
> > theoretical) chance that between test and subsequent clearing
> > of last_vcpu the remote processor (i.e. vpmu->last_pcpu) might do
> > both vpmu_load() and then vpmu_save() for another VCPU. The former
> > will clear last_vcpu and the latter will set it to something else.
> >
> > Performing this operation via IPI will guarantee that nothing can
> > happen on the remote processor between testing and clearing of
> > last_vcpu.
> >
> > We should also check for VPMU_CONTEXT_ALLOCATED in vpmu_destroy() to
> > avoid unnecessary percpu tests and arch-specific destroy ops. Thus
> > checks in AMD and Intel routines are no longer needed.
> >
> > Signed-off-by: Boris Ostrovsky 
> > ---
> >  xen/arch/x86/hvm/svm/vpmu.c   |3 ---
> >  xen/arch/x86/hvm/vmx/vpmu_core2.c |2 --
> >  xen/arch/x86/hvm/vpmu.c   |   20 
> >  3 files changed, 20 insertions(+), 5 deletions(-)
> 
> CC-ing the rest of the maintainers (Intel ones, since Boris is
> on the AMD side).
> 
> I am OK with this patch going in Xen 4.5 as for one thing to actually
> use vpmu you have to pass 'vpmu=1' on the Xen command line.
> 
> Aka, Release-Acked-by: Konrad Rzeszutek Wilk 

Acked-by: Kevin Tian 

> 
> >
> > Changes in v4:
> >  * Back to v2's IPI implementation
> >
> > Changes in v3:
> >  * Use cmpxchg instead of IPI
> >  * Use correct routine names in commit message
> >  * Remove duplicate test for VPMU_CONTEXT_ALLOCATED in arch-specific
> >destroy ops
> >
> > Changes in v2:
> >  * Test last_vcpu locally before IPI
> >  * Don't handle local pcpu as a special case --- on_selected_cpus
> >will take care of that
> >  * Dont't cast variables unnecessarily
> >
> >
> > diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c
> > index 8e07a98..4c448bb 100644
> > --- a/xen/arch/x86/hvm/svm/vpmu.c
> > +++ b/xen/arch/x86/hvm/svm/vpmu.c
> > @@ -403,9 +403,6 @@ static void amd_vpmu_destroy(struct vcpu *v)
> >  {
> >  struct vpmu_struct *vpmu = vcpu_vpmu(v);
> >
> > -if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) )
> > -return;
> > -
> >  if ( ((struct amd_vpmu_context *)vpmu->context)->msr_bitmap_set )
> >  amd_vpmu_unset_msr_bitmap(v);
> >
> > diff --git a/xen/arch/x86/hvm/vmx/vpmu_core2.c
> b/xen/arch/x86/hvm/vmx/vpmu_core2.c
> > index 68b6272..590c2a9 100644
> > --- a/xen/arch/x86/hvm/vmx/vpmu_core2.c
> > +++ b/xen/arch/x86/hvm/vmx/vpmu_core2.c
> > @@ -818,8 +818,6 @@ static void core2_vpmu_destroy(struct vcpu *v)
> >  struct vpmu_struct *vpmu = vcpu_vpmu(v);
> >  struct core2_vpmu_context *core2_vpmu_cxt = vpmu->context;
> >
> > -if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) )
> > -return;
> >  xfree(core2_vpmu_cxt->pmu_enable);
> >  xfree(vpmu->context);
> >  if ( cpu_has_vmx_msr_bitmap )
> > diff --git a/xen/arch/x86/hvm/vpmu.c b/xen/arch/x86/hvm/vpmu.c
> > index 1df74c2..37f0d9f 100644
> > --- a/xen/arch/x86/hvm/vpmu.c
> > +++ b/xen/arch/x86/hvm/vpmu.c
> > @@ -247,10 +247,30 @@ void vpmu_initialise(struct vcpu *v)
> >  }
> >  }
> >
> > +static void vpmu_clear_last(void *arg)
> > +{
> > +if ( this_cpu(last_vcpu) == arg )
> > +this_cpu(last_vcpu) = NULL;
> > +}
> > +
> >  void vpmu_destroy(struct vcpu *v)
> >  {
> >  struct vpmu_struct *vpmu = vcpu_vpmu(v);
> >
> > +if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) )
> > +return;
> > +
> > +/*
> > + * Need to clear last_vcpu in case it points to v.
> > + * We can check here non-atomically whether it is 'v' since
> > + * last_vcpu can never become 'v' again at this point.
> > + * We will test it again in vpmu_clear_last() with interrupts
> > + * disabled to make sure we don't clear someone else.
> > + */
> > +if ( per_cpu(last_vcpu, vpmu->last_pcpu) == v )
> > +on_selected_cpus(cpumask_of(vpmu->last_pcpu),
> > + vpmu_clear_last, v, 1);
> > +
> >  if ( vpmu->arch_vpmu_ops &&
> vpmu->arch_vpmu_ops->arch_vpmu_destroy )
> >  vpmu->arch_vpmu_ops->arch_vpmu_destroy(v);
> >  }
> > --
> > 1.7.1
> >

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/4] x86: expose CMT L3 event mask to user space

2015-01-07 Thread Jan Beulich
>>> On 23.12.14 at 09:54,  wrote:
> --- a/xen/arch/x86/sysctl.c
> +++ b/xen/arch/x86/sysctl.c
> @@ -157,6 +157,11 @@ long arch_do_sysctl(
>  sysctl->u.psr_cmt_op.u.data = (ret ? 0 : info.size);
>  break;
>  }
> +case XEN_SYSCTL_PSR_CMT_get_l3_event_mask:
> +{
> +sysctl->u.psr_cmt_op.u.data = psr_cmt->l3.features;
> +break;
> +}

Stray figure braces. Other than that
Acked-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-07 Thread Paolo Bonzini


On 07/01/2015 08:18, Andy Lutomirski wrote:
>>> >> Thus far, I've been told unambiguously that a guest can't observe pvti
>>> >> while it's being written, and I think you're now telling me that this
>>> >> isn't true and that a guest *can* observe pvti while it's being
>>> >> written while the low bit of the version field is not set.  If so,
>>> >> this is rather strongly incompatible with the spec in the KVM docs.
>> >
>> > Where am I saying that?
> I thought the conclusion from what you and Marcelo pointed out about
> the code was that, once the first vCPU updated its pvti, it could
> start running guest code while the other vCPUs are still updating
> pvti, so its guest code can observe the other vCPUs mid-update.

Ah, in that sense you're right.  However, each VCPU cannot observe _its
own_ pvti entry while it's being written (no matter what's in the low
bit of the version field).

Paolo

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information

2015-01-07 Thread Jan Beulich
>>> On 06.01.15 at 12:55,  wrote:
> On 06/01/15 02:18, Boris Ostrovsky wrote:
> 
>> diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
>> index 5f295f3..a5eb81c 100644
>> --- a/xen/include/xen/pci.h
>> +++ b/xen/include/xen/pci.h
>> @@ -56,6 +56,8 @@ struct pci_dev {
>>  
>>  u8 phantom_stride;
>>  
>> +int node; /* NUMA node */
>> +
>>  enum pdev_type {
>>  DEV_TYPE_PCI_UNKNOWN,
>>  DEV_TYPE_PCIe_ENDPOINT,
>> @@ -102,7 +104,8 @@ void setup_hwdom_pci_devices(struct domain *,
>>  int pci_release_devices(struct domain *d);
>>  int pci_add_segment(u16 seg);
>>  const unsigned long *pci_get_ro_map(u16 seg);
>> -int pci_add_device(u16 seg, u8 bus, u8 devfn, const struct pci_dev_info *);
>> +int pci_add_device(u16 seg, u8 bus, u8 devfn,
>> +   const struct pci_dev_info *, int);
> 
> Please use parameter names in definitions.

For the added parameter - yes. For the pre-existing pointer one I don't
see a strong need to do so (and there was no name there before).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 4/4] libxl: Add interface for querying hypervisor about PCI topology

2015-01-07 Thread Dario Faggioli
On Mon, 2015-01-05 at 21:18 -0500, Boris Ostrovsky wrote:
> .. and use this new interface to display it along with CPU topology
> and NUMA information when 'xl info -n' command is issued
> 
Also, can we see how an `xl info -n' looks, on an IONUMA system?

> @@ -195,6 +195,24 @@ int xc_cputopoinfo(xc_interface *xch,
>  return 0;
>  }
>  
> +int xc_pcitopoinfo(xc_interface *xch,
> +   xc_pcitopoinfo_t *put_info)
> +{
> +int ret;
> +DECLARE_SYSCTL;
> +
> +sysctl.cmd = XEN_SYSCTL_pcitopoinfo;
> +
> +memcpy(&sysctl.u.pcitopoinfo, put_info, sizeof(*put_info));
> +
> +if ( (ret = do_sysctl(xch, &sysctl)) != 0 )
> +return ret;
> +
> +memcpy(put_info, &sysctl.u.pcitopoinfo, sizeof(*put_info));
> +
> +return 0;
> +}

> @@ -5121,6 +5121,64 @@ libxl_cputopology *libxl_get_cpu_topology(libxl_ctx 
> *ctx, int *nb_cpu_out)
>  return ret;
>  }
>  
> +libxl_pcitopology *libxl_get_pci_topology(libxl_ctx *ctx, int *num_devs)
> +{
> +GC_INIT(ctx);
> +xc_pcitopoinfo_t tinfo;
> +DECLARE_HYPERCALL_BUFFER(xen_sysctl_pcitopo_t, pcitopo);
> +libxl_pcitopology *ret = NULL;
> +int i, rc;
> +
I see from where this comes from. However, at least from new functions,
I think we should avoid using things like DECLARE_HYPERCALL_BUFFER etc.,
in libxl. They belong in libxc, IMO.

This is basically what Andrew was doing here (although that was on
xc_{topology,numa}info:

http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/hwloc-support-experimental
http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=blobdiff;f=tools/libxc/xc_misc.c;h=4f672cead5a4d2b1fc23bcddb6fb49e4d6ec72b5;hp=330345459176961cacc7fda506e843831fe7156a;hb=3585994405b6a73c137309dd4be91f48c71e4903;hpb=9a80d5056766535ac624774b96495f8b97b1d28b

Basically, what I'm suggesting is to have xc_pcitopoinfo() to do most of
the work, with libxl_get_pci_topology being just a wrapper to it.

In fact, if you grep/cscope for things like DECLARE_HYPERCALL_BUFFER in
tools/libxl, the *only* results are these:

libxl.c libxl_get_cpu_topology  5076 
DECLARE_HYPERCALL_BUFFER(xc_cpu_to_core_t, coremap);
libxl.c libxl_get_cpu_topology  5077 
DECLARE_HYPERCALL_BUFFER(xc_cpu_to_socket_t, socketmap);
libxl.c libxl_get_cpu_topology  5078 
DECLARE_HYPERCALL_BUFFER(xc_cpu_to_node_t, nodemap);
libxl.c libxl_get_numainfo  5142 
DECLARE_HYPERCALL_BUFFER(xc_node_to_memsize_t, memsize);
libxl.c libxl_get_numainfo  5143 
DECLARE_HYPERCALL_BUFFER(xc_node_to_memfree_t, memfree);
libxl.c libxl_get_numainfo  5144 
DECLARE_HYPERCALL_BUFFER(uint32_t, node_dists);

And I think we should work toward removing these too, rather than adding
new ones! :-)

Regards,
Dario

-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information

2015-01-07 Thread Jan Beulich
>>> On 06.01.15 at 03:18,  wrote:
> @@ -618,7 +620,22 @@ ret_t do_physdev_op(int cmd, 
> XEN_GUEST_HANDLE_PARAM(void) arg)
>  }
>  else
>  pdev_info.is_virtfn = 0;
> -ret = pci_add_device(add.seg, add.bus, add.devfn, &pdev_info);
> +
> +if ( add.flags & XEN_PCI_DEV_PXM )
> +{
> +uint32_t pxm;
> +int optarr_off = offsetof(struct physdev_pci_device_add, optarr) 
> /

unsigned int or size_t.

> --- a/xen/include/xen/pci.h
> +++ b/xen/include/xen/pci.h
> @@ -56,6 +56,8 @@ struct pci_dev {
>  
>  u8 phantom_stride;
>  
> +int node; /* NUMA node */

I thought I asked about this on v1 already: Does this really need to be
an int, when commonly node numbers are stored in u8/unsigned char?
Shrinking the field size would prevent the structure size from growing...

Of course an additional question would be whether the node wouldn't
better go into struct arch_pci_dev - that depends on whether we
expect ARM to be using NUMA...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] README, xen/Makefile: Update to Xen 4.5.0

2015-01-07 Thread Ian Campbell
On Tue, 2015-01-06 at 19:46 +0100, Sander Eikelenboom wrote:
> > diff --git a/README b/README
> > index 412607a..641bb23 100644
> > --- a/README
> > +++ b/README
> > @@ -1,9 +1,9 @@
> >  #
> > -__  ___  ___ _ 
> >  
> > -\ \/ /___ _ __   | || |  | ___|   _   _ _ __  ___| |_ __ _| |__ | | 
> > ___ 
> > - \  // _ \ '_ \  | || |_ |___ \ _| | | | '_ \/ __| __/ _` | '_ \| |/ _ 
> > \
> > - /  \  __/ | | | |__   _| ___) |_| |_| | | | \__ \ || (_| | |_) | |  
> > __/
> > -/_/\_\___|_| |_||_|(_)/   \__,_|_| 
> > |_|___/\__\__,_|_.__/|_|\___|
> > +__  ___  _   ___
> > +\ \/ /___ _ __   | || |  | ___| / _ \
> > + \  // _ \ '_ \  | || |_ |___ \| | | |
> > + /  \  __/ | | | |__   _| ___) | |_| |
> > +/_/\_\___|_| |_||_|(_)(_)___/
> >  
> >  #
> >  
> > @@ -19,14 +19,33 @@ is freely-distributable Open Source software, released 
> > under the GNU
> >  GPL. Since its initial public release, Xen has grown a large
> >  development community, spearheaded by xen.org (http://www.xen.org).
> 
> Shouldn't this be "Xen-project" and "http://www.xenproject.org";, since the 
> transition 
> to the Linux foundation ? 

For the URLs, I guess, but it's not urgent for 4.5 IMHO, the old URLs
aren't going anywhere.

For the prose, NAK, the hypervisor is called "Xen". Lets not make the
technical documentation use the long-winded and stilted "Xen Project
Hypervisor" or whatever when a simple "Xen" works fine.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 0/9] toolstack-based approach to pvhvm guest kexec

2015-01-07 Thread Olaf Hering
On Mon, Jan 05, Vitaly Kuznetsov wrote:

> Wei Liu  writes:
> 
> > Olaf mentioned his concern about handling ballooned pages in
> > <20141211153029.ga1...@aepfle.de>. Is that point moot now?
> 
> Well, the limitation is real and some guest-side handling will be
> required in case we want to support kexec with ballooning. But as David
> validly mentioned "It's the responsibility of the guest to ensure it
> either doesn't kexec when it is ballooned or that the kexec kernel can
> handle this". Not sure if we can (and need to) do anything hypevisor- or
> toolstack-side.

One approach would be to mark all pages as some sort of
populate-on-demand first. Then copy the existing assigned pages from
domA to domB and update the page type. The remaining pages are likely
ballooned. Once the guest tries to access them this should give the
hypervisor and/or toolstack a chance to assign a real RAM page to them.

I mean, if a host-assisted approach for kexec is implemented then this
approach must also cover ballooning.


Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/4] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient

2015-01-07 Thread Jan Beulich
>>> On 06.01.15 at 14:41,  wrote:
> On 06/01/15 02:18, Boris Ostrovsky wrote:
>> Instead of copying data for each field in xen_sysctl_topologyinfo separately
>> put cpu/socket/node into a single structure and do a single copy for each
>> processor.
>>
>> There is also no need to copy whole op to user at the end, max_cpu_index is
>> sufficient
>>
>> Rename xen_sysctl_topologyinfo and XEN_SYSCTL_topologyinfo to reflect the 
>> fact
>> that these are used for CPU topology. Subsequent patch will add support for
>> PCI topology sysctl.
>>
>> Signed-off-by: Boris Ostrovsky 
> 
> If we are going to change the hypercall, then can we see about making it
> a stable interface (i.e. not a sysctl/domctl)?  There are non-toolstack
> components which might want/need access to this information.  (i.e. I am
> still looking for a reasonable way to get this information from Xen in
> hwloc)

In which case leaving the sysctl alone and just adding a new non-sysctl
interface should be considered.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/4] sysctl: Add sysctl interface for querying PCI topology

2015-01-07 Thread Jan Beulich
>>> On 06.01.15 at 03:18,  wrote:
> --- a/xen/common/sysctl.c
> +++ b/xen/common/sysctl.c
> @@ -365,6 +365,66 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
> u_sysctl)
>  }
>  break;
>  

#ifdef HAS_PCI

> +case XEN_SYSCTL_pcitopoinfo:
> +{
> +xen_sysctl_pcitopoinfo_t *ti = &op->u.pcitopoinfo;
> +
> +if ( guest_handle_is_null(ti->pcitopo) ||
> + (ti->first_dev >= ti->num_devs) )
> +{
> +ret = -EINVAL;
> +break;
> +}
> +
> +for ( ; ti->first_dev < ti->num_devs; ti->first_dev++ )
> +{
> +xen_sysctl_pcitopo_t pcitopo;
> +struct pci_dev *pdev;
> +
> +if ( copy_from_guest_offset(&pcitopo, ti->pcitopo,
> +ti->first_dev, 1) )
> +{
> +ret = -EFAULT;
> +break;
> +}
> +
> +spin_lock(&pcidevs_lock);
> +pdev = pci_get_pdev(pcitopo.pcidev.seg, pcitopo.pcidev.bus,
> +pcitopo.pcidev.devfn);
> +if ( !pdev || (pdev->node == NUMA_NO_NODE) )
> +pcitopo.node = INVALID_TOPOLOGY_ID;
> +else
> +pcitopo.node = pdev->node;

Are hypervisor-internal node numbers really meaningful to the caller?

> +spin_unlock(&pcidevs_lock);
> +
> +if ( copy_to_guest_offset(ti->pcitopo, ti->first_dev,

__copy_ty_guest_offset()

> +  &pcitopo, 1) )
> +{
> +ret = -EFAULT;
> +break;
> +}
> +
> +if ( hypercall_preempt_check() )
> +break;

You didn't increment ->first_dev yet, i.e. you'd start with the same
index again when continuing later, and in the end you may not make
any forward progress.

> @@ -463,7 +464,7 @@ typedef struct xen_sysctl_lockprof_op 
> xen_sysctl_lockprof_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_op_t);
>  
>  /* XEN_SYSCTL_cputopoinfo */
> -#define INVALID_TOPOLOGY_ID  (~0U)
> +#define INVALID_TOPOLOGY_ID  (~0U) /* Also used by pcitopo */

Better extend the preceding comment.

> @@ -492,6 +493,36 @@ struct xen_sysctl_cputopoinfo {
>  typedef struct xen_sysctl_cputopoinfo xen_sysctl_cputopoinfo_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopoinfo_t);
>  
> +/* XEN_SYSCTL_pcitopoinfo */
> +struct xen_sysctl_pcitopo {
> +struct physdev_pci_device pcidev;
> +uint32_t node;
> +};
> +typedef struct xen_sysctl_pcitopo xen_sysctl_pcitopo_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopo_t);
> +
> +struct xen_sysctl_pcitopoinfo {
> +/* IN: Size of pcitopo array */
> +uint32_t num_devs;
> +
> +/*
> + * IN/OUT: First element of pcitopo array that needs to be processed by
> + * hypervisor.
> + * This is used primarily by hypercall continuations and callers will
> + * typically set it to zero
> + */
> +uint32_t first_dev;
> +
> +/*
> + * If not NULL, filled with node identifier for each pcidev

The "If not NULL" would be meaningful only if NULL had a special
meaning.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/7] tools/hotplug: remove SELinux options from var-lib-xenstored.mount

2015-01-07 Thread Olaf Hering
On Tue, Jan 06, Ian Campbell wrote:

> On Fri, 2014-12-19 at 12:25 +0100, Olaf Hering wrote:

...

> Acked-by: Ian Campbell 
> 
> (on commit s/Appearently/Apparently/; s/non-existant/non-existent/ in
> the commit log)

I made typos also in other commit messages. Should I resend the entire
series, or will this be done during commit?

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/7] tools/hotplug: remove SELinux options from var-lib-xenstored.mount

2015-01-07 Thread Ian Campbell
On Wed, 2015-01-07 at 10:23 +0100, Olaf Hering wrote:
> On Tue, Jan 06, Ian Campbell wrote:
> 
> > On Fri, 2014-12-19 at 12:25 +0100, Olaf Hering wrote:
> 
> ...
> 
> > Acked-by: Ian Campbell 
> > 
> > (on commit s/Appearently/Apparently/; s/non-existant/non-existent/ in
> > the commit log)
> 
> I made typos also in other commit messages. Should I resend the entire
> series, or will this be done during commit?

Looks like Konrad already committed, I don't know if he fixed the typos
(I suppose it doesn't matter now either way).

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Nominations for Xen 4.5 stable tree maintainer.

2015-01-07 Thread Jan Beulich
>>> On 06.01.15 at 17:15,  wrote:
> Hello,
> 
> Per http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases: 
> "Each stable branch has a maintainer who is nominated/volunteers according 
> to the Maintainer Election
>  process described in the project governance document 
> [http://www.xenproject.org/governance.html].
>  This will resulting in the MAINTAINERS file in the relevant branch being 
> patched to include the maintainer."
> 
> For the past year or so Jan Beulich has been the stable tree maintainer.

I think it is most efficient for one person to maintain all stable trees,
as e.g. the selection of what needs backporting is often pretty
common between the trees. So if someone wants to urgently take
over from me, I'd be handing over the 4.4 tree at once. But (just
like Andrew stated fro XenServer) since I'm doing the backports
for our internal purposes anyway, I would be happy to continue
pushing the results to the respective upstream trees.

> Since Xen 4.5 has branched that opens up a new stable tree and we can also
> stop maintaining Xen 4.3 stable tree.

Not just yet - there's certainly going to be a wrap-up 4.3.4.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xl only waits 33 seconds for ballooning to complete

2015-01-07 Thread Ian Campbell
On Tue, 2015-01-06 at 14:17 -0700, Mike Latimer wrote:
> Hi,
> 
> In a previous post (1), I mentioned issues seen while ballooning a large 
> amount of memory. In the current code, the ballooning process only has 33 
> seconds to complete, or the xl operation (i.e. domain create) will fail. When 
> a lot of ballooning is required, or the host is very slow to balloon memory, 
> this delay is not sufficient.
> 
> The code involved is tools/libxl/xl_cmdimpl.c:freemem. This function retries 
> 3 
> times, and each retry includes a 10 second delay in 
> libxl_wait_for_free_memory 
> and a 1 second delay in libxl_wait_for_memory_target.
> 
> Is there a better approach, which would account for ballooning operations 
> that 
> take a much longer time to complete?
> 
> The easiest option is to simply increase the retry count, but that would 
> again 
> leave us with a fixed window of time for an operation to complete. It seems 
> like something that monitors the balloon process, and continues to wait if it 
> is progressing, might be a better approach.

That's exactly what I was about to suggest as I read the penultimate
paragraph, i.e. keep waiting so long as some reasonable delta occurs on
each iteration.

Ian.

> 
> Any ideas?
> 
> Thanks,
> Mike
> 
> 1. http://lists.xen.org/archives/html/xen-devel/2014-12/msg01443.html
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 7/7] tools/hotplug: add wrapper to start xenstored

2015-01-07 Thread Olaf Hering
On Tue, Jan 06, Ian Campbell wrote:

> On Fri, 2014-12-19 at 12:25 +0100, Olaf Hering wrote:
> > The shell wrapper in xenstored.service does not handle XENSTORE_TRACE.
> > 
> > Create a separate wrapper script which is used in the sysv runlevel
> > script and in systemd xenstored.service. It preserves existing
> > behaviour by handling the XENSTORE_TRACE boolean. It also implements
> > the handling of XENSTORED_ARGS=. This variable has to be added to
> > sysconfig/xencommons.
> 
> Why don't we just drop XENSTORED_* in favour of XENSTORED_ARGS (with an
> example in the sysconfig file of enabling tracing if you like)?

After having two weeks to think about this I came to the same
conclusion. I think whatever the outcome is, the boolean should be
removed. The sysconfig file should get a XENSTORED_ARGS="" along with a
help text which mentions "-T /path" and "xenstored --help" to get other
options because there is no man page.

> Going to a wrapper script just to make some fairly uncommon debugging
> option marginally more convenient seems like overkill to me, plus
> XENSTORED_ARGS would allow for passing other useful options to
> xenstored.

If I recall correctly the point of the current 'sh -c "exec ..."' stunt
was to expand the XENSTORE variable from the sysconfig file. But this
approach leads to failures with SELinux because the socket passing does
not work this way. Up to now I have not seen a success report for
selinux+systemd+xenstored. Maybe its already somewhere in the other
unread mails.

In my cover letter I provided some possible ways to handle
selinux+systemd+xenstored. Ideally the approach "Exec=/usr/bin/env
$XENSTORED --no-fork $XENSTORED_ARGS" works because it means its
possible to select a binary via the sysconfig file. But it also means
the XENSTORE_TRACE boolean has to be removed in favour of the plain
XENSTORED_ARGS= approach mentioned above. Finally this would avoid the
need for a wrapper script.

Hopefully someone with access to a SELinux enabled system will report
which approach actually works.

> > The wrapper uses exec unconditionally. This works because the
> > systemd service file passes --no-fork, which has the desired effect
> > that the binary launched by systemd becomes the final daemon
> > process. The sysv script does not pass --no-fork, which causes
> > xenstored to fork internally to return to the caller of the wrapper
> > script.
> > 
> > The place of the wrapper is currently LIBEXEC_BIN, it has to be
> > decided what the final location is supposed to be. IanJ wants it in
> > "/etc".
> 
> If we go this route then I agree with Ian J. (/etc/xen/scripts, I
> suppose).

I have not heard back which location has to be used. If /etc/xen/scripts
is the place, so be it. I thought this is just for hotplug scripts.


Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Xen-users] HVM PCI Passthrough: Code 12: Undersized PCI MMIO region?

2015-01-07 Thread Jan Beulich
>>> On 06.01.15 at 14:00,  wrote:
> On Tue, Dec 02, 2014 at 10:08:35PM -0500, Stephen Oberholtzer wrote:
>> (1) Is my guess correct?  Or at least close?

Depends on whether you assign the devices at boot time of the guest,
or via hotplug. In the latter case, I suppose only the MMIO hole size
setting as pointed out by Don will help you, while in the former case I'd
expect things to work (and if not, using a debug hypervisor with high
enough log level would make hvmloader send information to the
hypervisor log that ought to help diagnosing the issue).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 7/7] tools/hotplug: add wrapper to start xenstored

2015-01-07 Thread Olaf Hering
On Tue, Jan 06, Ian Jackson wrote:

> Olaf Hering writes ("[PATCH 7/7] tools/hotplug: add wrapper to start 
> xenstored"):
> > The shell wrapper in xenstored.service does not handle XENSTORE_TRACE.
> ...
> > +XENSTORED_LIBEXEC = xenstored.sh
> 
> Should be in /etc as previously discussed.  Previously I wrote:
> 
>   Bottom line: as relevant maintainer, I'm afraid I'm going to insist
>   that this script be in /etc.
> 
> I'm disappointed.  It is not acceptable to resubmit a change ignoring
> such unequivocal feedback.

Plain "/etc" wont work, I think. /etc/xen/scripts perhaps? But see my
other reply to IanC, maybe there is a way to avoid the wrapper.

And after having some time to think about this: If one has a need to
adjust something, then this could be done in the xencommons script right
away. In other words, the modification can be done there instead of
calling the wrapper.

> Nacked-by: Ian Jackson 
> 
> > +hotplug/Linux/xenstored.sh
> 
> Although many of the existing hotplug scripts have this notion of
> calling things "foo.sh" because they happen to be written in shell, I
> think this is bad practice.
> 
> I would prefer xenstored-wrap or some such.  (My co-maintainers may
> disagree...)  But this is a bit of a bikeshed issue.

I agree. Initally I had xenstored-launcher in mind.

> > echo -n Starting $XENSTORED...
> > -   $XENSTORED --pid-file /var/run/xenstored.pid $XENSTORED_ARGS
> > +   XENSTORED=$XENSTORED \
> > +   XENSTORED_TRACE=$XENSTORED_TRACE \
> > +   XENSTORED_ARGS=$XENSTORED_ARGS \
> > +   ${LIBEXEC_BIN}/xenstored.sh --pid-file 
> > /var/run/xenstored.pid
> 
> It might be easier to "." xenstore-wrap.  Failing that using `export'
> will avoid this rather odd and repetitive style.

I think thats a good idea. Something like this may work, doing the "."
and the "exec" in the subshell:

 (
 set -- --pid-file /var/run/xenstored.pid
 . xenstored.sh 
 )


> > diff --git a/tools/hotplug/Linux/xenstored.sh.in 
> > b/tools/hotplug/Linux/xenstored.sh.in
> > new file mode 100644
> > index 000..dc806ee
> > --- /dev/null
> > +++ b/tools/hotplug/Linux/xenstored.sh.in
> > @@ -0,0 +1,6 @@
> > +#!/bin/sh
> > +if test -n "$XENSTORED_TRACE"
> > +then
> > +   XENSTORED_ARGS=" -T /var/log/xen/xenstored-trace.log"
> > +fi
> > +exec $XENSTORED $@ $XENSTORED_ARGS
> 
> This should probably have "" around "$@" just in case.

Ok. 


I will wait for results from SELinux testing before respinning this
patch.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Crash on efi_reset_machine on Lenovo ThinkCentre m93 (Haswell)

2015-01-07 Thread Jan Beulich
>>> On 05.01.15 at 16:04,  wrote:

Odd -

> (XEN) [ Xen-4.4.1  x86_64  debug=n  Not tainted ]
> (XEN) CPU:0
> (XEN) RIP:e008:[] d5fd8412

the address here ...

> (XEN) [ Xen-4.5.0-rc-lK  x86_64  debug=y  Tainted:C ]
> (XEN) CPU:0
> (XEN) RIP:e008:[] d5fd83d0

... is different from the one here, yet one would think the firmware
always does the same thing on a given runtime services call. Except
if the page here wasn't marked for runtime use in the memory map
(which you didn't provide).

> I hadn't dug deep enough in this to figure out how it works on Linux but
> was wondering if anybody else had seen this?

Iirc on Linux rebooting via runtime services is only possible when
explicitly asking for it on the command line. In any event this very
much looks like a firmware issue (and knowing what's at the
addresses in question would be interesting), and the only
workaround I can think of would be to use "no-efi-rs".

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.5 Development Update (GA slip by a week)

2015-01-07 Thread Ian Campbell
On Tue, 2015-01-06 at 12:23 -0800, Suriyan Ramasami wrote:
> On Tue, Jan 6, 2015 at 10:54 AM, Suriyan Ramasami  wrote:
> > On Tue, Jan 6, 2015 at 10:51 AM, Lars Kurth  
> > wrote:
> >>
> >> On 6 Jan 2015, at 18:08, Suriyan Ramasami  wrote:
> >>
> >>>
> >>> I shall try my hand at updating the information again. If this needs
> >>> to be done (yesterday), then as a temporary solution we could just
> >>> delete this information for now, and I shall work on it soon.
> >>
> >> Ideally it needs to be done by next Wed. If you have the content, you 
> >> could send it to me and I can fix the page
> >
> > Thanks Lars for the offer. I just started editing the page, and looks
> > simple enough to get some content out there. I should have updated the
> > relevant information in a few hours from now. Please do help me in
> > fixing the page once its done (if it needs fixing)
> >
> > Thanks!
> > - Suriyan
> 
> Hello Lars/Ian,
>I have updated the wiki somewhat to an OK level of information.

Thanks, I've removed the todo banner since there is some content now
(I've not reviewed it in detail, since I don't know about the h/w, but
it all looks very plausible).

Earlier you said:
> I did have specific content for this wiki page, as the Arndale XEN
> information is Linaro centric and hence not applicable to the OdroidXU
> - especially the boot loader part and the linux dom0 part. The rest of
> the information is quite similar.

I think that some of the Linaro specific stuff on the Arndale page is no
longer needed, e.g. I run mainline u-boot and Linux just fine on one
these days.

So I think it is probably worth trying to refactor into a generic Exynos
part and board specific details, but that's not urgent for the 4.5
release.

Anyway, not asking you to tackle all that. I've added it to
http://wiki.xen.org/wiki/Xen_Document_Days/TODO, along with my previous
suggestion to get rid of the duplicated lists on the main ARM page. I
might try and have a go at that on the next doc day.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/7 v3] tools/hotplug: systemd changes for 4.5

2015-01-07 Thread Olaf Hering
On Mon, Jan 05, Konrad Rzeszutek Wilk wrote:

> +Release Issues
> +==
> +
> +While we did the utmost to get a release out, there are certain
> +fixes which were not complete on time. As such please reference this
> +section if you are running into trouble.
> +
> +* systemd not working with Fedora Core 20, 21 or later (systemctl
> +  reports xenstore failing to start).
> +
> +  Systemd support is now part of Xen source code. While utmost work has
> +  been done to make the systemd files compatible across all the
> +  distributions, there might issues when using systemd files from
> +  Xen sources. The work-around is to define an mount entry in
> +  /etc/fstab as follow:
> +
> +  tmpfs   /var/lib/xenstored  tmpfs
> +  mode=755,context="system_u:object_r:xenstored_var_lib_t:s0" 0 0
> +
> +

Shouldnt this go into a new SELinux section in the INSTALL file?

Its my understanding that the reported SELinux failure is not only
related to the context= mount option, but also to the socket passing
from systemd.


Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] qemu upstream arm compile error failed

2015-01-07 Thread Mao Mingy
I tried to use the qemu upstream on xen Dom0 (arm arch) to get the PVFB, 
PVKBD and PVMOUSE work.


However, I got compiler error at the begin. Here are the detail steps:
1. download the source code from " git://git.qemu-project.org/qemu.git ".
2. " cd qemu "
3. configure using command "./configure 
--cross-prefix=arm-linux-gnueabihf- --target-list=arm-softmmu --enable-fdt"

4. " make "

However,  I got the following compile error

  GEN tests/test-qmp-commands.h
  GEN   tests/test-qapi-event.h
  CCtests/qemu-iotests/socket_scm_helper.o
  LINK  tests/qemu-iotests/socket_scm_helper
/usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/bin/ld: 
cannot find -lgthread-2.0
/usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/bin/ld: 
cannot find -lglib-2.0

collect2: error: ld returned 1 exit status
make: *** [tests/qemu-iotests/socket_scm_helper] Error 1

Looks like the build process need some libs missed in my 
gcc-cross/arm-linux-gnueabihf/4.8


What is the reason and solution?

Thanks & Regards,
Mao
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] collect Dirty bitmap from Xen source code

2015-01-07 Thread Minalkumar Patel
I am working on Xen live Migration. 

I want to print dirty_bitmap which is given in 
XEN_GUEST_HANDLE_64(uint8) dirty_bitmap in shadow.h

and Dirty bit tracking is given in:

int shadow_track_dirty_vram(struct domain *d,
  unsigned long first_pfn,
  unsigned long nr,
  XEN_GUEST_HANDLE_64(uint8) dirty_bitmap);

Anyone can help me to print dirty bitmap as i want to collect dirty bitmap from 
log files or I want to print dirty bitmap.___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] qemu upstream arm compile error failed

2015-01-07 Thread Ian Campbell
On Wed, 2015-01-07 at 18:02 +0800, Mao Mingy wrote:
> I tried to use the qemu upstream on xen Dom0 (arm arch) to get the
> PVFB, PVKBD and PVMOUSE work.

With Xen 4.5 qemu is automatically built for ARM, no need to do it
separately AFAIK.

Also note that you need to build qemu for i386, even on ARM (since the
PV backends are currently entwined with x86 for historical reasons. This
doesn't matter since qemu does no CPU emulation under Xen anyway.

Lastly:
> /usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/bin/ld:
>  cannot find -lgthread-2.0
> /usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/bin/ld:
>  cannot find -lglib-2.0

I think it is pretty obvious from these error messages that you are
missing some build dependencies, this isn't a Xen specific issue AFAICT.
The solution is to make ARM versions of those libraries available in
your cross environment.

Ian.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4] x86/viridian: Add Partition Reference Time enlightenment

2015-01-07 Thread Jan Beulich
>>> On 14.10.14 at 12:45,  wrote:
> The presence of the partition reference time enlightenment persuades newer
> versions of Windows to prefer the TSC as their primary time source. Hence,
> if rdtsc is not being emulated and is invariant then many vmexits (for
> alternative time sources such as the HPET or reference counter MSR) can
> be avoided.
> 
> The implementation is not yet complete as no attempt is made to prevent
> emulation of rdtsc if the enlightenment is active and guest and host
> TSC frequencies differ. To do that requires invasive changes in the core
> x86 time code and hence a lot more testing.
> 
> This patch avoids the issue by disabling the enlightenment if rdtsc is
> being emulated, causing Windows to choose another time source. This is
> safe, but may cause a big variation in performance of guests migrated
> between hosts of differing TSC frequency. Thus the enlightenment is not
> enabled in the default set, but may be enabled to improve guest performance
> where such migrations are not a concern.
> 
> See section 15.4 of the Microsoft Hypervisor Top Level Functional
> Specification v4.0a for details.
> 
> Signed-off-by: Paul Durrant 

This doesn't apply anymore and hence needs to be re-spun against
current staging.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4] x86/viridian: Add Partition Reference Time enlightenment

2015-01-07 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 07 January 2015 10:08
> To: Paul Durrant
> Cc: Christoph Egger; Ian Campbell; Ian Jackson; Stefano Stabellini; xen-
> de...@lists.xenproject.org; Keir (Xen.org)
> Subject: Re: [PATCH v4] x86/viridian: Add Partition Reference Time
> enlightenment
> 
> >>> On 14.10.14 at 12:45,  wrote:
> > The presence of the partition reference time enlightenment persuades
> newer
> > versions of Windows to prefer the TSC as their primary time source. Hence,
> > if rdtsc is not being emulated and is invariant then many vmexits (for
> > alternative time sources such as the HPET or reference counter MSR) can
> > be avoided.
> >
> > The implementation is not yet complete as no attempt is made to prevent
> > emulation of rdtsc if the enlightenment is active and guest and host
> > TSC frequencies differ. To do that requires invasive changes in the core
> > x86 time code and hence a lot more testing.
> >
> > This patch avoids the issue by disabling the enlightenment if rdtsc is
> > being emulated, causing Windows to choose another time source. This is
> > safe, but may cause a big variation in performance of guests migrated
> > between hosts of differing TSC frequency. Thus the enlightenment is not
> > enabled in the default set, but may be enabled to improve guest
> performance
> > where such migrations are not a concern.
> >
> > See section 15.4 of the Microsoft Hypervisor Top Level Functional
> > Specification v4.0a for details.
> >
> > Signed-off-by: Paul Durrant 
> 
> This doesn't apply anymore and hence needs to be re-spun against
> current staging.

Ok. I'll rebase and repost.

  Paul

> 
> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] VT-d: don't crash when PTE bits 52 and up are non-zero

2015-01-07 Thread Jan Beulich
>>> On 23.12.14 at 07:52,  wrote:
>>  From: Jan Beulich [mailto:jbeul...@suse.com]
>> Sent: Friday, December 19, 2014 7:26 PM
>> 
>> This can (and will) be legitimately the case when sharing page tables
>> with EPT (more of a problem before p2m_access_rwx became zero, but
>> still possible even now when other than that is the default for a
>> guest), leading to an unconditional crash (in print_vtd_entries())
>> when a DMA remapping fault occurs.
> 
> could you elaborate the scenarios when bits 52+ are non-zero?
> 
>> 
>> Signed-off-by: Jan Beulich 
> 
> but the changes looks reasonable to me.
> 
> Signed-off-by: Kevin Tian 

I translated this to a Reviewed-by, as S-o-b doesn't seem to make
sense here.

Konrad - please indicate whether this can also go into 4.5.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5] x86/viridian: Add Partition Reference Time enlightenment

2015-01-07 Thread Paul Durrant
The presence of the partition reference time enlightenment persuades newer
versions of Windows to prefer the TSC as their primary time source. Hence,
if rdtsc is not being emulated and is invariant then many vmexits (for
alternative time sources such as the HPET or reference counter MSR) can
be avoided.

The implementation is not yet complete as no attempt is made to prevent
emulation of rdtsc if the enlightenment is active and guest and host
TSC frequencies differ. To do that requires invasive changes in the core
x86 time code and hence a lot more testing.

This patch avoids the issue by disabling the enlightenment if rdtsc is
being emulated, causing Windows to choose another time source. This is
safe, but may cause a big variation in performance of guests migrated
between hosts of differing TSC frequency. Thus the enlightenment is not
enabled in the default set, but may be enabled to improve guest performance
where such migrations are not a concern.

See section 15.4 of the Microsoft Hypervisor Top Level Functional
Specification v4.0a for details.

Signed-off-by: Paul Durrant 
Cc: Keir Fraser 
Cc: Jan Beulich 
Acked-by: Ian Campbell 
Cc: Ian Jackson 
Cc: Stefano Stabellini 
Reviewed-by: Christoph Egger 
---
v5:
 - Re-based to staging

 docs/man/xl.cfg.pod.5  |6 ++
 tools/libxl/libxl_dom.c|3 +
 tools/libxl/libxl_types.idl|1 +
 xen/arch/x86/hvm/viridian.c|  111 
 xen/include/asm-x86/hvm/viridian.h |   25 +++
 xen/include/public/arch-x86/hvm/save.h |   11 
 xen/include/public/hvm/params.h|7 +-
 7 files changed, 163 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 622ea53..e2f91fc 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1231,6 +1231,12 @@ This group incorporates Partition Time Reference Counter 
MSR. This
 enlightenment can improve performance of Windows 8 and Windows
 Server 2012 onwards.
 
+=item B
+
+This set incorporates the Partition Reference TSC MSR. This
+enlightenment can improve performance of Windows 7 and Windows
+Server 2008 R2 onwards.
+
 =item B
 
 This is a special value that enables the default set of groups, which
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 1d33a18..48d661a 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -263,6 +263,9 @@ static int hvm_set_viridian_features(libxl__gc *gc, 
uint32_t domid,
 if (libxl_bitmap_test(&enlightenments, 
LIBXL_VIRIDIAN_ENLIGHTENMENT_TIME_REF_COUNT))
 mask |= HVMPV_time_ref_count;
 
+if (libxl_bitmap_test(&enlightenments, 
LIBXL_VIRIDIAN_ENLIGHTENMENT_REFERENCE_TSC))
+mask |= HVMPV_reference_tsc;
+
 if (mask != 0 &&
 xc_hvm_param_set(CTX->xch,
  domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index f7fc695..1214d2e 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -192,6 +192,7 @@ libxl_viridian_enlightenment = 
Enumeration("viridian_enlightenment", [
 (0, "base"),
 (1, "freq"),
 (2, "time_ref_count"),
+(3, "reference_tsc"),
 ])
 
 #
diff --git a/xen/arch/x86/hvm/viridian.c b/xen/arch/x86/hvm/viridian.c
index 3197b6b..cb689f6 100644
--- a/xen/arch/x86/hvm/viridian.c
+++ b/xen/arch/x86/hvm/viridian.c
@@ -21,6 +21,7 @@
 #define VIRIDIAN_MSR_HYPERCALL  0x4001
 #define VIRIDIAN_MSR_VP_INDEX   0x4002
 #define VIRIDIAN_MSR_TIME_REF_COUNT 0x4020
+#define VIRIDIAN_MSR_REFERENCE_TSC  0x4021
 #define VIRIDIAN_MSR_TSC_FREQUENCY  0x4022
 #define VIRIDIAN_MSR_APIC_FREQUENCY 0x4023
 #define VIRIDIAN_MSR_EOI0x4070
@@ -40,6 +41,7 @@
 #define CPUID3A_MSR_APIC_ACCESS(1 << 4)
 #define CPUID3A_MSR_HYPERCALL  (1 << 5)
 #define CPUID3A_MSR_VP_INDEX   (1 << 6)
+#define CPUID3A_MSR_REFERENCE_TSC  (1 << 9)
 #define CPUID3A_MSR_FREQ   (1 << 11)
 
 /* Viridian CPUID 404, Implementation Recommendations. */
@@ -95,6 +97,8 @@ int cpuid_viridian_leaves(unsigned int leaf, unsigned int 
*eax,
 *eax |= CPUID3A_MSR_FREQ;
 if ( viridian_feature_mask(d) & HVMPV_time_ref_count )
 *eax |= CPUID3A_MSR_TIME_REF_COUNT;
+if ( viridian_feature_mask(d) & HVMPV_reference_tsc )
+*eax |= CPUID3A_MSR_REFERENCE_TSC;
 break;
 case 4:
 /* Recommended hypercall usage. */
@@ -155,6 +159,17 @@ static void dump_apic_assist(const struct vcpu *v)
v, aa->fields.enabled, (unsigned long)aa->fields.pfn);
 }
 
+static void dump_reference_tsc(const struct domain *d)
+{
+const union viridian_reference_tsc *rt;
+
+rt = &d->arch.hvm_domain.viridian.reference_tsc;
+
+printk(XENLOG_G_INFO "d%d: VIRIDIAN REFERENCE_TSC: enabled: %x pfn: %lx\n",
+   d->domain_id,
+   rt->fields.e

Re: [Xen-devel] [PATCH v5 0/9] toolstack-based approach to pvhvm guest kexec

2015-01-07 Thread David Vrabel
On 07/01/15 09:10, Olaf Hering wrote:
> On Mon, Jan 05, Vitaly Kuznetsov wrote:
> 
>> Wei Liu  writes:
>>
>>> Olaf mentioned his concern about handling ballooned pages in
>>> <20141211153029.ga1...@aepfle.de>. Is that point moot now?
>>
>> Well, the limitation is real and some guest-side handling will be
>> required in case we want to support kexec with ballooning. But as David
>> validly mentioned "It's the responsibility of the guest to ensure it
>> either doesn't kexec when it is ballooned or that the kexec kernel can
>> handle this". Not sure if we can (and need to) do anything hypevisor- or
>> toolstack-side.
> 
> One approach would be to mark all pages as some sort of
> populate-on-demand first. Then copy the existing assigned pages from
> domA to domB and update the page type. The remaining pages are likely
> ballooned. Once the guest tries to access them this should give the
> hypervisor and/or toolstack a chance to assign a real RAM page to them.
> 
> I mean, if a host-assisted approach for kexec is implemented then this
> approach must also cover ballooning.

It is not possible for the hypervisor or toolstack to do what you want
because there may not be enough free memory to repopulate the new domain.

The guest can handle this by:

1. Not ballooning (this is common in cloud environments).
2. Reducing the balloon prior to kexec.
3. Running the kexec'd image in a reserved chunk of memory (the crash
kernel case).
4. Providing balloon information to the kexec'd image.

None of these require any additional hypervisor or toolstack support and
1-3 are trivial for a guest to implement.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 0/9] toolstack-based approach to pvhvm guest kexec

2015-01-07 Thread Vitaly Kuznetsov
Olaf Hering  writes:

> On Mon, Jan 05, Vitaly Kuznetsov wrote:
>
>> Wei Liu  writes:
>> 
>> > Olaf mentioned his concern about handling ballooned pages in
>> > <20141211153029.ga1...@aepfle.de>. Is that point moot now?
>> 
>> Well, the limitation is real and some guest-side handling will be
>> required in case we want to support kexec with ballooning. But as David
>> validly mentioned "It's the responsibility of the guest to ensure it
>> either doesn't kexec when it is ballooned or that the kexec kernel can
>> handle this". Not sure if we can (and need to) do anything hypevisor- or
>> toolstack-side.
>
> One approach would be to mark all pages as some sort of
> populate-on-demand first. Then copy the existing assigned pages from
> domA to domB and update the page type. The remaining pages are likely
> ballooned. Once the guest tries to access them this should give the
> hypervisor and/or toolstack a chance to assign a real RAM page to
> them.

The thing is .. we don't have these pages when kexec is being performed,
they are already ballooned out and the hypervisor doesn't have the
knowledge of which GFNs should be re-populated. I think it is possible
to keep track of all pages the guest balloons out for this purpose, but 
..

>
> I mean, if a host-assisted approach for kexec is implemented then this
> approach must also cover ballooning.

I don't see why solving the issue hypervisor-side is a must. When the
guest performs kdump we don't care about the ballooning as we have a
separate memory area which is supposed to have no ballooned out
pages. When we do kexec nothing stops us from asking balloon driver to
bring everything back, it is fine to perform non-trivial work before
kexec (e.g. we shutdown all the devices).

But, as I said, I'll try playing with ballooning to make these thoughts
not purely theoretical.

-- 
  Vitaly

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 0/9] toolstack-based approach to pvhvm guest kexec

2015-01-07 Thread Jan Beulich
>>> On 07.01.15 at 11:41,  wrote:
> On 07/01/15 09:10, Olaf Hering wrote:
>> On Mon, Jan 05, Vitaly Kuznetsov wrote:
>> 
>>> Wei Liu  writes:
>>>
 Olaf mentioned his concern about handling ballooned pages in
 <20141211153029.ga1...@aepfle.de>. Is that point moot now?
>>>
>>> Well, the limitation is real and some guest-side handling will be
>>> required in case we want to support kexec with ballooning. But as David
>>> validly mentioned "It's the responsibility of the guest to ensure it
>>> either doesn't kexec when it is ballooned or that the kexec kernel can
>>> handle this". Not sure if we can (and need to) do anything hypevisor- or
>>> toolstack-side.
>> 
>> One approach would be to mark all pages as some sort of
>> populate-on-demand first. Then copy the existing assigned pages from
>> domA to domB and update the page type. The remaining pages are likely
>> ballooned. Once the guest tries to access them this should give the
>> hypervisor and/or toolstack a chance to assign a real RAM page to them.
>> 
>> I mean, if a host-assisted approach for kexec is implemented then this
>> approach must also cover ballooning.
> 
> It is not possible for the hypervisor or toolstack to do what you want
> because there may not be enough free memory to repopulate the new domain.
> 
> The guest can handle this by:
> 
> 1. Not ballooning (this is common in cloud environments).
> 2. Reducing the balloon prior to kexec.

Which may fail because again there may not be enough memory to
claim back from the hypervisor.

Jan

> 3. Running the kexec'd image in a reserved chunk of memory (the crash
> kernel case).
> 4. Providing balloon information to the kexec'd image.
> 
> None of these require any additional hypervisor or toolstack support and
> 1-3 are trivial for a guest to implement.
> 
> David




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 0/9] toolstack-based approach to pvhvm guest kexec

2015-01-07 Thread Olaf Hering
On Wed, Jan 07, David Vrabel wrote:

> 2. Reducing the balloon prior to kexec.

We carry a patch for kexec(1) which does balloon up before doing the
actual kexec call. I propose to get such change into the upstream kexec
tools if that is indeed the way to go. The benefit is that the guest
waits until every ballooned page is populated. If the host is short on
memory then the guest will hang instead of crash after kexec.

https://build.opensuse.org/package/view_file/Kernel:kdump/kexec-tools/kexec-tools-xen-balloon-up.patch

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 0/9] toolstack-based approach to pvhvm guest kexec

2015-01-07 Thread Olaf Hering
On Wed, Jan 07, Vitaly Kuznetsov wrote:

> The thing is .. we don't have these pages when kexec is being performed,
> they are already ballooned out and the hypervisor doesn't have the
> knowledge of which GFNs should be re-populated. I think it is possible
> to keep track of all pages the guest balloons out for this purpose, but 
> ..

Have you tried to make the new guest a PoD guest? That way it may work
out of the box already.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 0/5] enable Memory Bandwidth Monitoring (MBM) for VMs

2015-01-07 Thread Chao Peng
Changes from v1:
* Move event type check from xc to xl;
* Add retry capability for MBM sampling;
* Fix Coding style/docs;

Intel Memory Bandwidth Monitoring(MBM) is a new hardware feature
which builds on the CMT infrastructure to allow monitoring of system
memory bandwidth. Event codes are provided to monitor both "total"
and "local" bandwidth, meaning bandwidth over QPI and other external
links can be monitored.

For XEN, MBM is used to monitor memory bandwidth for VMs. Due to its
dependency on CMT, the software also makes use of most of CMT codes.
Actually, besides introducing two additional events and some cpuid
feature bits, there are no extra changes compared to cache occupancy
monitoring in CMT. Due to this, CMT should be enabled first to use
this feature.

For interface changes, the patch serial only introduces a new command
"XEN_SYSCTL_PSR_CMT_get_l3_event_mask" which exposes MBM feature
capability to user space and introduces two additional options for
"xl psr-cmt-show":
total_mem_bandwidth: Show total memory bandwidth
local_mem_bandwidth: Show local memory bandwidth

The usage flow keeps the same with CMT.

Chao Peng (5):
  x86: expose CMT L3 event mask to user space
  tools: add routine to get CMT L3 event mask
  tools: correct coding style for psr
  tools: code refactoring for MBM
  tools: add total/local memory bandwith monitoring

 docs/man/xl.pod.1 |9 +++
 tools/libxc/include/xenctrl.h |   11 +--
 tools/libxc/xc_psr.c  |   42 ++--
 tools/libxl/libxl.h   |   20 --
 tools/libxl/libxl_psr.c   |  147 +++--
 tools/libxl/libxl_types.idl   |2 +
 tools/libxl/xl_cmdimpl.c  |   69 +--
 tools/libxl/xl_cmdtable.c |4 +-
 xen/arch/x86/sysctl.c |3 +
 xen/include/public/sysctl.h   |1 +
 10 files changed, 254 insertions(+), 54 deletions(-)

-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 3/5] tools: correct coding style for psr

2015-01-07 Thread Chao Peng
- space: remove space after '(' or before ')' in 'if' condition;
- indention: align function definition/call arguments;

Signed-off-by: Chao Peng 
---
 tools/libxc/include/xenctrl.h |8 
 tools/libxc/xc_psr.c  |   10 +-
 tools/libxl/libxl.h   |   11 +++
 tools/libxl/libxl_psr.c   |   11 +++
 tools/libxl/xl_cmdimpl.c  |   11 ++-
 5 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 96b357c..c6e9e3e 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2693,15 +2693,15 @@ typedef enum xc_psr_cmt_type xc_psr_cmt_type;
 int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid);
 int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid);
 int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid,
-uint32_t *rmid);
+   uint32_t *rmid);
 int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid);
 int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch,
-uint32_t *upscaling_factor);
+   uint32_t *upscaling_factor);
 int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask);
 int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu,
 uint32_t *l3_cache_size);
-int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid,
-uint32_t cpu, uint32_t psr_cmt_type, uint64_t *monitor_data);
+int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, uint32_t cpu,
+uint32_t psr_cmt_type, uint64_t *monitor_data);
 int xc_psr_cmt_enabled(xc_interface *xch);
 #endif
 
diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c
index e76a0f9..e3ecc41 100644
--- a/tools/libxc/xc_psr.c
+++ b/tools/libxc/xc_psr.c
@@ -47,7 +47,7 @@ int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid)
 }
 
 int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid,
-uint32_t *rmid)
+   uint32_t *rmid)
 {
 int rc;
 DECLARE_DOMCTL;
@@ -88,7 +88,7 @@ int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t 
*total_rmid)
 }
 
 int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch,
-uint32_t *upscaling_factor)
+   uint32_t *upscaling_factor)
 {
 static int val = 0;
 int rc;
@@ -137,7 +137,7 @@ int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, 
uint32_t *event_mask)
 }
 
 int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu,
-  uint32_t *l3_cache_size)
+ uint32_t *l3_cache_size)
 {
 static int val = 0;
 int rc;
@@ -162,8 +162,8 @@ int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, 
uint32_t cpu,
 return rc;
 }
 
-int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid,
-uint32_t cpu, xc_psr_cmt_type type, uint64_t *monitor_data)
+int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, uint32_t cpu,
+xc_psr_cmt_type type, uint64_t *monitor_data)
 {
 xc_resource_op_t op;
 xc_resource_entry_t entries[2];
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 42ace76..d84ff7f 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -1455,10 +1455,13 @@ int libxl_psr_cmt_domain_attached(libxl_ctx *ctx, 
uint32_t domid);
 int libxl_psr_cmt_enabled(libxl_ctx *ctx);
 int libxl_psr_cmt_type_supported(libxl_ctx *ctx, libxl_psr_cmt_type type);
 int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid);
-int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, uint32_t socketid,
-uint32_t *l3_cache_size);
-int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, uint32_t domid,
-uint32_t socketid, uint32_t *l3_cache_occupancy);
+int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx,
+uint32_t socketid,
+uint32_t *l3_cache_size);
+int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx,
+  uint32_t domid,
+  uint32_t socketid,
+  uint32_t *l3_cache_occupancy);
 #endif
 
 /* misc */
diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
index 3018a0d..0f2c7e0 100644
--- a/tools/libxl/libxl_psr.c
+++ b/tools/libxl/libxl_psr.c
@@ -153,8 +153,9 @@ int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t 
*total_rmid)
 return rc;
 }
 
-int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, uint32_t socketid,
- uint32_t *l3_cache_size)
+int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx,
+uint32_t socketid,
+uint32_t *l3_cache_size)
 {
 GC_INIT(ctx);
 
@@ -178,8 +179,10 @@ out:
 return rc;
 }
 
-int libxl_psr_cmt_get_cache_occupancy(libxl_c

[Xen-devel] [PATCH v2 5/5] tools: add total/local memory bandwith monitoring

2015-01-07 Thread Chao Peng
Add Memory Bandwidth Monitoring(MBM) for VMs. Two types of monitoring
are supported: total and local memory bandwidth monitoring. To use it,
CMT should be enabled in hypervisor.

Signed-off-by: Chao Peng 
---
 docs/man/xl.pod.1 |9 +
 tools/libxc/include/xenctrl.h |2 ++
 tools/libxc/xc_psr.c  |8 +
 tools/libxl/libxl.h   |8 +
 tools/libxl/libxl_psr.c   |   75 +
 tools/libxl/libxl_types.idl   |2 ++
 tools/libxl/xl_cmdimpl.c  |   18 ++
 tools/libxl/xl_cmdtable.c |4 ++-
 8 files changed, 125 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
index 6b89ba8..0370625 100644
--- a/docs/man/xl.pod.1
+++ b/docs/man/xl.pod.1
@@ -1461,6 +1461,13 @@ is domain level. To monitor a specific domain, just 
attach the domain id with
 the monitoring service. When the domain doesn't need to be monitored any more,
 detach the domain id from the monitoring service.
 
+Intel Broadwell and later server platforms also offer total/local memory
+bandwidth monitoring. Xen supports per-domain monitoring for these two
+additional monitoring types. Both memory bandwidth monitoring and L3 cache
+occupancy monitoring share the same set of underground monitoring service. Once
+a domain is attached to the monitoring service, monitoring data can be showed
+for any of these monitoring types.
+
 =over 4
 
 =item B [I]
@@ -1476,6 +1483,8 @@ detach: Detach the platform shared resource monitoring 
service from a domain.
 Show monitoring data for a certain domain or all domains. Current supported
 monitor types are:
  - "cache-occupancy": showing the L3 cache occupancy.
+ - "total-mem-bandwidth": showing the total memory bandwidth.
+ - "local-mem-bandwidth": showing the local memory bandwidth.
 
 =back
 
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index c6e9e3e..06366b5 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2688,6 +2688,8 @@ int xc_resource_op(xc_interface *xch, uint32_t nr_ops, 
xc_resource_op_t *ops);
 #if defined(__i386__) || defined(__x86_64__)
 enum xc_psr_cmt_type {
 XC_PSR_CMT_L3_OCCUPANCY,
+XC_PSR_CMT_TOTAL_MEM_BANDWIDTH,
+XC_PSR_CMT_LOCAL_MEM_BANDWIDTH,
 };
 typedef enum xc_psr_cmt_type xc_psr_cmt_type;
 int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid);
diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c
index e3ecc41..99cb754 100644
--- a/tools/libxc/xc_psr.c
+++ b/tools/libxc/xc_psr.c
@@ -23,6 +23,8 @@
 #define IA32_CMT_CTR_ERROR_MASK (0x3ull << 62)
 
 #define EVTID_L3_OCCUPANCY 0x1
+#define EVTID_TOTAL_MEM_BANDWIDTH  0x2
+#define EVTID_LOCAL_MEM_BANDWIDTH  0x3
 
 int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid)
 {
@@ -175,6 +177,12 @@ int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, 
uint32_t cpu,
 case XC_PSR_CMT_L3_OCCUPANCY:
 evtid = EVTID_L3_OCCUPANCY;
 break;
+case XC_PSR_CMT_TOTAL_MEM_BANDWIDTH:
+evtid = EVTID_TOTAL_MEM_BANDWIDTH;
+break;
+case XC_PSR_CMT_LOCAL_MEM_BANDWIDTH:
+evtid = EVTID_LOCAL_MEM_BANDWIDTH;
+break;
 default:
 return -1;
 }
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index d84ff7f..5ab9d0c 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -1462,6 +1462,14 @@ int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx,
   uint32_t domid,
   uint32_t socketid,
   uint32_t *l3_cache_occupancy);
+int libxl_psr_cmt_get_total_mem_bandwidth(libxl_ctx *ctx,
+  uint32_t domid,
+  uint32_t socketid,
+  uint32_t *bandwidth);
+int libxl_psr_cmt_get_local_mem_bandwidth(libxl_ctx *ctx,
+  uint32_t domid,
+  uint32_t socketid,
+  uint32_t *bandwidth);
 #endif
 
 /* misc */
diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
index 42e74d2..a0cda89 100644
--- a/tools/libxl/libxl_psr.c
+++ b/tools/libxl/libxl_psr.c
@@ -18,6 +18,7 @@
 
 
 #define IA32_QM_CTR_ERROR_MASK (0x3ul << 62)
+#define MBM_SAMPLE_RETRY_MAX 4
 
 static void libxl__psr_cmt_log_err_msg(libxl__gc *gc, int err)
 {
@@ -240,6 +241,80 @@ out:
 return rc;
 }
 
+static int libxl__psr_cmt_get_mem_bandwidth(libxl__gc *gc,
+uint32_t domid,
+xc_psr_cmt_type type,
+uint32_t socketid,
+uint32_t *bandwidth)
+{
+uint64_t sample1, sample2;
+uint32_t upscaling_factor;
+int retry_attempts = 0;
+int rc;
+
+retry:
+rc = libxl__psr_cmt_get_l

[Xen-devel] [PATCH v2 4/5] tools: code refactoring for MBM

2015-01-07 Thread Chao Peng
Make some internal routines common so that total/local memory bandwidth
monitoring in the next patch can make use of them.

Signed-off-by: Chao Peng 
---
 tools/libxl/libxl_psr.c  |   51 ++-
 tools/libxl/xl_cmdimpl.c |   54 +++---
 2 files changed, 63 insertions(+), 42 deletions(-)

diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
index 0f2c7e0..42e74d2 100644
--- a/tools/libxl/libxl_psr.c
+++ b/tools/libxl/libxl_psr.c
@@ -179,42 +179,53 @@ out:
 return rc;
 }
 
-int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx,
-  uint32_t domid,
-  uint32_t socketid,
-  uint32_t *l3_cache_occupancy)
+static int libxl__psr_cmt_get_l3_monitoring_data(libxl__gc *gc,
+ uint32_t domid,
+ xc_psr_cmt_type type,
+ uint32_t socketid,
+ uint64_t *data)
 {
-GC_INIT(ctx);
-
 unsigned int rmid;
-uint32_t upscaling_factor;
-uint64_t monitor_data;
 int cpu, rc;
-xc_psr_cmt_type type;
 
-rc = xc_psr_cmt_get_domain_rmid(ctx->xch, domid, &rmid);
+rc = xc_psr_cmt_get_domain_rmid(CTX->xch, domid, &rmid);
 if (rc < 0 || rmid == 0) {
 LOGE(ERROR, "fail to get the domain rmid, "
 "or domain is not attached with platform QoS monitoring service");
-rc = ERROR_FAIL;
-goto out;
+return ERROR_FAIL;
 }
 
 cpu = libxl__pick_socket_cpu(gc, socketid);
 if (cpu < 0) {
 LOGE(ERROR, "failed to get socket cpu");
-rc = ERROR_FAIL;
-goto out;
+return ERROR_FAIL;
 }
 
-type = XC_PSR_CMT_L3_OCCUPANCY;
-rc = xc_psr_cmt_get_data(ctx->xch, rmid, cpu, type, &monitor_data);
+rc = xc_psr_cmt_get_data(CTX->xch, rmid, cpu, type, data);
 if (rc < 0) {
 LOGE(ERROR, "failed to get monitoring data");
-rc = ERROR_FAIL;
-goto out;
+return ERROR_FAIL;
 }
 
+return rc;
+}
+
+int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx,
+  uint32_t domid,
+  uint32_t socketid,
+  uint32_t *l3_cache_occupancy)
+{
+GC_INIT(ctx);
+uint64_t data;
+uint32_t upscaling_factor;
+int rc;
+
+rc= libxl__psr_cmt_get_l3_monitoring_data(gc, domid,
+  XC_PSR_CMT_L3_OCCUPANCY,
+  socketid, &data);
+if (rc < 0)
+goto out;
+
 rc = xc_psr_cmt_get_l3_upscaling_factor(ctx->xch, &upscaling_factor);
 if (rc < 0) {
 LOGE(ERROR, "failed to get L3 upscaling factor");
@@ -222,8 +233,8 @@ int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx,
 goto out;
 }
 
-*l3_cache_occupancy = upscaling_factor * monitor_data / 1024;
-rc = 0;
+*l3_cache_occupancy = upscaling_factor * data / 1024;
+
 out:
 GC_FREE;
 return rc;
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 24f3c8d..09ca73e 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -7846,12 +7846,13 @@ out:
 }
 
 #ifdef LIBXL_HAVE_PSR_CMT
-static void psr_cmt_print_domain_cache_occupancy(libxl_dominfo *dominfo,
- uint32_t nr_sockets)
+static void psr_cmt_print_domain_l3_info(libxl_dominfo *dominfo,
+ libxl_psr_cmt_type type,
+ uint32_t nr_sockets)
 {
 char *domain_name;
 uint32_t socketid;
-uint32_t l3_cache_occupancy;
+uint32_t data;
 
 if (!libxl_psr_cmt_domain_attached(ctx, dominfo->domid))
 return;
@@ -7861,15 +7862,21 @@ static void 
psr_cmt_print_domain_cache_occupancy(libxl_dominfo *dominfo,
 free(domain_name);
 
 for (socketid = 0; socketid < nr_sockets; socketid++) {
-if (!libxl_psr_cmt_get_cache_occupancy(ctx, dominfo->domid, socketid,
-   &l3_cache_occupancy))
-printf("%13u KB", l3_cache_occupancy);
+switch (type) {
+case LIBXL_PSR_CMT_TYPE_CACHE_OCCUPANCY:
+if (!libxl_psr_cmt_get_cache_occupancy(ctx, dominfo->domid,
+   socketid, &data))
+printf("%13u KB", data);
+break;
+default:
+return;
+}
 }
 
 printf("\n");
 }
 
-static int psr_cmt_show_cache_occupancy(uint32_t domid)
+static int psr_cmt_show_l3_info(libxl_psr_cmt_type type, uint32_t domid)
 {
 uint32_t i, socketid, nr_sockets, total_rmid;
 uint32_t l3_cache_size;
@@ -7905,19 +7912,22 @@ static int psr_cmt_show_cache_occupancy(uint32_t domid

[Xen-devel] [PATCH v2 2/5] tools: add routine to get CMT L3 event mask

2015-01-07 Thread Chao Peng
This is the tools side wrapper for XEN_SYSCTL_PSR_CMT_get_l3_event_mask
of XEN_SYSCTL_psr_cmt_op.

Signed-off-by: Chao Peng 
---
 tools/libxc/include/xenctrl.h |1 +
 tools/libxc/xc_psr.c  |   24 
 tools/libxl/libxl.h   |1 +
 tools/libxl/libxl_psr.c   |   18 ++
 4 files changed, 44 insertions(+)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 0ad8b8d..96b357c 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2697,6 +2697,7 @@ int xc_psr_cmt_get_domain_rmid(xc_interface *xch, 
uint32_t domid,
 int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid);
 int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch,
 uint32_t *upscaling_factor);
+int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask);
 int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu,
 uint32_t *l3_cache_size);
 int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid,
diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c
index 872e6dc..e76a0f9 100644
--- a/tools/libxc/xc_psr.c
+++ b/tools/libxc/xc_psr.c
@@ -112,6 +112,30 @@ int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch,
 return rc;
 }
 
+int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask)
+{
+static int val = 0;
+int rc;
+DECLARE_SYSCTL;
+
+if ( val )
+{
+*event_mask = val;
+return 0;
+}
+
+sysctl.cmd = XEN_SYSCTL_psr_cmt_op;
+sysctl.u.psr_cmt_op.cmd =
+XEN_SYSCTL_PSR_CMT_get_l3_event_mask;
+sysctl.u.psr_cmt_op.flags = 0;
+
+rc = xc_sysctl(xch, &sysctl);
+if ( !rc )
+val = *event_mask = sysctl.u.psr_cmt_op.u.data;
+
+return rc;
+}
+
 int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu,
   uint32_t *l3_cache_size)
 {
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 0a123f1..42ace76 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -1453,6 +1453,7 @@ int libxl_psr_cmt_attach(libxl_ctx *ctx, uint32_t domid);
 int libxl_psr_cmt_detach(libxl_ctx *ctx, uint32_t domid);
 int libxl_psr_cmt_domain_attached(libxl_ctx *ctx, uint32_t domid);
 int libxl_psr_cmt_enabled(libxl_ctx *ctx);
+int libxl_psr_cmt_type_supported(libxl_ctx *ctx, libxl_psr_cmt_type type);
 int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid);
 int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, uint32_t socketid,
 uint32_t *l3_cache_size);
diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
index 0437465..3018a0d 100644
--- a/tools/libxl/libxl_psr.c
+++ b/tools/libxl/libxl_psr.c
@@ -120,6 +120,24 @@ int libxl_psr_cmt_enabled(libxl_ctx *ctx)
 return xc_psr_cmt_enabled(ctx->xch);
 }
 
+int libxl_psr_cmt_type_supported(libxl_ctx *ctx, libxl_psr_cmt_type type)
+{
+GC_INIT(ctx);
+uint32_t event_mask;
+int ret;
+
+ret = xc_psr_cmt_get_l3_event_mask(ctx->xch, &event_mask);
+if (ret < 0) {
+libxl__psr_cmt_log_err_msg(gc, errno);
+ret = 0;
+} else {
+ret = event_mask & (1 << (type - 1));
+}
+
+GC_FREE;
+return ret;
+}
+
 int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid)
 {
 GC_INIT(ctx);
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.5 Development Update (GA slip by a week)

2015-01-07 Thread Olaf Hering
On Tue, Jan 06, Konrad Rzeszutek Wilk wrote:

> There is only one outstanding patch and that is "#7 tools/hotplug: add 
> wrapper to 
> start xenstored". Olaf is back tomorrow so it might make it .. or not.

See my other replies. I think once we know how to deal with SELinux and
systemd this change may be not needed anymore.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 1/5] x86: expose CMT L3 event mask to user space

2015-01-07 Thread Chao Peng
L3 event mask indicates the event types supported in host, including
cache occupancy event as well as local/total memory bandwidth events
for Memory Bandwidth Monitoring(MBM). Expose it so all these events
can be monitored in user space.

Signed-off-by: Chao Peng 
Reviewed-by: Andrew Cooper 
Acked-by: Jan Beulich 
---
 xen/arch/x86/sysctl.c   |3 +++
 xen/include/public/sysctl.h |1 +
 2 files changed, 4 insertions(+)

diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c
index 57ad992..611a291 100644
--- a/xen/arch/x86/sysctl.c
+++ b/xen/arch/x86/sysctl.c
@@ -157,6 +157,9 @@ long arch_do_sysctl(
 sysctl->u.psr_cmt_op.u.data = (ret ? 0 : info.size);
 break;
 }
+case XEN_SYSCTL_PSR_CMT_get_l3_event_mask:
+sysctl->u.psr_cmt_op.u.data = psr_cmt->l3.features;
+break;
 default:
 sysctl->u.psr_cmt_op.u.data = 0;
 ret = -ENOSYS;
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index b3713b3..8552dc6 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -641,6 +641,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_sysctl_coverage_op_t);
 /* The L3 cache size is returned in KB unit */
 #define XEN_SYSCTL_PSR_CMT_get_l3_cache_size 2
 #define XEN_SYSCTL_PSR_CMT_enabled   3
+#define XEN_SYSCTL_PSR_CMT_get_l3_event_mask 4
 struct xen_sysctl_psr_cmt_op {
 uint32_t cmd;   /* IN: XEN_SYSCTL_PSR_CMT_* */
 uint32_t flags; /* padding variable, may be extended for future use */
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/5] tools: correct coding style for psr

2015-01-07 Thread Wei Liu
On Wed, Jan 07, 2015 at 07:12:03PM +0800, Chao Peng wrote:
> - space: remove space after '(' or before ')' in 'if' condition;
> - indention: align function definition/call arguments;
> 
> Signed-off-by: Chao Peng 

Acked-by: Wei Liu 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Nominations for Xen 4.5 stable tree maintainer.

2015-01-07 Thread Ian Campbell
On Tue, 2015-01-06 at 11:15 -0500, Konrad Rzeszutek Wilk wrote:
> Hello,
> 
> Per http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases:
> "Each stable branch has a maintainer who is nominated/volunteers according to 
> the Maintainer Election
>  process described in the project governance document 
> [http://www.xenproject.org/governance.html].
>  This will resulting in the MAINTAINERS file in the relevant branch being 
> patched to include the maintainer."
> 
> For the past year or so Jan Beulich has been the stable tree maintainer.
> 
> Since Xen 4.5 has branched that opens up a new stable tree and we can also
> stop maintaining Xen 4.3 stable tree.
> 
> The nominations are open - please volunteer yourself. In case nobody
> volunteers I can also take the role.
> 
> I ask folks to finish voting/nominating by Jan 14th so that when Xen 4.5 comes
> out we have an viable stable tree maintainer.

I'm not sure how voting is supposed to proceed with multiple nominations
(and with the deadline for nominations apparently being the same as for
voting), but given that Jan has thrown his hat in the ring and has been
doing a fine job with the previous trees, +1 to Jan continuing as the
stable tree maintainer.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/5] tools: add routine to get CMT L3 event mask

2015-01-07 Thread Andrew Cooper
On 07/01/15 11:12, Chao Peng wrote:
> This is the tools side wrapper for XEN_SYSCTL_PSR_CMT_get_l3_event_mask
> of XEN_SYSCTL_psr_cmt_op.
>
> Signed-off-by: Chao Peng 
> ---
>  tools/libxc/include/xenctrl.h |1 +
>  tools/libxc/xc_psr.c  |   24 
>  tools/libxl/libxl.h   |1 +
>  tools/libxl/libxl_psr.c   |   18 ++
>  4 files changed, 44 insertions(+)
>
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index 0ad8b8d..96b357c 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -2697,6 +2697,7 @@ int xc_psr_cmt_get_domain_rmid(xc_interface *xch, 
> uint32_t domid,
>  int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid);
>  int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch,
>  uint32_t *upscaling_factor);
> +int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask);
>  int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu,
>  uint32_t *l3_cache_size);
>  int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid,
> diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c
> index 872e6dc..e76a0f9 100644
> --- a/tools/libxc/xc_psr.c
> +++ b/tools/libxc/xc_psr.c
> @@ -112,6 +112,30 @@ int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch,
>  return rc;
>  }
>  
> +int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask)
> +{
> +static int val = 0;

This should be uint32_t rather than int.

I am somewhat concerned about multithreaded use of libxc, but this is
not the first issue in libxc, and probably shouldn't be held against
this patch.  As the result of the hypercall is going to be the same, the
worse that a race could achieve is a wasted hypercall.

> +int rc;
> +DECLARE_SYSCTL;
> +
> +if ( val )
> +{
> +*event_mask = val;
> +return 0;
> +}
> +
> +sysctl.cmd = XEN_SYSCTL_psr_cmt_op;
> +sysctl.u.psr_cmt_op.cmd =
> +XEN_SYSCTL_PSR_CMT_get_l3_event_mask;
> +sysctl.u.psr_cmt_op.flags = 0;
> +
> +rc = xc_sysctl(xch, &sysctl);
> +if ( !rc )
> +val = *event_mask = sysctl.u.psr_cmt_op.u.data;
> +
> +return rc;
> +}
> +
>  int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu,
>uint32_t *l3_cache_size)
>  {
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 0a123f1..42ace76 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -1453,6 +1453,7 @@ int libxl_psr_cmt_attach(libxl_ctx *ctx, uint32_t 
> domid);
>  int libxl_psr_cmt_detach(libxl_ctx *ctx, uint32_t domid);
>  int libxl_psr_cmt_domain_attached(libxl_ctx *ctx, uint32_t domid);
>  int libxl_psr_cmt_enabled(libxl_ctx *ctx);
> +int libxl_psr_cmt_type_supported(libxl_ctx *ctx, libxl_psr_cmt_type type);
>  int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid);
>  int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, uint32_t socketid,
>  uint32_t *l3_cache_size);
> diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
> index 0437465..3018a0d 100644
> --- a/tools/libxl/libxl_psr.c
> +++ b/tools/libxl/libxl_psr.c
> @@ -120,6 +120,24 @@ int libxl_psr_cmt_enabled(libxl_ctx *ctx)
>  return xc_psr_cmt_enabled(ctx->xch);
>  }
>  
> +int libxl_psr_cmt_type_supported(libxl_ctx *ctx, libxl_psr_cmt_type type)
> +{
> +GC_INIT(ctx);
> +uint32_t event_mask;
> +int ret;

The libxl CODING_SYTLE states that this "ret" should be "rc"

~Andrew

> +
> +ret = xc_psr_cmt_get_l3_event_mask(ctx->xch, &event_mask);
> +if (ret < 0) {
> +libxl__psr_cmt_log_err_msg(gc, errno);
> +ret = 0;
> +} else {
> +ret = event_mask & (1 << (type - 1));
> +}
> +
> +GC_FREE;
> +return ret;
> +}
> +
>  int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid)
>  {
>  GC_INIT(ctx);



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 06/12] xen: mark grant mapped pages as foreign

2015-01-07 Thread Ian Campbell
On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote:
> From: Jenny Herbert 
> 
> Use the "foreign" page flag to mark pages that have a grant map.  Use
> page->private to store information of the grant (the granting domain
> and the grant reference).
> 
> Signed-off-by: Jenny Herbert 
> Signed-off-by: David Vrabel 
> ---
>  arch/x86/xen/p2m.c|   50 
> ++---
>  include/xen/grant_table.h |   13 
>  2 files changed, 56 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c
> index 0d70814..22624a3 100644
> --- a/arch/x86/xen/p2m.c
> +++ b/arch/x86/xen/p2m.c
> @@ -648,6 +648,43 @@ bool set_phys_to_machine(unsigned long pfn, unsigned 
> long mfn)
>   return true;
>  }
>  
> +static int
> +init_page_grant_ref(struct page *p, domid_t domid, grant_ref_t grantref)

I'd be inclined to add "map" to the names somewhere, otherwise people
might thing they need to call this when allocating a grant in the f.e.
or other things.

> +{
> +#ifdef CONFIG_X86_64

Rather than suggesting to add CONFIG_ARM_64 here I'll suggest
BITS_PER_LONG >= 64.

> + uint64_t gref;
> + uint64_t* gref_p = &gref;
> +#else
> + uint64_t* gref_p = kmalloc(sizeof(uint64_t), GFP_KERNEL);

Might this allocation be happening during e.g. swapping? I suppose it is
backend only, and swapping to a loopback vbd would be pretty mad. If you
can figure a reasonable use case for that you might want some extra GFP
flags?

Might this be hot enough to warrant using a specific kmem_cache?

> + if (!gref)
> + return -ENOMEM;
> + uint64_t* gref = gref_p;
> +#endif
> +
> + *gref_p = ((uint64_t) grantref << 32) | domid;
> + p->private = gref;

There is a set_page_private() macro, which doesn't seem to do much but I
suppose you should use it (and page_private() for accessing, if you
don't already).

> @@ -182,4 +183,16 @@ int gnttab_unmap_refs(struct gnttab_unmap_grant_ref 
> *unmap_ops,
>  void gnttab_batch_map(struct gnttab_map_grant_ref *batch, unsigned count);
>  void gnttab_batch_copy(struct gnttab_copy *batch, unsigned count);
>  
> +static inline void
> +get_page_grant_ref(struct page *p, domid_t* domid, grant_ref_t* grantref)
> +{

BUG_ON(!PageBlah(p))?

> +#ifdef CONFIG_X86_64
> + uint64_t gref = p->private;
> +#else
> + uint64_t gref = *p->private;
> +#endif
> + *domid = gref & 0x;
> + *grantref = gref >> 32;
> +}
> +
>  #endif /* __ASM_GNTTAB_H__ */



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [libvirt] [PATCH RFC] libxl: fix paths in capability string

2015-01-07 Thread Wei Liu
On Tue, Jan 06, 2015 at 10:28:13AM -0700, Eric Blake wrote:
> On 01/06/2015 09:12 AM, Wei Liu wrote:
> > Currently libxl driver hardcodes some paths in its capability string,
> > which might not be the correct paths.
> > 
> > This patch introduces --with-libxl-prefix, so that user can specify the
> > prefix used to build Xen tools. The default value is /usr/local which is
> > the default --prefix for Xen tools.
> 
> Why can't you just make '--with-libxl=/path/to/alt' work?  That is,
> --with-libxl=yes uses a default (which matches the prefix where LIBVIRT
> will be installed [1]), --with-libxl=no turns it off, and specifying a
> path says to use that path.  Then you don't need to add a new option.
> 

But --with-libxl is used to specify build time path which can be
different from runtime path if I build libvirt against a non-installed
version of xen.

Consider I have a non-installed build of libxl in
/local/scratch/xen.git/dist/install/usr/local but not /usr/local. I
don't want that "/local/scratch..." end up in capability string.

Wei.

> [1] The assumption generally is that whoever is building libvirt has
> also built libxl to be installed in the same location - which is a nicer
> default than hard-coding a /usr/local default that libxl uses in
> isolation. Of course, libvirt also defaults to /usr/local in isolation,
> so if you don't specify anything at all, then ./configure will see that
> libvirt is going into /usr/local and will therefore default to looking
> for libxl also in /usr/local; but for a distro packager, it is nicer to
> have the mere specification of --prefix switch all other relative
> defaults to play nicely with everything else in the distro.
> 
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 



> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [libvirt] [PATCH RFC] libxl: fix paths in capability string

2015-01-07 Thread Wei Liu
On Tue, Jan 06, 2015 at 05:36:57PM +, Daniel P. Berrange wrote:
> On Tue, Jan 06, 2015 at 10:28:13AM -0700, Eric Blake wrote:
> > On 01/06/2015 09:12 AM, Wei Liu wrote:
> > > Currently libxl driver hardcodes some paths in its capability string,
> > > which might not be the correct paths.
> > > 
> > > This patch introduces --with-libxl-prefix, so that user can specify the
> > > prefix used to build Xen tools. The default value is /usr/local which is
> > > the default --prefix for Xen tools.
> > 
> > Why can't you just make '--with-libxl=/path/to/alt' work?  That is,
> > --with-libxl=yes uses a default (which matches the prefix where LIBVIRT
> > will be installed [1]), --with-libxl=no turns it off, and specifying a
> > path says to use that path.  Then you don't need to add a new option.
> 
> Yep, that would be preferrable.
> 
> Alternatively the next best would be for xen to include a pkg-config
> configuration file, with a custom variable that reports the paths
> required
> 

This might be a useful thing in its own right. Thanks for the
suggestion.

Wei.

> Regards,
> Daniel
> -- 
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC] libxl: fix paths in capability string

2015-01-07 Thread Wei Liu
Jim, another idea: if those strings are likely to be wrong and in fact
not used, can we just not print them?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 07/12] xen-netback: use foreign page information from the pages themselves

2015-01-07 Thread Ian Campbell
On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote:
> From: Jenny Herbert 
> 
> Use the foreign page flag in netback to get the domid and grant ref
> needed for the grant copy.  This signficiantly simplifies the netback
> code and makes netback work with foreign pages from other backends
> (e.g., blkback).
> 
> This allows blkback to use iSCSI disks provided by domUs running on
> the same host.
> 
> Signed-off-by: Jenny Herbert 
> Signed-off-by: David Vrabel 
> ---
>  drivers/net/xen-netback/netback.c |   97 
> +++--
>  1 file changed, 6 insertions(+), 91 deletions(-)

Gotta love that diffstat...

Acked-by: Ian Campbell 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 0/9] toolstack-based approach to pvhvm guest kexec

2015-01-07 Thread Vitaly Kuznetsov
"Jan Beulich"  writes:

 On 07.01.15 at 11:41,  wrote:
>> On 07/01/15 09:10, Olaf Hering wrote:
>>> On Mon, Jan 05, Vitaly Kuznetsov wrote:
>>> 
 Wei Liu  writes:

> Olaf mentioned his concern about handling ballooned pages in
> <20141211153029.ga1...@aepfle.de>. Is that point moot now?

 Well, the limitation is real and some guest-side handling will be
 required in case we want to support kexec with ballooning. But as David
 validly mentioned "It's the responsibility of the guest to ensure it
 either doesn't kexec when it is ballooned or that the kexec kernel can
 handle this". Not sure if we can (and need to) do anything hypevisor- or
 toolstack-side.
>>> 
>>> One approach would be to mark all pages as some sort of
>>> populate-on-demand first. Then copy the existing assigned pages from
>>> domA to domB and update the page type. The remaining pages are likely
>>> ballooned. Once the guest tries to access them this should give the
>>> hypervisor and/or toolstack a chance to assign a real RAM page to them.
>>> 
>>> I mean, if a host-assisted approach for kexec is implemented then this
>>> approach must also cover ballooning.
>> 
>> It is not possible for the hypervisor or toolstack to do what you want
>> because there may not be enough free memory to repopulate the new domain.
>> 
>> The guest can handle this by:
>> 
>> 1. Not ballooning (this is common in cloud environments).
>> 2. Reducing the balloon prior to kexec.
>
> Which may fail because again there may not be enough memory to
> claim back from the hypervisor.
>

Yes, but it may be better to cancel kexec at this point.

-- 
  Vitaly

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Nominations for Xen 4.5 stable tree maintainer.

2015-01-07 Thread Lars Kurth


On 07/01/2015 11:26, "Ian Campbell"  wrote:

>On Tue, 2015-01-06 at 11:15 -0500, Konrad Rzeszutek Wilk wrote:
>> Hello,
>> 
>> Per http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases:
>> "Each stable branch has a maintainer who is nominated/volunteers
>>according to the Maintainer Election
>>  process described in the project governance document
>>[http://www.xenproject.org/governance.html].
>>  This will resulting in the MAINTAINERS file in the relevant branch
>>being patched to include the maintainer."
>> 
>> For the past year or so Jan Beulich has been the stable tree maintainer.
>> 
>> Since Xen 4.5 has branched that opens up a new stable tree and we can
>>also
>> stop maintaining Xen 4.3 stable tree.
>> 
>> The nominations are open - please volunteer yourself. In case nobody
>> volunteers I can also take the role.
>> 
>> I ask folks to finish voting/nominating by Jan 14th so that when Xen
>>4.5 comes
>> out we have an viable stable tree maintainer.
>
>I'm not sure how voting is supposed to proceed with multiple nominations
>(and with the deadline for nominations apparently being the same as for
>voting), 

Actually, it is questionable whether there are multiple nominations.
Andrew said "If Jan wants a break, I would be happy to volunteer."

I am also not convinced that we need an election, unless the existing
maintainer wants to steps down. We never had one in the past. And we don't
have an explicit nomination for Release Managers unless the existing RM
steps down.

I can't find the mailing list discussion which led to
http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases (the link
in the change history seems to be wrong). Maybe we should just change the
document to clarify that an election is only needed if the previous
maintainer steps down, which is what I think the intention really was.

Regards
Lars

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/12] xen/grant-table: add a mechanism to safely unmap pages that are in use

2015-01-07 Thread Ian Campbell
On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote:
> From: Jenny Herbert 
> 
> Introduce gnttab_unmap_refs_async() that can be used to safely unmap
> pages that may be in use (ref count > 1).  If the pages are in use the
> unmap is deferred and retried later.  This polling is not very clever
> but it should be good enough if the cases where the delay is necessary
> are rare.
> 
> This is needed to allow block backends using grant mapping to safely
> use network storage (block or filesystem based such as iSCSI or NFS).
> 
> The network storage driver may complete a block request whilst there
> is a queued network packet retry (because the ack from the remote end
> races with deciding to queue the retry).  The pages for the retried
> packet would be grant unmapped and the network driver (or hardware)
> would access the unmapped page.

I thought this had been solved a little while ago by mapping a scratch
page on unmap even for kernel space grant mappings, but both the design
doc and here imply not (i.e. the scratch is for user grant mappings
only), so I must be misremembering.

Regardless, this approach seems likely to be far better...



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/12] xen/grant-table: add a mechanism to safely unmap pages that are in use

2015-01-07 Thread Ian Campbell
On Wed, 2015-01-07 at 12:00 +, Ian Campbell wrote:
> On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote:
> > From: Jenny Herbert 
> > 
> > Introduce gnttab_unmap_refs_async() that can be used to safely unmap
> > pages that may be in use (ref count > 1).  If the pages are in use the
> > unmap is deferred and retried later.  This polling is not very clever
> > but it should be good enough if the cases where the delay is necessary
> > are rare.
> > 
> > This is needed to allow block backends using grant mapping to safely
> > use network storage (block or filesystem based such as iSCSI or NFS).
> > 
> > The network storage driver may complete a block request whilst there
> > is a queued network packet retry (because the ack from the remote end
> > races with deciding to queue the retry).  The pages for the retried
> > packet would be grant unmapped and the network driver (or hardware)
> > would access the unmapped page.
> 
> I thought this had been solved a little while ago by mapping a scratch
> page on unmap even for kernel space grant mappings, but both the design
> doc and here imply not (i.e. the scratch is for user grant mappings
> only), so I must be misremembering.
> 
> Regardless, this approach seems likely to be far better...

None of the following patches switch netback to use it though, I think
because the use of SKBTX_DEV_ZEROCOPY makes it unnecessary. But perhaps
this lazy unmap scheme would be a better fix than SKB zerocopy for that
case too. The current stuff has the downside of always copying skbs from
guests destined for dom0 itself (rather than just passing through on
there way to a pysical NIC or another VIF), IIRC.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Nominations for Xen 4.5 stable tree maintainer.

2015-01-07 Thread Andrew Cooper
On 07/01/15 11:59, Lars Kurth wrote:
>
> On 07/01/2015 11:26, "Ian Campbell"  wrote:
>
>> On Tue, 2015-01-06 at 11:15 -0500, Konrad Rzeszutek Wilk wrote:
>>> Hello,
>>>
>>> Per http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases:
>>> "Each stable branch has a maintainer who is nominated/volunteers
>>> according to the Maintainer Election
>>>  process described in the project governance document
>>> [http://www.xenproject.org/governance.html].
>>>  This will resulting in the MAINTAINERS file in the relevant branch
>>> being patched to include the maintainer."
>>>
>>> For the past year or so Jan Beulich has been the stable tree maintainer.
>>>
>>> Since Xen 4.5 has branched that opens up a new stable tree and we can
>>> also
>>> stop maintaining Xen 4.3 stable tree.
>>>
>>> The nominations are open - please volunteer yourself. In case nobody
>>> volunteers I can also take the role.
>>>
>>> I ask folks to finish voting/nominating by Jan 14th so that when Xen
>>> 4.5 comes
>>> out we have an viable stable tree maintainer.
>> I'm not sure how voting is supposed to proceed with multiple nominations
>> (and with the deadline for nominations apparently being the same as for
>> voting), 
> Actually, it is questionable whether there are multiple nominations.
> Andrew said "If Jan wants a break, I would be happy to volunteer."

Quite - I am happy for Jan to continue in this role.  I would also agree
that having the same person doing all stable trees would be far more
efficient than any split alternative.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 11/12] xen/gntdev: mark userspace PTEs as special on x86 PV guests

2015-01-07 Thread Ian Campbell
On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote:
> In an x86 PV guest, get_user_pages_fast() on a userspace address range
> containing foreign mappings does not work correctly because the M2P
> lookup of the MFN from a userspace PTE may return the wrong page.
> 
> Force get_user_pages_fast() to fail on such addresses by marking the PTEs
> as special.
> 
> If Xen has XENFEAT_gnttab_map_avail_bits (available since at least
> 4.0),

http://wiki.xenproject.org/wiki/Xen_Kernel_Feature_Matrix says the dom0
pvpops already requires >= 4.0 too, which matches my recollection
(something to do with a new APIC interface which upstream insisted on
during upstreaming, IIRC), but both could be out of date...

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 4/5] tools: code refactoring for MBM

2015-01-07 Thread Wei Liu
On Wed, Jan 07, 2015 at 07:12:04PM +0800, Chao Peng wrote:
[...]
> -int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx,
> -  uint32_t domid,
> -  uint32_t socketid,
> -  uint32_t *l3_cache_occupancy)
> +static int libxl__psr_cmt_get_l3_monitoring_data(libxl__gc *gc,
> + uint32_t domid,
> + xc_psr_cmt_type type,
> + uint32_t socketid,
> + uint64_t *data)
>  {
> -GC_INIT(ctx);
> -
>  unsigned int rmid;
> -uint32_t upscaling_factor;
> -uint64_t monitor_data;
>  int cpu, rc;
> -xc_psr_cmt_type type;
>  
> -rc = xc_psr_cmt_get_domain_rmid(ctx->xch, domid, &rmid);
> +rc = xc_psr_cmt_get_domain_rmid(CTX->xch, domid, &rmid);
>  if (rc < 0 || rmid == 0) {
>  LOGE(ERROR, "fail to get the domain rmid, "
>  "or domain is not attached with platform QoS monitoring 
> service");
> -rc = ERROR_FAIL;
> -goto out;
> +return ERROR_FAIL;

Please retain the "goto out" idiom if possible.

>  }
>  
>  cpu = libxl__pick_socket_cpu(gc, socketid);
>  if (cpu < 0) {
>  LOGE(ERROR, "failed to get socket cpu");
> -rc = ERROR_FAIL;
> -goto out;
> +return ERROR_FAIL;
>  }
>  
> -type = XC_PSR_CMT_L3_OCCUPANCY;
> -rc = xc_psr_cmt_get_data(ctx->xch, rmid, cpu, type, &monitor_data);
> +rc = xc_psr_cmt_get_data(CTX->xch, rmid, cpu, type, data);
>  if (rc < 0) {
>  LOGE(ERROR, "failed to get monitoring data");
> -rc = ERROR_FAIL;
> -goto out;
> +return ERROR_FAIL;
>  }
>  
> +return rc;
> +}
> +
> +int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx,
> +  uint32_t domid,
> +  uint32_t socketid,
> +  uint32_t *l3_cache_occupancy)
> +{
> +GC_INIT(ctx);
> +uint64_t data;
> +uint32_t upscaling_factor;
> +int rc;
> +
> +rc= libxl__psr_cmt_get_l3_monitoring_data(gc, domid,

"rc ="

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Nominations for Xen 4.5 stable tree maintainer.

2015-01-07 Thread Ian Campbell
On Wed, 2015-01-07 at 11:59 +, Lars Kurth wrote:
> 
> 
> On 07/01/2015 11:26, "Ian Campbell"  wrote:
> 
> >On Tue, 2015-01-06 at 11:15 -0500, Konrad Rzeszutek Wilk wrote:
> >> Hello,
> >> 
> >> Per http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases:
> >> "Each stable branch has a maintainer who is nominated/volunteers
> >>according to the Maintainer Election
> >>  process described in the project governance document
> >>[http://www.xenproject.org/governance.html].
> >>  This will resulting in the MAINTAINERS file in the relevant branch
> >>being patched to include the maintainer."
> >> 
> >> For the past year or so Jan Beulich has been the stable tree maintainer.
> >> 
> >> Since Xen 4.5 has branched that opens up a new stable tree and we can
> >>also
> >> stop maintaining Xen 4.3 stable tree.
> >> 
> >> The nominations are open - please volunteer yourself. In case nobody
> >> volunteers I can also take the role.
> >> 
> >> I ask folks to finish voting/nominating by Jan 14th so that when Xen
> >>4.5 comes
> >> out we have an viable stable tree maintainer.
> >
> >I'm not sure how voting is supposed to proceed with multiple nominations
> >(and with the deadline for nominations apparently being the same as for
> >voting), 
> 
> Actually, it is questionable whether there are multiple nominations.
> Andrew said "If Jan wants a break, I would be happy to volunteer."

True, and Konrad said "if nobody else...".

Still, my +1 for Jan stands.

> I am also not convinced that we need an election, unless the existing
> maintainer wants to steps down. We never had one in the past. And we don't
> have an explicit nomination for Release Managers unless the existing RM
> steps down.
> 
> I can't find the mailing list discussion which led to
> http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases (the link
> in the change history seems to be wrong).

http://lists.xen.org/archives/html/xen-devel/2012-11/msg01391.html
perhaps?

>  Maybe we should just change the
> document to clarify that an election is only needed if the previous
> maintainer steps down, which is what I think the intention really was.
> 

Seems reasonable to me, presumably some existing mechanism (i.e. common
sense...) exists if the incumbent goes off the rails or disappears
without resigning etc.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] community develop/contributor call

2015-01-07 Thread Ian Campbell
Hi all,

For a few years now I've been running a monthly phone call between
various organisations which contribute to Xen, the so called "TCT
Call"[0]. However since then the Xen Project has moved to the Linux
foundation and the introduction of the Advisory Board means this call
isn't really useful in its current form any more.

Rather than just cancelling it I've been thinking of repurposing it into
a more development focused call for all contributors (maintainers, patch
submitters etc).

The idea would be to provide a regular slot where technical and
development topics can be discussed on an on demand basis without the
need to setup an ad-hoc call each time something comes up. In other
words to provide a forum for topics such as thrashing out a more complex
design for a new feature, building consensus on how to move forward with
a stalled patch series, discussing difficult freeze exception requests
during release periods and cases where on list discussion has stalled or
those involved feel the need to have a "live" chat about it for some
reason.

My intention would be the call would be targeted at maintainers and
people who are themselves working on the code/design, although anyone
will be welcome to dial in and take part.

The plan would be to post a call for agenda items to xen-devel a week
beforehand with the call being cancelled the day before if no topics
were proposed (I have no problem with the call only happening
occasionally, there's no point in having it if there are no topics, but
if it's useful a couple of times a year it's worth having IMHO). I
will try and reach out to anyone who should be involved with a proposed
topic to try and ensure that the call has the right people on it in
order to make progress. In the event that some of the participants feel
that a topic is still being usefully discussed on list and that
discussion should remain there then I'll mediate to either gain
consensus one way or another or unblock things some other way etc.

I would expect people decide based on the published agenda each month
whether they want/need to attend. I hope that at least the maintainers
would make an effort to attend if there were topics covering their areas
(and as above I will chase where needed).

In the short term I intend to stick with the same time (5PM UK, on the
second Wednesday of the month) and cadence (monthly) as before, but I
might revisit the timing, in particular if we find the proposed topics
make the time problematic for key people who want to be involved (e.g.
folks in the Far East, for whom 5PM UK is pretty antisocial).

Since that schedule puts a call a week today (14th Jan), I'll send out a
call for agenda later today.

Ian.

[0] http://wiki.xen.org/wiki/Xen_Technical_Coordination_Team_%28TCT%29_Meeting


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 5/5] tools: add total/local memory bandwith monitoring

2015-01-07 Thread Wei Liu
On Wed, Jan 07, 2015 at 07:12:05PM +0800, Chao Peng wrote:
[...]
> +static int libxl__psr_cmt_get_mem_bandwidth(libxl__gc *gc,
> +uint32_t domid,
> +xc_psr_cmt_type type,
> +uint32_t socketid,
> +uint32_t *bandwidth)
> +{
> +uint64_t sample1, sample2;
> +uint32_t upscaling_factor;
> +int retry_attempts = 0;
> +int rc;
> +
> +retry:
> +rc = libxl__psr_cmt_get_l3_monitoring_data(gc, domid, type, socketid,
> +   &sample1);
> +if (rc < 0)
> +return ERROR_FAIL;
> +
> +usleep(1);
> +
> +rc = libxl__psr_cmt_get_l3_monitoring_data(gc, domid, type, socketid,
> +   &sample2);
> +if (rc < 0)
> +   return ERROR_FAIL;
> +
> +if (sample2 < sample1) {
> + if (retry_attempts < MBM_SAMPLE_RETRY_MAX) {
> + retry_attempts++;
> + goto retry;
> + } else {
> + LOGE(ERROR, "event counter overflowed");
> + return ERROR_FAIL;
> + }
> +}
> +

Two things:

1. Use for/while for retry loop, don't use goto.
2. Use "goto out" idiom for error handling.

The "goto out" idiom is part of documented libxl error handling section
in tools/libxl/CODING_STYLE. You might find that document useful
reference.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] check-headers error in staging

2015-01-07 Thread Olaf Hering

After upgrade to current staging, my test packages fail to build:

[  289s] + make -j4 -k -C tools/include/xen-foreign
[  289s] make: Entering directory 
`/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign'
[  289s] python mkheader.py arm32 arm32.h 
/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-arm.h
 
/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/xen.h
[  289s] python mkheader.py arm64 arm64.h 
/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-arm.h
 
/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/xen.h
[  289s] python mkheader.py x86_32 x86_32.h 
/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-x86/xen-x86_32.h
 
/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/
[  289s] python mkheader.py x86_64 x86_64.h 
/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-x86/xen-x86_64.h
 
/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/
[  289s] python mkchecker.py checker.c arm32 arm64 x86_32 x86_64
[  289s] gcc -Wall -Werror -Wstrict-prototypes -O2 -fomit-frame-pointer 
-fno-strict-aliasing -Wdeclaration-after-statement -o checker checker.c
[  289s] ./checker > tmp.size
[  289s] diff -u reference.size tmp.size
[  289s] --- reference.size 2015-01-07 10:28:57.0 +
[  289s] +++ tmp.size   2015-01-07 12:28:33.564911299 +
[  289s] @@ -9,6 +9,6 @@
[  289s]  arch_vcpu_info|   0   0  24  16
[  289s]  vcpu_time_info|  32  32  32  32
[  289s]  vcpu_info |  48  48  64  64
[  289s] -arch_shared_info  |   0   0 268 280
[  289s] -shared_info   |1088108825843368
[  289s] +arch_shared_info  |   0   0  24  48
[  289s] +shared_info   |1088108823403136
[  289s]
[  289s] make: *** [check-headers] Error 1


this was the last successful build:
xen_hg_changeset Mon Dec 15 17:40:12 2014 + hg: 30046:cefc36150538

this build fails:
xen_hg_changeset Wed Jan 07 11:28:57 2015 +0100 hg: 30084:88d114af72d8


So it looks like something between git commites dcd8486..dd94cac causes this 
failure.


Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Nominations for Xen 4.5 stable tree maintainer.

2015-01-07 Thread Lars Kurth
>>  Maybe we should just change the
>> document to clarify that an election is only needed if the previous
>> maintainer steps down, which is what I think the intention really was.
>
> Seems reasonable to me, presumably some existing mechanism (i.e. common
> sense...) exists if the incumbent goes off the rails or disappears
> without resigning etc.


Ian, 

thanks for digging out the link. Looking through the thread we said, the
consensus seems to have been to handle stable tree maintainers like
maintainers. "Each stable branch has a maintainer who is
nominated/volunteers according to the Maintainer Election process
described in the project governance document" doesn't imply a formal
election. The governance document states

* Nomination: A maintainer should nominate himself by proposing a patch to
the MAINTAINERS file or mailing a nomination to the project's mailing
list. Alternatively another maintainer may nominate a community member. A
nomination should explain the contributions of proposed maintainer to the
project as well as a scope (set of owned components). Where the case is
not obvious, evidence such as specific patches and other evidence
supporting the nomination should be cited.
* Confirmation: Normally, there is no need for a direct election to
confirm a new maintainer. Discussion should happen on the mailing list
using the principles of consensus decision making. If there is
disagreement or doubt, the project lead or a committer should ask the
community manager to arrange a more formal vote

So I would propose to replace

"Each stable branch has a maintainer who is nominated/volunteers according
to the Maintainer Election process described in the project governance
document. This will resulting in the MAINTAINERS file in the relevant
branch being patched to include the maintainer."

with


"Each stable branch has a maintainer who is nominated/volunteers according
to the Maintainer Election process described in the project governance
document. This means that the stable branch maintainer nominates himself
or is nominated by another maintainer on the mailing list or through a
patch to the MAINTAINERS file on the relevant branch. The principles of
consensus decision making are applied, unless there is disagreement, in
which case a formal election may be needed. The MAINTAINERS file in the
relevant branch will be patched to include the stable branch maintainer."

That would mean that we don't have to go through this discussion again for
4.6.

Lars


On 07/01/2015 12:17, "Ian Campbell"  wrote:

>On Wed, 2015-01-07 at 11:59 +, Lars Kurth wrote:
>> 
>> 
>> On 07/01/2015 11:26, "Ian Campbell"  wrote:
>> 
>> >On Tue, 2015-01-06 at 11:15 -0500, Konrad Rzeszutek Wilk wrote:
>> >> Hello,
>> >> 
>> >> Per http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases:
>> >> "Each stable branch has a maintainer who is nominated/volunteers
>> >>according to the Maintainer Election
>> >>  process described in the project governance document
>> >>[http://www.xenproject.org/governance.html].
>> >>  This will resulting in the MAINTAINERS file in the relevant branch
>> >>being patched to include the maintainer."
>> >> 
>> >> For the past year or so Jan Beulich has been the stable tree
>>maintainer.
>> >> 
>> >> Since Xen 4.5 has branched that opens up a new stable tree and we can
>> >>also
>> >> stop maintaining Xen 4.3 stable tree.
>> >> 
>> >> The nominations are open - please volunteer yourself. In case nobody
>> >> volunteers I can also take the role.
>> >> 
>> >> I ask folks to finish voting/nominating by Jan 14th so that when Xen
>> >>4.5 comes
>> >> out we have an viable stable tree maintainer.
>> >
>> >I'm not sure how voting is supposed to proceed with multiple
>>nominations
>> >(and with the deadline for nominations apparently being the same as for
>> >voting), 
>> 
>> Actually, it is questionable whether there are multiple nominations.
>> Andrew said "If Jan wants a break, I would be happy to volunteer."
>
>True, and Konrad said "if nobody else...".
>
>Still, my +1 for Jan stands.
>
>> I am also not convinced that we need an election, unless the existing
>> maintainer wants to steps down. We never had one in the past. And we
>>don't
>> have an explicit nomination for Release Managers unless the existing RM
>> steps down.
>> 
>> I can't find the mailing list discussion which led to
>> http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases (the
>>link
>> in the change history seems to be wrong).
>
>http://lists.xen.org/archives/html/xen-devel/2012-11/msg01391.html
>perhaps?
>
>>  Maybe we should just change the
>> document to clarify that an election is only needed if the previous
>> maintainer steps down, which is what I think the intention really was.
>> 
>
>Seems reasonable to me, presumably some existing mechanism (i.e. common
>sense...) exists if the incumbent goes off the rails or disappears
>without resigning etc.
>

___
Xen-devel mailing

Re: [Xen-devel] check-headers error in staging

2015-01-07 Thread Andrew Cooper
On 07/01/15 12:37, Olaf Hering wrote:
> After upgrade to current staging, my test packages fail to build:
>
> [  289s] + make -j4 -k -C tools/include/xen-foreign
> [  289s] make: Entering directory 
> `/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign'
> [  289s] python mkheader.py arm32 arm32.h 
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-arm.h
>  
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/xen.h
> [  289s] python mkheader.py arm64 arm64.h 
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-arm.h
>  
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/xen.h
> [  289s] python mkheader.py x86_32 x86_32.h 
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-x86/xen-x86_32.h
>  
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/
> [  289s] python mkheader.py x86_64 x86_64.h 
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-x86/xen-x86_64.h
>  
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/
> [  289s] python mkchecker.py checker.c arm32 arm64 x86_32 x86_64
> [  289s] gcc -Wall -Werror -Wstrict-prototypes -O2 -fomit-frame-pointer 
> -fno-strict-aliasing -Wdeclaration-after-statement -o checker checker.c
> [  289s] ./checker > tmp.size
> [  289s] diff -u reference.size tmp.size
> [  289s] --- reference.size 2015-01-07 10:28:57.0 +
> [  289s] +++ tmp.size   2015-01-07 12:28:33.564911299 +
> [  289s] @@ -9,6 +9,6 @@
> [  289s]  arch_vcpu_info|   0   0  24  16
> [  289s]  vcpu_time_info|  32  32  32  32
> [  289s]  vcpu_info |  48  48  64  64
> [  289s] -arch_shared_info  |   0   0 268 280
> [  289s] -shared_info   |1088108825843368
> [  289s] +arch_shared_info  |   0   0  24  48
> [  289s] +shared_info   |1088108823403136
> [  289s]
> [  289s] make: *** [check-headers] Error 1
>
>
> this was the last successful build:
> xen_hg_changeset Mon Dec 15 17:40:12 2014 + hg: 30046:cefc36150538
>
> this build fails:
> xen_hg_changeset Wed Jan 07 11:28:57 2015 +0100 hg: 30084:88d114af72d8
>
>
> So it looks like something between git commites dcd8486..dd94cac causes this 
> failure.

It will be c/s dac6d3b1a "expand x86 arch_shared_info to support linear
p2m list"

There is a shrink to the size of arch_shared_info, but was argued and
accepted as safe to do.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] check-headers error in staging

2015-01-07 Thread Jan Beulich
>>> On 07.01.15 at 13:45,  wrote:
> On 07/01/15 12:37, Olaf Hering wrote:
>> After upgrade to current staging, my test packages fail to build:
>>
>> [  289s] + make -j4 -k -C tools/include/xen-foreign
>> [  289s] make: Entering directory 
> `/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign'
>> [  289s] python mkheader.py arm32 arm32.h 
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> ./../xen/include/public/arch-arm.h 
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/..
> /../../xen/include/public/xen.h
>> [  289s] python mkheader.py arm64 arm64.h 
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> ./../xen/include/public/arch-arm.h 
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/..
> /../../xen/include/public/xen.h
>> [  289s] python mkheader.py x86_32 x86_32.h 
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> ./../xen/include/public/arch-x86/xen-x86_32.h 
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> ./../xen/include/
>> [  289s] python mkheader.py x86_64 x86_64.h 
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> ./../xen/include/public/arch-x86/xen-x86_64.h 
> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> ./../xen/include/
>> [  289s] python mkchecker.py checker.c arm32 arm64 x86_32 x86_64
>> [  289s] gcc -Wall -Werror -Wstrict-prototypes -O2 -fomit-frame-pointer 
> -fno-strict-aliasing -Wdeclaration-after-statement -o checker checker.c
>> [  289s] ./checker > tmp.size
>> [  289s] diff -u reference.size tmp.size
>> [  289s] --- reference.size 2015-01-07 10:28:57.0 +
>> [  289s] +++ tmp.size   2015-01-07 12:28:33.564911299 +
>> [  289s] @@ -9,6 +9,6 @@
>> [  289s]  arch_vcpu_info|   0   0  24  16
>> [  289s]  vcpu_time_info|  32  32  32  32
>> [  289s]  vcpu_info |  48  48  64  64
>> [  289s] -arch_shared_info  |   0   0 268 280
>> [  289s] -shared_info   |1088108825843368
>> [  289s] +arch_shared_info  |   0   0  24  48
>> [  289s] +shared_info   |1088108823403136
>> [  289s]
>> [  289s] make: *** [check-headers] Error 1
>>
>>
>> this was the last successful build:
>> xen_hg_changeset Mon Dec 15 17:40:12 2014 + hg: 30046:cefc36150538
>>
>> this build fails:
>> xen_hg_changeset Wed Jan 07 11:28:57 2015 +0100 hg: 30084:88d114af72d8
>>
>>
>> So it looks like something between git commites dcd8486..dd94cac causes this 
> failure.
> 
> It will be c/s dac6d3b1a "expand x86 arch_shared_info to support linear
> p2m list"
> 
> There is a shrink to the size of arch_shared_info, but was argued and
> accepted as safe to do.

So it looks like I should revert that one then, as it'll cause an
unconditional build failure in the tools part of the build afaict. Plus
it's not clear how to properly express the now variable size to the
checking logic, i.e. resolving the issue may take some time.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/12] xen/grant-table: add a mechanism to safely unmap pages that are in use

2015-01-07 Thread David Vrabel
On 07/01/15 12:00, Ian Campbell wrote:
> On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote:
>> From: Jenny Herbert 
>>
>> Introduce gnttab_unmap_refs_async() that can be used to safely unmap
>> pages that may be in use (ref count > 1).  If the pages are in use the
>> unmap is deferred and retried later.  This polling is not very clever
>> but it should be good enough if the cases where the delay is necessary
>> are rare.
>>
>> This is needed to allow block backends using grant mapping to safely
>> use network storage (block or filesystem based such as iSCSI or NFS).
>>
>> The network storage driver may complete a block request whilst there
>> is a queued network packet retry (because the ack from the remote end
>> races with deciding to queue the retry).  The pages for the retried
>> packet would be grant unmapped and the network driver (or hardware)
>> would access the unmapped page.
> 
> I thought this had been solved a little while ago by mapping a scratch
> page on unmap even for kernel space grant mappings, but both the design
> doc and here imply not (i.e. the scratch is for user grant mappings
> only), so I must be misremembering.

It was only for user grant mappings and it did not fix the case where
the page being unmapped was currently dma mapped.  This could have
resulted in the NIC transmitting sensitive data.

e.g.,

1. iscsi queues a retransmit with page P (frame F).
2. NIC driver DMA maps and queues frame F on h/w.
3. iscsi completes the I/O.
4. page P is unmapped.
5. response is sent to guest
6. guest reuses frame F.
7. NIC transmits frame F.

We don't use this safe unmap mechanism for netback because the zero copy
stuff means we don't need it and the polling on the unmap is high
latency and only good enough if the polling is needed very rarely.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 11/12] xen/gntdev: mark userspace PTEs as special on x86 PV guests

2015-01-07 Thread David Vrabel
On 07/01/15 12:11, Ian Campbell wrote:
> On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote:
>> In an x86 PV guest, get_user_pages_fast() on a userspace address range
>> containing foreign mappings does not work correctly because the M2P
>> lookup of the MFN from a userspace PTE may return the wrong page.
>>
>> Force get_user_pages_fast() to fail on such addresses by marking the PTEs
>> as special.
>>
>> If Xen has XENFEAT_gnttab_map_avail_bits (available since at least
>> 4.0),
> 
> http://wiki.xenproject.org/wiki/Xen_Kernel_Feature_Matrix says the dom0
> pvpops already requires >= 4.0 too, which matches my recollection
> (something to do with a new APIC interface which upstream insisted on
> during upstreaming, IIRC), but both could be out of date...

gntdev is usable by driver domains and useful for inter-domain comms so
it isn't limited to dom0 use only and Linux still needs to run on Xen
3.2 (I think that's the oldest still available on AWS).

Because of the m2p override limitation, gntdev is currently unsafe[1] to
use by untrusted userspace apps so there's no (new) security issues here.

However, we could disable gntdev if this feature is messing unless
overridden by a module option.  Opinions on this?

David

[1] mapping a ref twice or a two refs for the same frame can corrupt
kernel state is various exciting ways because of messed up page ref counts.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/12] xen/grant-table: add a mechanism to safely unmap pages that are in use

2015-01-07 Thread Ian Campbell
On Wed, 2015-01-07 at 13:07 +, David Vrabel wrote:
> On 07/01/15 12:00, Ian Campbell wrote:
> > On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote:
> >> From: Jenny Herbert 
> >>
> >> Introduce gnttab_unmap_refs_async() that can be used to safely unmap
> >> pages that may be in use (ref count > 1).  If the pages are in use the
> >> unmap is deferred and retried later.  This polling is not very clever
> >> but it should be good enough if the cases where the delay is necessary
> >> are rare.
> >>
> >> This is needed to allow block backends using grant mapping to safely
> >> use network storage (block or filesystem based such as iSCSI or NFS).
> >>
> >> The network storage driver may complete a block request whilst there
> >> is a queued network packet retry (because the ack from the remote end
> >> races with deciding to queue the retry).  The pages for the retried
> >> packet would be grant unmapped and the network driver (or hardware)
> >> would access the unmapped page.
> > 
> > I thought this had been solved a little while ago by mapping a scratch
> > page on unmap even for kernel space grant mappings, but both the design
> > doc and here imply not (i.e. the scratch is for user grant mappings
> > only), so I must be misremembering.
> 
> It was only for user grant mappings and it did not fix the case where
> the page being unmapped was currently dma mapped.  This could have
> resulted in the NIC transmitting sensitive data.
> 
> e.g.,
> 
> 1. iscsi queues a retransmit with page P (frame F).
> 2. NIC driver DMA maps and queues frame F on h/w.
> 3. iscsi completes the I/O.
> 4. page P is unmapped.
> 5. response is sent to guest
> 6. guest reuses frame F.
> 7. NIC transmits frame F.

You don't actually need Xen to expose this sort of thing, any userspace
which reuses a buffer after write() (to e.g. NFS has completed) might
expose the new data on the wire in a retransmit.

It's certainly much easier to expose with Xen, I'll grant (no pun
intended) you that.

> We don't use this safe unmap mechanism for netback because the zero copy
> stuff means we don't need it and the polling on the unmap is high
> latency and only good enough if the polling is needed very rarely.

I'd think it should be rare for netback too unless your network is made
of wet string.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/12] xen/grant-table: add a mechanism to safely unmap pages that are in use

2015-01-07 Thread David Vrabel
On 07/01/15 13:24, Ian Campbell wrote:
> On Wed, 2015-01-07 at 13:07 +, David Vrabel wrote:
> 
>> We don't use this safe unmap mechanism for netback because the zero copy
>> stuff means we don't need it and the polling on the unmap is high
>> latency and only good enough if the polling is needed very rarely.
> 
> I'd think it should be rare for netback too unless your network is made
> of wet string.

Hmmm.  Yes, this might be worth looking at but I would prefer to do it
to it as a follow up to this series.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 08/12] xen/grant-table: add a mechanism to safely unmap pages that are in use

2015-01-07 Thread Ian Campbell
On Wed, 2015-01-07 at 13:30 +, David Vrabel wrote:
> On 07/01/15 13:24, Ian Campbell wrote:
> > On Wed, 2015-01-07 at 13:07 +, David Vrabel wrote:
> > 
> >> We don't use this safe unmap mechanism for netback because the zero copy
> >> stuff means we don't need it and the polling on the unmap is high
> >> latency and only good enough if the polling is needed very rarely.
> > 
> > I'd think it should be rare for netback too unless your network is made
> > of wet string.
> 
> Hmmm.  Yes, this might be worth looking at but I would prefer to do it
> to it as a follow up to this series.

Sure.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 11/12] xen/gntdev: mark userspace PTEs as special on x86 PV guests

2015-01-07 Thread Ian Campbell
On Wed, 2015-01-07 at 13:23 +, David Vrabel wrote:
> On 07/01/15 12:11, Ian Campbell wrote:
> > On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote:
> >> In an x86 PV guest, get_user_pages_fast() on a userspace address range
> >> containing foreign mappings does not work correctly because the M2P
> >> lookup of the MFN from a userspace PTE may return the wrong page.
> >>
> >> Force get_user_pages_fast() to fail on such addresses by marking the PTEs
> >> as special.
> >>
> >> If Xen has XENFEAT_gnttab_map_avail_bits (available since at least
> >> 4.0),
> > 
> > http://wiki.xenproject.org/wiki/Xen_Kernel_Feature_Matrix says the dom0
> > pvpops already requires >= 4.0 too, which matches my recollection
> > (something to do with a new APIC interface which upstream insisted on
> > during upstreaming, IIRC), but both could be out of date...
> 
> gntdev is usable by driver domains and useful for inter-domain comms so
> it isn't limited to dom0 use only and Linux still needs to run on Xen
> 3.2 (I think that's the oldest still available on AWS).

Ah yes, driver domains...

> Because of the m2p override limitation, gntdev is currently unsafe[1] to
> use by untrusted userspace apps so there's no (new) security issues here.
> 
> However, we could disable gntdev if this feature is messing unless
> overridden by a module option.  Opinions on this?

If it is exploitable by untrusted apps in the new form (the race between
mmap and the pte update still is, right?), then that might be best, or
only allow root to open it?

> 
> David
> 
> [1] mapping a ref twice or a two refs for the same frame can corrupt
> kernel state is various exciting ways because of messed up page ref counts.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] check-headers error in staging

2015-01-07 Thread Ian Campbell
On Wed, 2015-01-07 at 13:05 +, Jan Beulich wrote:
> >>> On 07.01.15 at 13:45,  wrote:
> > On 07/01/15 12:37, Olaf Hering wrote:
> >> After upgrade to current staging, my test packages fail to build:
> >>
> >> [  289s] + make -j4 -k -C tools/include/xen-foreign
> >> [  289s] make: Entering directory 
> > `/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign'
> >> [  289s] python mkheader.py arm32 arm32.h 
> > /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> > ./../xen/include/public/arch-arm.h 
> > /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/..
> > /../../xen/include/public/xen.h
> >> [  289s] python mkheader.py arm64 arm64.h 
> > /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> > ./../xen/include/public/arch-arm.h 
> > /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/..
> > /../../xen/include/public/xen.h
> >> [  289s] python mkheader.py x86_32 x86_32.h 
> > /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> > ./../xen/include/public/arch-x86/xen-x86_32.h 
> > /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> > ./../xen/include/
> >> [  289s] python mkheader.py x86_64 x86_64.h 
> > /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> > ./../xen/include/public/arch-x86/xen-x86_64.h 
> > /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> > ./../xen/include/
> >> [  289s] python mkchecker.py checker.c arm32 arm64 x86_32 x86_64
> >> [  289s] gcc -Wall -Werror -Wstrict-prototypes -O2 -fomit-frame-pointer 
> > -fno-strict-aliasing -Wdeclaration-after-statement -o checker checker.c
> >> [  289s] ./checker > tmp.size
> >> [  289s] diff -u reference.size tmp.size
> >> [  289s] --- reference.size 2015-01-07 10:28:57.0 +
> >> [  289s] +++ tmp.size   2015-01-07 12:28:33.564911299 +
> >> [  289s] @@ -9,6 +9,6 @@
> >> [  289s]  arch_vcpu_info|   0   0  24  16
> >> [  289s]  vcpu_time_info|  32  32  32  32
> >> [  289s]  vcpu_info |  48  48  64  64
> >> [  289s] -arch_shared_info  |   0   0 268 280
> >> [  289s] -shared_info   |1088108825843368
> >> [  289s] +arch_shared_info  |   0   0  24  48
> >> [  289s] +shared_info   |1088108823403136
> >> [  289s]
> >> [  289s] make: *** [check-headers] Error 1
> >>
> >>
> >> this was the last successful build:
> >> xen_hg_changeset Mon Dec 15 17:40:12 2014 + hg: 30046:cefc36150538
> >>
> >> this build fails:
> >> xen_hg_changeset Wed Jan 07 11:28:57 2015 +0100 hg: 30084:88d114af72d8
> >>
> >>
> >> So it looks like something between git commites dcd8486..dd94cac causes 
> >> this 
> > failure.
> > 
> > It will be c/s dac6d3b1a "expand x86 arch_shared_info to support linear
> > p2m list"
> > 
> > There is a shrink to the size of arch_shared_info, but was argued and
> > accepted as safe to do.
> 
> So it looks like I should revert that one then, as it'll cause an
> unconditional build failure in the tools part of the build afaict. Plus
> it's not clear how to properly express the now variable size to the
> checking logic, i.e. resolving the issue may take some time.

It's not sufficient just to reinstate the appropriate amount of padding?



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] check-headers error in staging

2015-01-07 Thread Andrew Cooper
On 07/01/15 13:36, Ian Campbell wrote:
> On Wed, 2015-01-07 at 13:05 +, Jan Beulich wrote:
> On 07.01.15 at 13:45,  wrote:
>>> On 07/01/15 12:37, Olaf Hering wrote:
 After upgrade to current staging, my test packages fail to build:

 [  289s] + make -j4 -k -C tools/include/xen-foreign
 [  289s] make: Entering directory 
>>> `/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign'
 [  289s] python mkheader.py arm32 arm32.h 
>>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
>>> ./../xen/include/public/arch-arm.h 
>>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/..
>>> /../../xen/include/public/xen.h
 [  289s] python mkheader.py arm64 arm64.h 
>>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
>>> ./../xen/include/public/arch-arm.h 
>>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/..
>>> /../../xen/include/public/xen.h
 [  289s] python mkheader.py x86_32 x86_32.h 
>>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
>>> ./../xen/include/public/arch-x86/xen-x86_32.h 
>>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
>>> ./../xen/include/
 [  289s] python mkheader.py x86_64 x86_64.h 
>>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
>>> ./../xen/include/public/arch-x86/xen-x86_64.h 
>>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
>>> ./../xen/include/
 [  289s] python mkchecker.py checker.c arm32 arm64 x86_32 x86_64
 [  289s] gcc -Wall -Werror -Wstrict-prototypes -O2 -fomit-frame-pointer 
>>> -fno-strict-aliasing -Wdeclaration-after-statement -o checker checker.c
 [  289s] ./checker > tmp.size
 [  289s] diff -u reference.size tmp.size
 [  289s] --- reference.size 2015-01-07 10:28:57.0 +
 [  289s] +++ tmp.size   2015-01-07 12:28:33.564911299 +
 [  289s] @@ -9,6 +9,6 @@
 [  289s]  arch_vcpu_info|   0   0  24  16
 [  289s]  vcpu_time_info|  32  32  32  32
 [  289s]  vcpu_info |  48  48  64  64
 [  289s] -arch_shared_info  |   0   0 268 280
 [  289s] -shared_info   |1088108825843368
 [  289s] +arch_shared_info  |   0   0  24  48
 [  289s] +shared_info   |1088108823403136
 [  289s]
 [  289s] make: *** [check-headers] Error 1


 this was the last successful build:
 xen_hg_changeset Mon Dec 15 17:40:12 2014 + hg: 30046:cefc36150538

 this build fails:
 xen_hg_changeset Wed Jan 07 11:28:57 2015 +0100 hg: 30084:88d114af72d8


 So it looks like something between git commites dcd8486..dd94cac causes 
 this 
>>> failure.
>>>
>>> It will be c/s dac6d3b1a "expand x86 arch_shared_info to support linear
>>> p2m list"
>>>
>>> There is a shrink to the size of arch_shared_info, but was argued and
>>> accepted as safe to do.
>> So it looks like I should revert that one then, as it'll cause an
>> unconditional build failure in the tools part of the build afaict. Plus
>> it's not clear how to properly express the now variable size to the
>> checking logic, i.e. resolving the issue may take some time.
> It's not sufficient just to reinstate the appropriate amount of padding?
>
>

The padding was previously in terms of uint64_t's, whereas the new
amount of padding is $N - (3 * unsigned long) which is arch dependent
and awkward to cleanly express.  We discussed that, as it was the last
field all zeroes anyway, with the backing page also being zeroed,
dropping the padding in its entirety was safe.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] check-headers error in staging

2015-01-07 Thread Ian Campbell
On Wed, 2015-01-07 at 13:39 +, Andrew Cooper wrote:
> On 07/01/15 13:36, Ian Campbell wrote:
> > On Wed, 2015-01-07 at 13:05 +, Jan Beulich wrote:
> > On 07.01.15 at 13:45,  wrote:
> >>> On 07/01/15 12:37, Olaf Hering wrote:
>  After upgrade to current staging, my test packages fail to build:
> 
>  [  289s] + make -j4 -k -C tools/include/xen-foreign
>  [  289s] make: Entering directory 
> >>> `/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign'
>  [  289s] python mkheader.py arm32 arm32.h 
> >>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> >>> ./../xen/include/public/arch-arm.h 
> >>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/..
> >>> /../../xen/include/public/xen.h
>  [  289s] python mkheader.py arm64 arm64.h 
> >>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> >>> ./../xen/include/public/arch-arm.h 
> >>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/..
> >>> /../../xen/include/public/xen.h
>  [  289s] python mkheader.py x86_32 x86_32.h 
> >>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> >>> ./../xen/include/public/arch-x86/xen-x86_32.h 
> >>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> >>> ./../xen/include/
>  [  289s] python mkheader.py x86_64 x86_64.h 
> >>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> >>> ./../xen/include/public/arch-x86/xen-x86_64.h 
> >>> /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../.
> >>> ./../xen/include/
>  [  289s] python mkchecker.py checker.c arm32 arm64 x86_32 x86_64
>  [  289s] gcc -Wall -Werror -Wstrict-prototypes -O2 -fomit-frame-pointer 
> >>> -fno-strict-aliasing -Wdeclaration-after-statement -o checker checker.c
>  [  289s] ./checker > tmp.size
>  [  289s] diff -u reference.size tmp.size
>  [  289s] --- reference.size 2015-01-07 10:28:57.0 +
>  [  289s] +++ tmp.size   2015-01-07 12:28:33.564911299 +
>  [  289s] @@ -9,6 +9,6 @@
>  [  289s]  arch_vcpu_info|   0   0  24  16
>  [  289s]  vcpu_time_info|  32  32  32  32
>  [  289s]  vcpu_info |  48  48  64  64
>  [  289s] -arch_shared_info  |   0   0 268 280
>  [  289s] -shared_info   |1088108825843368
>  [  289s] +arch_shared_info  |   0   0  24  48
>  [  289s] +shared_info   |1088108823403136
>  [  289s]
>  [  289s] make: *** [check-headers] Error 1
> 
> 
>  this was the last successful build:
>  xen_hg_changeset Mon Dec 15 17:40:12 2014 + hg: 30046:cefc36150538
> 
>  this build fails:
>  xen_hg_changeset Wed Jan 07 11:28:57 2015 +0100 hg: 30084:88d114af72d8
> 
> 
>  So it looks like something between git commites dcd8486..dd94cac causes 
>  this 
> >>> failure.
> >>>
> >>> It will be c/s dac6d3b1a "expand x86 arch_shared_info to support linear
> >>> p2m list"
> >>>
> >>> There is a shrink to the size of arch_shared_info, but was argued and
> >>> accepted as safe to do.
> >> So it looks like I should revert that one then, as it'll cause an
> >> unconditional build failure in the tools part of the build afaict. Plus
> >> it's not clear how to properly express the now variable size to the
> >> checking logic, i.e. resolving the issue may take some time.
> > It's not sufficient just to reinstate the appropriate amount of padding?
> >
> >
> 
> The padding was previously in terms of uint64_t's, whereas the new
> amount of padding is $N - (3 * unsigned long) which is arch dependent
> and awkward to cleanly express.  We discussed that, as it was the last
> field all zeroes anyway, with the backing page also being zeroed,
> dropping the padding in its entirety was safe.

I think this could be fixed by updating reference.size, it already has
separate columns for x86_32 and x86_64 (I'd originally thought Jan meant
variable as in dynamic, rather than just different between subarches).
In fact they appear to have been different in size before anyway, they
are just differently different now...

Otherwise either ifdef the padding or make the new fields uint64_t,
neither seems all that intolerable.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.10 test] 33190: regressions - trouble: blocked/broken/fail/pass

2015-01-07 Thread xen . org
flight 33190 linux-3.10 real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/33190/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-winxpsp3  7 windows-install fail REGR. vs. 26303
 build-armhf-libvirt   5 libvirt-buildfail in 33120 REGR. vs. 26303
 build-i386-libvirt5 libvirt-buildfail in 33120 REGR. vs. 26303
 build-amd64-libvirt   5 libvirt-buildfail in 33120 REGR. vs. 26303

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl   3 host-install(3)   broken pass in 33120

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels 
fail blocked in 26303
 build-armhf-libvirt   3 host-install(3)broken blocked in 26303
 build-i386-libvirt3 host-install(3)broken blocked in 26303
 build-amd64-libvirt   3 host-install(3)broken blocked in 26303
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 10 guest-localmigrate fail blocked 
in 26303
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 26303
 test-amd64-amd64-xl-winxpsp3  7 windows-install  fail   like 26303

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-armhf-armhf-xl   5 xen-boot  fail in 33120 never pass

version targeted for testing:
 linuxa472efc75989c7092187fe00f0400e02c495c436
baseline version:
 linuxbe67db109090b17b56eb8eb2190cd70700f107aa


816 people touched revisions under test,
not listing them all


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  broken  
 build-armhf-libvirt  broken  
 build-i386-libvirt   broken  
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  broken  
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64fail
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64   

[Xen-devel] [PATCH] x86/xen: Free bootmem in free_p2m_page() during early boot

2015-01-07 Thread Boris Ostrovsky
With recent changes in p2m we now have legitimate cases when
p2m memory needs to be freed during early boot (i.e. before
slab is initialized).

Signed-off-by: Boris Ostrovsky 
---
 arch/x86/xen/p2m.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c
index edbc7a6..df8acb4 100644
--- a/arch/x86/xen/p2m.c
+++ b/arch/x86/xen/p2m.c
@@ -167,10 +167,13 @@ static void * __ref alloc_p2m_page(void)
return (void *)__get_free_page(GFP_KERNEL | __GFP_REPEAT);
 }
 
-/* Only to be called in case of a race for a page just allocated! */
 static void free_p2m_page(void *p)
 {
-   BUG_ON(!slab_is_available());
+   if (unlikely(!slab_is_available())) {
+   free_bootmem((unsigned long)p, PAGE_SIZE);
+   return;
+   }
+
free_page((unsigned long)p);
 }
 
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 4/4] libxl: Add interface for querying hypervisor about PCI topology

2015-01-07 Thread Boris Ostrovsky

On 01/07/2015 04:04 AM, Dario Faggioli wrote:

On Mon, 2015-01-05 at 21:18 -0500, Boris Ostrovsky wrote:

.. and use this new interface to display it along with CPU topology
and NUMA information when 'xl info -n' command is issued


Also, can we see how an `xl info -n' looks, on an IONUMA system?


@@ -195,6 +195,24 @@ int xc_cputopoinfo(xc_interface *xch,
  return 0;
  }
  
+int xc_pcitopoinfo(xc_interface *xch,

+   xc_pcitopoinfo_t *put_info)
+{
+int ret;
+DECLARE_SYSCTL;
+
+sysctl.cmd = XEN_SYSCTL_pcitopoinfo;
+
+memcpy(&sysctl.u.pcitopoinfo, put_info, sizeof(*put_info));
+
+if ( (ret = do_sysctl(xch, &sysctl)) != 0 )
+return ret;
+
+memcpy(put_info, &sysctl.u.pcitopoinfo, sizeof(*put_info));
+
+return 0;
+}
@@ -5121,6 +5121,64 @@ libxl_cputopology *libxl_get_cpu_topology(libxl_ctx 
*ctx, int *nb_cpu_out)
  return ret;
  }
  
+libxl_pcitopology *libxl_get_pci_topology(libxl_ctx *ctx, int *num_devs)

+{
+GC_INIT(ctx);
+xc_pcitopoinfo_t tinfo;
+DECLARE_HYPERCALL_BUFFER(xen_sysctl_pcitopo_t, pcitopo);
+libxl_pcitopology *ret = NULL;
+int i, rc;
+

I see from where this comes from. However, at least from new functions,
I think we should avoid using things like DECLARE_HYPERCALL_BUFFER etc.,
in libxl. They belong in libxc, IMO.

This is basically what Andrew was doing here (although that was on
xc_{topology,numa}info:

http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/hwloc-support-experimental
http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=blobdiff;f=tools/libxc/xc_misc.c;h=4f672cead5a4d2b1fc23bcddb6fb49e4d6ec72b5;hp=330345459176961cacc7fda506e843831fe7156a;hb=3585994405b6a73c137309dd4be91f48c71e4903;hpb=9a80d5056766535ac624774b96495f8b97b1d28b

Basically, what I'm suggesting is to have xc_pcitopoinfo() to do most of
the work, with libxl_get_pci_topology being just a wrapper to it.



Maybe then I should update libxl_get_cpu_topology() and 
libxl_get_numainfo() as well for code uniformity.


-boris




In fact, if you grep/cscope for things like DECLARE_HYPERCALL_BUFFER in
tools/libxl, the *only* results are these:

libxl.c libxl_get_cpu_topology  5076 
DECLARE_HYPERCALL_BUFFER(xc_cpu_to_core_t, coremap);
libxl.c libxl_get_cpu_topology  5077 
DECLARE_HYPERCALL_BUFFER(xc_cpu_to_socket_t, socketmap);
libxl.c libxl_get_cpu_topology  5078 
DECLARE_HYPERCALL_BUFFER(xc_cpu_to_node_t, nodemap);
libxl.c libxl_get_numainfo  5142 
DECLARE_HYPERCALL_BUFFER(xc_node_to_memsize_t, memsize);
libxl.c libxl_get_numainfo  5143 
DECLARE_HYPERCALL_BUFFER(xc_node_to_memfree_t, memfree);
libxl.c libxl_get_numainfo  5144 
DECLARE_HYPERCALL_BUFFER(uint32_t, node_dists);

And I think we should work toward removing these too, rather than adding
new ones! :-)

Regards,
Dario




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/xen: Free bootmem in free_p2m_page() during early boot

2015-01-07 Thread Juergen Gross

On 01/07/2015 03:08 PM, Boris Ostrovsky wrote:

With recent changes in p2m we now have legitimate cases when
p2m memory needs to be freed during early boot (i.e. before
slab is initialized).

Signed-off-by: Boris Ostrovsky 


Reviewed-by: Juergen Gross 


---
  arch/x86/xen/p2m.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c
index edbc7a6..df8acb4 100644
--- a/arch/x86/xen/p2m.c
+++ b/arch/x86/xen/p2m.c
@@ -167,10 +167,13 @@ static void * __ref alloc_p2m_page(void)
return (void *)__get_free_page(GFP_KERNEL | __GFP_REPEAT);
  }

-/* Only to be called in case of a race for a page just allocated! */
  static void free_p2m_page(void *p)
  {
-   BUG_ON(!slab_is_available());
+   if (unlikely(!slab_is_available())) {
+   free_bootmem((unsigned long)p, PAGE_SIZE);
+   return;
+   }
+
free_page((unsigned long)p);
  }





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information

2015-01-07 Thread Boris Ostrovsky

On 01/07/2015 04:06 AM, Jan Beulich wrote:

On 06.01.15 at 03:18,  wrote:

@@ -618,7 +620,22 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) 
arg)
  }
  else
  pdev_info.is_virtfn = 0;
-ret = pci_add_device(add.seg, add.bus, add.devfn, &pdev_info);
+
+if ( add.flags & XEN_PCI_DEV_PXM )
+{
+uint32_t pxm;
+int optarr_off = offsetof(struct physdev_pci_device_add, optarr) /

unsigned int or size_t.


--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -56,6 +56,8 @@ struct pci_dev {
  
  u8 phantom_stride;
  
+int node; /* NUMA node */

I thought I asked about this on v1 already: Does this really need to be
an int, when commonly node numbers are stored in u8/unsigned char?
Shrinking the field size would prevent the structure size from growing...



I kept this field as an int to be able to store NUMA_NO_NODE which I 
thought to be (int)-1.


But now I see that NUMA_NO_NODE is, in fact, 0xff but is promoted to 
(int)-1 by pxm_to_node(). Given that there is a number of tests for 
NUMA_NO_NODE and not for (int)-1, should we then make pxm_to_node() 
return u8 as well?




Of course an additional question would be whether the node wouldn't
better go into struct arch_pci_dev - that depends on whether we
expect ARM to be using NUMA...



Since we have CPU topology in common code I thought this would be 
arch-independent as well.


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/4] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient

2015-01-07 Thread Boris Ostrovsky

On 01/07/2015 04:12 AM, Jan Beulich wrote:

On 06.01.15 at 14:41,  wrote:

On 06/01/15 02:18, Boris Ostrovsky wrote:

Instead of copying data for each field in xen_sysctl_topologyinfo separately
put cpu/socket/node into a single structure and do a single copy for each
processor.

There is also no need to copy whole op to user at the end, max_cpu_index is
sufficient

Rename xen_sysctl_topologyinfo and XEN_SYSCTL_topologyinfo to reflect the fact
that these are used for CPU topology. Subsequent patch will add support for
PCI topology sysctl.

Signed-off-by: Boris Ostrovsky 

If we are going to change the hypercall, then can we see about making it
a stable interface (i.e. not a sysctl/domctl)?  There are non-toolstack
components which might want/need access to this information.  (i.e. I am
still looking for a reasonable way to get this information from Xen in
hwloc)

In which case leaving the sysctl alone and just adding a new non-sysctl
interface should be considered.


I'd expect IO NUMA information to be used together with CPU topology 
information so I am not sure how useful this would be. Unless we create 
a similar interface for that (CPU/memory) as well.


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-07 Thread Marcelo Tosatti
On Tue, Jan 06, 2015 at 11:18:21PM -0800, Andy Lutomirski wrote:
> On Tue, Jan 6, 2015 at 9:38 PM, Paolo Bonzini  wrote:
> >
> >
> > On 06/01/2015 17:56, Andy Lutomirski wrote:
> >> Still no good.  We can migrate a bunch of times so we see the same CPU
> >> all three times
> >
> > There are no three times.  The CPU you see here:
> >
> >>>
> >>>
> >>> // ... compute nanoseconds from pvti and tsc ...
> >>> rmb();
> >>> }   while(v != pvti->version);
> >
> > is the same you read here:
> >
> >>> cpu = get_cpu();
> >
> > The algorithm is:
> 
> I still don't see why this is safe, and I think that the issue is that
> you left out part of the loop.
> 
> >
> > 1) get a consistent (cpu, version, tsc)
> >
> >1.a) get cpu
> 
> Suppose we observe cpu 0.
> 
> >1.b) get pvti[cpu]->version, ignoring low bit
> 
> Missing step, presumably here: read pvti[cpu]->tsc_timestamp, scale,
> etc.  This could all execute on vCPU 1.  We could read values that are
> inconsistent with each other.
> 
> >1.c) get (tsc, cpu)
> 
> Now we could end up back on vCPU 0.
> 
> >1.d) if cpu from 1.a and 1.c do not match, loop
> >1.e) if pvti[cpu] was being updated, we'll loop later
> >
> > 2) compute nanoseconds from pvti[cpu] and tsc
> >
> > 3) if pvti[cpu] changed under our feet during (2), i.e. version doesn't
> > match, retry.
> >
> > As long as the CPU is consistent between get_cpu() and rdtscp(), there
> > is no problem with migration, because pvti is always accessed for that
> > one CPU.  If there were any problem, it would be caught by the version
> > check.  Writing it down with two nested do...whiles makes it clearer IMHO.
> 
> Why exactly would it be caught by the version check?
> 
> My ugly patch tries to make the argument that, at any point at which
> we observe ourselves to be on a given vCPU, that vCPU won't be
> updating pvti.  That means that, if version doesn't change between two
> consecutive observations that we're on that vCPU, then we're okay.
> This IMO sucks.  It's fragile, it's hard to make a coherent argument
> about correctness, and it requires at least two getcpu-like operations
> to read the time.  Those operations are *slow*.  One is much better
> than two, and zero is much better than one.
> 
> >
> >> and *still* don't get a consistent read, unless we
> >> play nasty games with lots of version checks (I have a patch for that,
> >> but I don't like it very much).  The patch is here:
> >>
> >> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vdso_paranoia&id=a69754dc5ff33f5187162b5338854ad23dd7be8d
> >>
> >> but I don't like it.
> >>
> >> Thus far, I've been told unambiguously that a guest can't observe pvti
> >> while it's being written, and I think you're now telling me that this
> >> isn't true and that a guest *can* observe pvti while it's being
> >> written while the low bit of the version field is not set.  If so,
> >> this is rather strongly incompatible with the spec in the KVM docs.
> >
> > Where am I saying that?
> 
> I thought the conclusion from what you and Marcelo pointed out about
> the code was that, once the first vCPU updated its pvti, it could
> start running guest code while the other vCPUs are still updating
> pvti, so its guest code can observe the other vCPUs mid-update.
> 
> >> Also, if you do this, can you also make setting and clearing
> >> STABLE_BIT properly atomic across all vCPUs?  Or at least do something
> >> like setting it last and clearing it first on vPCU 0?
> >
> > That would be nice if you want to make the pvclock area fit in a single
> > page.  However, it would have to be a separate flag bit, or a separate
> > CPUID feature.
> 
> It would be nice to have.  Although I think that fixing the host to
> increment version pre-update and post-update may actually be good
> enough.  Is there any case in which it would fail in practice if we
> made that fix and always looked at pvti 0?

TSC_STABLE_BIT -> ~TSC_STABLE_BIT transition steps would finish but 
still allow VCPU-1 to use stale values from VCPU-0.

To fix, do one of the following:

1) Check validity of local TSC_STABLE_BIT in addition (slow).
2) Perform update of VCPU-0 pvclock area before allowing
any other VCPU to VM-entry.



> 
> --Andy
> 
> >
> > Paolo
> 
> 
> 
> -- 
> Andy Lutomirski
> AMA Capital Management, LLC

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 4/4] libxl: Add interface for querying hypervisor about PCI topology

2015-01-07 Thread Dario Faggioli
On Wed, 2015-01-07 at 09:15 -0500, Boris Ostrovsky wrote:
> On 01/07/2015 04:04 AM, Dario Faggioli wrote:
> > This is basically what Andrew was doing here (although that was on
> > xc_{topology,numa}info:
> >
> > http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/hwloc-support-experimental
> > http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=blobdiff;f=tools/libxc/xc_misc.c;h=4f672cead5a4d2b1fc23bcddb6fb49e4d6ec72b5;hp=330345459176961cacc7fda506e843831fe7156a;hb=3585994405b6a73c137309dd4be91f48c71e4903;hpb=9a80d5056766535ac624774b96495f8b97b1d28b
> >
> > Basically, what I'm suggesting is to have xc_pcitopoinfo() to do most of
> > the work, with libxl_get_pci_topology being just a wrapper to it.
> 
> 
> Maybe then I should update libxl_get_cpu_topology() and 
> libxl_get_numainfo() as well for code uniformity.
> 
Yes, that would be ideal.

I wanted to mention it, but didn't, because I felt it was not fair to
"ask" you that.. but yes, since you're reworking them anyway, that
really would IMO be wonderful. :-)

Regards,
Dario

-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv1 0/3] x86/xen: fixes and improvements to linear p2m

2015-01-07 Thread David Vrabel
The linear p2m changes in 3.19-rc1 is broken with some dom0
configurations.  While trying to fix it I also noticed that
get_phys_to_machine() could be optimized for the common case.

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 3/3] x86/xen: optimize get_phys_to_machine()

2015-01-07 Thread David Vrabel
The page table walk is only needed to distinguish between identity and
missing, both of which have INVALID_P2M_ENTRY.

Signed-off-by: David Vrabel 
---
 arch/x86/xen/p2m.c |   30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c
index edbc7a6..a848201 100644
--- a/arch/x86/xen/p2m.c
+++ b/arch/x86/xen/p2m.c
@@ -405,8 +405,7 @@ void __init xen_vmalloc_p2m_tree(void)
 
 unsigned long get_phys_to_machine(unsigned long pfn)
 {
-   pte_t *ptep;
-   unsigned int level;
+   unsigned long mfn;
 
if (unlikely(pfn >= xen_p2m_size)) {
if (pfn < xen_max_p2m_pfn)
@@ -414,19 +413,26 @@ unsigned long get_phys_to_machine(unsigned long pfn)
 
return IDENTITY_FRAME(pfn);
}
+   
+   mfn = xen_p2m_addr[pfn];
 
-   ptep = lookup_address((unsigned long)(xen_p2m_addr + pfn), &level);
-   BUG_ON(!ptep || level != PG_LEVEL_4K);
+   if (unlikely(mfn == INVALID_P2M_ENTRY)) {
+   pte_t *ptep;
+   unsigned int level;
 
-   /*
-* The INVALID_P2M_ENTRY is filled in both p2m_*identity
-* and in p2m_*missing, so returning the INVALID_P2M_ENTRY
-* would be wrong.
-*/
-   if (pte_pfn(*ptep) == PFN_DOWN(__pa(p2m_identity)))
-   return IDENTITY_FRAME(pfn);
+   ptep = lookup_address((unsigned long)(xen_p2m_addr + pfn), 
&level);
+   BUG_ON(!ptep || level != PG_LEVEL_4K);
+
+   /*
+* The INVALID_P2M_ENTRY is filled in both p2m_*identity
+* and in p2m_*missing, so returning the INVALID_P2M_ENTRY
+* would be wrong.
+*/
+   if (pte_pfn(*ptep) == PFN_DOWN(__pa(p2m_identity)))
+   return IDENTITY_FRAME(pfn);
+   }
 
-   return xen_p2m_addr[pfn];
+   return mfn;
 }
 EXPORT_SYMBOL_GPL(get_phys_to_machine);
 
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information

2015-01-07 Thread Andrew Cooper
On 07/01/15 14:42, Boris Ostrovsky wrote:
> On 01/07/2015 04:06 AM, Jan Beulich wrote:
> On 06.01.15 at 03:18,  wrote:
>>> @@ -618,7 +620,22 @@ ret_t do_physdev_op(int cmd,
>>> XEN_GUEST_HANDLE_PARAM(void) arg)
>>>   }
>>>   else
>>>   pdev_info.is_virtfn = 0;
>>> -ret = pci_add_device(add.seg, add.bus, add.devfn, &pdev_info);
>>> +
>>> +if ( add.flags & XEN_PCI_DEV_PXM )
>>> +{
>>> +uint32_t pxm;
>>> +int optarr_off = offsetof(struct
>>> physdev_pci_device_add, optarr) /
>> unsigned int or size_t.
>>
>>> --- a/xen/include/xen/pci.h
>>> +++ b/xen/include/xen/pci.h
>>> @@ -56,6 +56,8 @@ struct pci_dev {
>>> u8 phantom_stride;
>>>   +int node; /* NUMA node */
>> I thought I asked about this on v1 already: Does this really need to be
>> an int, when commonly node numbers are stored in u8/unsigned char?
>> Shrinking the field size would prevent the structure size from
>> growing...
>
>
> I kept this field as an int to be able to store NUMA_NO_NODE which I
> thought to be (int)-1.
>
> But now I see that NUMA_NO_NODE is, in fact, 0xff but is promoted to
> (int)-1 by pxm_to_node(). Given that there is a number of tests for
> NUMA_NO_NODE and not for (int)-1, should we then make pxm_to_node()
> return u8 as well?

I noticed this as well, and found it quite counter intuitive.

I would suggest fixing NUMA_NO_NODE to -1 and removing some of the
type-punning.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/3] x86/xen: add extra memory for remapped frames during setup

2015-01-07 Thread David Vrabel
If the non-RAM regions in the e820 memory map are larger than the size
of the initial balloon, a BUG was triggered as the frames are remaped
beyond the limit of the linear p2m.  The frames are remapped into the
initial balloon area (xen_extra_mem) but not enough of this is
available.

Ensure enough extra memory regions are added for these remapped
frames.

Signed-off-by: David Vrabel 
---
 arch/x86/xen/setup.c |   13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 664dffc..feb6d86 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -366,7 +366,7 @@ static void __init xen_do_set_identity_and_remap_chunk(
 static unsigned long __init xen_set_identity_and_remap_chunk(
 const struct e820entry *list, size_t map_size, unsigned long start_pfn,
unsigned long end_pfn, unsigned long nr_pages, unsigned long remap_pfn,
-   unsigned long *released)
+   unsigned long *released, unsigned long *remapped)
 {
unsigned long pfn;
unsigned long i = 0;
@@ -404,6 +404,7 @@ static unsigned long __init 
xen_set_identity_and_remap_chunk(
/* Update variables to reflect new mappings. */
i += size;
remap_pfn += size;
+   *remapped += size;
}
 
/*
@@ -420,12 +421,13 @@ static unsigned long __init 
xen_set_identity_and_remap_chunk(
 
 static void __init xen_set_identity_and_remap(
const struct e820entry *list, size_t map_size, unsigned long nr_pages,
-   unsigned long *released)
+   unsigned long *released, unsigned long *remapped)
 {
phys_addr_t start = 0;
unsigned long last_pfn = nr_pages;
const struct e820entry *entry;
unsigned long num_released = 0;
+   unsigned long num_remapped = 0;
int i;
 
/*
@@ -452,12 +454,13 @@ static void __init xen_set_identity_and_remap(
last_pfn = xen_set_identity_and_remap_chunk(
list, map_size, start_pfn,
end_pfn, nr_pages, last_pfn,
-   &num_released);
+   &num_released, &num_remapped);
start = end;
}
}
 
*released = num_released;
+   *remapped = num_remapped;
 
pr_info("Released %ld page(s)\n", num_released);
 }
@@ -577,6 +580,7 @@ char * __init xen_memory_setup(void)
struct xen_memory_map memmap;
unsigned long max_pages;
unsigned long extra_pages = 0;
+   unsigned long remapped_pages;
int i;
int op;
 
@@ -626,9 +630,10 @@ char * __init xen_memory_setup(void)
 * underlying RAM.
 */
xen_set_identity_and_remap(map, memmap.nr_entries, max_pfn,
-  &xen_released_pages);
+  &xen_released_pages, &remapped_pages);
 
extra_pages += xen_released_pages;
+   extra_pages += remapped_pages;
 
/*
 * Clamp the amount of extra memory to a EXTRA_MEM_RATIO
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/3] x86/xen: don't count how many PFNs are identity mapped

2015-01-07 Thread David Vrabel
This accounting is just used to print a diagnostic message that isn't
very useful.

Signed-off-by: David Vrabel 
---
 arch/x86/xen/setup.c |   27 +--
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index dfd77de..664dffc 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -229,15 +229,14 @@ static int __init xen_free_mfn(unsigned long mfn)
  * as a fallback if the remapping fails.
  */
 static void __init xen_set_identity_and_release_chunk(unsigned long start_pfn,
-   unsigned long end_pfn, unsigned long nr_pages, unsigned long *identity,
-   unsigned long *released)
+   unsigned long end_pfn, unsigned long nr_pages, unsigned long *released)
 {
-   unsigned long len = 0;
unsigned long pfn, end;
int ret;
 
WARN_ON(start_pfn > end_pfn);
 
+   /* Release pages first. */
end = min(end_pfn, nr_pages);
for (pfn = start_pfn; pfn < end; pfn++) {
unsigned long mfn = pfn_to_mfn(pfn);
@@ -250,16 +249,14 @@ static void __init 
xen_set_identity_and_release_chunk(unsigned long start_pfn,
WARN(ret != 1, "Failed to release pfn %lx err=%d\n", pfn, ret);
 
if (ret == 1) {
+   (*released)++;
if (!__set_phys_to_machine(pfn, INVALID_P2M_ENTRY))
break;
-   len++;
} else
break;
}
 
-   /* Need to release pages first */
-   *released += len;
-   *identity += set_phys_range_identity(start_pfn, end_pfn);
+   set_phys_range_identity(start_pfn, end_pfn);
 }
 
 /*
@@ -318,7 +315,6 @@ static void __init xen_do_set_identity_and_remap_chunk(
unsigned long ident_pfn_iter, remap_pfn_iter;
unsigned long ident_end_pfn = start_pfn + size;
unsigned long left = size;
-   unsigned long ident_cnt = 0;
unsigned int i, chunk;
 
WARN_ON(size == 0);
@@ -347,8 +343,7 @@ static void __init xen_do_set_identity_and_remap_chunk(
xen_remap_mfn = mfn;
 
/* Set identity map */
-   ident_cnt += set_phys_range_identity(ident_pfn_iter,
-   ident_pfn_iter + chunk);
+   set_phys_range_identity(ident_pfn_iter, ident_pfn_iter + chunk);
 
left -= chunk;
}
@@ -371,7 +366,7 @@ static void __init xen_do_set_identity_and_remap_chunk(
 static unsigned long __init xen_set_identity_and_remap_chunk(
 const struct e820entry *list, size_t map_size, unsigned long start_pfn,
unsigned long end_pfn, unsigned long nr_pages, unsigned long remap_pfn,
-   unsigned long *identity, unsigned long *released)
+   unsigned long *released)
 {
unsigned long pfn;
unsigned long i = 0;
@@ -386,8 +381,7 @@ static unsigned long __init 
xen_set_identity_and_remap_chunk(
/* Do not remap pages beyond the current allocation */
if (cur_pfn >= nr_pages) {
/* Identity map remaining pages */
-   *identity += set_phys_range_identity(cur_pfn,
-   cur_pfn + size);
+   set_phys_range_identity(cur_pfn, cur_pfn + size);
break;
}
if (cur_pfn + size > nr_pages)
@@ -398,7 +392,7 @@ static unsigned long __init 
xen_set_identity_and_remap_chunk(
if (!remap_range_size) {
pr_warning("Unable to find available pfn range, not 
remapping identity pages\n");
xen_set_identity_and_release_chunk(cur_pfn,
-   cur_pfn + left, nr_pages, identity, released);
+   cur_pfn + left, nr_pages, released);
break;
}
/* Adjust size to fit in current e820 RAM region */
@@ -410,7 +404,6 @@ static unsigned long __init 
xen_set_identity_and_remap_chunk(
/* Update variables to reflect new mappings. */
i += size;
remap_pfn += size;
-   *identity += size;
}
 
/*
@@ -430,7 +423,6 @@ static void __init xen_set_identity_and_remap(
unsigned long *released)
 {
phys_addr_t start = 0;
-   unsigned long identity = 0;
unsigned long last_pfn = nr_pages;
const struct e820entry *entry;
unsigned long num_released = 0;
@@ -460,14 +452,13 @@ static void __init xen_set_identity_and_remap(
last_pfn = xen_set_identity_and_remap_chunk(
list, map_size, start_pfn,
end_pfn, nr_pages, last_pfn,
-   &identity, &num_released);
+   &num_released);
   

Re: [Xen-devel] [Bugfix] x86/apic: Fix xen IRQ allocation failure caused by commit b81975eade8c

2015-01-07 Thread Konrad Rzeszutek Wilk
On Wed, Jan 07, 2015 at 02:13:49PM +0800, Jiang Liu wrote:
> Commit b81975eade8c ("x86, irq: Clean up irqdomain transition code")
> breaks xen IRQ allocation because xen_smp_prepare_cpus() doesn't invoke
> setup_IO_APIC(), so no irqdomains created for IOAPICs and
> mp_map_pin_to_irq() fails at the very beginning.
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2369,31 +2369,29 @@ static void ioapic_destroy_irqdomain(int idx)
>   ioapics[idx].pin_info = NULL;
>  }
>  
> -void __init setup_IO_APIC(void)
> +void __init setup_IO_APIC(bool xen_smp)
>  {
>   int ioapic;
>  
> - /*
> -  * calling enable_IO_APIC() is moved to setup_local_APIC for BP
> -  */
> - io_apic_irqs = nr_legacy_irqs() ? ~PIC_IRQS : ~0UL;
> + if (!xen_smp) {
> + apic_printk(APIC_VERBOSE, "ENABLING IO-APIC IRQs\n");
> + io_apic_irqs = nr_legacy_irqs() ? ~PIC_IRQS : ~0UL;
> +
> + /* Set up IO-APIC IRQ routing. */
> + x86_init.mpparse.setup_ioapic_ids();
> + sync_Arb_IDs();
> + }


Is there a specific reason that this cannot run in all cases?

What I am asking is why are we doing a special case here? The description
at the top implied that we were just missing an call to
setup_IO_APIC.. 

>  
> - apic_printk(APIC_VERBOSE, "ENABLING IO-APIC IRQs\n");
>   for_each_ioapic(ioapic)
>   BUG_ON(mp_irqdomain_create(ioapic));
> -
> - /*
> - * Set up IO-APIC IRQ routing.
> - */
> - x86_init.mpparse.setup_ioapic_ids();
> -
> - sync_Arb_IDs();
>   setup_IO_APIC_irqs();
> - init_IO_APIC_traps();
> - if (nr_legacy_irqs())
> - check_timer();
> -
>   ioapic_initialized = 1;
> +
> + if (!xen_smp) {
> + init_IO_APIC_traps();
> + if (nr_legacy_irqs())
> + check_timer();
> + }
>  }
>  
>  /*
> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> index 4c071aeb8417..7eb0283901fa 100644
> --- a/arch/x86/xen/smp.c
> +++ b/arch/x86/xen/smp.c
> @@ -326,7 +326,10 @@ static void __init xen_smp_prepare_cpus(unsigned int 
> max_cpus)
>  
>   xen_raw_printk(m);
>   panic(m);
> + } else {
> + setup_IO_APIC(true);
>   }
> +
>   xen_init_lock_cpu(0);
>  
>   smp_store_boot_cpu_info();
> -- 
> 1.7.10.4
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/7] tools/hotplug: remove SELinux options from var-lib-xenstored.mount

2015-01-07 Thread Konrad Rzeszutek Wilk
On Wed, Jan 07, 2015 at 09:31:50AM +, Ian Campbell wrote:
> On Wed, 2015-01-07 at 10:23 +0100, Olaf Hering wrote:
> > On Tue, Jan 06, Ian Campbell wrote:
> > 
> > > On Fri, 2014-12-19 at 12:25 +0100, Olaf Hering wrote:
> > 
> > ...
> > 
> > > Acked-by: Ian Campbell 
> > > 
> > > (on commit s/Appearently/Apparently/; s/non-existant/non-existent/ in
> > > the commit log)
> > 
> > I made typos also in other commit messages. Should I resend the entire
> > series, or will this be done during commit?
> 
> Looks like Konrad already committed, I don't know if he fixed the typos
> (I suppose it doesn't matter now either way).

I did the changes you pointed our here.
> 
> Ian.
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/4] sysctl: Add sysctl interface for querying PCI topology

2015-01-07 Thread Boris Ostrovsky

On 01/07/2015 04:21 AM, Jan Beulich wrote:

On 06.01.15 at 03:18,  wrote:

--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -365,6 +365,66 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
u_sysctl)
  }
  break;
  

#ifdef HAS_PCI


+case XEN_SYSCTL_pcitopoinfo:
+{
+xen_sysctl_pcitopoinfo_t *ti = &op->u.pcitopoinfo;
+
+if ( guest_handle_is_null(ti->pcitopo) ||
+ (ti->first_dev >= ti->num_devs) )
+{
+ret = -EINVAL;
+break;
+}
+
+for ( ; ti->first_dev < ti->num_devs; ti->first_dev++ )
+{
+xen_sysctl_pcitopo_t pcitopo;
+struct pci_dev *pdev;
+
+if ( copy_from_guest_offset(&pcitopo, ti->pcitopo,
+ti->first_dev, 1) )
+{
+ret = -EFAULT;
+break;
+}
+
+spin_lock(&pcidevs_lock);
+pdev = pci_get_pdev(pcitopo.pcidev.seg, pcitopo.pcidev.bus,
+pcitopo.pcidev.devfn);
+if ( !pdev || (pdev->node == NUMA_NO_NODE) )
+pcitopo.node = INVALID_TOPOLOGY_ID;
+else
+pcitopo.node = pdev->node;

Are hypervisor-internal node numbers really meaningful to the caller?



This is the same information (pxm -> node mapping ) that we provide in 
XEN_SYSCTL_topologyinfo (renamed in this series to
XEN_SYSCTL_cputopoinfo). Given that I expect the two topologies to be 
used together I think the answer is yes.




@@ -463,7 +464,7 @@ typedef struct xen_sysctl_lockprof_op 
xen_sysctl_lockprof_op_t;
  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_op_t);
  
  /* XEN_SYSCTL_cputopoinfo */

-#define INVALID_TOPOLOGY_ID  (~0U)
+#define INVALID_TOPOLOGY_ID  (~0U) /* Also used by pcitopo */

Better extend the preceding comment.


I mentioned it to Wei yesterday that the file is structured (or at least 
appears to me to be structured) in such a way that these top comments 
mark sections of definitions for each sysctl. And so I thought that I'd 
be breaking this convention if I were to extend the comment.


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 7/7] tools/hotplug: add wrapper to start xenstored

2015-01-07 Thread Konrad Rzeszutek Wilk
On Wed, Jan 07, 2015 at 10:49:38AM +0100, Olaf Hering wrote:
> On Tue, Jan 06, Ian Jackson wrote:
> 
> > Olaf Hering writes ("[PATCH 7/7] tools/hotplug: add wrapper to start 
> > xenstored"):
> > > The shell wrapper in xenstored.service does not handle XENSTORE_TRACE.
> > ...
> > > +XENSTORED_LIBEXEC = xenstored.sh
> > 
> > Should be in /etc as previously discussed.  Previously I wrote:
> > 
> >   Bottom line: as relevant maintainer, I'm afraid I'm going to insist
> >   that this script be in /etc.
> > 
> > I'm disappointed.  It is not acceptable to resubmit a change ignoring
> > such unequivocal feedback.
> 
> Plain "/etc" wont work, I think. /etc/xen/scripts perhaps? But see my
> other reply to IanC, maybe there is a way to avoid the wrapper.
> 
> And after having some time to think about this: If one has a need to
> adjust something, then this could be done in the xencommons script right
> away. In other words, the modification can be done there instead of
> calling the wrapper.
> 
> > Nacked-by: Ian Jackson 
> > 
> > > +hotplug/Linux/xenstored.sh
> > 
> > Although many of the existing hotplug scripts have this notion of
> > calling things "foo.sh" because they happen to be written in shell, I
> > think this is bad practice.
> > 
> > I would prefer xenstored-wrap or some such.  (My co-maintainers may
> > disagree...)  But this is a bit of a bikeshed issue.
> 
> I agree. Initally I had xenstored-launcher in mind.
> 
> > >   echo -n Starting $XENSTORED...
> > > - $XENSTORED --pid-file /var/run/xenstored.pid $XENSTORED_ARGS
> > > + XENSTORED=$XENSTORED \
> > > + XENSTORED_TRACE=$XENSTORED_TRACE \
> > > + XENSTORED_ARGS=$XENSTORED_ARGS \
> > > + ${LIBEXEC_BIN}/xenstored.sh --pid-file 
> > > /var/run/xenstored.pid
> > 
> > It might be easier to "." xenstore-wrap.  Failing that using `export'
> > will avoid this rather odd and repetitive style.
> 
> I think thats a good idea. Something like this may work, doing the "."
> and the "exec" in the subshell:
> 
>  (
>  set -- --pid-file /var/run/xenstored.pid
>  . xenstored.sh 
>  )
> 
> 
> > > diff --git a/tools/hotplug/Linux/xenstored.sh.in 
> > > b/tools/hotplug/Linux/xenstored.sh.in
> > > new file mode 100644
> > > index 000..dc806ee
> > > --- /dev/null
> > > +++ b/tools/hotplug/Linux/xenstored.sh.in
> > > @@ -0,0 +1,6 @@
> > > +#!/bin/sh
> > > +if test -n "$XENSTORED_TRACE"
> > > +then
> > > + XENSTORED_ARGS=" -T /var/log/xen/xenstored-trace.log"
> > > +fi
> > > +exec $XENSTORED $@ $XENSTORED_ARGS
> > 
> > This should probably have "" around "$@" just in case.
> 
> Ok. 
> 
> 
> I will wait for results from SELinux testing before respinning this
> patch.

It did work for me (I did an 'Tested-by') in my email.

Please do keep in mind that today is the last for commits for Xen 4.5.
No pressure :-)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/7 v3] tools/hotplug: systemd changes for 4.5

2015-01-07 Thread Konrad Rzeszutek Wilk
On Wed, Jan 07, 2015 at 10:53:06AM +0100, Olaf Hering wrote:
> On Mon, Jan 05, Konrad Rzeszutek Wilk wrote:
> 
> > +Release Issues
> > +==
> > +
> > +While we did the utmost to get a release out, there are certain
> > +fixes which were not complete on time. As such please reference this
> > +section if you are running into trouble.
> > +
> > +* systemd not working with Fedora Core 20, 21 or later (systemctl
> > +  reports xenstore failing to start).
> > +
> > +  Systemd support is now part of Xen source code. While utmost work has
> > +  been done to make the systemd files compatible across all the
> > +  distributions, there might issues when using systemd files from
> > +  Xen sources. The work-around is to define an mount entry in
> > +  /etc/fstab as follow:
> > +
> > +  tmpfs   /var/lib/xenstored  tmpfs
> > +  mode=755,context="system_u:object_r:xenstored_var_lib_t:s0" 0 0
> > +
> > +
> 
> Shouldnt this go into a new SELinux section in the INSTALL file?

It is going in the web-page for 'Release Issues' and such.
> 
> Its my understanding that the reported SELinux failure is not only
> related to the context= mount option, but also to the socket passing
> from systemd.


I couldn't spot any errors in SELinux for this. Perhaps I had misconfigured?
> 
> 
> Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] VT-d: don't crash when PTE bits 52 and up are non-zero

2015-01-07 Thread Konrad Rzeszutek Wilk
On Wed, Jan 07, 2015 at 10:15:39AM +, Jan Beulich wrote:
> >>> On 23.12.14 at 07:52,  wrote:
> >>  From: Jan Beulich [mailto:jbeul...@suse.com]
> >> Sent: Friday, December 19, 2014 7:26 PM
> >> 
> >> This can (and will) be legitimately the case when sharing page tables
> >> with EPT (more of a problem before p2m_access_rwx became zero, but
> >> still possible even now when other than that is the default for a
> >> guest), leading to an unconditional crash (in print_vtd_entries())
> >> when a DMA remapping fault occurs.
> > 
> > could you elaborate the scenarios when bits 52+ are non-zero?
> > 
> >> 
> >> Signed-off-by: Jan Beulich 
> > 
> > but the changes looks reasonable to me.
> > 
> > Signed-off-by: Kevin Tian 
> 
> I translated this to a Reviewed-by, as S-o-b doesn't seem to make
> sense here.
> 
> Konrad - please indicate whether this can also go into 4.5.

Yes. Release-Acked-by: Konrad Rzeszutek Wilk 
> 
> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/7 v3] tools/hotplug: systemd changes for 4.5

2015-01-07 Thread Olaf Hering
On Wed, Jan 07, Konrad Rzeszutek Wilk wrote:

> On Wed, Jan 07, 2015 at 10:53:06AM +0100, Olaf Hering wrote:
> > Its my understanding that the reported SELinux failure is not only
> > related to the context= mount option, but also to the socket passing
> > from systemd.
> 
> I couldn't spot any errors in SELinux for this. Perhaps I had misconfigured?

Last year you said xenstored did not start, even with patch #1 applied.
I dont know if you added the required fstab changes. So if current
staging works fine with SELinux enabled we could go with this change for
the service file, instead of the wrapper:

ExecStart=/usr/bin/env $XENSTORED --no-fork $XENSTORED_ARGS


Does that work for you? If yes, lets get rid of the XENSTORED_TRACE=
boolean and use a new XENSTORED_ARGS= variable instead. That would make
patch #7 alot simpler.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information

2015-01-07 Thread Jan Beulich
>>> On 07.01.15 at 15:42,  wrote:
> On 01/07/2015 04:06 AM, Jan Beulich wrote:
> On 06.01.15 at 03:18,  wrote:
>>> --- a/xen/include/xen/pci.h
>>> +++ b/xen/include/xen/pci.h
>>> @@ -56,6 +56,8 @@ struct pci_dev {
>>>   
>>>   u8 phantom_stride;
>>>   
>>> +int node; /* NUMA node */
>> I thought I asked about this on v1 already: Does this really need to be
>> an int, when commonly node numbers are stored in u8/unsigned char?
>> Shrinking the field size would prevent the structure size from growing...
> 
> I kept this field as an int to be able to store NUMA_NO_NODE which I 
> thought to be (int)-1.
> 
> But now I see that NUMA_NO_NODE is, in fact, 0xff but is promoted to 
> (int)-1 by pxm_to_node(). Given that there is a number of tests for 
> NUMA_NO_NODE and not for (int)-1, should we then make pxm_to_node() 
> return u8 as well?

I think that would make sense, together with fixing up one of the three
callers in VT-d code (from alloc_pgtable_maddr()); the other two look
correct already.

>> Of course an additional question would be whether the node wouldn't
>> better go into struct arch_pci_dev - that depends on whether we
>> expect ARM to be using NUMA...
> 
> Since we have CPU topology in common code I thought this would be 
> arch-independent as well.

Not sure what you're referring to here: What common piece of data
stores the node of a particular CPU? cpu_to_node[] clearly is x86-
specific.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information

2015-01-07 Thread Jan Beulich
>>> On 07.01.15 at 15:47,  wrote:
> On 07/01/15 14:42, Boris Ostrovsky wrote:
>> I kept this field as an int to be able to store NUMA_NO_NODE which I
>> thought to be (int)-1.
>>
>> But now I see that NUMA_NO_NODE is, in fact, 0xff but is promoted to
>> (int)-1 by pxm_to_node(). Given that there is a number of tests for
>> NUMA_NO_NODE and not for (int)-1, should we then make pxm_to_node()
>> return u8 as well?
> 
> I noticed this as well, and found it quite counter intuitive.
> 
> I would suggest fixing NUMA_NO_NODE to -1 and removing some of the
> type-punning.

I have to admit that I see no value in wasting 4 bytes for something
that for the foreseeable future won't exceed 1 byte.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/4] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient

2015-01-07 Thread Jan Beulich
>>> On 07.01.15 at 15:45,  wrote:
> On 01/07/2015 04:12 AM, Jan Beulich wrote:
> On 06.01.15 at 14:41,  wrote:
>>> On 06/01/15 02:18, Boris Ostrovsky wrote:
 Instead of copying data for each field in xen_sysctl_topologyinfo 
 separately
 put cpu/socket/node into a single structure and do a single copy for each
 processor.

 There is also no need to copy whole op to user at the end, max_cpu_index is
 sufficient

 Rename xen_sysctl_topologyinfo and XEN_SYSCTL_topologyinfo to reflect the 
> fact
 that these are used for CPU topology. Subsequent patch will add support for
 PCI topology sysctl.

 Signed-off-by: Boris Ostrovsky 
>>> If we are going to change the hypercall, then can we see about making it
>>> a stable interface (i.e. not a sysctl/domctl)?  There are non-toolstack
>>> components which might want/need access to this information.  (i.e. I am
>>> still looking for a reasonable way to get this information from Xen in
>>> hwloc)
>> In which case leaving the sysctl alone and just adding a new non-sysctl
>> interface should be considered.
> 
> I'd expect IO NUMA information to be used together with CPU topology 
> information so I am not sure how useful this would be. Unless we create 
> a similar interface for that (CPU/memory) as well.

Creating a new CPU topology interface while leaving alone the current
sysctl was what I meant to suggest. The I/O topology one would then
become a sibling to the CPU one.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/3] x86/xen: don't count how many PFNs are identity mapped

2015-01-07 Thread Juergen Gross

On 01/07/2015 03:47 PM, David Vrabel wrote:

This accounting is just used to print a diagnostic message that isn't
very useful.

Signed-off-by: David Vrabel 


Reviewed-by: Juergen Gross 


---
  arch/x86/xen/setup.c |   27 +--
  1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index dfd77de..664dffc 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -229,15 +229,14 @@ static int __init xen_free_mfn(unsigned long mfn)
   * as a fallback if the remapping fails.
   */
  static void __init xen_set_identity_and_release_chunk(unsigned long start_pfn,
-   unsigned long end_pfn, unsigned long nr_pages, unsigned long *identity,
-   unsigned long *released)
+   unsigned long end_pfn, unsigned long nr_pages, unsigned long *released)
  {
-   unsigned long len = 0;
unsigned long pfn, end;
int ret;

WARN_ON(start_pfn > end_pfn);

+   /* Release pages first. */
end = min(end_pfn, nr_pages);
for (pfn = start_pfn; pfn < end; pfn++) {
unsigned long mfn = pfn_to_mfn(pfn);
@@ -250,16 +249,14 @@ static void __init 
xen_set_identity_and_release_chunk(unsigned long start_pfn,
WARN(ret != 1, "Failed to release pfn %lx err=%d\n", pfn, ret);

if (ret == 1) {
+   (*released)++;
if (!__set_phys_to_machine(pfn, INVALID_P2M_ENTRY))
break;
-   len++;
} else
break;
}

-   /* Need to release pages first */
-   *released += len;
-   *identity += set_phys_range_identity(start_pfn, end_pfn);
+   set_phys_range_identity(start_pfn, end_pfn);
  }

  /*
@@ -318,7 +315,6 @@ static void __init xen_do_set_identity_and_remap_chunk(
unsigned long ident_pfn_iter, remap_pfn_iter;
unsigned long ident_end_pfn = start_pfn + size;
unsigned long left = size;
-   unsigned long ident_cnt = 0;
unsigned int i, chunk;

WARN_ON(size == 0);
@@ -347,8 +343,7 @@ static void __init xen_do_set_identity_and_remap_chunk(
xen_remap_mfn = mfn;

/* Set identity map */
-   ident_cnt += set_phys_range_identity(ident_pfn_iter,
-   ident_pfn_iter + chunk);
+   set_phys_range_identity(ident_pfn_iter, ident_pfn_iter + chunk);

left -= chunk;
}
@@ -371,7 +366,7 @@ static void __init xen_do_set_identity_and_remap_chunk(
  static unsigned long __init xen_set_identity_and_remap_chunk(
  const struct e820entry *list, size_t map_size, unsigned long 
start_pfn,
unsigned long end_pfn, unsigned long nr_pages, unsigned long remap_pfn,
-   unsigned long *identity, unsigned long *released)
+   unsigned long *released)
  {
unsigned long pfn;
unsigned long i = 0;
@@ -386,8 +381,7 @@ static unsigned long __init 
xen_set_identity_and_remap_chunk(
/* Do not remap pages beyond the current allocation */
if (cur_pfn >= nr_pages) {
/* Identity map remaining pages */
-   *identity += set_phys_range_identity(cur_pfn,
-   cur_pfn + size);
+   set_phys_range_identity(cur_pfn, cur_pfn + size);
break;
}
if (cur_pfn + size > nr_pages)
@@ -398,7 +392,7 @@ static unsigned long __init 
xen_set_identity_and_remap_chunk(
if (!remap_range_size) {
pr_warning("Unable to find available pfn range, not 
remapping identity pages\n");
xen_set_identity_and_release_chunk(cur_pfn,
-   cur_pfn + left, nr_pages, identity, released);
+   cur_pfn + left, nr_pages, released);
break;
}
/* Adjust size to fit in current e820 RAM region */
@@ -410,7 +404,6 @@ static unsigned long __init 
xen_set_identity_and_remap_chunk(
/* Update variables to reflect new mappings. */
i += size;
remap_pfn += size;
-   *identity += size;
}

/*
@@ -430,7 +423,6 @@ static void __init xen_set_identity_and_remap(
unsigned long *released)
  {
phys_addr_t start = 0;
-   unsigned long identity = 0;
unsigned long last_pfn = nr_pages;
const struct e820entry *entry;
unsigned long num_released = 0;
@@ -460,14 +452,13 @@ static void __init xen_set_identity_and_remap(
last_pfn = xen_set_identity_and_remap_chunk(
list, map_size, start_pfn,
end_pfn, nr_pages, last_pfn,
-   &identity, &num_

Re: [Xen-devel] [PATCH 2/3] x86/xen: add extra memory for remapped frames during setup

2015-01-07 Thread Juergen Gross

On 01/07/2015 03:47 PM, David Vrabel wrote:

If the non-RAM regions in the e820 memory map are larger than the size
of the initial balloon, a BUG was triggered as the frames are remaped
beyond the limit of the linear p2m.  The frames are remapped into the
initial balloon area (xen_extra_mem) but not enough of this is
available.

Ensure enough extra memory regions are added for these remapped
frames.

Signed-off-by: David Vrabel 


Reviewed-by: Juergen Gross 


---
  arch/x86/xen/setup.c |   13 +
  1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 664dffc..feb6d86 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -366,7 +366,7 @@ static void __init xen_do_set_identity_and_remap_chunk(
  static unsigned long __init xen_set_identity_and_remap_chunk(
  const struct e820entry *list, size_t map_size, unsigned long 
start_pfn,
unsigned long end_pfn, unsigned long nr_pages, unsigned long remap_pfn,
-   unsigned long *released)
+   unsigned long *released, unsigned long *remapped)
  {
unsigned long pfn;
unsigned long i = 0;
@@ -404,6 +404,7 @@ static unsigned long __init 
xen_set_identity_and_remap_chunk(
/* Update variables to reflect new mappings. */
i += size;
remap_pfn += size;
+   *remapped += size;
}

/*
@@ -420,12 +421,13 @@ static unsigned long __init 
xen_set_identity_and_remap_chunk(

  static void __init xen_set_identity_and_remap(
const struct e820entry *list, size_t map_size, unsigned long nr_pages,
-   unsigned long *released)
+   unsigned long *released, unsigned long *remapped)
  {
phys_addr_t start = 0;
unsigned long last_pfn = nr_pages;
const struct e820entry *entry;
unsigned long num_released = 0;
+   unsigned long num_remapped = 0;
int i;

/*
@@ -452,12 +454,13 @@ static void __init xen_set_identity_and_remap(
last_pfn = xen_set_identity_and_remap_chunk(
list, map_size, start_pfn,
end_pfn, nr_pages, last_pfn,
-   &num_released);
+   &num_released, &num_remapped);
start = end;
}
}

*released = num_released;
+   *remapped = num_remapped;

pr_info("Released %ld page(s)\n", num_released);
  }
@@ -577,6 +580,7 @@ char * __init xen_memory_setup(void)
struct xen_memory_map memmap;
unsigned long max_pages;
unsigned long extra_pages = 0;
+   unsigned long remapped_pages;
int i;
int op;

@@ -626,9 +630,10 @@ char * __init xen_memory_setup(void)
 * underlying RAM.
 */
xen_set_identity_and_remap(map, memmap.nr_entries, max_pfn,
-  &xen_released_pages);
+  &xen_released_pages, &remapped_pages);

extra_pages += xen_released_pages;
+   extra_pages += remapped_pages;

/*
 * Clamp the amount of extra memory to a EXTRA_MEM_RATIO




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/xen: Free bootmem in free_p2m_page() during early boot

2015-01-07 Thread David Vrabel
On 07/01/15 14:08, Boris Ostrovsky wrote:
> With recent changes in p2m we now have legitimate cases when
> p2m memory needs to be freed during early boot (i.e. before
> slab is initialized).
> 
> Signed-off-by: Boris Ostrovsky 

Applied to to stable/for-linus-3.19, thanks.

If I understand correctly this didn't fully fix your 32-bit dom0 crashes?

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] x86/xen: optimize get_phys_to_machine()

2015-01-07 Thread Juergen Gross

On 01/07/2015 03:47 PM, David Vrabel wrote:

The page table walk is only needed to distinguish between identity and
missing, both of which have INVALID_P2M_ENTRY.


As get_phys_to_machine is called by __pfn_to_mfn() only which already
checks for mfn == INVALID_P2M_ENTRY this optimization will have an
effect only in the early boot case with pfn >= xen_p2m_size.

I doubt this is necessary.


Juergen



Signed-off-by: David Vrabel 
---
  arch/x86/xen/p2m.c |   30 ++
  1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c
index edbc7a6..a848201 100644
--- a/arch/x86/xen/p2m.c
+++ b/arch/x86/xen/p2m.c
@@ -405,8 +405,7 @@ void __init xen_vmalloc_p2m_tree(void)

  unsigned long get_phys_to_machine(unsigned long pfn)
  {
-   pte_t *ptep;
-   unsigned int level;
+   unsigned long mfn;

if (unlikely(pfn >= xen_p2m_size)) {
if (pfn < xen_max_p2m_pfn)
@@ -414,19 +413,26 @@ unsigned long get_phys_to_machine(unsigned long pfn)

return IDENTITY_FRAME(pfn);
}
+   
+   mfn = xen_p2m_addr[pfn];

-   ptep = lookup_address((unsigned long)(xen_p2m_addr + pfn), &level);
-   BUG_ON(!ptep || level != PG_LEVEL_4K);
+   if (unlikely(mfn == INVALID_P2M_ENTRY)) {
+   pte_t *ptep;
+   unsigned int level;

-   /*
-* The INVALID_P2M_ENTRY is filled in both p2m_*identity
-* and in p2m_*missing, so returning the INVALID_P2M_ENTRY
-* would be wrong.
-*/
-   if (pte_pfn(*ptep) == PFN_DOWN(__pa(p2m_identity)))
-   return IDENTITY_FRAME(pfn);
+   ptep = lookup_address((unsigned long)(xen_p2m_addr + pfn), 
&level);
+   BUG_ON(!ptep || level != PG_LEVEL_4K);
+
+   /*
+* The INVALID_P2M_ENTRY is filled in both p2m_*identity
+* and in p2m_*missing, so returning the INVALID_P2M_ENTRY
+* would be wrong.
+*/
+   if (pte_pfn(*ptep) == PFN_DOWN(__pa(p2m_identity)))
+   return IDENTITY_FRAME(pfn);
+   }

-   return xen_p2m_addr[pfn];
+   return mfn;
  }
  EXPORT_SYMBOL_GPL(get_phys_to_machine);





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/4] sysctl: Add sysctl interface for querying PCI topology

2015-01-07 Thread Jan Beulich
>>> On 07.01.15 at 15:55,  wrote:
> On 01/07/2015 04:21 AM, Jan Beulich wrote:
> On 06.01.15 at 03:18,  wrote:
>>> +for ( ; ti->first_dev < ti->num_devs; ti->first_dev++ )
>>> +{
>>> +xen_sysctl_pcitopo_t pcitopo;
>>> +struct pci_dev *pdev;
>>> +
>>> +if ( copy_from_guest_offset(&pcitopo, ti->pcitopo,
>>> +ti->first_dev, 1) )
>>> +{
>>> +ret = -EFAULT;
>>> +break;
>>> +}
>>> +
>>> +spin_lock(&pcidevs_lock);
>>> +pdev = pci_get_pdev(pcitopo.pcidev.seg, pcitopo.pcidev.bus,
>>> +pcitopo.pcidev.devfn);
>>> +if ( !pdev || (pdev->node == NUMA_NO_NODE) )
>>> +pcitopo.node = INVALID_TOPOLOGY_ID;
>>> +else
>>> +pcitopo.node = pdev->node;
>> Are hypervisor-internal node numbers really meaningful to the caller?
> 
> 
> This is the same information (pxm -> node mapping ) that we provide in 
> XEN_SYSCTL_topologyinfo (renamed in this series to
> XEN_SYSCTL_cputopoinfo). Given that I expect the two topologies to be 
> used together I think the answer is yes.

Building your argumentation on potentially mis-designed existing
interfaces is bogus. The question is - what use is a Xen internal
node number to a caller of a particular hypercall (other than it
being purely informational, e.g. for printing human readable
output)?

In particular if we were to introduce a new non-sysctl interface,
determining whether the hypervisor internal representation is really
the right one to expose here should be one of the most important
design aspects. I personally think that exposing e.g. the firmware
determined (and hence hopefully stable across reboots) PXM would
be more reasonable.

>>> @@ -463,7 +464,7 @@ typedef struct xen_sysctl_lockprof_op 
> xen_sysctl_lockprof_op_t;
>>>   DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_op_t);
>>>   
>>>   /* XEN_SYSCTL_cputopoinfo */
>>> -#define INVALID_TOPOLOGY_ID  (~0U)
>>> +#define INVALID_TOPOLOGY_ID  (~0U) /* Also used by pcitopo */
>> Better extend the preceding comment.
> 
> I mentioned it to Wei yesterday that the file is structured (or at least 
> appears to me to be structured) in such a way that these top comments 
> mark sections of definitions for each sysctl. And so I thought that I'd 
> be breaking this convention if I were to extend the comment.

Yeah, I saw that discussion after having replied. Looking more
closely, I think the placement of the INVALID_TOPOLOGY_ID
definition is sub-optimal when you want to use it in a second place.
Move it mode towards the beginning of the header, leave the
current comment as is, and if you feel so ad a new comment ahead
of it explaining which operations are using it.

And then, if we go the non-sysctl route, the definition would likely
need moving into xen.h anyway.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/xen: Free bootmem in free_p2m_page() during early boot

2015-01-07 Thread Boris Ostrovsky

On 01/07/2015 10:10 AM, David Vrabel wrote:

On 07/01/15 14:08, Boris Ostrovsky wrote:

With recent changes in p2m we now have legitimate cases when
p2m memory needs to be freed during early boot (i.e. before
slab is initialized).

Signed-off-by: Boris Ostrovsky 

Applied to to stable/for-linus-3.19, thanks.

If I understand correctly this didn't fully fix your 32-bit dom0 crashes?


No, this only allowed me to be alive for a few more microseconds ;-)

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 0/2] Two libxl misc patches

2015-01-07 Thread Wei Liu
These changes were requested by Ian Campbell when relevant patches were posted
but they didn't warrant a resend at that time.

Wei Liu (2):
  libxl_internal: lock_carefd -> carefd
  libxl_internal: comment on domain userdata unlock function

 tools/libxl/libxl_internal.c |   20 +---
 tools/libxl/libxl_internal.h |2 +-
 2 files changed, 18 insertions(+), 4 deletions(-)

-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   >