Re: [Xen-devel] [PATCH 17/22] xen/arm: p2m: Don't need to restore the state for an idle vCPU.

2016-07-22 Thread Sergej Proskurin


On 07/20/2016 06:10 PM, Julien Grall wrote:
> The function p2m_restore_state could be called with an idle vCPU in
> arguments (when called by construct_dom0). However, we will never return
> to EL0/EL1 in this case, so it is not necessary to restore the p2m
> registers.
> 

I absolutely agree.

Cheers,
~Sergej

> Signed-off-by: Julien Grall 
> ---
>  xen/arch/arm/p2m.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index c52081a..d1b6009 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -127,6 +127,9 @@ void p2m_restore_state(struct vcpu *n)
>  {
>  register_t hcr;
>  
> +if ( is_idle_vcpu(n) )
> +return;
> +
>  hcr = READ_SYSREG(HCR_EL2);
>  WRITE_SYSREG(hcr & ~HCR_VM, HCR_EL2);
>  isb();
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 16/22] xen/arm: p2m: Move the vttbr field from arch_domain to p2m_domain

2016-07-22 Thread Sergej Proskurin
Hi Julien,

On 07/20/2016 06:10 PM, Julien Grall wrote:
> The field vttbr holds the base address of the translation table for
> guest. Its value will depends on how the p2m has been initialized and
> will only be used by the code code.
> 
> So move the field from arch_domain to p2m_domain. This will also ease
> the implementation of altp2m.
> ---
>  xen/arch/arm/p2m.c   | 11 +++
>  xen/arch/arm/traps.c |  2 +-
>  xen/include/asm-arm/domain.h |  1 -
>  xen/include/asm-arm/p2m.h|  3 +++
>  4 files changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index c407e6a..c52081a 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -107,10 +107,14 @@ void dump_p2m_lookup(struct domain *d, paddr_t addr)
>  
>  static void p2m_load_VTTBR(struct domain *d)
>  {
> +struct p2m_domain *p2m = &d->arch.p2m;
> +

This is ok to me. Further altp2m implementation can easily extend this
code base.

Also, as your patch (patch #17) as well eliminates the possibility that
the idle domain reaches this function the following check could be
potentially removed (as far as I know, p2m_load_VTTBR is reached only
through p2m_restore_state).

>  if ( is_idle_domain(d) )
>  return;
> -BUG_ON(!d->arch.vttbr);
> -WRITE_SYSREG64(d->arch.vttbr, VTTBR_EL2);
> +
> +ASSERT(p2m->vttbr);
> +
> +WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
>  isb(); /* Ensure update is visible */
>  }
>  
> @@ -1298,8 +1302,7 @@ static int p2m_alloc_table(struct domain *d)
>  
>  p2m->root = page;
>  
> -d->arch.vttbr = page_to_maddr(p2m->root)
> -| ((uint64_t)p2m->vmid&0xff)<<48;
> +p2m->vttbr = page_to_maddr(p2m->root) | ((uint64_t)p2m->vmid & 0xff) << 
> 48;
>  
>  /*
>   * Make sure that all TLBs corresponding to the new VMID are flushed
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 06a8ee5..65c6fb4 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -880,7 +880,7 @@ void vcpu_show_registers(const struct vcpu *v)
>  ctxt.ifsr32_el2 = v->arch.ifsr;
>  #endif
>  
> -ctxt.vttbr_el2 = v->domain->arch.vttbr;
> +ctxt.vttbr_el2 = v->domain->arch.p2m.vttbr;
>  
>  _show_registers(&v->arch.cpu_info->guest_cpu_user_regs, &ctxt, 1, v);
>  }
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 4e9d8bf..9452fcd 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -48,7 +48,6 @@ struct arch_domain
>  
>  /* Virtual MMU */
>  struct p2m_domain p2m;
> -uint64_t vttbr;
>  
>  struct hvm_domain hvm_domain;
>  gfn_t *grant_table_gfn;
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index ce28e8a..53c4d78 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -32,6 +32,9 @@ struct p2m_domain {
>  /* Current VMID in use */
>  uint8_t vmid;
>  
> +/* Current Translation Table Base Register for the p2m */
> +uint64_t vttbr;
> +
>  /*
>   * Highest guest frame that's ever been mapped in the p2m
>   * Only takes into account ram and foreign mapping
> 

Cheers,
~Sergej

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

2016-07-22 Thread Roger Pau Monné
On Thu, Jul 21, 2016 at 06:08:05PM +0800, Bob Liu wrote:
> 
> On 07/21/2016 04:57 PM, Roger Pau Monné wrote:
> > On Fri, Jul 15, 2016 at 05:31:49PM +0800, Bob Liu wrote:
> >> The current VBD layer reserves buffer space for each attached device based 
> >> on
> >> three statically configured settings which are read at boot time.
> >>  * max_indirect_segs: Maximum amount of segments.
> >>  * max_ring_page_order: Maximum order of pages to be used for the shared 
> >> ring.
> >>  * max_queues: Maximum of queues(rings) to be used.
> >>
> >> But the storage backend, workload, and guest memory result in very 
> >> different
> >> tuning requirements. It's impossible to centrally predict application
> >> characteristics so it's best to leave allow the settings can be dynamiclly
> >> adjusted based on workload inside the Guest.
> >>
> >> Usage:
> >> Show current values:
> >> cat /sys/devices/vbd-xxx/max_indirect_segs
> >> cat /sys/devices/vbd-xxx/max_ring_page_order
> >> cat /sys/devices/vbd-xxx/max_queues
> >>
> >> Write new values:
> >> echo  > /sys/devices/vbd-xxx/max_indirect_segs
> >> echo  > /sys/devices/vbd-xxx/max_ring_page_order
> >> echo  > /sys/devices/vbd-xxx/max_queues
> >>
> >> Signed-off-by: Bob Liu 
> >> --
> >> v2: Add device lock and other comments from Konrad.
> >> ---
> >>  drivers/block/xen-blkfront.c | 285 
> >> ++-
> >>  1 file changed, 283 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> >> index 10f46a8..9a5ed22 100644
> >> --- a/drivers/block/xen-blkfront.c
> >> +++ b/drivers/block/xen-blkfront.c
> >> @@ -46,6 +46,7 @@
> >>  #include 
> >>  #include 
> >>  #include 
> >> +#include 
> >>  
> >>  #include 
> >>  #include 
> >> @@ -212,6 +213,11 @@ struct blkfront_info
> >>/* Save uncomplete reqs and bios for migration. */
> >>struct list_head requests;
> >>struct bio_list bio_list;
> >> +  /* For dynamic configuration. */
> >> +  unsigned int reconfiguring:1;
> >> +  int new_max_indirect_segments;
> >> +  int new_max_ring_page_order;
> >> +  int new_max_queues;
> >>  };
> >>  
> >>  static unsigned int nr_minors;
> >> @@ -1350,6 +1356,31 @@ static void blkif_free(struct blkfront_info *info, 
> >> int suspend)
> >>for (i = 0; i < info->nr_rings; i++)
> >>blkif_free_ring(&info->rinfo[i]);
> >>  
> >> +  /* Remove old xenstore nodes. */
> >> +  if (info->nr_ring_pages > 1)
> >> +  xenbus_rm(XBT_NIL, info->xbdev->nodename, "ring-page-order");
> >> +
> >> +  if (info->nr_rings == 1) {
> >> +  if (info->nr_ring_pages == 1) {
> >> +  xenbus_rm(XBT_NIL, info->xbdev->nodename, "ring-ref");
> >> +  } else {
> >> +  for (i = 0; i < info->nr_ring_pages; i++) {
> >> +  char ring_ref_name[RINGREF_NAME_LEN];
> >> +
> >> +  snprintf(ring_ref_name, RINGREF_NAME_LEN, 
> >> "ring-ref%u", i);
> >> +  xenbus_rm(XBT_NIL, info->xbdev->nodename, 
> >> ring_ref_name);
> >> +  }
> >> +  }
> >> +  } else {
> >> +  xenbus_rm(XBT_NIL, info->xbdev->nodename, 
> >> "multi-queue-num-queues");
> >> +
> >> +  for (i = 0; i < info->nr_rings; i++) {
> >> +  char queuename[QUEUE_NAME_LEN];
> >> +
> >> +  snprintf(queuename, QUEUE_NAME_LEN, "queue-%u", i);
> >> +  xenbus_rm(XBT_NIL, info->xbdev->nodename, queuename);
> >> +  }
> >> +  }
> >>kfree(info->rinfo);
> >>info->rinfo = NULL;
> >>info->nr_rings = 0;
> >> @@ -1772,6 +1803,10 @@ static int talk_to_blkback(struct xenbus_device 
> >> *dev,
> >>info->nr_ring_pages = 1;
> >>else {
> >>ring_page_order = min(xen_blkif_max_ring_order, max_page_order);
> >> +  if (info->new_max_ring_page_order) {
> > 
> > Instead of calling this "new_max_ring_page_order", could you just call it 
> > max_ring_page_order, iniitalize it to xen_blkif_max_ring_order by default 
> 
> 
> Sure, I can do that.
> 
> 
> > and use it everywhere instead of xen_blkif_max_ring_order?
> 
> 
> But "xen_blkif_max_ring_order" still have to be used here, this is the only 
> place "xen_blkif_max_ring_order" is used(except checking the value of it in 
> xlblk_init()).
> 
> 
> > 
> >> +  BUG_ON(info->new_max_ring_page_order > max_page_order);
> >> +  ring_page_order = info->new_max_ring_page_order;
> >> +  }
> >>info->nr_ring_pages = 1 << ring_page_order;
> >>}
> >>  
> >> @@ -1895,6 +1930,10 @@ static int negotiate_mq(struct blkfront_info *info)
> >>backend_max_queues = 1;
> >>  
> >>info->nr_rings = min(backend_max_queues, xen_blkif_max_queues);
> >> +  if (info->new_max_queues) {
> > 
> > Same here IMHO, this is going to make the code flow slightly easier to 
> > understand.
> > 
> >> +  BUG_ON(info->new_max_queues > backend_max_queues);
> >

Re: [Xen-devel] [PATCH 18/22] xen/arm: p2m: Rework the context switch to another VTTBR in flush_tlb_domain

2016-07-22 Thread Sergej Proskurin
Hi Julien,

On 07/20/2016 06:11 PM, Julien Grall wrote:
> The current implementation of flush_tlb_domain is relying on the domain
> to have a single p2m. With the upcoming feature altp2m, a single domain
> may have different p2m. So we would need to switch to the correct p2m in
> order to flush the TLBs.
> 
> Rather than checking whether the domain is not the current domain, check
> whether the VTTBR is different. The resulting assembly code is much
> smaller: from 38 instructions (+ 2 functions call) to 22 instructions.
> 
> Signed-off-by: Julien Grall 
> ---
>  xen/arch/arm/p2m.c | 18 +++---
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index d1b6009..015c1e8 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -151,24 +151,28 @@ void p2m_restore_state(struct vcpu *n)
>  
>  void flush_tlb_domain(struct domain *d)
>  {
> +struct p2m_domain *p2m = &d->arch.p2m;
>  unsigned long flags = 0;
> +uint64_t ovttbr;
>  
>  /*
> - * Update the VTTBR if necessary with the domain d. In this case,
> - * it's only necessary to flush TLBs on every CPUs with the current VMID
> - * (our domain).
> + * ARM only provides an instruction to flush TLBs for the current
> + * VMID. So switch to the VTTBR of a given P2M if different.
>   */
> -if ( d != current->domain )
> +ovttbr = READ_SYSREG64(VTTBR_EL2);
> +if ( ovttbr != p2m->vttbr )
>  {
>  local_irq_save(flags);
> -p2m_load_VTTBR(d);
> +WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
> +isb();
>  }
>  
>  flush_tlb();
>  
> -if ( d != current->domain )
> +if ( ovttbr != READ_SYSREG64(VTTBR_EL2) )
>  {
> -p2m_load_VTTBR(current->domain);
> +WRITE_SYSREG64(ovttbr, VTTBR_EL2);
> +isb();
>  local_irq_restore(flags);
>  }
>  }
> 

Thank you for this implementation change. Since the upcoming altp2m
feature uses a VMID per p2m view, it makes absolutely sense to directly
look for differing VTTBRs.

Cheers,
~Sergej

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 19/22] xen/arm: p2m: Inline p2m_load_VTTBR into p2m_restore_state

2016-07-22 Thread Sergej Proskurin
Hi Julien,

On 07/20/2016 06:11 PM, Julien Grall wrote:
> p2m_restore_state is the last caller of p2m_load_VTTBR and already check
> if the vCPU does not belong to the idle domain.
> 
> Note that it is likely possible to remove some isb in the function
> p2m_restore_state, however this is not the purpose of this patch. So the
> numerous isb have been left.
> 

Right now, I don't see any issues with removing the p2m_load_VTTBR
function in combination with changes applied to flush_tlb_domain in your
patch #18 and #17. However, I am not entirely sure whether it makes
sense to entirely remove the function and replicate the VTTBR loading
functionality across multiple functions. Why don't we just provide a
struct p2m_domain* to p2m_load_VTTBR (potentially with a backpointer to
the associated domain, as it is shown in the arm/altp2m patch) and use
the function inline?

Cheers,
~Sergej


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

2016-07-22 Thread Bob Liu

On 07/22/2016 03:45 PM, Roger Pau Monné wrote:
> On Thu, Jul 21, 2016 at 06:08:05PM +0800, Bob Liu wrote:
>>
>> On 07/21/2016 04:57 PM, Roger Pau Monné wrote:
..[snip]..
 +
 +static ssize_t dynamic_reconfig_device(struct blkfront_info *info, 
 ssize_t count)
 +{
 +  unsigned int i;
 +  int err = -EBUSY;
 +
 +  /*
 +   * Make sure no migration in parallel, device lock is actually a
 +   * mutex.
 +   */
 +  if (!device_trylock(&info->xbdev->dev)) {
 +  pr_err("Fail to acquire dev:%s lock, may be in migration.\n",
 +  dev_name(&info->xbdev->dev));
 +  return err;
 +  }
 +
 +  /*
 +   * Prevent new requests and guarantee no uncompleted reqs.
 +   */
 +  blk_mq_freeze_queue(info->rq);
 +  if (part_in_flight(&info->gd->part0))
 +  goto out;
 +
 +  /*
 +   * FrontBackend
 +   * Switch to XenbusStateClosed
 +   *  frontend_changed():
 +   *   case XenbusStateClosed:
 +   *  xen_blkif_disconnect()
 +   *  Switch to 
 XenbusStateClosed
 +   * blkfront_resume():
 +   *  frontend_changed():
 +   *  reconnect
 +   * Wait until XenbusStateConnected
 +   */
 +  info->reconfiguring = true;
 +  xenbus_switch_state(info->xbdev, XenbusStateClosed);
 +
 +  /* Poll every 100ms, 1 minute timeout. */
 +  for (i = 0; i < 600; i++) {
 +  /*
 +   * Wait backend enter XenbusStateClosed, blkback_changed()
 +   * will clear reconfiguring.
 +   */
 +  if (!info->reconfiguring)
 +  goto resume;
 +  schedule_timeout_interruptible(msecs_to_jiffies(100));
 +  }
>>>
>>> Instead of having this wait, could you just set info->reconfiguring = 1, 
>>> set 
>>> the frontend state to XenbusStateClosed and mimic exactly what a resume 
>>> from 
>>> suspension does? blkback_changed would have to set the frontend state to 
>>> InitWait when it detects that the backend has switched to Closed, and call 
>>> blkfront_resume.
>>
>>
>> I think that won't work.
>> In the real "resume" case, the power management system will trigger all 
>> ->resume() path.
>> But there is no place for dynamic configuration.
> 
> Hello,
> 
> I think it should be possible to set info->reconfiguring and wait for the 
> backend to switch to state Closed, at that point we should call blkif_resume 
> (from blkback_changed) and the backend will follow the reconection.
> 

Okay, I get your point. Yes, that's an option.

But this will make 'dynamic configuration' to be async, I'm worry about the 
end-user will get panic.
E.g
A end-user "echo  > /sys/devices/vbd-xxx/max_indirect_segs",
but then the device will be Closed and disappeared, the user have to wait for a 
random time so that the device can resume.

-- 
Regards,
-Bob

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 15/22] xen/arm: Don't call p2m_alloc_table from arch_domain_create

2016-07-22 Thread Sergej Proskurin
Hi Julien,


> -int p2m_alloc_table(struct domain *d)
> +static int p2m_alloc_table(struct domain *d)

While moving parts of the altp2m code out of ./xen/arch/arm/p2m.c, the
function p2m_alloc_table needs to be called from ./xen/arch/arm/altp2m.c
to allocate the individual altp2m views. Hence it should not be static.

However, this requirement is clearly part of an entirely patch, which
will be introduced in the near future and hence can be discussed there.

Cheers,
~Sergej

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 20/22] xen/arm: Don't export flush_tlb_domain

2016-07-22 Thread Sergej Proskurin
Hi Julien,

On 07/20/2016 06:11 PM, Julien Grall wrote:
> The function flush_tlb_domain is not used outside of the file where it
> has been declared.
> 

As for patch #15, the same applies here too:
For altp2m, flush_tlb_domain/p2m_flush_tlb should be made available to
./xen/arch/arm/altp2m.c.

Cheers,
~Sergej

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/vMSI-X: Fix host crash when shutting down guests with MSI capable devices

2016-07-22 Thread Sander Eikelenboom

Thursday, July 21, 2016, 12:18:37 PM, you wrote:

> c/s 74c6dc2d "x86/vMSI-X: defer intercept handler registration" caused MSI-X
> table infrastructure not to always be initialised, but it missed one path
> which needed an is-initialised check.

> If a devices is passed through to a domain which is MSI capable but not MSI-X
> capable, the call to msixtbl_init() is omitted, but a XEN_DOMCTL_unbind_pt_irq
> hypercall still calls into msixtbl_pt_unregister().  This follows the linked
> list pointer which is still NULL.

> Introduce an is-initalised check to msixtbl_pt_unregister().

> Furthermore, the purpose of the open-coded msixtbl_list.next check is rather
> subtle.  Introduce an msixtbl_initialised() predicate instead, which makes its
> purpose far more obvious.

> Reported-by: Sander Eikelenboom 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Sander Eikelenboom 

> Sander - would you mind double checking this patch?
> ---

Hi Andrew,

Just got the chance to test and it works for me !

Thanks,

Sander


>  xen/arch/x86/hvm/vmsi.c | 16 +---
>  1 file changed, 13 insertions(+), 3 deletions(-)

> diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
> index e418b98..ef1dfff 100644
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -166,6 +166,16 @@ struct msixtbl_entry
>  
>  static DEFINE_RCU_READ_LOCK(msixtbl_rcu_lock);
>  
> +/*
> + * MSI-X table infrastructure is dynamically initialised when an MSI-X 
> capable
> + * device is passed through to a domain, rather than unconditionally for all
> + * domains.
> + */
> +static bool msixtbl_initialised(const struct domain *d)
> +{
+return !!d->>arch.hvm_domain.msixtbl_list.next;
> +}
> +
>  static struct msixtbl_entry *msixtbl_find_entry(
>  struct vcpu *v, unsigned long addr)
>  {
> @@ -519,7 +529,7 @@ void msixtbl_pt_unregister(struct domain *d, struct pirq 
> *pirq)
>  ASSERT(pcidevs_locked());
>  ASSERT(spin_is_locked(&d->event_lock));
>  
> -if ( !has_vlapic(d) )
> +if ( !msixtbl_initialised(d) )
>  return;
>  
>  irq_desc = pirq_spin_lock_irq_desc(pirq, NULL);
> @@ -552,7 +562,7 @@ void msixtbl_init(struct domain *d)
>  struct hvm_io_handler *handler;
>  
>  if ( !has_hvm_container_domain(d) || !has_vlapic(d) ||
- d->>arch.hvm_domain.msixtbl_list.next )
> + msixtbl_initialised(d) )
>  return;
>  
>  INIT_LIST_HEAD(&d->arch.hvm_domain.msixtbl_list);
> @@ -569,7 +579,7 @@ void msixtbl_pt_cleanup(struct domain *d)
>  {
>  struct msixtbl_entry *entry, *temp;
>  
-if ( !d->>arch.hvm_domain.msixtbl_list.next )
> +if ( !msixtbl_initialised(d) )
>  return;
>  
>  spin_lock(&d->event_lock);


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian

2016-07-22 Thread Ingo Jürgensmann

On 18.07.2016 18:44, Andreas Ziegler wrote:


did the information that Ingo provided (i cited his message to the list
below) maybe help in narrowing down the possible issue?



If you need additional information we can try getting it for you, as
Ingo might be able to reproduce the kernel panic, although not 
reliably.


In the meanwhile, I activated IPv6 again on Tuesday evening and today 
the server crashed again some minutes ago. Here's the output from 
netconsole:


Jul 22 10:27:55 31.172.31.251 [171182.273973] BUG: unable to handle 
kernel

Jul 22 10:27:55 31.172.31.251  at 88025b2d
Jul 22 10:27:55 31.172.31.251 [171182.274039] IP:
Jul 22 10:27:55 31.172.31.251  [] memcpy+0x6/0x110
Jul 22 10:27:55 31.172.31.251 [171182.274080] PGD 1814067
Jul 22 10:27:55 31.172.31.251
Jul 22 10:27:55 31.172.31.251 [171182.274130] Oops:  [#1]
Jul 22 10:27:55 31.172.31.251
Jul 22 10:27:55 31.172.31.251 [171182.274171] Modules linked in:
Jul 22 10:27:55 31.172.31.251  nls_utf8
Jul 22 10:27:55 31.172.31.251  btrfs
Jul 22 10:27:55 31.172.31.251  xor
Jul 22 10:27:55 31.172.31.251  raid6_pq
Jul 22 10:27:55 31.172.31.251  ufs
Jul 22 10:27:55 31.172.31.251  qnx4
Jul 22 10:27:55 31.172.31.251  hfsplus
Jul 22 10:27:55 31.172.31.251  hfs
Jul 22 10:27:55 31.172.31.251  minix
Jul 22 10:27:55 31.172.31.251  ntfs
Jul 22 10:27:55 31.172.31.251  vfat
Jul 22 10:27:55 31.172.31.251  msdos
Jul 22 10:27:55 31.172.31.251  fat
Jul 22 10:27:55 31.172.31.251  jfs
Jul 22 10:27:55 31.172.31.251  authenc
Jul 22 10:27:55 31.172.31.251  xfrm6_mode_tunnel
Jul 22 10:27:55 31.172.31.251  xfrm4_mode_tunnel
Jul 22 10:27:55 31.172.31.251  xt_physdev
Jul 22 10:27:55 31.172.31.251  xen_netback
Jul 22 10:27:55 31.172.31.251  tun
Jul 22 10:27:55 31.172.31.251  xen_blkback
Jul 22 10:27:55 31.172.31.251  xt_multiport
Jul 22 10:27:55 31.172.31.251  twofish_generic
Jul 22 10:27:55 31.172.31.251  twofish_avx_x86_64
Jul 22 10:27:55 31.172.31.251  twofish_x86_64_3way
Jul 22 10:27:55 31.172.31.251  twofish_x86_64
Jul 22 10:27:55 31.172.31.251  twofish_common
Jul 22 10:27:55 31.172.31.251  serpent_avx_x86_64
Jul 22 10:27:55 31.172.31.251  serpent_sse2_x86_64
Jul 22 10:27:55 31.172.31.251  serpent_generic
Jul 22 10:27:55 31.172.31.251  blowfish_generic
Jul 22 10:27:55 31.172.31.251  blowfish_x86_64
Jul 22 10:27:55 31.172.31.251  blowfish_common
Jul 22 10:27:55 31.172.31.251  cast5_avx_x86_64
Jul 22 10:27:55 31.172.31.251  cast5_generic
Jul 22 10:27:55 31.172.31.251  cast_common
Jul 22 10:27:55 31.172.31.251  seqiv
Jul 22 10:27:55 31.172.31.251  ctr
Jul 22 10:27:55 31.172.31.251  ecb
Jul 22 10:27:55 31.172.31.251  des_generic
Jul 22 10:27:55 31.172.31.251  cbc
Jul 22 10:27:55 31.172.31.251  algif_skcipher
Jul 22 10:27:55 31.172.31.251  camellia_generic
Jul 22 10:27:55 31.172.31.251  camellia_aesni_avx_x86_64
Jul 22 10:27:55 31.172.31.251  camellia_x86_64
Jul 22 10:27:55 31.172.31.251  xts
Jul 22 10:27:55 31.172.31.251  xcbc
Jul 22 10:27:55 31.172.31.251  hmac
Jul 22 10:27:55 31.172.31.251  sha512_ssse3
Jul 22 10:27:55 31.172.31.251  sha512_generic
Jul 22 10:27:55 31.172.31.251  sha256_ssse3
Jul 22 10:27:55 31.172.31.251  sha256_generic
Jul 22 10:27:55 31.172.31.251  md4
Jul 22 10:27:55 31.172.31.251  algif_hash
Jul 22 10:27:55 31.172.31.251  af_alg
Jul 22 10:27:55 31.172.31.251  ipt_ULOG
Jul 22 10:27:55 31.172.31.251  xfrm_user
Jul 22 10:27:55 31.172.31.251  xfrm4_tunnel
Jul 22 10:27:55 31.172.31.251  xen_gntdev
Jul 22 10:27:55 31.172.31.251  tunnel4
Jul 22 10:27:55 31.172.31.251  xen_evtchn
Jul 22 10:27:55 31.172.31.251  xenfs
Jul 22 10:27:55 31.172.31.251  xen_privcmd
Jul 22 10:27:55 31.172.31.251  ipcomp
Jul 22 10:27:55 31.172.31.251  xfrm_ipcomp
Jul 22 10:27:55 31.172.31.251  esp4
Jul 22 10:27:55 31.172.31.251  ah4
Jul 22 10:27:55 31.172.31.251  af_key
Jul 22 10:27:55 31.172.31.251  xfrm_algo
Jul 22 10:27:55 31.172.31.251  ipmi_poweroff
Jul 22 10:27:55 31.172.31.251  video
Jul 22 10:27:55 31.172.31.251  thermal
Jul 22 10:27:55 31.172.31.251  fan
Jul 22 10:27:55 31.172.31.251  ac
Jul 22 10:27:55 31.172.31.251  battery
Jul 22 10:27:55 31.172.31.251  ip6t_REJECT
Jul 22 10:27:55 31.172.31.251  ip6table_filter
Jul 22 10:27:55 31.172.31.251  ip6_tables
Jul 22 10:27:55 31.172.31.251  bridge
Jul 22 10:27:55 31.172.31.251  stp
Jul 22 10:27:55 31.172.31.251  llc
Jul 22 10:27:55 31.172.31.251  ipt_REJECT
Jul 22 10:27:55 31.172.31.251  xt_tcpudp
Jul 22 10:27:55 31.172.31.251  iptable_filter
Jul 22 10:27:55 31.172.31.251  ip_tables
Jul 22 10:27:55 31.172.31.251  x_tables
Jul 22 10:27:55 31.172.31.251  ext4
Jul 22 10:27:55 31.172.31.251  crc16
Jul 22 10:27:55 31.172.31.251  mbcache
Jul 22 10:27:55 31.172.31.251  jbd2
Jul 22 10:27:55 31.172.31.251  fuse
Jul 22 10:27:55 31.172.31.251  ipmi_devintf
Jul 22 10:27:55 31.172.31.251  ipmi_watchdog
Jul 22 10:27:55 31.172.31.251  w83627ehf
Jul 22 10:27:55 31.172.31.251  hwmon_vid
Jul 22 10:27:55 31.172.31.251  nf_conntrack_ipv4
Jul 22 10:27:55 31.172.31.251  nf_defrag_ipv4
Jul 22 10:27:55 31.172.31.251  nf_conntrack
Jul 22 10:

Re: [Xen-devel] [PATCH 15/22] xen/arm: Don't call p2m_alloc_table from arch_domain_create

2016-07-22 Thread Julien Grall



On 22/07/16 09:32, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


-int p2m_alloc_table(struct domain *d)
+static int p2m_alloc_table(struct domain *d)


While moving parts of the altp2m code out of ./xen/arch/arm/p2m.c, the
function p2m_alloc_table needs to be called from ./xen/arch/arm/altp2m.c
to allocate the individual altp2m views. Hence it should not be static.


No, this function should not be called outside p2m.c as it will not 
fully initialize the p2m. You need to need to provide a function to 
initialize a p2m (such as p2m_init).


Regards.

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 16/22] xen/arm: p2m: Move the vttbr field from arch_domain to p2m_domain

2016-07-22 Thread Julien Grall



On 22/07/16 08:46, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


On 07/20/2016 06:10 PM, Julien Grall wrote:

The field vttbr holds the base address of the translation table for
guest. Its value will depends on how the p2m has been initialized and
will only be used by the code code.

So move the field from arch_domain to p2m_domain. This will also ease
the implementation of altp2m.
---
 xen/arch/arm/p2m.c   | 11 +++
 xen/arch/arm/traps.c |  2 +-
 xen/include/asm-arm/domain.h |  1 -
 xen/include/asm-arm/p2m.h|  3 +++
 4 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index c407e6a..c52081a 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -107,10 +107,14 @@ void dump_p2m_lookup(struct domain *d, paddr_t addr)

 static void p2m_load_VTTBR(struct domain *d)
 {
+struct p2m_domain *p2m = &d->arch.p2m;
+


This is ok to me. Further altp2m implementation can easily extend this
code base.

Also, as your patch (patch #17) as well eliminates the possibility that
the idle domain reaches this function the following check could be
potentially removed (as far as I know, p2m_load_VTTBR is reached only
through p2m_restore_state).


I will not remove this check because this is not the goal of this patch 
and each patch should boot Xen without any dependencies on a follow-up 
patch: p2m_load_VTTBR is used flush_tlb_domain until patch #18 and 
p2m_restore_state does not yet have the is_idle_* check (it will be 
added in patch #17).


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Corneliu ZUZU

Hi,

I've been inspecting vm-event code parts to try and understand when and 
why domain pausing/locking is done. If I understood correctly, domain 
pausing is done solely to force all the vCPUs of that domain to see a 
configuration update and act upon it (e.g. in the case of a 
XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring, 
domain pausing/unpausing ensures immediate enabling of CR3 load-exiting 
for all vCPUs), not to synchronize concurrent operations (lock-behavior).


As for locking, I see that for example vm_event_init_domain(), 
vm_event_cleanup_domain() and monitor_domctl() are all protected by the 
domctl-lock, but I don't think that's enough.

Here are a few code-paths that led me to believe that:
* do_hvm_op()->monitor_guest_request() reads d.monitor.guest_request_* 
resources, but it doesn't seem to take the domctl lock, so it seems 
possible for that to happen _while_ those resources are 
initialized/cleaned-up
* monitor_guest_request() also calls 
monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot() 
which attempts a vm_event_ring_lock(ved), which could also be called 
_while_ that's initialized (vm_event_enable()) or cleaned-up 
(vm_event_disable())
* hvm_monitor_cr() - e.g. on the code-path 
vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX() 
there doesn't seem to be taken into account the possibility of a 
concurrent monitor_init_domain()/monitor_cleanup_domain()


Am I missing something with these conclusions?
As a resolution for this, I've been thinking of adding a 'subsys_lock' 
field in the vm_event_domain structure, either spinlock or rw-lock, 
which would be initialised/uninitialised when d.vm_event is 
allocated/freed (domain_create()/complete_domain_destroy()).


Let me know what you think,
Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 19/22] xen/arm: p2m: Inline p2m_load_VTTBR into p2m_restore_state

2016-07-22 Thread Julien Grall



On 22/07/16 09:07, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


On 07/20/2016 06:11 PM, Julien Grall wrote:

p2m_restore_state is the last caller of p2m_load_VTTBR and already check
if the vCPU does not belong to the idle domain.

Note that it is likely possible to remove some isb in the function
p2m_restore_state, however this is not the purpose of this patch. So the
numerous isb have been left.



Right now, I don't see any issues with removing the p2m_load_VTTBR
function in combination with changes applied to flush_tlb_domain in your
patch #18 and #17. However, I am not entirely sure whether it makes
sense to entirely remove the function and replicate the VTTBR loading
functionality across multiple functions. Why don't we just provide a
struct p2m_domain* to p2m_load_VTTBR (potentially with a backpointer to
the associated domain, as it is shown in the arm/altp2m patch) and use
the function inline?


Because ideally this function should take a p2m in parameter and a p2m 
cannot belong to an idle domain. So the function would be:


WRITE_SYSREG64(p2m->vttbr, VTTBR_EL2);
isb();

However, in the case of p2m_restore_state the isb() is not necessary and 
will impact the performance. Yes, I know the function contains a lots of 
pointless isb(), this needs to be fixed at some point.


So overall, this function is not necessary.

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH XTF 3/3] xtf-runner: regularise runner exit code

2016-07-22 Thread Wei Liu
On Thu, Jul 21, 2016 at 08:04:59PM +0100, Andrew Cooper wrote:
> On 21/07/2016 16:44, Wei Liu wrote:
> > Report the first "ERROR" and "FAILURE" if found, otherwise report "SKIP"
> > if found. Eventually if everything is ok the exit code will be 0.
> >
> > See runner code for numeric exit code space.
> >
> > Signed-off-by: Wei Liu 
> > ---
> >  xtf-runner | 12 +---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/xtf-runner b/xtf-runner
> > index 17ce933..ebe5c27 100755
> > --- a/xtf-runner
> > +++ b/xtf-runner
> > @@ -249,17 +249,23 @@ def run_tests(args):
> >  if not len(tests):
> >  raise RunnerError("No tests to run")
> >  
> > -rc = 0
> > +rc = exit_code('SUCCESS')
> 
> This logic would be easier to express if you use the indices of
> all_results as a measure of severity.
> 
> e.g.
> 
> rc = all_results.index('SUCCESS')
> 
> >  results = []
> >  
> >  for test in tests:
> >  
> >  res = run_test(test)
> > -if res != "SUCCESS":
> > -rc = 1
> > +if res in ("ERROR", "FAILURE") and rc == exit_code('SUCCESS'):
> > +rc = exit_code(res)
> 
> res_idx = all_results.index(res)
> if res_idx > rc:
> rc = res_idx

I intended to report the first "error" or "failure" encountered.

This would cause a FAILURE to overwrite previous ERROR result. Is that
what you want? 

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 20/22] xen/arm: Don't export flush_tlb_domain

2016-07-22 Thread Julien Grall



On 22/07/16 09:54, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


On 07/20/2016 06:11 PM, Julien Grall wrote:

The function flush_tlb_domain is not used outside of the file where it
has been declared.



As for patch #15, the same applies here too:
For altp2m, flush_tlb_domain/p2m_flush_tlb should be made available to
./xen/arch/arm/altp2m.c.


Based on your previous version, I don't see any reason to flush call 
flush_tlb_domain/p2m_flush_tlb in altp2m.


Please justify why you would need it.

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

2016-07-22 Thread Roger Pau Monné
On Fri, Jul 22, 2016 at 04:17:48PM +0800, Bob Liu wrote:
> 
> On 07/22/2016 03:45 PM, Roger Pau Monné wrote:
> > On Thu, Jul 21, 2016 at 06:08:05PM +0800, Bob Liu wrote:
> >>
> >> On 07/21/2016 04:57 PM, Roger Pau Monné wrote:
> ..[snip]..
>  +
>  +static ssize_t dynamic_reconfig_device(struct blkfront_info *info, 
>  ssize_t count)
>  +{
>  +unsigned int i;
>  +int err = -EBUSY;
>  +
>  +/*
>  + * Make sure no migration in parallel, device lock is actually a
>  + * mutex.
>  + */
>  +if (!device_trylock(&info->xbdev->dev)) {
>  +pr_err("Fail to acquire dev:%s lock, may be in 
>  migration.\n",
>  +dev_name(&info->xbdev->dev));
>  +return err;
>  +}
>  +
>  +/*
>  + * Prevent new requests and guarantee no uncompleted reqs.
>  + */
>  +blk_mq_freeze_queue(info->rq);
>  +if (part_in_flight(&info->gd->part0))
>  +goto out;
>  +
>  +/*
>  + * FrontBackend
>  + * Switch to XenbusStateClosed
>  + *  frontend_changed():
>  + *   case XenbusStateClosed:
>  + *  
>  xen_blkif_disconnect()
>  + *  Switch to 
>  XenbusStateClosed
>  + * blkfront_resume():
>  + *  frontend_changed():
>  + *  reconnect
>  + * Wait until XenbusStateConnected
>  + */
>  +info->reconfiguring = true;
>  +xenbus_switch_state(info->xbdev, XenbusStateClosed);
>  +
>  +/* Poll every 100ms, 1 minute timeout. */
>  +for (i = 0; i < 600; i++) {
>  +/*
>  + * Wait backend enter XenbusStateClosed, 
>  blkback_changed()
>  + * will clear reconfiguring.
>  + */
>  +if (!info->reconfiguring)
>  +goto resume;
>  +schedule_timeout_interruptible(msecs_to_jiffies(100));
>  +}
> >>>
> >>> Instead of having this wait, could you just set info->reconfiguring = 1, 
> >>> set 
> >>> the frontend state to XenbusStateClosed and mimic exactly what a resume 
> >>> from 
> >>> suspension does? blkback_changed would have to set the frontend state to 
> >>> InitWait when it detects that the backend has switched to Closed, and 
> >>> call 
> >>> blkfront_resume.
> >>
> >>
> >> I think that won't work.
> >> In the real "resume" case, the power management system will trigger all 
> >> ->resume() path.
> >> But there is no place for dynamic configuration.
> > 
> > Hello,
> > 
> > I think it should be possible to set info->reconfiguring and wait for the 
> > backend to switch to state Closed, at that point we should call 
> > blkif_resume 
> > (from blkback_changed) and the backend will follow the reconection.
> > 
> 
> Okay, I get your point. Yes, that's an option.
> 
> But this will make 'dynamic configuration' to be async, I'm worry about the 
> end-user will get panic.
> E.g
> A end-user "echo  > /sys/devices/vbd-xxx/max_indirect_segs",
> but then the device will be Closed and disappeared, the user have to wait for 
> a random time so that the device can resume.

That should not happen, AFAICT on migration the device never dissapears. 
alloc_disk and friends should not be called on resume from migration (see 
the switch in blkfront_connect, you should take the BLKIF_STATE_SUSPENDED 
path for the reconfiguration).

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH XTF 3/3] xtf-runner: regularise runner exit code

2016-07-22 Thread Andrew Cooper

On 22/07/16 10:29, Wei Liu wrote:


On Thu, Jul 21, 2016 at 08:04:59PM +0100, Andrew Cooper wrote:

On 21/07/2016 16:44, Wei Liu wrote:

Report the first "ERROR" and "FAILURE" if found, otherwise report "SKIP"
if found. Eventually if everything is ok the exit code will be 0.

See runner code for numeric exit code space.

Signed-off-by: Wei Liu 
---
  xtf-runner | 12 +---
  1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/xtf-runner b/xtf-runner
index 17ce933..ebe5c27 100755
--- a/xtf-runner
+++ b/xtf-runner
@@ -249,17 +249,23 @@ def run_tests(args):
  if not len(tests):
  raise RunnerError("No tests to run")
  
-rc = 0

+rc = exit_code('SUCCESS')

This logic would be easier to express if you use the indices of
all_results as a measure of severity.

e.g.

rc = all_results.index('SUCCESS')


  results = []
  
  for test in tests:
  
  res = run_test(test)

-if res != "SUCCESS":
-rc = 1
+if res in ("ERROR", "FAILURE") and rc == exit_code('SUCCESS'):
+rc = exit_code(res)

res_idx = all_results.index(res)
if res_idx > rc:
 rc = res_idx

I intended to report the first "error" or "failure" encountered.

This would cause a FAILURE to overwrite previous ERROR result. Is that
what you want?


When running more than one test, the overall result should be the most 
severe.  So yes, a subsequent FAILURE should override an ERROR.


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [XTF PATCH v2 4/4] xtf-runner: regularise runner exit code

2016-07-22 Thread Wei Liu
The script now returns the most severe result. Document the exit code in
help string.

Signed-off-by: Wei Liu 
---
 xtf-runner | 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/xtf-runner b/xtf-runner
index 1c96750..15b98c6 100755
--- a/xtf-runner
+++ b/xtf-runner
@@ -251,23 +251,30 @@ def run_tests(args):
 if not len(tests):
 raise RunnerError("No tests to run")
 
-rc = 0
+rc = all_results.index('SUCCESS')
 results = []
 
 for test in tests:
 
 res = run_test(test)
-if res != "SUCCESS":
-rc = 1
+res_idx = all_results.index(res);
+if res_idx > rc:
+rc = res_idx
 
 results.append(res)
 
+if rc == exit_code('SUCCESS'):
+for res in results:
+if res == 'SKIP':
+rc = exit_code('SKIP')
+break
+
 print "\nCombined test results:"
 
 for test, res in zip(tests, results):
 print "%-40s %s" % (test, res)
 
-return rc
+return exit_code(all_results[rc])
 
 
 def main():
@@ -308,6 +315,13 @@ def main():
   "   List all 'functional' or 'special' tests\n"
   "./xtf-runner --list hvm64\n"
   "   List all 'hvm64' tests\n"
+  "\n"
+  "  Exit code for this script:\n"
+  "0:everythin is ok\n"
+  "1,2:  reserved for python interpreter\n"
+  "3:test(s) are skipped\n"
+  "4:test(s) report error\n"
+  "5:test(s) report failure\n"
   ),
 )
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [XTF PATCH v2 3/4] xtf-runner: provide a set of exit codes for different results

2016-07-22 Thread Wei Liu
Signed-off-by: Wei Liu 
---
 xtf-runner | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/xtf-runner b/xtf-runner
index c938f28..1c96750 100755
--- a/xtf-runner
+++ b/xtf-runner
@@ -21,6 +21,16 @@ except ImportError:
 # Note that warning is not a result on its own.
 all_results = ('SUCCESS', 'SKIP', 'ERROR', 'FAILURE')
 
+# Return the exit code for different states.  Avoid using 1 and 2 because
+# python interpreter uses them -- see document for sys.exit.
+def exit_code(state):
+""" Convert a test result to an xtf-runner exit code. """
+return { "SUCCESS": 0,
+ "SKIP":3,
+ "ERROR":   4,
+ "FAILURE": 5,
+}.get(state, 4)
+
 # All test categories and configurations
 all_categories   = ("special", "functional", "xsa", "utility")
 pv_environments  = ("pv64", "pv32pae")
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [XTF PATCH v2 0/4] Runner exit code clean up

2016-07-22 Thread Wei Liu
Wei Liu (4):
  gitignore: ignore vim swp file
  xtf-runner: sync all test results
  xtf-runner: provide a set of exit codes for different results
  xtf-runner: regularise runner exit code

 .gitignore |  1 +
 xtf-runner | 37 -
 2 files changed, 33 insertions(+), 5 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [XTF PATCH v2 1/4] gitignore: ignore vim swp file

2016-07-22 Thread Wei Liu
Signed-off-by: Wei Liu 
---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index 01243f0..56e1884 100644
--- a/.gitignore
+++ b/.gitignore
@@ -3,6 +3,7 @@
 *.d
 *.pyc
 *.pyo
+*.swp
 /arch/x86/*.lds
 /cscope.*
 /dist/
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [XTF PATCH v2 2/4] xtf-runner: sync all test results

2016-07-22 Thread Wei Liu
Signed-off-by: Wei Liu 
---
 xtf-runner | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/xtf-runner b/xtf-runner
index 50a5e96..c938f28 100755
--- a/xtf-runner
+++ b/xtf-runner
@@ -17,6 +17,9 @@ try:
 except ImportError:
 import simplejson as json
 
+# All results of a test, keep in sync with C code report.h.
+# Note that warning is not a result on its own.
+all_results = ('SUCCESS', 'SKIP', 'ERROR', 'FAILURE')
 
 # All test categories and configurations
 all_categories   = ("special", "functional", "xsa", "utility")
@@ -161,7 +164,7 @@ def run_test(test):
 if not "Test result:" in test_result:
 return "ERROR"
 
-for res in ("SUCCESS", "SKIP", "FAILURE"):
+for res in all_results:
 
 if res in test_result:
 return res
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

2016-07-22 Thread Bob Liu

On 07/22/2016 05:34 PM, Roger Pau Monné wrote:
> On Fri, Jul 22, 2016 at 04:17:48PM +0800, Bob Liu wrote:
>>
>> On 07/22/2016 03:45 PM, Roger Pau Monné wrote:
>>> On Thu, Jul 21, 2016 at 06:08:05PM +0800, Bob Liu wrote:

 On 07/21/2016 04:57 PM, Roger Pau Monné wrote:
>> ..[snip]..
>> +
>> +static ssize_t dynamic_reconfig_device(struct blkfront_info *info, 
>> ssize_t count)
>> +{
>> +unsigned int i;
>> +int err = -EBUSY;
>> +
>> +/*
>> + * Make sure no migration in parallel, device lock is actually a
>> + * mutex.
>> + */
>> +if (!device_trylock(&info->xbdev->dev)) {
>> +pr_err("Fail to acquire dev:%s lock, may be in 
>> migration.\n",
>> +dev_name(&info->xbdev->dev));
>> +return err;
>> +}
>> +
>> +/*
>> + * Prevent new requests and guarantee no uncompleted reqs.
>> + */
>> +blk_mq_freeze_queue(info->rq);
>> +if (part_in_flight(&info->gd->part0))
>> +goto out;
>> +
>> +/*
>> + * FrontBackend
>> + * Switch to XenbusStateClosed
>> + *  frontend_changed():
>> + *   case XenbusStateClosed:
>> + *  
>> xen_blkif_disconnect()
>> + *  Switch to 
>> XenbusStateClosed
>> + * blkfront_resume():
>> + *  frontend_changed():
>> + *  reconnect
>> + * Wait until XenbusStateConnected
>> + */
>> +info->reconfiguring = true;
>> +xenbus_switch_state(info->xbdev, XenbusStateClosed);
>> +
>> +/* Poll every 100ms, 1 minute timeout. */
>> +for (i = 0; i < 600; i++) {
>> +/*
>> + * Wait backend enter XenbusStateClosed, 
>> blkback_changed()
>> + * will clear reconfiguring.
>> + */
>> +if (!info->reconfiguring)
>> +goto resume;
>> +schedule_timeout_interruptible(msecs_to_jiffies(100));
>> +}
>
> Instead of having this wait, could you just set info->reconfiguring = 1, 
> set 
> the frontend state to XenbusStateClosed and mimic exactly what a resume 
> from 
> suspension does? blkback_changed would have to set the frontend state to 
> InitWait when it detects that the backend has switched to Closed, and 
> call 
> blkfront_resume.


 I think that won't work.
 In the real "resume" case, the power management system will trigger all 
 ->resume() path.
 But there is no place for dynamic configuration.
>>>
>>> Hello,
>>>
>>> I think it should be possible to set info->reconfiguring and wait for the 
>>> backend to switch to state Closed, at that point we should call 
>>> blkif_resume 
>>> (from blkback_changed) and the backend will follow the reconection.
>>>
>>
>> Okay, I get your point. Yes, that's an option.
>>
>> But this will make 'dynamic configuration' to be async, I'm worry about the 
>> end-user will get panic.
>> E.g
>> A end-user "echo  > /sys/devices/vbd-xxx/max_indirect_segs",
>> but then the device will be Closed and disappeared, the user have to wait 
>> for a random time so that the device can resume.
> 
> That should not happen, AFAICT on migration the device never dissapears. 

Oh, yes.

> alloc_disk and friends should not be called on resume from migration (see 
> the switch in blkfront_connect, you should take the BLKIF_STATE_SUSPENDED 
> path for the reconfiguration).
> 

What about if the end-user starts I/O immediately after writing new value to 
/sys?
But the resume is still in progress.

-- 
Regards,
-Bob

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH v2 4/4] xtf-runner: regularise runner exit code

2016-07-22 Thread Andrew Cooper

On 22/07/16 10:43, Wei Liu wrote:


The script now returns the most severe result. Document the exit code in
help string.

Signed-off-by: Wei Liu 
---
  xtf-runner | 22 ++
  1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/xtf-runner b/xtf-runner
index 1c96750..15b98c6 100755
--- a/xtf-runner
+++ b/xtf-runner
@@ -251,23 +251,30 @@ def run_tests(args):
  if not len(tests):
  raise RunnerError("No tests to run")
  
-rc = 0

+rc = all_results.index('SUCCESS')
  results = []
  
  for test in tests:
  
  res = run_test(test)

-if res != "SUCCESS":
-rc = 1
+res_idx = all_results.index(res);
+if res_idx > rc:
+rc = res_idx
  
  results.append(res)
  
+if rc == exit_code('SUCCESS'):

+for res in results:
+if res == 'SKIP':
+rc = exit_code('SKIP')
+break


Why is this conditional needed? SKIP has index 1 so will automatically 
displace SUCCESS in the change above.



+
  print "\nCombined test results:"
  
  for test, res in zip(tests, results):

  print "%-40s %s" % (test, res)
  
-return rc

+return exit_code(all_results[rc])
  
  
  def main():

@@ -308,6 +315,13 @@ def main():
"   List all 'functional' or 'special' tests\n"
"./xtf-runner --list hvm64\n"
"   List all 'hvm64' tests\n"
+  "\n"
+  "  Exit code for this script:\n"
+  "0:everythin is ok\n"


everything

~Andrew


+  "1,2:  reserved for python interpreter\n"
+  "3:test(s) are skipped\n"
+  "4:test(s) report error\n"
+  "5:test(s) report failure\n"
),
  )
  



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Andrew Cooper

On 22/07/16 10:27, Corneliu ZUZU wrote:


Hi,

I've been inspecting vm-event code parts to try and understand when 
and why domain pausing/locking is done. If I understood correctly, 
domain pausing is done solely to force all the vCPUs of that domain to 
see a configuration update and act upon it (e.g. in the case of a 
XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring, 
domain pausing/unpausing ensures immediate enabling of CR3 
load-exiting for all vCPUs), not to synchronize concurrent operations 
(lock-behavior).


Correct.  Without the vcpu/domain pause, a vcpu could be midway through 
a vmexit path with the monitor configuration changing under its feet.  
OTOH, it could be running in guest context, and only receive the update 
one scheduling timeslice later.




As for locking, I see that for example vm_event_init_domain(), 
vm_event_cleanup_domain() and monitor_domctl() are all protected by 
the domctl-lock, but I don't think that's enough.

Here are a few code-paths that led me to believe that:
* do_hvm_op()->monitor_guest_request() reads d.monitor.guest_request_* 
resources, but it doesn't seem to take the domctl lock, so it seems 
possible for that to happen _while_ those resources are 
initialized/cleaned-up
* monitor_guest_request() also calls 
monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot() 
which attempts a vm_event_ring_lock(ved), which could also be called 
_while_ that's initialized (vm_event_enable()) or cleaned-up 
(vm_event_disable())
* hvm_monitor_cr() - e.g. on the code-path 
vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX() 
there doesn't seem to be taken into account the possibility of a 
concurrent monitor_init_domain()/monitor_cleanup_domain()


Am I missing something with these conclusions?
As a resolution for this, I've been thinking of adding a 'subsys_lock' 
field in the vm_event_domain structure, either spinlock or rw-lock, 
which would be initialised/uninitialised when d.vm_event is 
allocated/freed (domain_create()/complete_domain_destroy()).


I don't think you are missing anything.  It seems like a monitor lock is 
needed.


FWIW, the domctl lock is only used at the moment because it was easy, 
and worked when everything was a domctl.  Being a global spinlock, it is 
a severe bottlekneck for concurrent domain management, which I need to 
find some time to break apart, so the less reliance on it the better 
from my point of view.


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH v2 4/4] xtf-runner: regularise runner exit code

2016-07-22 Thread Wei Liu
On Fri, Jul 22, 2016 at 10:49:18AM +0100, Andrew Cooper wrote:
> On 22/07/16 10:43, Wei Liu wrote:
> 
> >The script now returns the most severe result. Document the exit code in
> >help string.
> >
> >Signed-off-by: Wei Liu 
> >---
> >  xtf-runner | 22 ++
> >  1 file changed, 18 insertions(+), 4 deletions(-)
> >
> >diff --git a/xtf-runner b/xtf-runner
> >index 1c96750..15b98c6 100755
> >--- a/xtf-runner
> >+++ b/xtf-runner
> >@@ -251,23 +251,30 @@ def run_tests(args):
> >  if not len(tests):
> >  raise RunnerError("No tests to run")
> >-rc = 0
> >+rc = all_results.index('SUCCESS')
> >  results = []
> >  for test in tests:
> >  res = run_test(test)
> >-if res != "SUCCESS":
> >-rc = 1
> >+res_idx = all_results.index(res);
> >+if res_idx > rc:
> >+rc = res_idx
> >  results.append(res)
> >+if rc == exit_code('SUCCESS'):
> >+for res in results:
> >+if res == 'SKIP':
> >+rc = exit_code('SKIP')
> >+break
> 
> Why is this conditional needed? SKIP has index 1 so will automatically
> displace SUCCESS in the change above.
> 

Forgot to delete that hunk. I will resend.

> >+
> >  print "\nCombined test results:"
> >  for test, res in zip(tests, results):
> >  print "%-40s %s" % (test, res)
> >-return rc
> >+return exit_code(all_results[rc])
> >  def main():
> >@@ -308,6 +315,13 @@ def main():
> >"   List all 'functional' or 'special' tests\n"
> >"./xtf-runner --list hvm64\n"
> >"   List all 'hvm64' tests\n"
> >+  "\n"
> >+  "  Exit code for this script:\n"
> >+  "0:everythin is ok\n"
> 
> everything

Fixed.

Wei.

> 
> ~Andrew
> 
> >+  "1,2:  reserved for python interpreter\n"
> >+  "3:test(s) are skipped\n"
> >+  "4:test(s) report error\n"
> >+  "5:test(s) report failure\n"
> >),
> >  )
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Razvan Cojocaru
On 07/22/2016 12:27 PM, Corneliu ZUZU wrote:
> Hi,
> 
> I've been inspecting vm-event code parts to try and understand when and
> why domain pausing/locking is done. If I understood correctly, domain
> pausing is done solely to force all the vCPUs of that domain to see a
> configuration update and act upon it (e.g. in the case of a
> XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring,
> domain pausing/unpausing ensures immediate enabling of CR3 load-exiting
> for all vCPUs), not to synchronize concurrent operations (lock-behavior).
> 
> As for locking, I see that for example vm_event_init_domain(),
> vm_event_cleanup_domain() and monitor_domctl() are all protected by the
> domctl-lock, but I don't think that's enough.
> Here are a few code-paths that led me to believe that:
> * do_hvm_op()->monitor_guest_request() reads d.monitor.guest_request_*
> resources, but it doesn't seem to take the domctl lock, so it seems
> possible for that to happen _while_ those resources are
> initialized/cleaned-up
> * monitor_guest_request() also calls
> monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot()
> which attempts a vm_event_ring_lock(ved), which could also be called
> _while_ that's initialized (vm_event_enable()) or cleaned-up
> (vm_event_disable())
> * hvm_monitor_cr() - e.g. on the code-path
> vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX()
> there doesn't seem to be taken into account the possibility of a
> concurrent monitor_init_domain()/monitor_cleanup_domain()
> 
> Am I missing something with these conclusions?

Your conclusions look correct, but I assume that the reason why this has
not been addressed in the past is that introspection applications are
expected to be well-behaved. Specifically, in a codebase where the
choice between uint64_t and long int matters speed-wise, and where
unlikely()s also matter, an extra lock may be an issue.

The typical flow of an introspection application is:

1. Initialize everything.
2. Subscribe to relevant events.
3. Event processing loop.
4. Unsubscribe from events.
5. Do a last-run of event processing (already queued in the ring buffer).
6. Uninitialize everything (no events are possible here because of steps
4-5).

> As a resolution for this, I've been thinking of adding a 'subsys_lock'
> field in the vm_event_domain structure, either spinlock or rw-lock,
> which would be initialised/uninitialised when d.vm_event is
> allocated/freed (domain_create()/complete_domain_destroy()).

I have nothing against this. Having as many assurances as possible that
things will work is definitely a plus in my book - with the comment that
I would prefer a rwlock to an ordinary spinlock, and that "subsys_lock"
sounds obscure to me, although I admit that I can't think of a good name
at the moment.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH v2 4/4] xtf-runner: regularise runner exit code

2016-07-22 Thread Wei Liu
Updated this patch

---8<---
From ef1f9ddfa797bb4095d48b90efc3c92a0a8fd1b6 Mon Sep 17 00:00:00 2001
From: Wei Liu 
Date: Mon, 13 Jun 2016 15:06:48 +0100
Subject: [XTF PATCH] xtf-runner: regularise runner exit code
Cc: andrew.coop...@citrix.com

The script now returns the most severe result. Document the exit code in
help string.

Signed-off-by: Wei Liu 
---
 xtf-runner | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/xtf-runner b/xtf-runner
index 1c96750..7743316 100755
--- a/xtf-runner
+++ b/xtf-runner
@@ -251,14 +251,15 @@ def run_tests(args):
 if not len(tests):
 raise RunnerError("No tests to run")
 
-rc = 0
+rc = all_results.index('SUCCESS')
 results = []
 
 for test in tests:
 
 res = run_test(test)
-if res != "SUCCESS":
-rc = 1
+res_idx = all_results.index(res);
+if res_idx > rc:
+rc = res_idx
 
 results.append(res)
 
@@ -267,7 +268,7 @@ def run_tests(args):
 for test, res in zip(tests, results):
 print "%-40s %s" % (test, res)
 
-return rc
+return exit_code(all_results[rc])
 
 
 def main():
@@ -308,6 +309,13 @@ def main():
   "   List all 'functional' or 'special' tests\n"
   "./xtf-runner --list hvm64\n"
   "   List all 'hvm64' tests\n"
+  "\n"
+  "  Exit code for this script:\n"
+  "0:everything is ok\n"
+  "1,2:  reserved for python interpreter\n"
+  "3:test(s) are skipped\n"
+  "4:test(s) report error\n"
+  "5:test(s) report failure\n"
   ),
 )
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 15/22] xen/arm: Don't call p2m_alloc_table from arch_domain_create

2016-07-22 Thread Sergej Proskurin
Hi Julien,

On 07/22/2016 11:18 AM, Julien Grall wrote:
> 
> 
> On 22/07/16 09:32, Sergej Proskurin wrote:
>> Hi Julien,
> 
> Hello Sergej,
> 
>>> -int p2m_alloc_table(struct domain *d)
>>> +static int p2m_alloc_table(struct domain *d)
>>
>> While moving parts of the altp2m code out of ./xen/arch/arm/p2m.c, the
>> function p2m_alloc_table needs to be called from ./xen/arch/arm/altp2m.c
>> to allocate the individual altp2m views. Hence it should not be static.
> 
> No, this function should not be called outside p2m.c as it will not
> fully initialize the p2m. You need to need to provide a function to
> initialize a p2m (such as p2m_init).
> 

The last time we have discussed reusing existing code, among others, for
individual struct p2m_domain initialization routines. Also, we have
agreed to move altp2m-related parts out of p2m.c into altp2m.c, which
makes it hard not to access parts required for initialization/teardown
(that are equal for both p2m and altp2m).

I agree that functions that, e.g., do not entirely initialize a specific
data structure should not be accessed from elsewhere. But then, we
should not have moved altp2m-related information out of p2m.c as they
simply need the same functionality when it comes to initialization/teardown.

Best regards,
~Sergej

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [XTF PATCH v2 4/4] xtf-runner: regularise runner exit code

2016-07-22 Thread Andrew Cooper

On 22/07/16 10:58, Wei Liu wrote:


Updated this patch


Thanks.

Pylint warned of an unnecessary semicolon from "res_idx = 
all_results.index(res);" which I dropped.


Whole series Reviewed and committed.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Corneliu ZUZU

On 7/22/2016 12:55 PM, Razvan Cojocaru wrote:

On 07/22/2016 12:27 PM, Corneliu ZUZU wrote:

Hi,

I've been inspecting vm-event code parts to try and understand when and
why domain pausing/locking is done. If I understood correctly, domain
pausing is done solely to force all the vCPUs of that domain to see a
configuration update and act upon it (e.g. in the case of a
XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring,
domain pausing/unpausing ensures immediate enabling of CR3 load-exiting
for all vCPUs), not to synchronize concurrent operations (lock-behavior).

As for locking, I see that for example vm_event_init_domain(),
vm_event_cleanup_domain() and monitor_domctl() are all protected by the
domctl-lock, but I don't think that's enough.
Here are a few code-paths that led me to believe that:
* do_hvm_op()->monitor_guest_request() reads d.monitor.guest_request_*
resources, but it doesn't seem to take the domctl lock, so it seems
possible for that to happen _while_ those resources are
initialized/cleaned-up
* monitor_guest_request() also calls
monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot()
which attempts a vm_event_ring_lock(ved), which could also be called
_while_ that's initialized (vm_event_enable()) or cleaned-up
(vm_event_disable())
* hvm_monitor_cr() - e.g. on the code-path
vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX()
there doesn't seem to be taken into account the possibility of a
concurrent monitor_init_domain()/monitor_cleanup_domain()

Am I missing something with these conclusions?

Your conclusions look correct, but I assume that the reason why this has
not been addressed in the past is that introspection applications are
expected to be well-behaved.


I wouldn't think that was the rationale (considering that the risk is 
crashing the hypervisor as a whole; also see below), I'd rather think 
this simply went unnoticed.



Specifically, in a codebase where the
choice between uint64_t and long int matters speed-wise, and where
unlikely()s also matter, an extra lock may be an issue.

The typical flow of an introspection application is:

1. Initialize everything.
2. Subscribe to relevant events.
3. Event processing loop.
4. Unsubscribe from events.
5. Do a last-run of event processing (already queued in the ring buffer).
6. Uninitialize everything (no events are possible here because of steps
4-5).


And even if an introspection application behaves as sanely as you say, 
the current layout of the code still doesn't guarantee bug-free 
behavior. That's because for one, there are in-guest events (that 
trigger access of vm-events resources) that are completely asynchronous 
in relation to 2-6, for example a HVMOP_guest_request_vm_event 
(subsequently calling monitor_guest_request()).



As a resolution for this, I've been thinking of adding a 'subsys_lock'
field in the vm_event_domain structure, either spinlock or rw-lock,
which would be initialised/uninitialised when d.vm_event is
allocated/freed (domain_create()/complete_domain_destroy()).

I have nothing against this. Having as many assurances as possible that
things will work is definitely a plus in my book - with the comment that
I would prefer a rwlock to an ordinary spinlock


Yep, I'd prefer that too.


, and that "subsys_lock"
sounds obscure to me, although I admit that I can't think of a good name
at the moment.


I'm referring to monitor, share & paging as "subsystems" of the whole 
"vm-events" subsystem (in that sense the 3 aforementioned are 
"sub-subsystems").

To me it's acceptable, but I'm open to better names.


Thanks,
Razvan


Thanks,
Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 20/22] xen/arm: Don't export flush_tlb_domain

2016-07-22 Thread Sergej Proskurin
Hi Julien,

On 07/22/2016 11:30 AM, Julien Grall wrote:
> 
> 
> On 22/07/16 09:54, Sergej Proskurin wrote:
>> Hi Julien,
> 
> Hello Sergej,
> 
>> On 07/20/2016 06:11 PM, Julien Grall wrote:
>>> The function flush_tlb_domain is not used outside of the file where it
>>> has been declared.
>>>
>>
>> As for patch #15, the same applies here too:
>> For altp2m, flush_tlb_domain/p2m_flush_tlb should be made available to
>> ./xen/arch/arm/altp2m.c.
> 
> Based on your previous version, I don't see any reason to flush call
> flush_tlb_domain/p2m_flush_tlb in altp2m.
> 
> Please justify why you would need it.
> 

The new version considers changes that are made to the hostp2m and
propagates them to all affected altp2m views by either changing
individual altp2m entries or even flushing (but not destroying) the
entire altp2m-tables. This idea has been borrowed from the x86 altp2m
implementation.

To prevent access to old/invalid GPAs, the current implementation
flushes the TLBs associated with the affected altp2m view after such
propagation.

Best regards,
~Sergej

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Kernel panic on Xen virtualisation in Debian

2016-07-22 Thread Ingo Jürgensmann

On 22.07.2016 11:03, Ingo Jürgensmann wrote:


In the meanwhile, I activated IPv6 again on Tuesday evening and today
the server crashed again some minutes ago. Here's the output from
netconsole:


... and the second subsequent crash:

Jul 22 11:15:04 31.172.31.251 [ 2774.896036] BUG: unable to handle 
kernel

Jul 22 11:15:04 31.172.31.251  at 88026ee6c000
Jul 22 11:15:04 31.172.31.251 [ 2774.896100] IP:
Jul 22 11:15:04 31.172.31.251  [] memcpy+0x6/0x110
Jul 22 11:15:04 31.172.31.251 [ 2774.896127] PGD 1814067
Jul 22 11:15:04 31.172.31.251
Jul 22 11:15:04 31.172.31.251 [ 2774.896164] Oops:  [#1]
Jul 22 11:15:04 31.172.31.251
Jul 22 11:15:04 31.172.31.251 [ 2774.896189] Modules linked in:
[omitting kernel modules list]
Jul 22 11:15:04 31.172.31.251 [ 2774.897396] CPU: 0 PID: 0 Comm: 
swapper/0 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2+deb8u3
Jul 22 11:15:04 31.172.31.251 [ 2774.897453] Hardware name: Supermicro 
X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 3.0a 
01/03/2014
Jul 22 11:15:04 31.172.31.251 [ 2774.897514] task: 8181a460 ti: 
8180 task.ti: 8180
Jul 22 11:15:04 31.172.31.251 [ 2774.897567] RIP: 
e030:[]

Jul 22 11:15:04 31.172.31.251  [] memcpy+0x6/0x110
Jul 22 11:15:04 31.172.31.251 [ 2774.897625] RSP: e02b:880280e03b58  
EFLAGS: 00010286
Jul 22 11:15:04 31.172.31.251 [ 2774.897658] RAX: 880264550070 RBX: 
88026cbe2400 RCX: 02ee
Jul 22 11:15:04 31.172.31.251 [ 2774.897694] RDX: 04a0 RSI: 
88026ee6c000 RDI: 880264550222
Jul 22 11:15:04 31.172.31.251 [ 2774.897729] RBP: 880280e03c38 R08: 
06c0 R09: 880264550062
Jul 22 11:15:04 31.172.31.251 [ 2774.897765] R10: 0100 R11: 
8c9ab7de R12: 880250204200
Jul 22 11:15:04 31.172.31.251 [ 2774.897800] R13: 88026ee6be66 R14: 
88026f2a7040 R15: 04a8
Jul 22 11:15:04 31.172.31.251 [ 2774.897842] FS:  () 
GS:880280e0() knlGS:880280e0
Jul 22 11:15:04 31.172.31.251 [ 2774.897896] CS:  e033 DS:  ES:  
CR0: 80050033
Jul 22 11:15:04 31.172.31.251 [ 2774.897929] CR2: 88026ee6c000 CR3: 
0002645c3000 CR4: 00042660

Jul 22 11:15:04 31.172.31.251 [ 2774.897965] Stack:
Jul 22 11:15:04 31.172.31.251 [ 2774.897991]  814d319f
Jul 22 11:15:04 31.172.31.251  88026a031000
Jul 22 11:15:04 31.172.31.251  880280e03ba8
Jul 22 11:15:04 31.172.31.251  9401294600a7012a
Jul 22 11:15:04 31.172.31.251
Jul 22 11:15:04 31.172.31.251 [ 2774.898063]  0100
Jul 22 11:15:04 31.172.31.251  814a000a
Jul 22 11:15:04 31.172.31.251  6fda7e90
Jul 22 11:15:04 31.172.31.251  80fe
Jul 22 11:15:04 31.172.31.251
Jul 22 11:15:04 31.172.31.251 [ 2774.898133]  1ad902feff7ac40e
Jul 22 11:15:04 31.172.31.251  88026fa5c6c0
Jul 22 11:15:04 31.172.31.251  224afc3e1600
Jul 22 11:15:04 31.172.31.251  880250204200
Jul 22 11:15:04 31.172.31.251
Jul 22 11:15:04 31.172.31.251 [ 2774.898205] Call Trace:
Jul 22 11:15:04 31.172.31.251 [ 2774.898233]  
Jul 22 11:15:04 31.172.31.251
Jul 22 11:15:04 31.172.31.251 [ 2774.898242]
Jul 22 11:15:04 31.172.31.251  [] ? 
ndisc_send_redirect+0x3bf/0x410
Jul 22 11:15:04 31.172.31.251 [ 2774.898313]  [] ? 
reg_vif_xmit+0xca/0xf0
Jul 22 11:15:04 31.172.31.251 [ 2774.898351]  [] ? 
ip6_forward+0x71c/0x850
Jul 22 11:15:04 31.172.31.251 [ 2774.898395]  [] ? 
ip6_route_input+0xa4/0xd0
Jul 22 11:15:04 31.172.31.251 [ 2774.898432]  [] ? 
__netif_receive_skb_core+0x543/0x750
Jul 22 11:15:04 31.172.31.251 [ 2774.898468]  [] ? 
netif_receive_skb_internal+0x1f/0x80
Jul 22 11:15:04 31.172.31.251 [ 2774.898513]  [] ? 
br_handle_frame_finish+0x1c2/0x3c0 [bridge]
Jul 22 11:15:04 31.172.31.251 [ 2774.898571]  [] ? 
br_handle_frame+0x147/0x240 [bridge]
Jul 22 11:15:04 31.172.31.251 [ 2774.898706]  [] ? 
__netif_receive_skb_core+0x1c4/0x750
Jul 22 11:15:04 31.172.31.251 [ 2774.898747]  [] ? 
xen_clocksource_get_cycles+0x1c/0x20
Jul 22 11:15:04 31.172.31.251 [ 2774.898784]  [] ? 
netif_receive_skb_internal+0x1f/0x80
Jul 22 11:15:04 31.172.31.251 [ 2774.898822]  [] ? 
xenvif_tx_action+0x49f/0x920 [xen_netback]
Jul 22 11:15:04 31.172.31.251 [ 2774.898877]  [] ? 
xenvif_poll+0x28/0x70 [xen_netback]
Jul 22 11:15:04 31.172.31.251 [ 2774.898914]  [] ? 
net_rx_action+0x140/0x240
Jul 22 11:15:04 31.172.31.251 [ 2774.898950]  [] ? 
__do_softirq+0xf1/0x290
Jul 22 11:15:04 31.172.31.251 [ 2774.898985]  [] ? 
irq_exit+0x95/0xa0
Jul 22 11:15:04 31.172.31.251 [ 2774.899022]  [] ? 
xen_evtchn_do_upcall+0x35/0x50
Jul 22 11:15:04 31.172.31.251 [ 2774.899059]  [] ? 
xen_do_hypervisor_callback+0x1e/0x30

Jul 22 11:15:04 31.172.31.251 [ 2774.899093]  
Jul 22 11:15:04 31.172.31.251
Jul 22 11:15:04 31.172.31.251 [ 2774.899103]
Jul 22 11:15:04 31.172.31.251  [] ? 
xen_hypercall_sched_op+0xa/0x20
Jul 22 11:15:04 31.172.31.251 [ 2774.899168]  [] ? 
xen_hypercall_sched_op+0xa/0x20
Jul 22 11:15:04 31.172.31.251 [ 2774.

Re: [Xen-devel] [PATCH 15/22] xen/arm: Don't call p2m_alloc_table from arch_domain_create

2016-07-22 Thread Julien Grall



On 22/07/16 11:16, Sergej Proskurin wrote:

Hi Julien,


Hello,


On 07/22/2016 11:18 AM, Julien Grall wrote:



On 22/07/16 09:32, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


-int p2m_alloc_table(struct domain *d)
+static int p2m_alloc_table(struct domain *d)


While moving parts of the altp2m code out of ./xen/arch/arm/p2m.c, the
function p2m_alloc_table needs to be called from ./xen/arch/arm/altp2m.c
to allocate the individual altp2m views. Hence it should not be static.


No, this function should not be called outside p2m.c as it will not
fully initialize the p2m. You need to need to provide a function to
initialize a p2m (such as p2m_init).



The last time we have discussed reusing existing code, among others, for
individual struct p2m_domain initialization routines. Also, we have
agreed to move altp2m-related parts out of p2m.c into altp2m.c, which
makes it hard not to access parts required for initialization/teardown
(that are equal for both p2m and altp2m).


I remember this discussion. However, the p2m initialize/teardown should 
exactly be the same for the hostp2m and altp2m (except for the type of 
the p2m). So, a function should be provided to initialize a full p2m to 
avoid code duplication.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Razvan Cojocaru
On 07/22/2016 01:17 PM, Corneliu ZUZU wrote:
> On 7/22/2016 12:55 PM, Razvan Cojocaru wrote:
>> On 07/22/2016 12:27 PM, Corneliu ZUZU wrote:
>>> Hi,
>>>
>>> I've been inspecting vm-event code parts to try and understand when and
>>> why domain pausing/locking is done. If I understood correctly, domain
>>> pausing is done solely to force all the vCPUs of that domain to see a
>>> configuration update and act upon it (e.g. in the case of a
>>> XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring,
>>> domain pausing/unpausing ensures immediate enabling of CR3 load-exiting
>>> for all vCPUs), not to synchronize concurrent operations
>>> (lock-behavior).
>>>
>>> As for locking, I see that for example vm_event_init_domain(),
>>> vm_event_cleanup_domain() and monitor_domctl() are all protected by the
>>> domctl-lock, but I don't think that's enough.
>>> Here are a few code-paths that led me to believe that:
>>> * do_hvm_op()->monitor_guest_request() reads d.monitor.guest_request_*
>>> resources, but it doesn't seem to take the domctl lock, so it seems
>>> possible for that to happen _while_ those resources are
>>> initialized/cleaned-up
>>> * monitor_guest_request() also calls
>>> monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot()
>>> which attempts a vm_event_ring_lock(ved), which could also be called
>>> _while_ that's initialized (vm_event_enable()) or cleaned-up
>>> (vm_event_disable())
>>> * hvm_monitor_cr() - e.g. on the code-path
>>> vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX()
>>>
>>> there doesn't seem to be taken into account the possibility of a
>>> concurrent monitor_init_domain()/monitor_cleanup_domain()
>>>
>>> Am I missing something with these conclusions?
>> Your conclusions look correct, but I assume that the reason why this has
>> not been addressed in the past is that introspection applications are
>> expected to be well-behaved.
> 
> I wouldn't think that was the rationale (considering that the risk is
> crashing the hypervisor as a whole; also see below), I'd rather think
> this simply went unnoticed.
> 
>> Specifically, in a codebase where the
>> choice between uint64_t and long int matters speed-wise, and where
>> unlikely()s also matter, an extra lock may be an issue.
>>
>> The typical flow of an introspection application is:
>>
>> 1. Initialize everything.
>> 2. Subscribe to relevant events.
>> 3. Event processing loop.
>> 4. Unsubscribe from events.
>> 5. Do a last-run of event processing (already queued in the ring buffer).
>> 6. Uninitialize everything (no events are possible here because of steps
>> 4-5).
> 
> And even if an introspection application behaves as sanely as you say,
> the current layout of the code still doesn't guarantee bug-free
> behavior. That's because for one, there are in-guest events (that
> trigger access of vm-events resources) that are completely asynchronous
> in relation to 2-6, for example a HVMOP_guest_request_vm_event
> (subsequently calling monitor_guest_request()).

Not true. As part of step 4 d->monitor.guest_request_enabled will become
0, so no sensitive code will run after that:

129 void monitor_guest_request(void)
130 {
131 struct vcpu *curr = current;
132 struct domain *d = curr->domain;
133
134 if ( d->monitor.guest_request_enabled )
135 {
136 vm_event_request_t req = {
137 .reason = VM_EVENT_REASON_GUEST_REQUEST,
138 .vcpu_id = curr->vcpu_id,
139 };
140
141 monitor_traps(curr, d->monitor.guest_request_sync, &req);
142 }
143 }

The guest _may_ try to request an event after that, but the request will
be ignored.

>> , and that "subsys_lock"
>> sounds obscure to me, although I admit that I can't think of a good name
>> at the moment.
> 
> I'm referring to monitor, share & paging as "subsystems" of the whole
> "vm-events" subsystem (in that sense the 3 aforementioned are
> "sub-subsystems").
> To me it's acceptable, but I'm open to better names.

If Tamas agrees with you I'll consider myself outvoted.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Corneliu ZUZU

On 7/22/2016 12:51 PM, Andrew Cooper wrote:

On 22/07/16 10:27, Corneliu ZUZU wrote:


Hi,

I've been inspecting vm-event code parts to try and understand when 
and why domain pausing/locking is done. If I understood correctly, 
domain pausing is done solely to force all the vCPUs of that domain 
to see a configuration update and act upon it (e.g. in the case of a 
XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring, 
domain pausing/unpausing ensures immediate enabling of CR3 
load-exiting for all vCPUs), not to synchronize concurrent operations 
(lock-behavior).


Correct.  Without the vcpu/domain pause, a vcpu could be midway 
through a vmexit path with the monitor configuration changing under 
its feet.  OTOH, it could be running in guest context, and only 
receive the update one scheduling timeslice later.




As for locking, I see that for example vm_event_init_domain(), 
vm_event_cleanup_domain() and monitor_domctl() are all protected by 
the domctl-lock, but I don't think that's enough.

Here are a few code-paths that led me to believe that:
* do_hvm_op()->monitor_guest_request() reads 
d.monitor.guest_request_* resources, but it doesn't seem to take the 
domctl lock, so it seems possible for that to happen _while_ those 
resources are initialized/cleaned-up
* monitor_guest_request() also calls 
monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot() 
which attempts a vm_event_ring_lock(ved), which could also be called 
_while_ that's initialized (vm_event_enable()) or cleaned-up 
(vm_event_disable())
* hvm_monitor_cr() - e.g. on the code-path 
vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX() 
there doesn't seem to be taken into account the possibility of a 
concurrent monitor_init_domain()/monitor_cleanup_domain()


Am I missing something with these conclusions?
As a resolution for this, I've been thinking of adding a 
'subsys_lock' field in the vm_event_domain structure, either spinlock 
or rw-lock, which would be initialised/uninitialised when d.vm_event 
is allocated/freed (domain_create()/complete_domain_destroy()).


I don't think you are missing anything.  It seems like a monitor lock 
is needed.


FWIW, the domctl lock is only used at the moment because it was easy, 
and worked when everything was a domctl.  Being a global spinlock, it 
is a severe bottlekneck for concurrent domain management, which I need 
to find some time to break apart, so the less reliance on it the 
better from my point of view.


~Andrew



I was thinking of not touching the way the domctl lock is acquired in 
these cases at all, but rather leave that as it is and introduce the 
rw-lock separately. Are you suggesting I should also change do_domctl to 
skip domctl_lock_acquire()-ing if op->cmd is XEN_DOMCTL_vm_event_op ? 
That would certainly be a first step towards amending that bottleneck 
you mentioned, but it would break consistency of locking-behavior for 
domctls. I think it would also add significant complexity to what I'll 
be doing so I hope it wouldn't be a problem if we leave that for a 
future patch.


Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 20/22] xen/arm: Don't export flush_tlb_domain

2016-07-22 Thread Julien Grall



On 22/07/16 11:25, Sergej Proskurin wrote:

Hi Julien,

On 07/22/2016 11:30 AM, Julien Grall wrote:



On 22/07/16 09:54, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


On 07/20/2016 06:11 PM, Julien Grall wrote:

The function flush_tlb_domain is not used outside of the file where it
has been declared.



As for patch #15, the same applies here too:
For altp2m, flush_tlb_domain/p2m_flush_tlb should be made available to
./xen/arch/arm/altp2m.c.


Based on your previous version, I don't see any reason to flush call
flush_tlb_domain/p2m_flush_tlb in altp2m.

Please justify why you would need it.



The new version considers changes that are made to the hostp2m and
propagates them to all affected altp2m views by either changing
individual altp2m entries or even flushing (but not destroying) the
entire altp2m-tables. This idea has been borrowed from the x86 altp2m
implementation.

To prevent access to old/invalid GPAs, the current implementation
flushes the TLBs associated with the affected altp2m view after such
propagation.


There is already a flush in apply_p2m_changes and removing all the 
mapping in a p2m could be implemented in p2m.c. So I still don't see why 
you need the flush outside.


I looked at the x86 version of the propagation and I was not able to 
spot any explicit flush. Maybe you can provide some code to show what 
you mean.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 15/22] xen/arm: Don't call p2m_alloc_table from arch_domain_create

2016-07-22 Thread Sergej Proskurin


On 07/22/2016 12:26 PM, Julien Grall wrote:
> 
> 
> On 22/07/16 11:16, Sergej Proskurin wrote:
>> Hi Julien,
> 
> Hello,
> 
>> On 07/22/2016 11:18 AM, Julien Grall wrote:
>>>
>>>
>>> On 22/07/16 09:32, Sergej Proskurin wrote:
 Hi Julien,
>>>
>>> Hello Sergej,
>>>
> -int p2m_alloc_table(struct domain *d)
> +static int p2m_alloc_table(struct domain *d)

 While moving parts of the altp2m code out of ./xen/arch/arm/p2m.c, the
 function p2m_alloc_table needs to be called from
 ./xen/arch/arm/altp2m.c
 to allocate the individual altp2m views. Hence it should not be static.
>>>
>>> No, this function should not be called outside p2m.c as it will not
>>> fully initialize the p2m. You need to need to provide a function to
>>> initialize a p2m (such as p2m_init).
>>>
>>
>> The last time we have discussed reusing existing code, among others, for
>> individual struct p2m_domain initialization routines. Also, we have
>> agreed to move altp2m-related parts out of p2m.c into altp2m.c, which
>> makes it hard not to access parts required for initialization/teardown
>> (that are equal for both p2m and altp2m).
> 
> I remember this discussion. However, the p2m initialize/teardown should
> exactly be the same for the hostp2m and altp2m (except for the type of
> the p2m). So, a function should be provided to initialize a full p2m to
> avoid code duplication.
> 

This is exactly what has been done. Nevertheless, altp2m views are
somewhat more dynamic than hostp2m and hence require frequent
initialization/teardown of individual views. That is, by moving altp2m
parts out of p2m.c we simply need to access this shared code multiple
times at runtime from different altp2m-related functions. This applies
to more functions that just to p2m_alloc_table.

Best regards,
~Sergej

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/1] ratelimit: Implement rate limit for credit2 scheduler Rate limit assures that a vcpu will execute for a minimum amount of time before being put at the back of a queue or

2016-07-22 Thread Dario Faggioli
On Mon, 2016-07-18 at 13:22 +0100, Anshul Makkar wrote:
>
Hey, Anshul.

Thanks, and sorry for the delay in reviewing.

This version is an improvement, but it looks to me that you've missed a
few of the review comments to v1.

> It introduces a minimum amount of latency 
>
"introduces context-switch rate-limiting"

> to enable a VM to batch its work and
> it also ensures that system is not spending most of its time in
> VMEXIT/VMENTRY because of VM that is waking/sleeping at high rate.
> 
> ratelimit can be disabled by setting it to 0.
> 

> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index 8b95a47..68bcdb8 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> 
> @@ -1601,6 +1602,34 @@ csched2_dom_cntl(
>  return rc;
>  }
>  
> +static int csched2_sys_cntl(const struct scheduler *ops,
> +struct xen_sysctl_scheduler_op *sc)
> +{
> +int rc = -EINVAL;
> +xen_sysctl_credit_schedule_t *params = &sc->u.sched_credit;
> +struct csched2_private *prv = CSCHED2_PRIV(ops);
> +unsigned long flags;
> +
> +switch (sc->cmd )
> +{
> +case XEN_SYSCTL_SCHEDOP_putinfo:
> +if ( params->ratelimit_us &&
> +( params->ratelimit_us < CSCHED2_MIN_TIMER ||
> +  params->ratelimit_us >
> MICROSECS(CSCHED2_MAX_TIMER) ))
>
CSCHED2_MIN_TIMER and CSCHED2_MAX_TIMER are defined as follows:

#define CSCHED2_MIN_TIMERMICROSECS(500)
#define CSCHED2_MAX_TIMERMILLISECS(2)

Which basically means they're value is 500*1000=50 and
2*100=200.

ratelimit_us is specified assuming it will be in microseconds. So,
basically, you're forcing the value stored in ratelimit_us to be at
least 50, which means 50 microseconds, which means 500
milliseconds, which is not what we want!

I remember saying already (although, it may have be in pvt, not on this
list) that I think we should just use XEN_SYSCTL_SCHED_RATELIMIT_MAX
and XEN_SYSCTL_SCHED_RATELIMIT_MIN here.

CSCHED2_MIN_TIMER and CSCHED2_MAX_TIMER are internal implementation
details, and I don't like them exposed (although, indirectly) to the
user.

> +return rc;
> +spin_lock_irqsave(&prv->lock, flags);
> +prv->ratelimit_us = params->ratelimit_us;
> +spin_unlock_irqrestore(&prv->lock, flags);
> +break;
> +
This is ok. However, the code base changed in the meanwhile (sorry! :-
P), and this spin_lock_irqsave() needs to become a
write_lock_irqsave().

> @@ -1688,9 +1719,20 @@ csched2_runtime(const struct scheduler *ops,
> int cpu, struct csched2_vcpu *snext
>   * 1) Run until snext's credit will be 0
>   * 2) But if someone is waiting, run until snext's credit is
> equal
>   * to his
> - * 3) But never run longer than MAX_TIMER or shorter than
> MIN_TIMER.
> + * 3) But never run longer than MAX_TIMER or shorter than
> MIN_TIMER or
> + * run for ratelimit time.
>   */
> 
I prefer George's version of this comment change:

"3) But never run longer than MAX_TIMER or shorter than MIN_TIMER
or the ratelimit time."

>  
> +/* Calculate mintime */
> +min_time = CSCHED2_MIN_TIMER;
> +if ( prv->ratelimit_us ) {
>
Coding style. (Parenthesis goes on next line.)

> +s_time_t ratelimit_min = prv->ratelimit_us;
> +ratelimit_min = snext->vcpu->runstate.state_entry_time +
> +MICROSECS(prv->ratelimit_us) - now;
>
Mmm... if you wanted to implement my suggestion from
<1468400021.13039.33.ca...@citrix.com>, you're definitely missing
something:

     s_time_t ratelimit_min = prv->ratelimit_us;
 if ( snext->vcpu->is_running )
 ratelimit_min = snext->vcpu->runstate.state_entry_time +
 MICROSECS(prv->ratelimit_us) - now;

In fact, you're initializing ratelimit_min and then immediately
overriding that... I'm surprised the compiler didn't complain.

> +if ( ratelimit_min > min_time )
> +min_time = ratelimit_min;
> +}
> +

> @@ -1707,32 +1749,33 @@ csched2_runtime(const struct scheduler *ops,
> int cpu, struct csched2_vcpu *snext
>  }
>  }
>  
> -/* The next guy may actually have a higher credit, if we've
> tried to
> - * avoid migrating him from a different cpu.  DTRT.  */
> -if ( rt_credit <= 0 )
> +/*
> + * The next guy ont the runqueue may actually have a higher
> credit,
> + * if we've tried to avoid migrating him from a different cpu.
> + * Setting time=0 will ensure the minimum timeslice is chosen.
>
George's draft patch had an empty line within this comment in here,
separating the two paragraph. Can we keep it? (I know, this is a very
minor thing, but since we're here :-D)

> + * FIXME: See if we can eliminate this conversion if we know
> time
> + * will be outside (MIN,MAX).  Probably requires pre-calculating
> + * credit values of MIN,MAX per vcpu, since each vcpu burns
> credit
> + * at a diffe

Re: [Xen-devel] [PATCH 15/22] xen/arm: Don't call p2m_alloc_table from arch_domain_create

2016-07-22 Thread Julien Grall



On 22/07/16 11:39, Sergej Proskurin wrote:



On 07/22/2016 12:26 PM, Julien Grall wrote:



On 22/07/16 11:16, Sergej Proskurin wrote:

Hi Julien,


Hello,


On 07/22/2016 11:18 AM, Julien Grall wrote:



On 22/07/16 09:32, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


-int p2m_alloc_table(struct domain *d)
+static int p2m_alloc_table(struct domain *d)


While moving parts of the altp2m code out of ./xen/arch/arm/p2m.c, the
function p2m_alloc_table needs to be called from
./xen/arch/arm/altp2m.c
to allocate the individual altp2m views. Hence it should not be static.


No, this function should not be called outside p2m.c as it will not
fully initialize the p2m. You need to need to provide a function to
initialize a p2m (such as p2m_init).



The last time we have discussed reusing existing code, among others, for
individual struct p2m_domain initialization routines. Also, we have
agreed to move altp2m-related parts out of p2m.c into altp2m.c, which
makes it hard not to access parts required for initialization/teardown
(that are equal for both p2m and altp2m).


I remember this discussion. However, the p2m initialize/teardown should
exactly be the same for the hostp2m and altp2m (except for the type of
the p2m). So, a function should be provided to initialize a full p2m to
avoid code duplication.



This is exactly what has been done. Nevertheless, altp2m views are
somewhat more dynamic than hostp2m and hence require frequent
initialization/teardown of individual views. That is, by moving altp2m
parts out of p2m.c we simply need to access this shared code multiple
times at runtime from different altp2m-related functions. This applies
to more functions that just to p2m_alloc_table.


I am not convinced that you need to reallocate the root page every time 
rather than clearing them. Anyway, I will need to see the code to 
understand what is done.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] OVMF very slow on AMD

2016-07-22 Thread Dario Faggioli
On Mon, 2016-07-18 at 16:09 +0100, Anthony PERARD wrote:
> 
> $ dwdiff procinfo_guest_ovmf_kvm procinfo_guest_ovmf_xen
> processor : 0
> vendor_id : AuthenticAMD
> cpu family: [-6-] {+21+}
> model : [-6-] {+1+}
> model name: [-QEMU Virtual CPU version 2.5+-] {+AMD
> Opteron(tm) Processor 4284+}
> stepping  : [-3-] {+2+}
> microcode : [-0x165-] {+0x600063d+}
>
Is it ok / expected for this (microcode, I mean) to be different?

BTW, I'm super ignorant about all it's being discussed here, so this
can well be completely off!

Regards,
Dario
--
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 20/22] xen/arm: Don't export flush_tlb_domain

2016-07-22 Thread Sergej Proskurin


On 07/22/2016 12:34 PM, Julien Grall wrote:
> 
> 
> On 22/07/16 11:25, Sergej Proskurin wrote:
>> Hi Julien,
>>
>> On 07/22/2016 11:30 AM, Julien Grall wrote:
>>>
>>>
>>> On 22/07/16 09:54, Sergej Proskurin wrote:
 Hi Julien,
>>>
>>> Hello Sergej,
>>>
 On 07/20/2016 06:11 PM, Julien Grall wrote:
> The function flush_tlb_domain is not used outside of the file where it
> has been declared.
>

 As for patch #15, the same applies here too:
 For altp2m, flush_tlb_domain/p2m_flush_tlb should be made available to
 ./xen/arch/arm/altp2m.c.
>>>
>>> Based on your previous version, I don't see any reason to flush call
>>> flush_tlb_domain/p2m_flush_tlb in altp2m.
>>>
>>> Please justify why you would need it.
>>>
>>
>> The new version considers changes that are made to the hostp2m and
>> propagates them to all affected altp2m views by either changing
>> individual altp2m entries or even flushing (but not destroying) the
>> entire altp2m-tables. This idea has been borrowed from the x86 altp2m
>> implementation.
>>
>> To prevent access to old/invalid GPAs, the current implementation
>> flushes the TLBs associated with the affected altp2m view after such
>> propagation.
> 
> There is already a flush in apply_p2m_changes and removing all the
> mapping in a p2m could be implemented in p2m.c. So I still don't see why
> you need the flush outside.
> 

Yes, the flush you are referring to flushes the hostp2m - not the
individual altp2m views.

> I looked at the x86 version of the propagation and I was not able to
> spot any explicit flush. Maybe you can provide some code to show what
> you mean.
>  

Sure thing:

...

static void p2m_reset_altp2m(struct p2m_domain *p2m)
{
p2m_flush_table(p2m);
/* Uninit and reinit ept to force TLB shootdown */
ept_p2m_uninit(p2m);
ept_p2m_init(p2m);
p2m->min_remapped_gfn = INVALID_GFN;
p2m->max_remapped_gfn = 0;
}

...

On x86, the uninit- and re-initialization of the EPTs force the TLBs
associated with the configured VMID of the EPTs to flush.

Regards,
~Sergej


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL

2016-07-22 Thread Wei Liu
On Fri, Jul 22, 2016 at 03:27:30AM +, osstest service owner wrote:
> flight 97737 xen-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/97737/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-amd64-xl-credit2  19 guest-start/debian.repeat fail REGR. vs. 
> 97664
>  test-armhf-armhf-xl  15 guest-start/debian.repeat fail REGR. vs. 
> 97664
>  test-armhf-armhf-xl-credit2  15 guest-start/debian.repeat fail REGR. vs. 
> 97664

From
http://logs.test-lab.xenproject.org/osstest/logs/97737/test-amd64-amd64-xl-credit2/serial-fiano0.log

Jul 21 07:38:22.383917 (XEN) Assertion 'rqd->avgload >= 0 && rqd->b_avgload >= 
0' failed at sched_credit2.c:734
Jul 21 07:38:23.27 (XEN) [ Xen-4.8-unstable  x86_64  debug=y  Not 
tainted ]
Jul 21 07:38:23.207896 (XEN) CPU:2
Jul 21 07:38:23.207943 (XEN) RIP:e008:[] 
sched_credit2.c#__update_runq_load+0xde/0x11e
Jul 21 07:38:23.215908 (XEN) RFLAGS: 00010082   CONTEXT: hypervisor
Jul 21 07:38:23.215970 (XEN) rax: 502c   rbx: 502c   
rcx: a1df
Jul 21 07:38:23.223906 (XEN) rdx: 0001   rsi: 83027d800460   
rdi: 38f6e47e
Jul 21 07:38:23.231882 (XEN) rbp: 830277eb7c10   rsp: 830277eb7be0   
r8:  fd71
Jul 21 07:38:23.239900 (XEN) r9:  0012   r10: 0014   
r11: 022d
Jul 21 07:38:23.247906 (XEN) r12: 830277d8f900   r13: 0001   
r14: 00e3db91fb86
Jul 21 07:38:23.255909 (XEN) r15: 83027d800460   cr0: 8005003b   
cr4: 001526e0
Jul 21 07:38:23.255974 (XEN) cr3: 7dcfd000   cr2: 7f81409f1b20
Jul 21 07:38:23.263910 (XEN) ds: 002b   es: 002b   fs:    gs:    ss: 
e010   cs: e008
Jul 21 07:38:23.272145 (XEN) Xen code around  
(sched_credit2.c#__update_runq_load+0xde/0x11e):
Jul 21 07:38:23.279948 (XEN)  00 00 00 48 85 c9 79 02 <0f> 0b 83 3d 44 2e 1c 00 
00 74 2e 8b 36 40 88 75
Jul 21 07:38:23.287896 (XEN) Xen stack trace from rsp=830277eb7be0:
Jul 21 07:38:23.287960 (XEN)0297 830277eb7c00 
82d080130969 0026785f
Jul 21 07:38:23.295904 (XEN)830277eb7c60 82d0802e96c0 
830277eb7c50 82d080126b19
Jul 21 07:38:23.303909 (XEN)0297 830277d8f900 
00e3db91fb86 
Jul 21 07:38:23.311948 (XEN)82d0802e96c0 83027d800464 
830277eb7c80 82d08012956e
Jul 21 07:38:23.319895 (XEN)83007dafc000 00e3db91f9b6 
82d080344140 83027d800464
Jul 21 07:38:23.319969 (XEN)830277eb7ce0 82d08012de99 
0297 0046
Jul 21 07:38:23.327929 (XEN)82d080130969 00266c7b 
830277eb7d10 83007dafc000
Jul 21 07:38:23.335903 (XEN) 830277ebd000 
83027d8450b8 82c00021001c
Jul 21 07:38:23.343911 (XEN)830277eb7cf0 82d08012e393 
830277eb7d10 82d08016b7d4
Jul 21 07:38:23.351904 (XEN)0007 002e 
830277eb7d20 82d08016b844
Jul 21 07:38:23.359911 (XEN)830277eb7d90 82d08010ab2f 
82c0002100b8 072e
Jul 21 07:38:23.367896 (XEN)83027d845010 0086 
83007dafc000 0007
Jul 21 07:38:23.367966 (XEN)830277eb7d78 0003 
830277ebd000 83007dafc19c
Jul 21 07:38:23.375919 (XEN)0286 83007dafc000 
830277eb7dd0 82d08010847b
Jul 21 07:38:23.383904 (XEN) 83007dc9d000 
 8302630b3aa8
Jul 21 07:38:23.391904 (XEN)8302630b3000 830277da4a28 
830277eb7e00 82d080105c7a
Jul 21 07:38:23.399910 (XEN)830277eba040  
 830277eb7fff
Jul 21 07:38:23.407908 (XEN)830277eb7e30 82d080122a08 
82d08031aa80 82d08031ab80
Jul 21 07:38:23.407977 (XEN)82d08031aa80 fffd 
830277eb7e60 82d080130116
Jul 21 07:38:23.415913 (XEN)0002 830277da4970 
0004 
Jul 21 07:38:23.423908 (XEN) Xen call trace:
Jul 21 07:38:23.423964 (XEN)[] 
sched_credit2.c#__update_runq_load+0xde/0x11e
Jul 21 07:38:23.431916 (XEN)[] 
sched_credit2.c#update_load+0x53/0x78
Jul 21 07:38:23.439900 (XEN)[] 
sched_credit2.c#csched2_vcpu_wake+0xf7/0x119
Jul 21 07:38:23.447913 (XEN)[] vcpu_wake+0x23f/0x3cb
Jul 21 07:38:23.447979 (XEN)[] vcpu_unblock+0x4b/0x4e
Jul 21 07:38:23.455908 (XEN)[] vcpu_kick+0x17/0x5b
Jul 21 07:38:23.463889 (XEN)[] 
vcpu_mark_events_pending+0x2c/0x2f
Jul 21 07:38:23.463962 (XEN)[] 
event_fifo.c#evtchn_fifo_set_pending+0x36a/0x3de
Jul 21 07:38:23.471914 (XEN)[] send_global_virq+0xe7/0x117
Jul 21 07:38:23.479916 (XEN)[] 
domain.c#complete_domain_destroy+0x179/0x182
Jul 21 07:38:23.487875 (XEN)[] 
rcupdate.c#rcu_process_callbacks+0x141/0x1a2
Jul 21 07:38:23.487915 (XEN)[] 
softirq.c#__d

Re: [Xen-devel] [PATCH 20/22] xen/arm: Don't export flush_tlb_domain

2016-07-22 Thread Julien Grall



On 22/07/16 11:46, Sergej Proskurin wrote:



On 07/22/2016 12:34 PM, Julien Grall wrote:



On 22/07/16 11:25, Sergej Proskurin wrote:

Hi Julien,

On 07/22/2016 11:30 AM, Julien Grall wrote:



On 22/07/16 09:54, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


On 07/20/2016 06:11 PM, Julien Grall wrote:

The function flush_tlb_domain is not used outside of the file where it
has been declared.



As for patch #15, the same applies here too:
For altp2m, flush_tlb_domain/p2m_flush_tlb should be made available to
./xen/arch/arm/altp2m.c.


Based on your previous version, I don't see any reason to flush call
flush_tlb_domain/p2m_flush_tlb in altp2m.

Please justify why you would need it.



The new version considers changes that are made to the hostp2m and
propagates them to all affected altp2m views by either changing
individual altp2m entries or even flushing (but not destroying) the
entire altp2m-tables. This idea has been borrowed from the x86 altp2m
implementation.

To prevent access to old/invalid GPAs, the current implementation
flushes the TLBs associated with the affected altp2m view after such
propagation.


There is already a flush in apply_p2m_changes and removing all the
mapping in a p2m could be implemented in p2m.c. So I still don't see why
you need the flush outside.



Yes, the flush you are referring to flushes the hostp2m - not the
individual altp2m views.


apply_p2m_changes is *not* hostp2m specific. It should work on any p2m 
regardless the type.


The ARM P2M interface is not set in stone, so if it does not fit it will 
need to be changed. We should avoid to hack the code in order to add a 
new feature.


It might be time to mention that I am reworking the whole p2m code it 
does not respect the ARM spec (such as break-before-make semantics) and 
I believe it does not fit the altp2m model. It is very difficult to 
implement the former with the current implementation and without have a 
big performance impact.


Rather than having a function which implement all the operations, I am 
planning to have a simple set of functions that can be used to 
re-implement the operations:

- p2m_set_entry: Set an entry in the P2M
- p2m_get_entry: Retrieve the informations of an entry

This is very similar to x86 and make more straight forward to implement 
new operations and co-op with the ARM spec.


I have already a prototype and I am hoping to send it soon.




I looked at the x86 version of the propagation and I was not able to
spot any explicit flush. Maybe you can provide some code to show what
you mean.



Sure thing:

...

static void p2m_reset_altp2m(struct p2m_domain *p2m)
{
p2m_flush_table(p2m);
/* Uninit and reinit ept to force TLB shootdown */
ept_p2m_uninit(p2m);
ept_p2m_init(p2m);
p2m->min_remapped_gfn = INVALID_GFN;
p2m->max_remapped_gfn = 0;
}

...

On x86, the uninit- and re-initialization of the EPTs force the TLBs
associated with the configured VMID of the EPTs to flush.


As mentioned in my previous mail, p2m_reset can be implemented in p2m.c 
as this is not altp2m.c specific.


When I asked to move altp2m specific code from p2m.c to altp2m.c it was 
for avoiding to have p2m.c too big. However, if the function is not 
altp2m specific, there is little reason to move outside.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 15/22] xen/arm: Don't call p2m_alloc_table from arch_domain_create

2016-07-22 Thread Sergej Proskurin


On 07/22/2016 12:38 PM, Julien Grall wrote:
> 
> 
> On 22/07/16 11:39, Sergej Proskurin wrote:
>>
>>
>> On 07/22/2016 12:26 PM, Julien Grall wrote:
>>>
>>>
>>> On 22/07/16 11:16, Sergej Proskurin wrote:
 Hi Julien,
>>>
>>> Hello,
>>>
 On 07/22/2016 11:18 AM, Julien Grall wrote:
>
>
> On 22/07/16 09:32, Sergej Proskurin wrote:
>> Hi Julien,
>
> Hello Sergej,
>
>>> -int p2m_alloc_table(struct domain *d)
>>> +static int p2m_alloc_table(struct domain *d)
>>
>> While moving parts of the altp2m code out of ./xen/arch/arm/p2m.c,
>> the
>> function p2m_alloc_table needs to be called from
>> ./xen/arch/arm/altp2m.c
>> to allocate the individual altp2m views. Hence it should not be
>> static.
>
> No, this function should not be called outside p2m.c as it will not
> fully initialize the p2m. You need to need to provide a function to
> initialize a p2m (such as p2m_init).
>

 The last time we have discussed reusing existing code, among others,
 for
 individual struct p2m_domain initialization routines. Also, we have
 agreed to move altp2m-related parts out of p2m.c into altp2m.c, which
 makes it hard not to access parts required for initialization/teardown
 (that are equal for both p2m and altp2m).
>>>
>>> I remember this discussion. However, the p2m initialize/teardown should
>>> exactly be the same for the hostp2m and altp2m (except for the type of
>>> the p2m). So, a function should be provided to initialize a full p2m to
>>> avoid code duplication.
>>>
>>
>> This is exactly what has been done. Nevertheless, altp2m views are
>> somewhat more dynamic than hostp2m and hence require frequent
>> initialization/teardown of individual views. That is, by moving altp2m
>> parts out of p2m.c we simply need to access this shared code multiple
>> times at runtime from different altp2m-related functions. This applies
>> to more functions that just to p2m_alloc_table.
> 
> I am not convinced that you need to reallocate the root page every time
> rather than clearing them. Anyway, I will need to see the code to
> understand what is done.
> 
> Regards,
> 

In the following, you will find an excerpt of both the p2m.c and
altp2m.c concerning the (alt)p2m initialization. Please note that
altp2m_init_helper is called from different initialization routines in
altp2m.c. Also, please consider that this code has not yet been ported
to your recent patches. Please let me know if you need more information.

---
./xen/arch/p2m.c:

...

int p2m_alloc_table(struct p2m_domain *p2m)
{
unsigned int i;
struct page_info *page;
struct vttbr *vttbr = &p2m->vttbr;

page = alloc_domheap_pages(NULL, P2M_ROOT_ORDER, 0);
if ( page == NULL )
return -ENOMEM;

/* Clear all first level pages */
for ( i = 0; i < P2M_ROOT_PAGES; i++ )
clear_and_clean_page(page + i);

p2m->root = page;

/* Initialize the VTTBR associated with the allocated p2m table. */
vttbr->vttbr = 0;
vttbr->vmid = p2m->vmid & 0xff;
vttbr->baddr = page_to_maddr(p2m->root);

return 0;
}

int p2m_init_one(struct domain *d, struct p2m_domain *p2m)
{
int rc = 0;

spin_lock_init(&p2m->lock);
INIT_PAGE_LIST_HEAD(&p2m->pages);

spin_lock(&p2m->lock);

p2m->vmid = INVALID_VMID;
rc = p2m_alloc_vmid(d, p2m);
if ( rc != 0 )
goto err;

p2m->domain = d;
p2m->access_required = false;
p2m->mem_access_enabled = false;
p2m->default_access = p2m_access_rwx;
p2m->root = NULL;

p2m->vttbr.vttbr = INVALID_VTTBR;

p2m->max_mapped_gfn = 0;
p2m->lowest_mapped_gfn = INVALID_GFN;
radix_tree_init(&p2m->mem_access_settings);

err:
spin_unlock(&p2m->lock);

return rc;
}

static int p2m_init_hostp2m(struct domain *d)
{
struct p2m_domain *p2m = p2m_get_hostp2m(d);

d->arch.vttbr = INVALID_VTTBR;
p2m->p2m_class = p2m_host;

return p2m_init_one(d, p2m);
}

int p2m_init(struct domain *d)
{
int rc;

rc = p2m_init_hostp2m(d);
if ( rc )
return rc;

return altp2m_init(d);
}

...

---
./xen/arch/altp2m.c:

...

static int altp2m_init_helper(struct domain *d, unsigned int idx)
{
int rc;
struct p2m_domain *p2m = d->arch.altp2m_p2m[idx];

if ( p2m == NULL )
{
/* Allocate a new altp2m view. */
p2m = xzalloc(struct p2m_domain);
if ( p2m == NULL)
{
rc = -ENOMEM;
goto err;
}
memset(p2m, 0, sizeof(struct p2m_domain));
}

/* Initialize the new altp2m view. */
rc = p2m_init_one(d, p2m);
if ( rc )
goto err;

/* Allocate a root table for the altp2m view. */
rc = p2m_alloc_table(p2m);
if ( rc )
goto err;

p2m->p2m_class = p2m_alternate;
p2m->access_required = 1;
_atomic_set(&p2m->active_vcpus, 0);

d->arch.altp2m_p2m[idx] = p2m;
d->arch.altp2m_vttbr[idx] = p2m->vttbr.vttbr;

/*
 * M

Re: [Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Andrew Cooper



On 22/07/16 11:31, Corneliu ZUZU wrote:

On 7/22/2016 12:51 PM, Andrew Cooper wrote:

On 22/07/16 10:27, Corneliu ZUZU wrote:


Hi,

I've been inspecting vm-event code parts to try and understand when 
and why domain pausing/locking is done. If I understood correctly, 
domain pausing is done solely to force all the vCPUs of that domain 
to see a configuration update and act upon it (e.g. in the case of a 
XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring, 
domain pausing/unpausing ensures immediate enabling of CR3 
load-exiting for all vCPUs), not to synchronize concurrent 
operations (lock-behavior).


Correct.  Without the vcpu/domain pause, a vcpu could be midway 
through a vmexit path with the monitor configuration changing under 
its feet.  OTOH, it could be running in guest context, and only 
receive the update one scheduling timeslice later.




As for locking, I see that for example vm_event_init_domain(), 
vm_event_cleanup_domain() and monitor_domctl() are all protected by 
the domctl-lock, but I don't think that's enough.

Here are a few code-paths that led me to believe that:
* do_hvm_op()->monitor_guest_request() reads 
d.monitor.guest_request_* resources, but it doesn't seem to take the 
domctl lock, so it seems possible for that to happen _while_ those 
resources are initialized/cleaned-up
* monitor_guest_request() also calls 
monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot() 
which attempts a vm_event_ring_lock(ved), which could also be called 
_while_ that's initialized (vm_event_enable()) or cleaned-up 
(vm_event_disable())
* hvm_monitor_cr() - e.g. on the code-path 
vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX() 
there doesn't seem to be taken into account the possibility of a 
concurrent monitor_init_domain()/monitor_cleanup_domain()


Am I missing something with these conclusions?
As a resolution for this, I've been thinking of adding a 
'subsys_lock' field in the vm_event_domain structure, either 
spinlock or rw-lock, which would be initialised/uninitialised when 
d.vm_event is allocated/freed 
(domain_create()/complete_domain_destroy()).


I don't think you are missing anything.  It seems like a monitor lock 
is needed.


FWIW, the domctl lock is only used at the moment because it was easy, 
and worked when everything was a domctl.  Being a global spinlock, it 
is a severe bottlekneck for concurrent domain management, which I 
need to find some time to break apart, so the less reliance on it the 
better from my point of view.


~Andrew



I was thinking of not touching the way the domctl lock is acquired in 
these cases at all, but rather leave that as it is and introduce the 
rw-lock separately. Are you suggesting I should also change do_domctl 
to skip domctl_lock_acquire()-ing if op->cmd is XEN_DOMCTL_vm_event_op 
? That would certainly be a first step towards amending that 
bottleneck you mentioned, but it would break consistency of 
locking-behavior for domctls. I think it would also add significant 
complexity to what I'll be doing so I hope it wouldn't be a problem if 
we leave that for a future patch.


Sorry - I didn't intend to suggest altering the domctl lock.  Just that 
adding in a new monitor lock and using it everywhere where appropriate 
would be a good step.


Breaking the domctl lock is a very large can of worms, which you 
definitely won't want to open while attempting to do something else.


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 20/22] xen/arm: Don't export flush_tlb_domain

2016-07-22 Thread Sergej Proskurin


On 07/22/2016 12:57 PM, Julien Grall wrote:
> 
> 
> On 22/07/16 11:46, Sergej Proskurin wrote:
>>
>>
>> On 07/22/2016 12:34 PM, Julien Grall wrote:
>>>
>>>
>>> On 22/07/16 11:25, Sergej Proskurin wrote:
 Hi Julien,

 On 07/22/2016 11:30 AM, Julien Grall wrote:
>
>
> On 22/07/16 09:54, Sergej Proskurin wrote:
>> Hi Julien,
>
> Hello Sergej,
>
>> On 07/20/2016 06:11 PM, Julien Grall wrote:
>>> The function flush_tlb_domain is not used outside of the file
>>> where it
>>> has been declared.
>>>
>>
>> As for patch #15, the same applies here too:
>> For altp2m, flush_tlb_domain/p2m_flush_tlb should be made
>> available to
>> ./xen/arch/arm/altp2m.c.
>
> Based on your previous version, I don't see any reason to flush call
> flush_tlb_domain/p2m_flush_tlb in altp2m.
>
> Please justify why you would need it.
>

 The new version considers changes that are made to the hostp2m and
 propagates them to all affected altp2m views by either changing
 individual altp2m entries or even flushing (but not destroying) the
 entire altp2m-tables. This idea has been borrowed from the x86 altp2m
 implementation.

 To prevent access to old/invalid GPAs, the current implementation
 flushes the TLBs associated with the affected altp2m view after such
 propagation.
>>>
>>> There is already a flush in apply_p2m_changes and removing all the
>>> mapping in a p2m could be implemented in p2m.c. So I still don't see why
>>> you need the flush outside.
>>>
>>
>> Yes, the flush you are referring to flushes the hostp2m - not the
>> individual altp2m views.
> 
> apply_p2m_changes is *not* hostp2m specific. It should work on any p2m
> regardless the type.
> 

This true. However, p2m_propagate_change is invoked (from
apply_p2m_changes) only if the p2m that was currently modified is the
hostp2m.

> The ARM P2M interface is not set in stone, so if it does not fit it will
> need to be changed. We should avoid to hack the code in order to add a
> new feature.
> 
> It might be time to mention that I am reworking the whole p2m code it
> does not respect the ARM spec (such as break-before-make semantics) and
> I believe it does not fit the altp2m model. It is very difficult to
> implement the former with the current implementation and without have a
> big performance impact.
> 
> Rather than having a function which implement all the operations, I am
> planning to have a simple set of functions that can be used to
> re-implement the operations:
> - p2m_set_entry: Set an entry in the P2M
> - p2m_get_entry: Retrieve the informations of an entry
> 
> This is very similar to x86 and make more straight forward to implement
> new operations and co-op with the ARM spec.
> 
> I have already a prototype and I am hoping to send it soon.
> 
>>
>>> I looked at the x86 version of the propagation and I was not able to
>>> spot any explicit flush. Maybe you can provide some code to show what
>>> you mean.
>>>
>>
>> Sure thing:
>>
>> ...
>>
>> static void p2m_reset_altp2m(struct p2m_domain *p2m)
>> {
>> p2m_flush_table(p2m);
>> /* Uninit and reinit ept to force TLB shootdown */
>> ept_p2m_uninit(p2m);
>> ept_p2m_init(p2m);
>> p2m->min_remapped_gfn = INVALID_GFN;
>> p2m->max_remapped_gfn = 0;
>> }
>>
>> ...
>>
>> On x86, the uninit- and re-initialization of the EPTs force the TLBs
>> associated with the configured VMID of the EPTs to flush.
> 
> As mentioned in my previous mail, p2m_reset can be implemented in p2m.c
> as this is not altp2m.c specific.
> 

Well yes. However, it is not used for the hostp2m, thus it makes it
automatically altp2m specific - but I know what you mean. Yet, I believe
it is cleaner to separate the entire altp2m code and maintain it in
altp2m.c. Nevertheless, I will need to pull back parts of altp2m code
into p2m.c, if we will not share some of the initialization/teardown
functions between both files.

Regards,
~Sergej


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Corneliu ZUZU

On 7/22/2016 1:29 PM, Razvan Cojocaru wrote:

On 07/22/2016 01:17 PM, Corneliu ZUZU wrote:

On 7/22/2016 12:55 PM, Razvan Cojocaru wrote:

On 07/22/2016 12:27 PM, Corneliu ZUZU wrote:

Hi,

I've been inspecting vm-event code parts to try and understand when and
why domain pausing/locking is done. If I understood correctly, domain
pausing is done solely to force all the vCPUs of that domain to see a
configuration update and act upon it (e.g. in the case of a
XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring,
domain pausing/unpausing ensures immediate enabling of CR3 load-exiting
for all vCPUs), not to synchronize concurrent operations
(lock-behavior).

As for locking, I see that for example vm_event_init_domain(),
vm_event_cleanup_domain() and monitor_domctl() are all protected by the
domctl-lock, but I don't think that's enough.
Here are a few code-paths that led me to believe that:
* do_hvm_op()->monitor_guest_request() reads d.monitor.guest_request_*
resources, but it doesn't seem to take the domctl lock, so it seems
possible for that to happen _while_ those resources are
initialized/cleaned-up
* monitor_guest_request() also calls
monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot()
which attempts a vm_event_ring_lock(ved), which could also be called
_while_ that's initialized (vm_event_enable()) or cleaned-up
(vm_event_disable())
* hvm_monitor_cr() - e.g. on the code-path
vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX()

there doesn't seem to be taken into account the possibility of a
concurrent monitor_init_domain()/monitor_cleanup_domain()

Am I missing something with these conclusions?

Your conclusions look correct, but I assume that the reason why this has
not been addressed in the past is that introspection applications are
expected to be well-behaved.

I wouldn't think that was the rationale (considering that the risk is
crashing the hypervisor as a whole; also see below), I'd rather think
this simply went unnoticed.


Specifically, in a codebase where the
choice between uint64_t and long int matters speed-wise, and where
unlikely()s also matter, an extra lock may be an issue.

The typical flow of an introspection application is:

1. Initialize everything.
2. Subscribe to relevant events.
3. Event processing loop.
4. Unsubscribe from events.
5. Do a last-run of event processing (already queued in the ring buffer).
6. Uninitialize everything (no events are possible here because of steps
4-5).

And even if an introspection application behaves as sanely as you say,
the current layout of the code still doesn't guarantee bug-free
behavior. That's because for one, there are in-guest events (that
trigger access of vm-events resources) that are completely asynchronous
in relation to 2-6, for example a HVMOP_guest_request_vm_event
(subsequently calling monitor_guest_request()).

Not true. As part of step 4 d->monitor.guest_request_enabled will become
0, so no sensitive code will run after that:


I wasn't being rigorous with 2-6 above (with that I only meant to say 
'there can't be any vm-events before everything is initialized') and it 
seems true that at least in the monitor_guest_request() case it might 
not be able to produce a hypervisor crash (although there are a number 
of ugly inconsistencies on that code path). But nonetheless a hypervisor 
crash can _still_ happen even with a _sane_ toolstack user.


Look @ hvm_do_resume():

1. If you, as a toolstack user, get at sane step no. 6: "Uninitialize 
everything (no events are possible here because of steps 4-5)."
2. But just before you do that,  a hvm_do_resume() happens on an 
arbitrary vCPU of the domain and gets to this point:


if ( unlikely(v->arch.vm_event) )
{
// ---> let's say you are now at this point, just after the 
above non-NULL check

struct monitor_write_data *w = &v->arch.vm_event->write_data;

3. and before proceeding further, the uninitialization sequence 
_completes_, i.e. this was done in vm_event_cleanup_domain():


for_each_vcpu ( d, v )
{
xfree(v->arch.vm_event);
v->arch.vm_event = NULL;
}

4. and then in hvm_do_resume() this gets to be done:

struct monitor_write_data *w = &v->arch.vm_event->write_data;

[...]

if ( w->do_write.msr )
{
hvm_msr_write_intercept(w->msr, w->value, 0);
w->do_write.msr = 0;
}

then you'll have a beautiful NULL-pointer access hypervisor crash.



129 void monitor_guest_request(void)
130 {
131 struct vcpu *curr = current;
132 struct domain *d = curr->domain;
133
134 if ( d->monitor.guest_request_enabled )
135 {
136 vm_event_request_t req = {
137 .reason = VM_EVENT_REASON_GUEST_REQUEST,
138 .vcpu_id = curr->vcpu_id,
139 };
140
141 monitor_traps(curr, d->monitor.guest_request_sync, &req);
142

Re: [Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Corneliu ZUZU

On 7/22/2016 2:13 PM, Andrew Cooper wrote:



On 22/07/16 11:31, Corneliu ZUZU wrote:

On 7/22/2016 12:51 PM, Andrew Cooper wrote:

On 22/07/16 10:27, Corneliu ZUZU wrote:


Hi,

I've been inspecting vm-event code parts to try and understand when 
and why domain pausing/locking is done. If I understood correctly, 
domain pausing is done solely to force all the vCPUs of that domain 
to see a configuration update and act upon it (e.g. in the case of 
a XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 
monitoring, domain pausing/unpausing ensures immediate enabling of 
CR3 load-exiting for all vCPUs), not to synchronize concurrent 
operations (lock-behavior).


Correct.  Without the vcpu/domain pause, a vcpu could be midway 
through a vmexit path with the monitor configuration changing under 
its feet.  OTOH, it could be running in guest context, and only 
receive the update one scheduling timeslice later.




As for locking, I see that for example vm_event_init_domain(), 
vm_event_cleanup_domain() and monitor_domctl() are all protected by 
the domctl-lock, but I don't think that's enough.

Here are a few code-paths that led me to believe that:
* do_hvm_op()->monitor_guest_request() reads 
d.monitor.guest_request_* resources, but it doesn't seem to take 
the domctl lock, so it seems possible for that to happen _while_ 
those resources are initialized/cleaned-up
* monitor_guest_request() also calls 
monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot() 
which attempts a vm_event_ring_lock(ved), which could also be 
called _while_ that's initialized (vm_event_enable()) or cleaned-up 
(vm_event_disable())
* hvm_monitor_cr() - e.g. on the code-path 
vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX() 
there doesn't seem to be taken into account the possibility of a 
concurrent monitor_init_domain()/monitor_cleanup_domain()


Am I missing something with these conclusions?
As a resolution for this, I've been thinking of adding a 
'subsys_lock' field in the vm_event_domain structure, either 
spinlock or rw-lock, which would be initialised/uninitialised when 
d.vm_event is allocated/freed 
(domain_create()/complete_domain_destroy()).


I don't think you are missing anything.  It seems like a monitor 
lock is needed.


FWIW, the domctl lock is only used at the moment because it was 
easy, and worked when everything was a domctl.  Being a global 
spinlock, it is a severe bottlekneck for concurrent domain 
management, which I need to find some time to break apart, so the 
less reliance on it the better from my point of view.


~Andrew



I was thinking of not touching the way the domctl lock is acquired in 
these cases at all, but rather leave that as it is and introduce the 
rw-lock separately. Are you suggesting I should also change do_domctl 
to skip domctl_lock_acquire()-ing if op->cmd is 
XEN_DOMCTL_vm_event_op ? That would certainly be a first step towards 
amending that bottleneck you mentioned, but it would break 
consistency of locking-behavior for domctls. I think it would also 
add significant complexity to what I'll be doing so I hope it 
wouldn't be a problem if we leave that for a future patch.


Sorry - I didn't intend to suggest altering the domctl lock.  Just 
that adding in a new monitor lock and using it everywhere where 
appropriate would be a good step.


Breaking the domctl lock is a very large can of worms, which you 
definitely won't want to open while attempting to do something else.


~Andrew



Yep, agreed.

Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

2016-07-22 Thread Roger Pau Monné
On Fri, Jul 22, 2016 at 05:43:32PM +0800, Bob Liu wrote:
> 
> On 07/22/2016 05:34 PM, Roger Pau Monné wrote:
> > On Fri, Jul 22, 2016 at 04:17:48PM +0800, Bob Liu wrote:
> >>
> >> On 07/22/2016 03:45 PM, Roger Pau Monné wrote:
> >>> On Thu, Jul 21, 2016 at 06:08:05PM +0800, Bob Liu wrote:
> 
>  On 07/21/2016 04:57 PM, Roger Pau Monné wrote:
> >> ..[snip]..
> >> +
> >> +static ssize_t dynamic_reconfig_device(struct blkfront_info *info, 
> >> ssize_t count)
> >> +{
> >> +  unsigned int i;
> >> +  int err = -EBUSY;
> >> +
> >> +  /*
> >> +   * Make sure no migration in parallel, device lock is actually a
> >> +   * mutex.
> >> +   */
> >> +  if (!device_trylock(&info->xbdev->dev)) {
> >> +  pr_err("Fail to acquire dev:%s lock, may be in 
> >> migration.\n",
> >> +  dev_name(&info->xbdev->dev));
> >> +  return err;
> >> +  }
> >> +
> >> +  /*
> >> +   * Prevent new requests and guarantee no uncompleted reqs.
> >> +   */
> >> +  blk_mq_freeze_queue(info->rq);
> >> +  if (part_in_flight(&info->gd->part0))
> >> +  goto out;
> >> +
> >> +  /*
> >> +   * FrontBackend
> >> +   * Switch to XenbusStateClosed
> >> +   *  frontend_changed():
> >> +   *   case XenbusStateClosed:
> >> +   *  
> >> xen_blkif_disconnect()
> >> +   *  Switch to 
> >> XenbusStateClosed
> >> +   * blkfront_resume():
> >> +   *  frontend_changed():
> >> +   *  reconnect
> >> +   * Wait until XenbusStateConnected
> >> +   */
> >> +  info->reconfiguring = true;
> >> +  xenbus_switch_state(info->xbdev, XenbusStateClosed);
> >> +
> >> +  /* Poll every 100ms, 1 minute timeout. */
> >> +  for (i = 0; i < 600; i++) {
> >> +  /*
> >> +   * Wait backend enter XenbusStateClosed, 
> >> blkback_changed()
> >> +   * will clear reconfiguring.
> >> +   */
> >> +  if (!info->reconfiguring)
> >> +  goto resume;
> >> +  schedule_timeout_interruptible(msecs_to_jiffies(100));
> >> +  }
> >
> > Instead of having this wait, could you just set info->reconfiguring = 
> > 1, set 
> > the frontend state to XenbusStateClosed and mimic exactly what a resume 
> > from 
> > suspension does? blkback_changed would have to set the frontend state 
> > to 
> > InitWait when it detects that the backend has switched to Closed, and 
> > call 
> > blkfront_resume.
> 
> 
>  I think that won't work.
>  In the real "resume" case, the power management system will trigger all 
>  ->resume() path.
>  But there is no place for dynamic configuration.
> >>>
> >>> Hello,
> >>>
> >>> I think it should be possible to set info->reconfiguring and wait for the 
> >>> backend to switch to state Closed, at that point we should call 
> >>> blkif_resume 
> >>> (from blkback_changed) and the backend will follow the reconection.
> >>>
> >>
> >> Okay, I get your point. Yes, that's an option.
> >>
> >> But this will make 'dynamic configuration' to be async, I'm worry about 
> >> the end-user will get panic.
> >> E.g
> >> A end-user "echo  > /sys/devices/vbd-xxx/max_indirect_segs",
> >> but then the device will be Closed and disappeared, the user have to wait 
> >> for a random time so that the device can resume.
> > 
> > That should not happen, AFAICT on migration the device never dissapears. 
> 
> Oh, yes.
> 
> > alloc_disk and friends should not be called on resume from migration (see 
> > the switch in blkfront_connect, you should take the BLKIF_STATE_SUSPENDED 
> > path for the reconfiguration).
> > 
> 
> What about if the end-user starts I/O immediately after writing new value to 
> /sys?
> But the resume is still in progress.

blkif_free already stops the queues by calling blk_mq_stop_hw_queues, and 
blkif_queue_request will refuse to put anything on the ring if the state 
is different than connected, which in turn makes blkif_queue_rq call 
blk_mq_stop_hw_queue to stop the queue, so it should be safe.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [distros-debian-jessie test] 66673: tolerable FAIL

2016-07-22 Thread Platform Team regression test user
flight 66673 distros-debian-jessie real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/66673/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 build-amd64   5 xen-buildfail blocked in 66571
 build-i3865 xen-buildfail blocked in 66571

Tests which did not succeed, but are not blocking:
 test-amd64-i386-amd64-jessie-netboot-pygrub  1 build-check(1)  blocked n/a
 test-amd64-amd64-i386-jessie-netboot-pygrub  1 build-check(1)  blocked n/a
 test-amd64-amd64-amd64-jessie-netboot-pvgrub  1 build-check(1) blocked n/a
 test-amd64-i386-i386-jessie-netboot-pvgrub  1 build-check(1)   blocked n/a
 test-armhf-armhf-armhf-jessie-netboot-pygrub 11 migrate-support-check fail 
never pass
 test-armhf-armhf-armhf-jessie-netboot-pygrub 12 saverestore-support-check fail 
never pass

baseline version:
 flight   66571

jobs:
 build-amd64  fail
 build-armhf  pass
 build-i386   fail
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-jessie-netboot-pvgrub blocked 
 test-amd64-i386-i386-jessie-netboot-pvgrub   blocked 
 test-amd64-i386-amd64-jessie-netboot-pygrub  blocked 
 test-armhf-armhf-armhf-jessie-netboot-pygrub pass
 test-amd64-amd64-i386-jessie-netboot-pygrub  blocked 



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen: credit2: don't let b_avgload go negative.

2016-07-22 Thread Dario Faggioli
The ASSERT() made effective by b5b5876619bd8ec2e
("xen: credit2: fix two s_time_t handling issues
in load balancing") triggers for b_avgload (spotted
by OSSTest).

b_avgload is where we store the prediction of how
the load of a runqueue will look like in the medium
to long term, because of a vcpu being added to or
removed from there.

On vcpu removal, saturate down b_avgload to zero,
as it makes very few sense to predict that the
load of a runqueue will at some point become negative!

Signed-off-by: Dario Faggioli 
---
Cc: George Dunlap 
Cc: Wei Liu 
---
 xen/common/sched_credit2.c |8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index b92226c..1d79de0 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1323,14 +1323,16 @@ runq_assign(const struct scheduler *ops, struct vcpu 
*vc)
 static void
 __runq_deassign(struct csched2_vcpu *svc)
 {
+struct csched2_runqueue_data *rqd = svc->rqd;
+
 ASSERT(!__vcpu_on_runq(svc));
 ASSERT(!(svc->flags & CSFLAG_scheduled));
 
 list_del_init(&svc->rqd_elem);
-update_max_weight(svc->rqd, 0, svc->weight);
+update_max_weight(rqd, 0, svc->weight);
 
 /* Expected new load based on removing this vcpu */
-svc->rqd->b_avgload -= svc->avgload;
+rqd->b_avgload = max_t(s_time_t, rqd->b_avgload - svc->avgload, 0);
 
 svc->rqd = NULL;
 }
@@ -1590,7 +1592,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu 
*vc)
 if ( rqd == svc->rqd )
 {
 if ( cpumask_intersects(vc->cpu_hard_affinity, &rqd->active) )
-rqd_avgload = rqd->b_avgload - svc->avgload;
+rqd_avgload = max_t(s_time_t, rqd->b_avgload - svc->avgload, 
0);
 }
 else if ( spin_trylock(&rqd->lock) )
 {


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Razvan Cojocaru
On 07/22/2016 02:39 PM, Corneliu ZUZU wrote:
> On 7/22/2016 1:29 PM, Razvan Cojocaru wrote:
>> On 07/22/2016 01:17 PM, Corneliu ZUZU wrote:
>>> On 7/22/2016 12:55 PM, Razvan Cojocaru wrote:
 On 07/22/2016 12:27 PM, Corneliu ZUZU wrote:
> Hi,
>
> I've been inspecting vm-event code parts to try and understand when
> and
> why domain pausing/locking is done. If I understood correctly, domain
> pausing is done solely to force all the vCPUs of that domain to see a
> configuration update and act upon it (e.g. in the case of a
> XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring,
> domain pausing/unpausing ensures immediate enabling of CR3
> load-exiting
> for all vCPUs), not to synchronize concurrent operations
> (lock-behavior).
>
> As for locking, I see that for example vm_event_init_domain(),
> vm_event_cleanup_domain() and monitor_domctl() are all protected by
> the
> domctl-lock, but I don't think that's enough.
> Here are a few code-paths that led me to believe that:
> * do_hvm_op()->monitor_guest_request() reads d.monitor.guest_request_*
> resources, but it doesn't seem to take the domctl lock, so it seems
> possible for that to happen _while_ those resources are
> initialized/cleaned-up
> * monitor_guest_request() also calls
> monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot()
> which attempts a vm_event_ring_lock(ved), which could also be called
> _while_ that's initialized (vm_event_enable()) or cleaned-up
> (vm_event_disable())
> * hvm_monitor_cr() - e.g. on the code-path
> vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX()
>
>
> there doesn't seem to be taken into account the possibility of a
> concurrent monitor_init_domain()/monitor_cleanup_domain()
>
> Am I missing something with these conclusions?
 Your conclusions look correct, but I assume that the reason why this
 has
 not been addressed in the past is that introspection applications are
 expected to be well-behaved.
>>> I wouldn't think that was the rationale (considering that the risk is
>>> crashing the hypervisor as a whole; also see below), I'd rather think
>>> this simply went unnoticed.
>>>
 Specifically, in a codebase where the
 choice between uint64_t and long int matters speed-wise, and where
 unlikely()s also matter, an extra lock may be an issue.

 The typical flow of an introspection application is:

 1. Initialize everything.
 2. Subscribe to relevant events.
 3. Event processing loop.
 4. Unsubscribe from events.
 5. Do a last-run of event processing (already queued in the ring
 buffer).
 6. Uninitialize everything (no events are possible here because of
 steps
 4-5).
>>> And even if an introspection application behaves as sanely as you say,
>>> the current layout of the code still doesn't guarantee bug-free
>>> behavior. That's because for one, there are in-guest events (that
>>> trigger access of vm-events resources) that are completely asynchronous
>>> in relation to 2-6, for example a HVMOP_guest_request_vm_event
>>> (subsequently calling monitor_guest_request()).
>> Not true. As part of step 4 d->monitor.guest_request_enabled will become
>> 0, so no sensitive code will run after that:
> 
> I wasn't being rigorous with 2-6 above (with that I only meant to say
> 'there can't be any vm-events before everything is initialized') and it
> seems true that at least in the monitor_guest_request() case it might
> not be able to produce a hypervisor crash (although there are a number
> of ugly inconsistencies on that code path). But nonetheless a hypervisor
> crash can _still_ happen even with a _sane_ toolstack user.
> 
> Look @ hvm_do_resume():
> 
> 1. If you, as a toolstack user, get at sane step no. 6: "Uninitialize
> everything (no events are possible here because of steps 4-5)."
> 2. But just before you do that,  a hvm_do_resume() happens on an
> arbitrary vCPU of the domain and gets to this point:
> 
> if ( unlikely(v->arch.vm_event) )
> {
> // ---> let's say you are now at this point, just after the
> above non-NULL check
> struct monitor_write_data *w = &v->arch.vm_event->write_data;
> 
> 3. and before proceeding further, the uninitialization sequence
> _completes_, i.e. this was done in vm_event_cleanup_domain():
> 
> for_each_vcpu ( d, v )
> {
> xfree(v->arch.vm_event);
> v->arch.vm_event = NULL;
> }
> 
> 4. and then in hvm_do_resume() this gets to be done:
> 
> struct monitor_write_data *w = &v->arch.vm_event->write_data;
> 
> [...]
> 
> if ( w->do_write.msr )
> {
> hvm_msr_write_intercept(w->msr, w->value, 0);
> w->do_write.msr = 0;
> }
> 
> then you'll have a bea

Re: [Xen-devel] [xen-unstable test] 97737: regressions - FAIL

2016-07-22 Thread Dario Faggioli
On Fri, 2016-07-22 at 11:49 +0100, Wei Liu wrote:
> On Fri, Jul 22, 2016 at 03:27:30AM +, osstest service owner
> wrote:
> > 
> > flight 97737 xen-unstable real [real]
> > http://logs.test-lab.xenproject.org/osstest/logs/97737/
> > 
> > Regressions :-(
> > 
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> >  test-amd64-amd64-xl-credit2  19 guest-start/debian.repeat fail
> > REGR. vs. 97664
> >  test-armhf-armhf-xl  15 guest-start/debian.repeat fail
> > REGR. vs. 97664
> >  test-armhf-armhf-xl-credit2  15 guest-start/debian.repeat fail
> > REGR. vs. 97664
>
Thanks from bringing my attention to this. I've seen it happening in
local testing a few times also, and was investigating already.

> From
> http://logs.test-lab.xenproject.org/osstest/logs/97737/test-amd64-amd
> 64-xl-credit2/serial-fiano0.log
> 
> Jul 21 07:38:22.383917 (XEN) Assertion 'rqd->avgload >= 0 && rqd-
> >b_avgload >= 0' failed at sched_credit2.c:734
>
Right. My investigation shows that it is the b_avgload>=0 check that is
actually failing, and looking at the code confirmed that, and I found
out why.

I've just sent a patch that I believe will fix this.

I can't find it in the archives yet, but it should be:
 [PATCH] xen: credit2: don't let b_avgload go negative.
 msg-id: <146918909364.19443.6394900696027710502.st...@solace.fritz.box>

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)



signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Corneliu ZUZU

On 7/22/2016 3:10 PM, Razvan Cojocaru wrote:

On 07/22/2016 02:39 PM, Corneliu ZUZU wrote:

On 7/22/2016 1:29 PM, Razvan Cojocaru wrote:

On 07/22/2016 01:17 PM, Corneliu ZUZU wrote:

On 7/22/2016 12:55 PM, Razvan Cojocaru wrote:

On 07/22/2016 12:27 PM, Corneliu ZUZU wrote:

Hi,

I've been inspecting vm-event code parts to try and understand when
and
why domain pausing/locking is done. If I understood correctly, domain
pausing is done solely to force all the vCPUs of that domain to see a
configuration update and act upon it (e.g. in the case of a
XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG that enables CR3 monitoring,
domain pausing/unpausing ensures immediate enabling of CR3
load-exiting
for all vCPUs), not to synchronize concurrent operations
(lock-behavior).

As for locking, I see that for example vm_event_init_domain(),
vm_event_cleanup_domain() and monitor_domctl() are all protected by
the
domctl-lock, but I don't think that's enough.
Here are a few code-paths that led me to believe that:
* do_hvm_op()->monitor_guest_request() reads d.monitor.guest_request_*
resources, but it doesn't seem to take the domctl lock, so it seems
possible for that to happen _while_ those resources are
initialized/cleaned-up
* monitor_guest_request() also calls
monitor_traps()->...->vm_event_wait_slot()->...->vm_event_grab_slot()
which attempts a vm_event_ring_lock(ved), which could also be called
_while_ that's initialized (vm_event_enable()) or cleaned-up
(vm_event_disable())
* hvm_monitor_cr() - e.g. on the code-path
vmx_vmexit_handler(EXIT_REASON_CR_ACCESS)->vmx_cr_access(VMX_CONTROL_REG_ACCESS_TYPE_MOV_TO_CR)->hvm_mov_to_cr()->hvm_set_cr0()->hvm_monitor_crX()


there doesn't seem to be taken into account the possibility of a
concurrent monitor_init_domain()/monitor_cleanup_domain()

Am I missing something with these conclusions?

Your conclusions look correct, but I assume that the reason why this
has
not been addressed in the past is that introspection applications are
expected to be well-behaved.

I wouldn't think that was the rationale (considering that the risk is
crashing the hypervisor as a whole; also see below), I'd rather think
this simply went unnoticed.


Specifically, in a codebase where the
choice between uint64_t and long int matters speed-wise, and where
unlikely()s also matter, an extra lock may be an issue.

The typical flow of an introspection application is:

1. Initialize everything.
2. Subscribe to relevant events.
3. Event processing loop.
4. Unsubscribe from events.
5. Do a last-run of event processing (already queued in the ring
buffer).
6. Uninitialize everything (no events are possible here because of
steps
4-5).

And even if an introspection application behaves as sanely as you say,
the current layout of the code still doesn't guarantee bug-free
behavior. That's because for one, there are in-guest events (that
trigger access of vm-events resources) that are completely asynchronous
in relation to 2-6, for example a HVMOP_guest_request_vm_event
(subsequently calling monitor_guest_request()).

Not true. As part of step 4 d->monitor.guest_request_enabled will become
0, so no sensitive code will run after that:

I wasn't being rigorous with 2-6 above (with that I only meant to say
'there can't be any vm-events before everything is initialized') and it
seems true that at least in the monitor_guest_request() case it might
not be able to produce a hypervisor crash (although there are a number
of ugly inconsistencies on that code path). But nonetheless a hypervisor
crash can _still_ happen even with a _sane_ toolstack user.

Look @ hvm_do_resume():

1. If you, as a toolstack user, get at sane step no. 6: "Uninitialize
everything (no events are possible here because of steps 4-5)."
2. But just before you do that,  a hvm_do_resume() happens on an
arbitrary vCPU of the domain and gets to this point:

 if ( unlikely(v->arch.vm_event) )
 {
 // ---> let's say you are now at this point, just after the
above non-NULL check
 struct monitor_write_data *w = &v->arch.vm_event->write_data;

3. and before proceeding further, the uninitialization sequence
_completes_, i.e. this was done in vm_event_cleanup_domain():

 for_each_vcpu ( d, v )
 {
 xfree(v->arch.vm_event);
 v->arch.vm_event = NULL;
 }

4. and then in hvm_do_resume() this gets to be done:

 struct monitor_write_data *w = &v->arch.vm_event->write_data;

 [...]

 if ( w->do_write.msr )
 {
 hvm_msr_write_intercept(w->msr, w->value, 0);
 w->do_write.msr = 0;
 }

then you'll have a beautiful NULL-pointer access hypervisor crash.

As I've said before, of course I agree with that


Sorry, I thought by your previous statement "I assume that the reason 
why this has not been addressed in the past is that introspection 
applications are expected to be well-behaved" you were inferring that 
with a 'sane' toolstack user, as you defined it, it won't be pos

Re: [Xen-devel] [PATCH] xen: credit2: don't let b_avgload go negative.

2016-07-22 Thread George Dunlap
On 22/07/16 13:04, Dario Faggioli wrote:
> The ASSERT() made effective by b5b5876619bd8ec2e
> ("xen: credit2: fix two s_time_t handling issues
> in load balancing") triggers for b_avgload (spotted
> by OSSTest).
> 
> b_avgload is where we store the prediction of how
> the load of a runqueue will look like in the medium
> to long term, because of a vcpu being added to or
> removed from there.
> 
> On vcpu removal, saturate down b_avgload to zero,
> as it makes very few sense to predict that the
> load of a runqueue will at some point become negative!
> 
> Signed-off-by: Dario Faggioli 

Acked-by: George Dunlap 

Wei, do you want to apply it?

Thanks,
 -George

> ---
> Cc: George Dunlap 
> Cc: Wei Liu 
> ---
>  xen/common/sched_credit2.c |8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index b92226c..1d79de0 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -1323,14 +1323,16 @@ runq_assign(const struct scheduler *ops, struct vcpu 
> *vc)
>  static void
>  __runq_deassign(struct csched2_vcpu *svc)
>  {
> +struct csched2_runqueue_data *rqd = svc->rqd;
> +
>  ASSERT(!__vcpu_on_runq(svc));
>  ASSERT(!(svc->flags & CSFLAG_scheduled));
>  
>  list_del_init(&svc->rqd_elem);
> -update_max_weight(svc->rqd, 0, svc->weight);
> +update_max_weight(rqd, 0, svc->weight);
>  
>  /* Expected new load based on removing this vcpu */
> -svc->rqd->b_avgload -= svc->avgload;
> +rqd->b_avgload = max_t(s_time_t, rqd->b_avgload - svc->avgload, 0);
>  
>  svc->rqd = NULL;
>  }
> @@ -1590,7 +1592,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct 
> vcpu *vc)
>  if ( rqd == svc->rqd )
>  {
>  if ( cpumask_intersects(vc->cpu_hard_affinity, &rqd->active) )
> -rqd_avgload = rqd->b_avgload - svc->avgload;
> +rqd_avgload = max_t(s_time_t, rqd->b_avgload - svc->avgload, 
> 0);
>  }
>  else if ( spin_trylock(&rqd->lock) )
>  {
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: credit2: don't let b_avgload go negative.

2016-07-22 Thread Wei Liu
On Fri, Jul 22, 2016 at 01:34:27PM +0100, George Dunlap wrote:
> On 22/07/16 13:04, Dario Faggioli wrote:
> > The ASSERT() made effective by b5b5876619bd8ec2e
> > ("xen: credit2: fix two s_time_t handling issues
> > in load balancing") triggers for b_avgload (spotted
> > by OSSTest).
> > 
> > b_avgload is where we store the prediction of how
> > the load of a runqueue will look like in the medium
> > to long term, because of a vcpu being added to or
> > removed from there.
> > 
> > On vcpu removal, saturate down b_avgload to zero,
> > as it makes very few sense to predict that the
> > load of a runqueue will at some point become negative!
> > 
> > Signed-off-by: Dario Faggioli 
> 
> Acked-by: George Dunlap 
> 
> Wei, do you want to apply it?
> 

I will apply it with your ack.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Razvan Cojocaru
On 07/22/2016 03:32 PM, Corneliu ZUZU wrote:
>>> Look @ hvm_do_resume():
>>>
>>> 1. If you, as a toolstack user, get at sane step no. 6: "Uninitialize
>>> everything (no events are possible here because of steps 4-5)."
>>> 2. But just before you do that,  a hvm_do_resume() happens on an
>>> arbitrary vCPU of the domain and gets to this point:
>>>
>>>  if ( unlikely(v->arch.vm_event) )
>>>  {
>>>  // ---> let's say you are now at this point, just after the
>>> above non-NULL check
>>>  struct monitor_write_data *w = &v->arch.vm_event->write_data;
>>>
>>> 3. and before proceeding further, the uninitialization sequence
>>> _completes_, i.e. this was done in vm_event_cleanup_domain():
>>>
>>>  for_each_vcpu ( d, v )
>>>  {
>>>  xfree(v->arch.vm_event);
>>>  v->arch.vm_event = NULL;
>>>  }
>>>
>>> 4. and then in hvm_do_resume() this gets to be done:
>>>
>>>  struct monitor_write_data *w = &v->arch.vm_event->write_data;
>>>
>>>  [...]
>>>
>>>  if ( w->do_write.msr )
>>>  {
>>>  hvm_msr_write_intercept(w->msr, w->value, 0);
>>>  w->do_write.msr = 0;
>>>  }
>>>
>>> then you'll have a beautiful NULL-pointer access hypervisor crash.
>> As I've said before, of course I agree with that
> 
> Sorry, I thought by your previous statement "I assume that the reason
> why this has not been addressed in the past is that introspection
> applications are expected to be well-behaved" you were inferring that
> with a 'sane' toolstack user, as you defined it, it won't be possible to
> have a hypervisor crash.

I did (please see below).

>>   - isn't that what a
>> previous patch you've submitted addressed?
> 
> I'm not sure what patch you're referring to, I don't remember ever
> addressing these issues before..

I am talking about your patch "x86/monitor: fix: don't compromise a
monitor_write_data with pending CR writes", which, if I'm not mistaken,
addresses the above problem (since we won't xfree(v->arch.vm_event) on
monitor uninit anymore).

With the above problem fixed, the workflow I suggested should be fine.

Again, I would prefer things to be rock solid, so personally I encourage
your patch and hope we get it in.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 15/22] xen/arm: Don't call p2m_alloc_table from arch_domain_create

2016-07-22 Thread Julien Grall



On 22/07/16 12:05, Sergej Proskurin wrote:



On 07/22/2016 12:38 PM, Julien Grall wrote:



On 22/07/16 11:39, Sergej Proskurin wrote:



On 07/22/2016 12:26 PM, Julien Grall wrote:



On 22/07/16 11:16, Sergej Proskurin wrote:

Hi Julien,


Hello,


On 07/22/2016 11:18 AM, Julien Grall wrote:



On 22/07/16 09:32, Sergej Proskurin wrote:

Hi Julien,


Hello Sergej,


-int p2m_alloc_table(struct domain *d)
+static int p2m_alloc_table(struct domain *d)


While moving parts of the altp2m code out of ./xen/arch/arm/p2m.c,
the
function p2m_alloc_table needs to be called from
./xen/arch/arm/altp2m.c
to allocate the individual altp2m views. Hence it should not be
static.


No, this function should not be called outside p2m.c as it will not
fully initialize the p2m. You need to need to provide a function to
initialize a p2m (such as p2m_init).



The last time we have discussed reusing existing code, among others,
for
individual struct p2m_domain initialization routines. Also, we have
agreed to move altp2m-related parts out of p2m.c into altp2m.c, which
makes it hard not to access parts required for initialization/teardown
(that are equal for both p2m and altp2m).


I remember this discussion. However, the p2m initialize/teardown should
exactly be the same for the hostp2m and altp2m (except for the type of
the p2m). So, a function should be provided to initialize a full p2m to
avoid code duplication.



This is exactly what has been done. Nevertheless, altp2m views are
somewhat more dynamic than hostp2m and hence require frequent
initialization/teardown of individual views. That is, by moving altp2m
parts out of p2m.c we simply need to access this shared code multiple
times at runtime from different altp2m-related functions. This applies
to more functions that just to p2m_alloc_table.


I am not convinced that you need to reallocate the root page every time
rather than clearing them. Anyway, I will need to see the code to
understand what is done.

Regards,



In the following, you will find an excerpt of both the p2m.c and
altp2m.c concerning the (alt)p2m initialization. Please note that
altp2m_init_helper is called from different initialization routines in
altp2m.c. Also, please consider that this code has not yet been ported
to your recent patches. Please let me know if you need more information.




static int altp2m_init_helper(struct domain *d, unsigned int idx)
{
int rc;
struct p2m_domain *p2m = d->arch.altp2m_p2m[idx];

if ( p2m == NULL )
{
/* Allocate a new altp2m view. */
p2m = xzalloc(struct p2m_domain);
if ( p2m == NULL)
{
rc = -ENOMEM;
goto err;
}
memset(p2m, 0, sizeof(struct p2m_domain));
}

/* Initialize the new altp2m view. */
rc = p2m_init_one(d, p2m);
if ( rc )
goto err;

/* Allocate a root table for the altp2m view. */
rc = p2m_alloc_table(p2m);


This patch moved p2m_alloc_table call into p2m_init (i.e your 
p2m_init_one). You complained that the function was not exported 
anymore, but you did not look how it was called in this patch.



if ( rc )
goto err;

p2m->p2m_class = p2m_alternate;
p2m->access_required = 1;
_atomic_set(&p2m->active_vcpus, 0);

d->arch.altp2m_p2m[idx] = p2m;
d->arch.altp2m_vttbr[idx] = p2m->vttbr.vttbr;


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline baseline-only test] 66672: regressions - FAIL

2016-07-22 Thread Platform Team regression test user
This run is configured for baseline tests only.

flight 66672 qemu-mainline real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/66672/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   5 xen-build fail REGR. vs. 66623
 build-i386-xsm5 xen-build fail REGR. vs. 66623
 build-amd64-xsm   5 xen-build fail REGR. vs. 66623
 build-i3865 xen-build fail REGR. vs. 66623
 test-armhf-armhf-libvirt-qcow2  9 debian-di-install   fail REGR. vs. 66623

Tests which did not succeed, but are not blocking:
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-amd64-qemuu-nested-intel  1 build-check(1)  blocked n/a
 test-amd64-amd64-i386-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-amd64-pvgrub  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-pygrub   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-raw1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qcow2 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-qemuu-nested-amd  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl

Re: [Xen-devel] [PATCH 01/19] xen: Create a new file xen_pvdev.c

2016-07-22 Thread Quan Xu

    sorry for the bad format from web email, and later review (neo training in 
new company)..    patch 6 -- patch 12, rename * patch,  are good to me. 
 Quan  
--From:Emil 
Condrea Time:2016 Jul 19 (Tue) 00:54To:Eric Blake 
Cc:qemu-devel ; Daniel De Graaf 
; xen-devel ; Stefano 
Stabellini ; Quan Xu ; wei.liu2 
; stefanb ; anthony.perard 
Subject:Re: [Xen-devel] [PATCH 01/19] xen: Create a 
new file xen_pvdev.c
Eric, this is the link to the original patch which is well formatted: 
http://marc.info/?l=xen-devel&m=146815138831762&w=2I think that the formatting 
and s-o-b was broken in the reply from Quan.On Jul 18, 2016 17:57, "Eric Blake" 
 wrote:
On 07/17/2016 01:41 AM, Quan Xu wrote:

>

> [Quan:]: comment starts with [Quan:]

>


This line doesn't belong in a commit message; it's fine to put it after

the --- separator though, if it aids mailing list reviewers.


>

> The purpose of the new file is to store generic functions shared by 
> frontendand backends such as xenstore operations, xendevs.

>


s/frontendand/front end and/


Please wrap your commit message lines.  Since 'git log' displays logs

with indentation, wrapping around 72 characters is ideal.


> Signed-off-by: Quan Xu 

> Signed-off-by: Emil Condrea 


These are not valid S-o-b, therefore this patch cannot be applied as-is.



> -int xenstore_read_int(const char *base, const char *node, int *ival)

> -{

> -    char *val;

> -    int rc = -1;

> -

> -    val = xenstore_read_str(base, node);

> [Quan:]:  IMO, it is better to initialize val when declares.  the same 
> comment for the other 'val'

> -    if (val && 1 == sscanf(val, "%d", ival)) {


This is not a valid patch.  Are you replying to a patch that someone

else posted? If so, your quoting style is VERY difficult to read.

Please consider using a leading > before every line that you are quoting

(rather than pasting it verbatim as if you had written it), and include

a blank line both before and after every line that you insert, to call

visual attention to what is your reply vs. what you are quoting.



--

Eric Blake   eblake redhat com    +1-919-301-3266

Libvirt virtualization library http://libvirt.org


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/7] xen/arm: cpufeature: Provide an helper to check if a capability is supported

2016-07-22 Thread Konrad Rzeszutek Wilk
On Wed, Jul 20, 2016 at 04:25:55PM +0100, Julien Grall wrote:
> The CPU capabilities will be set depending on the value found in the CPU
> registers. This patch provides a generic to go through a set of capabilities
> and find which one should be enabled.
> 
> The parameter "info" is used to display the kind of capability updated (e.g
> workaround, feature...).
> 
> Signed-off-by: Julien Grall 
> Acked-by: Stefano Stabellini 

Reviewed-by: Konrad Rzeszutek Wilk  with just one minor
suggestion:

> 
> ---
> Changes in v4:
> - Add Stefano's acked-by
> 
> Changes in v3:
>  - Patch added. The code was previously part of "Detect
>  silicon...".
> ---
>  xen/arch/arm/cpufeature.c| 16 
>  xen/include/asm-arm/cpufeature.h |  9 +
>  2 files changed, 25 insertions(+)
> 
> diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
> index 7a1b56b..088625b 100644
> --- a/xen/arch/arm/cpufeature.c
> +++ b/xen/arch/arm/cpufeature.c
> @@ -24,6 +24,22 @@
>  
>  DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
>  
> +void update_cpu_capabilities(const struct arm_cpu_capabilities *caps,
> + const char *info)
> +{
> +int i;
> +
> +for ( i = 0; caps[i].matches; i++ )
> +{
> +if ( !caps[i].matches(&caps[i]) )

So what if the 'struct arm_cpu_capabilitues' has '->matches' set to
NULL?

Perhaps:

if ( !caps[i].matches || !caps[i]... ?
> +continue;
> +
> +if ( !cpus_have_cap(caps[i].capability) && caps[i].desc )
> +printk(XENLOG_INFO "%s: %s\n", info, caps[i].desc);
> +cpus_set_cap(caps[i].capability);
> +}
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-arm/cpufeature.h 
> b/xen/include/asm-arm/cpufeature.h
> index 2bebad1..be2414c 100644
> --- a/xen/include/asm-arm/cpufeature.h
> +++ b/xen/include/asm-arm/cpufeature.h
> @@ -62,6 +62,15 @@ static inline void cpus_set_cap(unsigned int num)
>  __set_bit(num, cpu_hwcaps);
>  }
>  
> +struct arm_cpu_capabilities {
> +const char *desc;
> +u16 capability;
> +bool_t (*matches)(const struct arm_cpu_capabilities *);
> +};
> +
> +void update_cpu_capabilities(const struct arm_cpu_capabilities *caps,
> + const char *info);
> +
>  #endif /* __ASSEMBLY__ */
>  
>  #endif
> -- 
> 1.9.1
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching

2016-07-22 Thread Konrad Rzeszutek Wilk
> diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
> new file mode 100644
> index 000..d00f98e
> --- /dev/null
> +++ b/xen/arch/arm/alternative.c

Hey!

I've some comments, most of them are light. What I am concerned most
is the difference between comments in the header vs the code. 
Looking at Linux code it seems to have the same issue - and I know you don't
want to differ to much from Linux.

Perhaps go with the changes as is (With the comments being off) and the
then have some patches for Linux for the header mismatches and those
can be backported?

..snip..
> +static int __apply_alternatives(const struct alt_region *region)
> +{
> +const struct alt_instr *alt;
> +const u32 *origptr, *replptr;
> +u32 *writeptr, *writemap;
> +mfn_t text_mfn = _mfn(virt_to_mfn(_stext));
> +unsigned int text_order = get_order_from_bytes(_end - _start);
> +
> +printk("Patching kernel code\n");

I see you use prefixed later with 'alternatives:' and also this is without
any prefix (DEBUG, or ERR). Would it make sense to have this one above
be dprintk while the one below with printk (KERN_ERR) ?
> +
> +/*
> + * The text section is read-only. So re-map Xen to be able to patch
> + * the code.
> + */
> +writemap = __vmap(&text_mfn, 1 << text_order, 1, 1, PAGE_HYPERVISOR,
> +  VMAP_DEFAULT);
> +if ( !writemap )
> +{
> +printk("alternatives: Unable to map the text section\n");

Perhaps include the size of the text section? That may help in figuring
out what went wrong or if one could increase the VMAP area?

> +return -ENOMEM;
> +}
> +
> +for ( alt = region->begin; alt < region->end; alt++ )
> +{
> +u32 insn;
> +int i, nr_inst;
> +
> +if ( !cpus_have_cap(alt->cpufeature) )
> +continue;
> +
> +BUG_ON(alt->alt_len != alt->orig_len);

The header file says:
+   u8  alt_len;/* size of new instruction(s), <= orig_len */

So this BUG_ON seems incorrect?
But hten later in the header it says:
"1. Be exactly the same length (in bytes) as the default code
+ *sequence."

So the BUG_ON is correct. Perhaps the 'alt_len' comment should be s/<=/==/ ?

> +
> +origptr = ALT_ORIG_PTR(alt);
> +writeptr = origptr - (u32 *)_start + writemap;

How about just using writeptr += ?

Ah wait. You are trying to preserve the Linux code!. Nevermind then.

> +replptr = ALT_REPL_PTR(alt);
> +
> +nr_inst = alt->alt_len / sizeof(insn);
> +
> +for ( i = 0; i < nr_inst; i++ )
> +{
> +insn = get_alt_insn(alt, origptr + i, replptr + i);
> +*(writeptr + i) = cpu_to_le32(insn);
> +}
> +
> +/* Ensure the new instructions reached the memory and nuke */
> +clean_and_invalidate_dcache_va_range(writeptr,
> + (sizeof (*writeptr) * nr_inst));
> +}
> +
> +/* Nuke the instruction cache */
> +invalidate_icache();
> +
> +vunmap(writemap);
> +
> +return 0;
> +}
> +
> +/*
> + * We might be patching the stop_machine state machine, so implement a
> + * really simple polling protocol here.
> + */
> +static int __apply_alternatives_multi_stop(void *unused)
> +{
> +static int patched = 0;

Shouldn't this be 'atomic_t' ?

> +struct alt_region region = {

Can this 'const' ?
> +.begin = __alt_instructions,
> +.end = __alt_instructions_end,
> +};
> +
> +/* We always have a CPU 0 at this point (__init) */
> +if ( smp_processor_id() )
> +{
> +while ( !read_atomic(&patched) )
> +cpu_relax();
> +isb();
> +}
> +else
> +{
> +int ret;
> +
> +BUG_ON(patched);
> +ret = __apply_alternatives(®ion);
> +/* The patching is not expected to fail during boot. */
> +BUG_ON(ret != 0);
> +
> +/* Barriers provided by the cache flushing */
> +write_atomic(&patched, 1);
> +}
> +
> +return 0;
> +}
> +
> +/*
> + * This function should only be called during boot and before CPU0 jump
> + * into the idle_loop.

So could you add an:
> + */
> +void __init apply_alternatives_all(void)
> +{
> +int ret;
> +
ASSERT(system_state != SYS_STATE_active);
?

> + /* better not try code patching on a live SMP system */
> +ret = stop_machine_run(__apply_alternatives_multi_stop, NULL, NR_CPUS);
> +
> +/* stop_machine_run should never fail at this stage of the boot */
> +BUG_ON(ret);
> +}
> +
> +int apply_alternatives(void *start, size_t length)
> +{
> +struct alt_region region = {

Could this be 'const'?

> +.begin = start,
> +.end = start + length,
> +};
> +
> +return __apply_alternatives(®ion);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index

Re: [Xen-devel] [PATCH v5 3/7] xen/arm: Detect silicon revision and set cap bits accordingly

2016-07-22 Thread Konrad Rzeszutek Wilk
On Wed, Jul 20, 2016 at 04:25:56PM +0100, Julien Grall wrote:
> After each CPU has been started, we iterate through a list of CPU
> errata to detect CPUs which need from hypervisor code patches.
> 
> For each bug there is a function which check if that a particular CPU is
s/check/checks/
> affected. This needs to be done on every CPUs to cover heterogenous
s/CPUs/CPU/
> system properly.
s/system/systems/ ? (not sure)
> 
> If a certain erratum has been detected, the capability bit will be set.
> In the case the erratum requires code patching, this will be triggered
> by the call to apply_alternatives.
s/the///

> 
> The code is based on the file arch/arm64/kernel/cpu_errata.c in Linux
> v4.6-rc3.
> 
> Signed-off-by: Julien Grall 
> Acked-by: Stefano Stabellini 
> 
> ---
> Changes in v4:
> - Add missing emacs magic blocks
> - Add Stefano's acked-by
> 
> Changes in v3:
> - Move update_cpu_capabilities in a separate patch
> - Update the commit message to mention that workaround may
> not require code patching.
> 
> Changes in v2:
> - Use XENLOG_INFO for the loglevel of the message
> ---
>  xen/arch/arm/Makefile|  1 +
>  xen/arch/arm/cpuerrata.c | 34 ++
>  xen/arch/arm/setup.c |  3 +++
>  xen/arch/arm/smpboot.c   |  3 +++
>  xen/include/asm-arm/cpuerrata.h  | 14 ++
>  xen/include/asm-arm/cpufeature.h |  6 ++
>  6 files changed, 61 insertions(+)
>  create mode 100644 xen/arch/arm/cpuerrata.c
>  create mode 100644 xen/include/asm-arm/cpuerrata.h
> 
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 74bd7b8..23aaf52 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -7,6 +7,7 @@ subdir-$(CONFIG_ACPI) += acpi
>  obj-$(CONFIG_ALTERNATIVE) += alternative.o
>  obj-y += bootfdt.o
>  obj-y += cpu.o
> +obj-y += cpuerrata.o
>  obj-y += cpufeature.o
>  obj-y += decode.o
>  obj-y += device.o
> diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
> new file mode 100644
> index 000..03ae7b4
> --- /dev/null
> +++ b/xen/arch/arm/cpuerrata.c
> @@ -0,0 +1,34 @@
> +#include 
> +#include 
> +#include 
> +
> +#define MIDR_RANGE(model, min, max) \
> +.matches = is_affected_midr_range,  \
> +.midr_model = model,\
> +.midr_range_min = min,  \
> +.midr_range_max = max
> +
> +static bool_t __maybe_unused
> +is_affected_midr_range(const struct arm_cpu_capabilities *entry)
> +{
> +return MIDR_IS_CPU_MODEL_RANGE(boot_cpu_data.midr.bits, 
> entry->midr_model,
> +   entry->midr_range_min,
> +   entry->midr_range_max);
> +}
> +
> +static const struct arm_cpu_capabilities arm_errata[] = {
> +{},
> +};
> +
> +void check_local_cpu_errata(void)
> +{
> +update_cpu_capabilities(arm_errata, "enabled workaround for");
> +}
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 97b3214..38eb888 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -43,6 +43,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -171,6 +172,8 @@ static void __init processor_id(void)
>  }
>  
>  processor_setup();
> +
> +check_local_cpu_errata();
>  }
>  
>  void dt_unreserved_regions(paddr_t s, paddr_t e,
> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> index 3a962f7..d56b91d 100644
> --- a/xen/arch/arm/smpboot.c
> +++ b/xen/arch/arm/smpboot.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -316,6 +317,8 @@ void start_secondary(unsigned long boot_phys_offset,
>  local_irq_enable();
>  local_abort_enable();
>  
> +check_local_cpu_errata();
> +
>  printk(XENLOG_DEBUG "CPU %u booted.\n", smp_processor_id());
>  
>  startup_cpu_idle_loop();
> diff --git a/xen/include/asm-arm/cpuerrata.h b/xen/include/asm-arm/cpuerrata.h
> new file mode 100644
> index 000..fe93beb
> --- /dev/null
> +++ b/xen/include/asm-arm/cpuerrata.h
> @@ -0,0 +1,14 @@
> +#ifndef __ARM_CPUERRATA_H

Sorry about having engaged the pedantic review mode, but this caught my
eye.

I thought the style was to prefix it with __ and also postfix it with __:

$ grep "__" *.h
decode.h:#ifndef __ARCH_ARM_DECODE_H_
decode.h:#define __ARCH_ARM_DECODE_H_
decode.h:#endif /* __ARCH_ARM_DECODE_H_ */
kernel.h:#ifndef __ARCH_ARM_KERNEL_H__
kernel.h:#define __ARCH_ARM_KERNEL_H__
kernel.h:#endif /* #ifdef __ARCH_ARM_KERNEL_H__ */
vtimer.h:#ifndef __ARCH_ARM_VTIMER_H__
vtimer.h:#define __ARCH_ARM_VTIMER_H__
vuart.h:#ifndef __ARCH_ARM_VUART_H__
vuart.h:#define __ARCH_ARM_VUART_H__
vuart.h:#endif /* __ARCH_ARM_VUART_H__ */

And the include/asm also has have this in in surplus.

It rea

Re: [Xen-devel] [PATCH 01/19] xen: Create a new file xen_pvdev.c

2016-07-22 Thread Quan Xu
Anthony, thanks for your explaination.IMO, patch 1  and patch 2 need your 
detailed review.. IMO the reset patches are good in general..Emil, if patch 1 / 
patch 2 are reviewed from anthony, could you send out v10? :) i know it's not 
an easy task, thanks in advence!!
Quan

On Mon, 18 Jul 2016 15:50:27 +0100, anthony.perard 
wrote:> On Sun, Jul 17, 2016 at 03:41:26PM +0800, Quan Xu wrote:
> -int xenstore_write_int(const char *base, const char *node, int ival)
> -{
> -char val[12];
> -
> [Quan:]: why 12 ? what about XEN_BUFSIZE? 

That is the number of digit in INT_MAX (10) + 1 for the sign + 1 for '\0'.

> -snprintf(val, sizeof(val), "%d", ival);
> -return xenstore_write_str(base, node, val);
> -}
> -
> -int xenstore_write_int64(const char *base, const char *node, int64_t ival)
> -{
> -char val[21];
> -
> [Quan:]: why 21 ? what about XEN_BUFSIZE?

Same with INT64_MAX (19 digits).

> 
> -snprintf(val, sizeof(val), "%"PRId64, ival);
> -return xenstore_write_str(base, node, val);
> -}
> -
> -int xenstore_read_int(const char *base, const char *node, int *ival)
> -{
> -char *val;
> -int rc = -1;
> -
> -val = xenstore_read_str(base, node);
> [Quan:]:  IMO, it is better to initialize val when declares.

I think I prefer it this way.

> -if (val && 1 == sscanf(val, "%d", ival)) {
> -rc = 0;
> -}
> -g_free(val);
> -return rc;
> -}

-- 
Anthony PERARD___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 4/7] xen/arm: Document the errata implemented in Xen

2016-07-22 Thread Konrad Rzeszutek Wilk
On Wed, Jul 20, 2016 at 04:25:57PM +0100, Julien Grall wrote:
> The new document will help to keep track of each erratum Xen is able to
> handle.
> 
> The text is based on the Linux doc in Documents/arm64/silicon-errata.txt.
> 
> Also list the current errata that Xen is aware of.
> 
> Signed-off-by: Julien Grall 
> Acked-by: Stefano Stabellini 

Reviewed-by: Konrad Rzeszutek Wilk 

> 
> ---
> Changes in v4:
> - Fix grammar in the document : s/by ARM64/on ARM64/
> - Add Stefano's acked-by
> 
> Changes in v3:
> - Fix grammar in the commit message
> - s/Linux/Xen/
> - Mention that runtime patching is only supported by ARM64
> ---
>  docs/misc/arm/silicon-errata.txt | 45 
> 
>  1 file changed, 45 insertions(+)
>  create mode 100644 docs/misc/arm/silicon-errata.txt
> 
> diff --git a/docs/misc/arm/silicon-errata.txt 
> b/docs/misc/arm/silicon-errata.txt
> new file mode 100644
> index 000..5d54694
> --- /dev/null
> +++ b/docs/misc/arm/silicon-errata.txt
> @@ -0,0 +1,45 @@
> +Silicon Errata and Software Workarounds
> +===
> +
> +It is an unfortunate fact of life that hardware is often produced with
> +so-called "errata", which can cause it to deviate from the architecture
> +under specific circumstances.  For hardware produced by ARM, these
> +errata are broadly classified into the following categories:
> +
> +  Category A: A critical error without a viable workaround.
> +  Category B: A significant or critical error with an acceptable
> +  workaround.
> +  Category C: A minor error that is not expected to occur under normal
> +  operation.
> +
> +For more information, consult one of the "Software Developers Errata
> +Notice" documents available on infocenter.arm.com (registration
> +required).
> +
> +As far as Xen is concerned, Category B errata may require some special
> +treatment in the hypervisor. For example, avoiding a particular sequence
> +of code, or configuring the processor in a particular way. A less common
> +situation may require similar actions in order to declassify a Category A
> +erratum into a Category C erratum. These are collectively known as
> +"software workarounds" and are only required in the minority of cases
> +(e.g. those cases that both require a non-secure workaround *and* can
> +be triggered by Xen).
> +
> +For software workarounds that may adversely impact systems unaffected by
> +the erratum in question, a Kconfig entry is added under "ARM errata
> +workarounds via the alternatives framework". These are enabled by default
> +and patched in at runtime when an affected CPU is detected. Note that
> +runtime patching is only supported on ARM64. For less-intrusive workarounds,
> +a Kconfig option is not available and the code is structured (preferably
> +with a comment) in such a way that the erratum will not be hit.
> +
> +This approach can make it slightly onerous to determine exactly which
> +errata are worked around in an arbitrary hypervisor source tree, so this
> +file acts as a registry of software workarounds in the Xen hypervisor and
> +will be updated when new workarounds are committed and backported to
> +stable hypervisors.
> +
> +| Implementor| Component   | Erratum ID  | Kconfig   
>   |
> +++-+-+-+
> +| ARM| Cortex-A15  | #766422 | N/A   
>   |
> +| ARM| Cortex-A57  | #852523 | N/A   
>   |
> -- 
> 1.9.1
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 01/19] xen: Create a new file xen_pvdev.c

2016-07-22 Thread Emil Condrea
Sure, I will continue to send revisions until it is approved upstream.
On Jul 22, 2016 5:24 PM, "Quan Xu"  wrote:

> Anthony, thanks for your explaination.
> IMO, patch 1  and patch 2 need your detailed review.. IMO the reset
> patches are good in general..
> Emil, if patch 1 / patch 2 are reviewed from anthony, could you send out
> v10? :) i know it's not an easy task, thanks in advence!!
>
> Quan
>
>
> On Mon, 18 Jul 2016 15:50:27 +0100, anthony.perard<
> anthony.per...@citrix.com> wrote:
> > On Sun, Jul 17, 2016 at 03:41:26PM +0800, Quan Xu wrote:
> > -int xenstore_write_int(const char *base, const char *node, int ival)
> > -{
> > -char val[12];
> > -
> > [Quan:]: why 12 ? what about XEN_BUFSIZE?
>
> That is the number of digit in INT_MAX (10) + 1 for the sign + 1 for '\0'.
>
> > -snprintf(val, sizeof(val), "%d", ival);
> > -return xenstore_write_str(base, node, val);
> > -}
> > -
>
> > -int xenstore_write_int64(const char *base, const char *node, int64_t ival)
> > -{
> > -char val[21];
> > -
> > [Quan:]: why 21 ? what about XEN_BUFSIZE?
>
> Same with INT64_MAX (19 digits).
>
> >
> > -snprintf(val, sizeof(val), "%"PRId64, ival);
> > -return xenstore_write_str(base, node, val);
> > -}
> > -
> > -int xenstore_read_int(const char *base, const char *node, int *ival)
> > -{
> > -char *val;
> > -int rc = -1;
> > -
> > -val = xenstore_read_str(base, node);
> > [Quan:]:  IMO, it is better to initialize val when declares.
>
> I think I prefer it this way.
>
> > -if (val && 1 == sscanf(val, "%d", ival)) {
> > -rc = 0;
> > -}
> > -g_free(val);
> > -return rc;
> > -}
>
> --
> Anthony PERARD
>
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 0/7] xen/arm: Introduce alternative runtime patching for ARM64

2016-07-22 Thread Konrad Rzeszutek Wilk
On Wed, Jul 20, 2016 at 04:25:53PM +0100, Julien Grall wrote:
> Hello,
> 
> Some of the processor errata will require to modify code sequence. As those
> modifications may impact the performance, they should only be enabled on
> affected cores. Furthermore, Xen may also want to take advantage of
> new hardware features coming up with v8.1 and v8.2.
> 
> The first part of the series adds the alternative infrastructure, most of
> the code is based on Linux v4.6-rc3. The rest of the series implements
> errata for Cortex-A57 and Cortex-A53.
> 
> A branch with all the patches can be found here:
> 
> git://xenbits.xen.org/people/julieng/xen-unstable.git branch alternative-v5
> 
> For all the changes, see in each patch.

Hey!

I took a look at the patches. Only had some brief comments. I didn't review
the last three patches as I am not that familiar with the instructions.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/9] xen/arm: Provide macros to help creating workaround helpers

2016-07-22 Thread Konrad Rzeszutek Wilk
On Wed, Jul 20, 2016 at 8:43 AM, Julien Grall  wrote:
> Hi Konrad,
>
>>> For instance, the line bellow will create a workaround helper for
>>> erratum #424242 which is enabled when the capability
>>> ARM64_WORKAROUND_424242 is set and only available for ARM64:

42, eh?

>>>
>>> CHECK_WORKAROUND_HELPER(424242, ARM64_WORKAROUND_42424242, CONFIG_ARM64)
>>>
>>> Signed-off-by: Julien Grall 
>>
>>
>> It looks good to me. CC'ing Konrad which is more knowledgeable than me
>> about ALTERNATIVE.
>
>
> Do you have any opinions on this patch?

Yes I do:

Reviewed-by: Konrad Rzeszutek Wilk 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] "X86_PV_VCPU_MSRS record truncated" during domain restore

2016-07-22 Thread Massimo Colombi
The patches (that are attached to "[PATCH RFC 0/4] Fix issues with 
zero-length records in migration v2") work!


I patched locally qubes-builder to import your patches and recreate rpm 
files. These patches also work on Xen 4.6.1.


Best regards,

  Massimo

On 07/21/2016 10:53 AM, Andrew Cooper wrote:

On 21/07/2016 09:39, Massimo Colombi wrote:

This is the output of verify-stream-v2:

[user@dom0 scripts]$ sudo xl save fedora-23-dvm /fedora-23-dvm-savefile
Saving to /fedora-23-dvm-savefile new xl format (info 0x3/0x0/1991)
xc: info: Saving domain 4, type x86 PV
xc: Frames: 912384/912384  100%
xc: End of stream: 0/00%

[user@dom0 scripts]$ ./verify-stream-v2 -f xl -i /fedora-23-dvm-savefile
Stream Error:
Traceback (most recent call last):
   File "./verify-stream-v2", line 82, in read_stream
 VerifyLibxl(info, stream_read).verify()
   File "/usr/lib64/python2.7/site-packages/xen/migration/libxl.py",
line 82, in verify
 while self.verify_record() != REC_TYPE_end:
   File "/usr/lib64/python2.7/site-packages/xen/migration/libxl.py",
line 136, in verify_record
 record_verifiers[rtype](self, content[:length])
   File "/usr/lib64/python2.7/site-packages/xen/migration/libxl.py",
line 155, in verify_record_libxc_context
 VerifyLibxc(self.info, self.read).verify()
   File "/usr/lib64/python2.7/site-packages/xen/migration/libxc.py",
line 132, in verify
 while self.verify_record() != REC_TYPE_end:
   File "/usr/lib64/python2.7/site-packages/xen/migration/libxc.py",
line 227, in verify_record
 record_verifiers[rtype](self, content[:length])
   File "/usr/lib64/python2.7/site-packages/xen/migration/libxc.py",
line 429, in 
 VerifyLibxc.verify_record_x86_pv_vcpu_generic(s, x, "msrs"),
   File "/usr/lib64/python2.7/site-packages/xen/migration/libxc.py",
line 323, in verify_record_x86_pv_vcpu_generic
 " bytes long" % (name, minsz))
RecordError: X86_PV_VCPU_msrs record length must be at least 8 bytes long


On 07/21/2016 02:10 AM, Andrew Cooper wrote:

Is it possible to do an `xl save` equivalent on the domain, and run
tools/python/scripts/verify-stream-v2 against the resulting file?  That
should identify whether it is a malformed X86_PV_VCPU_MSRS record but
otherwise intact stream, or whether the stream becomes corrupted
elsewhere?

Thanks for your explanation of the bug.

I should also improve the reported error message.

Do you mind rerunning with an extra -v to dump a list of the records
found in the stream?

~Andrew



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 2/4] tools: split out xenstored starting form xencommons

2016-07-22 Thread Juergen Gross
In order to prepare starting a xenstore domain split out the starting
of the xenstore daemon from the xencommons script into a dedicated
launch-xenstore script.

Correct one error: don't remove old tdb files in background, as this
could lead to very subtle races.

A rerun of autogen.sh is required.

Signed-off-by: Juergen Gross 
---
 .gitignore   |  1 +
 tools/configure  |  3 +-
 tools/configure.ac   |  1 +
 tools/hotplug/Linux/Makefile |  1 +
 tools/hotplug/Linux/init.d/xencommons.in | 36 ++---
 tools/hotplug/Linux/launch-xenstore.in   | 55 
 6 files changed, 63 insertions(+), 34 deletions(-)
 create mode 100644 tools/hotplug/Linux/launch-xenstore.in

diff --git a/.gitignore b/.gitignore
index 9b8dece..d193820 100644
--- a/.gitignore
+++ b/.gitignore
@@ -157,6 +157,7 @@ tools/hotplug/Linux/init.d/xen-watchdog
 tools/hotplug/Linux/init.d/xencommons
 tools/hotplug/Linux/init.d/xendomains
 tools/hotplug/Linux/init.d/xendriverdomain
+tools/hotplug/Linux/launch-xenstore
 tools/hotplug/Linux/systemd/*.conf
 tools/hotplug/Linux/systemd/*.mount
 tools/hotplug/Linux/systemd/*.socket
diff --git a/tools/configure b/tools/configure
index 1c84c6c..51f16c5 100755
--- a/tools/configure
+++ b/tools/configure
@@ -2410,7 +2410,7 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
 
 
 
-ac_config_files="$ac_config_files ../config/Tools.mk 
hotplug/FreeBSD/rc.d/xencommons hotplug/FreeBSD/rc.d/xendriverdomain 
hotplug/Linux/init.d/sysconfig.xencommons 
hotplug/Linux/init.d/sysconfig.xendomains hotplug/Linux/init.d/xen-watchdog 
hotplug/Linux/init.d/xencommons hotplug/Linux/init.d/xendomains 
hotplug/Linux/init.d/xendriverdomain hotplug/Linux/vif-setup 
hotplug/Linux/xen-hotplug-common.sh hotplug/Linux/xendomains 
hotplug/NetBSD/rc.d/xencommons hotplug/NetBSD/rc.d/xendriverdomain 
libxl/xenlight.pc.in libxl/xlutil.pc.in ocaml/xenstored/oxenstored.conf"
+ac_config_files="$ac_config_files ../config/Tools.mk 
hotplug/FreeBSD/rc.d/xencommons hotplug/FreeBSD/rc.d/xendriverdomain 
hotplug/Linux/init.d/sysconfig.xencommons 
hotplug/Linux/init.d/sysconfig.xendomains hotplug/Linux/init.d/xen-watchdog 
hotplug/Linux/init.d/xencommons hotplug/Linux/init.d/xendomains 
hotplug/Linux/init.d/xendriverdomain hotplug/Linux/launch-xenstore 
hotplug/Linux/vif-setup hotplug/Linux/xen-hotplug-common.sh 
hotplug/Linux/xendomains hotplug/NetBSD/rc.d/xencommons 
hotplug/NetBSD/rc.d/xendriverdomain libxl/xenlight.pc.in libxl/xlutil.pc.in 
ocaml/xenstored/oxenstored.conf"
 
 ac_config_headers="$ac_config_headers config.h"
 
@@ -10376,6 +10376,7 @@ do
 "hotplug/Linux/init.d/xencommons") CONFIG_FILES="$CONFIG_FILES 
hotplug/Linux/init.d/xencommons" ;;
 "hotplug/Linux/init.d/xendomains") CONFIG_FILES="$CONFIG_FILES 
hotplug/Linux/init.d/xendomains" ;;
 "hotplug/Linux/init.d/xendriverdomain") CONFIG_FILES="$CONFIG_FILES 
hotplug/Linux/init.d/xendriverdomain" ;;
+"hotplug/Linux/launch-xenstore") CONFIG_FILES="$CONFIG_FILES 
hotplug/Linux/launch-xenstore" ;;
 "hotplug/Linux/vif-setup") CONFIG_FILES="$CONFIG_FILES 
hotplug/Linux/vif-setup" ;;
 "hotplug/Linux/xen-hotplug-common.sh") CONFIG_FILES="$CONFIG_FILES 
hotplug/Linux/xen-hotplug-common.sh" ;;
 "hotplug/Linux/xendomains") CONFIG_FILES="$CONFIG_FILES 
hotplug/Linux/xendomains" ;;
diff --git a/tools/configure.ac b/tools/configure.ac
index f991ab3..3a4abb5 100644
--- a/tools/configure.ac
+++ b/tools/configure.ac
@@ -15,6 +15,7 @@ hotplug/Linux/init.d/xen-watchdog
 hotplug/Linux/init.d/xencommons
 hotplug/Linux/init.d/xendomains
 hotplug/Linux/init.d/xendriverdomain
+hotplug/Linux/launch-xenstore
 hotplug/Linux/vif-setup
 hotplug/Linux/xen-hotplug-common.sh
 hotplug/Linux/xendomains
diff --git a/tools/hotplug/Linux/Makefile b/tools/hotplug/Linux/Makefile
index 6d6ccee..29280cb 100644
--- a/tools/hotplug/Linux/Makefile
+++ b/tools/hotplug/Linux/Makefile
@@ -30,6 +30,7 @@ XEN_SCRIPTS += block-drbd-probe
 XEN_SCRIPTS += block-dummy
 XEN_SCRIPTS += $(XEN_SCRIPTS-y)
 XEN_SCRIPTS += colo-proxy-setup
+XEN_SCRIPTS += launch-xenstore
 
 SUBDIRS-$(CONFIG_SYSTEMD) += systemd
 
diff --git a/tools/hotplug/Linux/init.d/xencommons.in 
b/tools/hotplug/Linux/init.d/xencommons.in
index 2d8f30b..a32608c 100644
--- a/tools/hotplug/Linux/init.d/xencommons.in
+++ b/tools/hotplug/Linux/init.d/xencommons.in
@@ -18,7 +18,6 @@
 # Description:   Starts and stops the daemons neeeded for xl/xend
 ### END INIT INFO
 
-XENSTORED=@XENSTORED@
 BACKEND_MODULES="@LINUX_BACKEND_MODULES@"
 
 . @XEN_SCRIPT_DIR@/hotplugpath.sh
@@ -53,8 +52,6 @@ if test -f /proc/xen/capabilities && \
 fi
 
 do_start () {
-local time=0
-   local timeout=30
local mod
 
for mod in $BACKEND_MODULES ; do modprobe "$mod" &>/dev/null ; done
@@ -62,37 +59,10 @@ do_start () {
mkdir -p ${XEN_RUN_DIR}
mkdir -p ${XEN_LOCK_DIR}
 
-   if ! `${bindir}/xenstore-read -s / >/dev/nul

[Xen-devel] [PATCH v3 0/4] tools: make xenstore domain/daemon configurable

2016-07-22 Thread Juergen Gross
Add a configuration option to /etc/sysconfig/xencommons to let the
user configure whether he wants to start xenstore service as a daemon
or as a stubdom.

Changes in V3:
- patch 1: re-add sd_notify() call
- split up former patch 2 into 3 patches as requested by Ian Jackson
- patch 4 (was 2): remove check for running xenstore domain, as this 
  is done in init-xenstore-domain already
- patch 4 (was 2): if booted with systemd send a systemd-notify message 
  in the xenstore domain case
- patch 4 (was 2): if booted with systemd don't wait until xenstore 
  daemon is running, as the daemon will have sent a notify message by 
  its own

Changes in V2:
- move service type modification form patch 2 to patch 1 as implied by
  Ross Lagerwall (at least I guess so)
- add .gitignore entry for launch-xenstore

Juergen Gross (4):
  tools: remove systemd xenstore socket definitions
  tools: split out xenstored starting form xencommons
  tools: use pidfile for test if xenstored is running
  tools: make xenstore domain easy configurable

 .gitignore |   1 +
 tools/configure|   7 +-
 tools/configure.ac |   3 +-
 tools/hotplug/Linux/Makefile   |   1 +
 tools/hotplug/Linux/init.d/sysconfig.xencommons.in |  42 +++-
 tools/hotplug/Linux/init.d/xencommons.in   |  38 +--
 tools/hotplug/Linux/launch-xenstore.in |  87 
 tools/hotplug/Linux/systemd/Makefile   |   5 -
 tools/hotplug/Linux/systemd/xenstored.service.in   |  13 +--
 tools/hotplug/Linux/systemd/xenstored.socket.in|  13 ---
 tools/hotplug/Linux/systemd/xenstored_ro.socket.in |  13 ---
 tools/ocaml/xenstored/systemd.ml   |   1 -
 tools/ocaml/xenstored/systemd.mli  |   5 -
 tools/ocaml/xenstored/systemd_stubs.c  | 113 -
 tools/ocaml/xenstored/utils.ml |  21 ++--
 tools/xenstore/xenstored_core.c|  92 +
 16 files changed, 169 insertions(+), 286 deletions(-)
 create mode 100644 tools/hotplug/Linux/launch-xenstore.in
 delete mode 100644 tools/hotplug/Linux/systemd/xenstored.socket.in
 delete mode 100644 tools/hotplug/Linux/systemd/xenstored_ro.socket.in

-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 3/4] tools: use pidfile for test if xenstored is running

2016-07-22 Thread Juergen Gross
Instead of trying to read xenstore via xenstore-read use the pidfile
of xenstored for the test whether xenstored is running. This prepares
support of xenstore domain, as trying to read xenstore will block
for ever in case xenstore domain is started after trying to read.

Signed-off-by: Juergen Gross 
---
 tools/hotplug/Linux/init.d/xencommons.in |  2 +-
 tools/hotplug/Linux/launch-xenstore.in   | 58 +++-
 2 files changed, 35 insertions(+), 25 deletions(-)

diff --git a/tools/hotplug/Linux/init.d/xencommons.in 
b/tools/hotplug/Linux/init.d/xencommons.in
index a32608c..a6a40d6 100644
--- a/tools/hotplug/Linux/init.d/xencommons.in
+++ b/tools/hotplug/Linux/init.d/xencommons.in
@@ -96,7 +96,7 @@ case "$1" in
do_start
;;
   status)
-${bindir}/xenstore-read -s /
+test -f @XEN_RUN_DIR@/xenstored.pid
;;
   stop)
do_stop
diff --git a/tools/hotplug/Linux/launch-xenstore.in 
b/tools/hotplug/Linux/launch-xenstore.in
index a0cbfd3..2bd9f64 100644
--- a/tools/hotplug/Linux/launch-xenstore.in
+++ b/tools/hotplug/Linux/launch-xenstore.in
@@ -18,38 +18,48 @@
 XENSTORED=@XENSTORED@
 
 . @XEN_SCRIPT_DIR@/hotplugpath.sh
-test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
 
-time=0
-timeout=30
+test_xenstore () {
+   test -f /var/run/xenstored.pid
+   return $?
+}
 
-if ! `${bindir}/xenstore-read -s / >/dev/null 2>&1`
-then
-   test -z "$XENSTORED_ROOTDIR" && XENSTORED_ROOTDIR="@XEN_LIB_STORED@"
-   rm -f "$XENSTORED_ROOTDIR"/tdb* 2>/dev/null
-   test -z "$XENSTORED_TRACE" || XENSTORED_ARGS=" -T 
@XEN_LOG_DIR@/xenstored-trace.log"
+timeout_xenstore () {
+   local time=0
+   local timeout=30
 
-   if [ -n "$XENSTORED" ] ; then
-   echo -n Starting $XENSTORED...
-   $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
-   else
-   echo "No xenstored found"
-   exit 1
-   fi
-
-   # Wait for xenstored to actually come up, timing out after 30 seconds
-   while [ $time -lt $timeout ] && ! `${bindir}/xenstore-read -s / 
>/dev/null 2>&1` ; do
-   echo -n .
-   time=$(($time+1))
-   sleep 1
+   while [ $time -lt $timeout ] && ! test_xenstore ; do
+   echo -n .
+   time=$(($time+1))
+   sleep 1
done
echo
-
+ 
# Exit if we timed out
if ! [ $time -lt $timeout ] ; then
-   echo Could not start xenstored
-   exit 1
+   echo "Could not start $@"
+   return 1
fi
+
+   return 0
+}
+
+test_xenstore && exit 0
+
+test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
+
+test -z "$XENSTORED_ROOTDIR" && XENSTORED_ROOTDIR="@XEN_LIB_STORED@"
+rm -f "$XENSTORED_ROOTDIR"/tdb* 2>/dev/null
+test -z "$XENSTORED_TRACE" || XENSTORED_ARGS=" -T 
@XEN_LOG_DIR@/xenstored-trace.log"
+
+if [ -n "$XENSTORED" ] ; then
+   echo -n Starting $XENSTORED...
+   $XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
+else
+   echo "No xenstored found"
+   exit 1
 fi
 
+timeout_xenstore $XENSTORED || exit 1
+
 exit 0
-- 
2.6.6


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 1/4] tools: remove systemd xenstore socket definitions

2016-07-22 Thread Juergen Gross
On a system with systemd the xenstore sockets are created via systemd.
Remove the related configuration files in order to be able to decide
at runtime whether the sockets should be created or not. This will
enable Xen to start xenstore either via a daemon or via a stub domain.

As the xenstore domain start program will exit after it has done its
job prepare the same behaviour to be tolerated by systemd for the
xenstore daemon by specifying the appropriate flags in the service
file.

A rerun of autogen.sh is required.

Signed-off-by: Juergen Gross 
---
V3: re-add sd_notify() call
---
 tools/configure|   4 +-
 tools/configure.ac |   2 -
 tools/hotplug/Linux/systemd/Makefile   |   5 -
 tools/hotplug/Linux/systemd/xenstored.service.in   |   5 +-
 tools/hotplug/Linux/systemd/xenstored.socket.in|  13 ---
 tools/hotplug/Linux/systemd/xenstored_ro.socket.in |  13 ---
 tools/ocaml/xenstored/systemd.ml   |   1 -
 tools/ocaml/xenstored/systemd.mli  |   5 -
 tools/ocaml/xenstored/systemd_stubs.c  | 113 -
 tools/ocaml/xenstored/utils.ml |  21 ++--
 tools/xenstore/xenstored_core.c|  92 +
 11 files changed, 33 insertions(+), 241 deletions(-)
 delete mode 100644 tools/hotplug/Linux/systemd/xenstored.socket.in
 delete mode 100644 tools/hotplug/Linux/systemd/xenstored_ro.socket.in

diff --git a/tools/configure b/tools/configure
index 5b5dcce..1c84c6c 100755
--- a/tools/configure
+++ b/tools/configure
@@ -9670,7 +9670,7 @@ fi
 
 if test "x$systemd" = "xy"; then :
 
-ac_config_files="$ac_config_files hotplug/Linux/systemd/proc-xen.mount 
hotplug/Linux/systemd/var-lib-xenstored.mount 
hotplug/Linux/systemd/xen-init-dom0.service 
hotplug/Linux/systemd/xen-qemu-dom0-disk-backend.service 
hotplug/Linux/systemd/xen-watchdog.service 
hotplug/Linux/systemd/xenconsoled.service 
hotplug/Linux/systemd/xendomains.service 
hotplug/Linux/systemd/xendriverdomain.service 
hotplug/Linux/systemd/xenstored.service hotplug/Linux/systemd/xenstored.socket 
hotplug/Linux/systemd/xenstored_ro.socket"
+ac_config_files="$ac_config_files hotplug/Linux/systemd/proc-xen.mount 
hotplug/Linux/systemd/var-lib-xenstored.mount 
hotplug/Linux/systemd/xen-init-dom0.service 
hotplug/Linux/systemd/xen-qemu-dom0-disk-backend.service 
hotplug/Linux/systemd/xen-watchdog.service 
hotplug/Linux/systemd/xenconsoled.service 
hotplug/Linux/systemd/xendomains.service 
hotplug/Linux/systemd/xendriverdomain.service 
hotplug/Linux/systemd/xenstored.service"
 
 
 fi
@@ -10394,8 +10394,6 @@ do
 "hotplug/Linux/systemd/xendomains.service") CONFIG_FILES="$CONFIG_FILES 
hotplug/Linux/systemd/xendomains.service" ;;
 "hotplug/Linux/systemd/xendriverdomain.service") 
CONFIG_FILES="$CONFIG_FILES hotplug/Linux/systemd/xendriverdomain.service" ;;
 "hotplug/Linux/systemd/xenstored.service") CONFIG_FILES="$CONFIG_FILES 
hotplug/Linux/systemd/xenstored.service" ;;
-"hotplug/Linux/systemd/xenstored.socket") CONFIG_FILES="$CONFIG_FILES 
hotplug/Linux/systemd/xenstored.socket" ;;
-"hotplug/Linux/systemd/xenstored_ro.socket") CONFIG_FILES="$CONFIG_FILES 
hotplug/Linux/systemd/xenstored_ro.socket" ;;
 
   *) as_fn_error $? "invalid argument: \`$ac_config_target'" "$LINENO" 5;;
   esac
diff --git a/tools/configure.ac b/tools/configure.ac
index 87e14a8..f991ab3 100644
--- a/tools/configure.ac
+++ b/tools/configure.ac
@@ -438,8 +438,6 @@ AS_IF([test "x$systemd" = "xy"], [
 hotplug/Linux/systemd/xendomains.service
 hotplug/Linux/systemd/xendriverdomain.service
 hotplug/Linux/systemd/xenstored.service
-hotplug/Linux/systemd/xenstored.socket
-hotplug/Linux/systemd/xenstored_ro.socket
 ])
 ])
 
diff --git a/tools/hotplug/Linux/systemd/Makefile 
b/tools/hotplug/Linux/systemd/Makefile
index 558e459..7d24bbe 100644
--- a/tools/hotplug/Linux/systemd/Makefile
+++ b/tools/hotplug/Linux/systemd/Makefile
@@ -6,9 +6,6 @@ XEN_SYSTEMD_MODULES = xen.conf
 XEN_SYSTEMD_MOUNT =  proc-xen.mount
 XEN_SYSTEMD_MOUNT += var-lib-xenstored.mount
 
-XEN_SYSTEMD_SOCKET  = xenstored.socket
-XEN_SYSTEMD_SOCKET += xenstored_ro.socket
-
 XEN_SYSTEMD_SERVICE  = xenstored.service
 XEN_SYSTEMD_SERVICE += xenconsoled.service
 XEN_SYSTEMD_SERVICE += xen-qemu-dom0-disk-backend.service
@@ -19,7 +16,6 @@ XEN_SYSTEMD_SERVICE += xendriverdomain.service
 
 ALL_XEN_SYSTEMD =  $(XEN_SYSTEMD_MODULES)  \
$(XEN_SYSTEMD_MOUNT)\
-   $(XEN_SYSTEMD_SOCKET)   \
$(XEN_SYSTEMD_SERVICE)
 
 .PHONY: all
@@ -38,7 +34,6 @@ install: $(ALL_XEN_SYSTEMD)
$(INSTALL_DIR) $(DESTDIR)$(XEN_SYSTEMD_DIR)
[ -d $(DESTDIR)$(XEN_SYSTEMD_MODULES_LOAD) ] || \
$(INSTALL_DIR) $(DESTDIR)$(XEN_SYSTEMD_MODULES_LOAD)
-   $(INSTALL_DATA) *.socket $(DESTDIR)$(XEN_SYSTEMD_DIR)
$(INSTALL_DATA) *.service $

[Xen-devel] [PATCH v3 4/4] tools: make xenstore domain easy configurable

2016-07-22 Thread Juergen Gross
Add configuration entries to sysconfig.xencommons for selection of the
xenstore type (domain or daemon) and start the selected xenstore
service via a script called from sysvinit or systemd.

Signed-off-by: Juergen Gross 
---
V3: - remove check for running xenstore domain, as this is done in
  init-xenstore-domain already
- if booted with systemd send a systemd-notify message in the
  xenstore domain case
- if booted with systemd don't wait until xenstore daemon is
  running, as the daemon will have sent a notify message by its
  own
- split up patch as requested by Ian Jackson

V2: - add .gitignore entry for launch-xenstore
---
 tools/hotplug/Linux/init.d/sysconfig.xencommons.in | 42 --
 tools/hotplug/Linux/launch-xenstore.in | 42 --
 tools/hotplug/Linux/systemd/xenstored.service.in   |  8 +
 3 files changed, 72 insertions(+), 20 deletions(-)

diff --git a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in 
b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
index c27a476..cc8185c 100644
--- a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
+++ b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
@@ -6,12 +6,24 @@
 #XENCONSOLED_TRACE=[none|guest|hv|all]
 
 ## Type: string
+## Default: daemon
+#
+# Select type of xentore service.
+#
+# This can be either of:
+#  * daemon
+#  * domain
+#
+# Changing this requires a reboot to take effect.
+#
+#XENSTORETYPE=daemon
+
+## Type: string
 ## Default: xenstored
 #
 # Select xenstore implementation, this can be either
-# of these below. If using systemd it's preferred that you
-# just edit the xenstored.service unit file and change
-# the XENSTORED variable there.
+# of these below.
+# Only evaluated if XENSTORETYPE is "daemon".
 #
 # This can be either of:
 #  * @sbindir@/oxenstored
@@ -26,21 +38,45 @@
 # Additional commandline arguments to start xenstored,
 # like "--trace-file @XEN_LOG_DIR@/xenstored-trace.log"
 # See "@sbindir@/xenstored --help" for possible options.
+# Only evaluated if XENSTORETYPE is "daemon".
 XENSTORED_ARGS=
 
 ## Type: string
 ## Default: Not defined, tracing off
 #
 # Log xenstored messages
+# Only evaluated if XENSTORETYPE is "daemon".
 #XENSTORED_TRACE=[yes|on|1]
 
 ## Type: string
 ## Default: "@XEN_LIB_STORED@"
 #
 # Running xenstored on XENSTORED_ROOTDIR
+# Only evaluated if XENSTORETYPE is "daemon".
 #XENSTORED_ROOTDIR=@XEN_LIB_STORED@
 
 ## Type: string
+## Default: @LIBEXEC@/boot/xenstore-stubdom.gz
+#
+# xenstore domain kernel.
+# Only evaluated if XENSTORETYPE is "domain".
+#XENSTORE_DOMAIN_KERNEL=@LIBEXEC@/boot/xenstore-stubdom.gz
+
+## Type: integer
+## Default: 8
+#
+# xenstore domain memory size in MiB.
+# Only evaluated if XENSTORETYPE is "domain".
+#XENSTORE_DOMAIN_SIZE=8
+
+## Type: string
+## Default: ""
+#
+# Additional arguments for starting the xenstore domain.
+# Only evaluated if XENSTORETYPE is "domain".
+XENSTORE_DOMAIN_ARGS=
+
+## Type: string
 ## Default: Not defined, xenbackendd debug mode off
 #
 # Running xenbackendd in debug mode
diff --git a/tools/hotplug/Linux/launch-xenstore.in 
b/tools/hotplug/Linux/launch-xenstore.in
index 2bd9f64..fdfa33a 100644
--- a/tools/hotplug/Linux/launch-xenstore.in
+++ b/tools/hotplug/Linux/launch-xenstore.in
@@ -48,18 +48,40 @@ test_xenstore && exit 0
 
 test -f @CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons && . 
@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
 
-test -z "$XENSTORED_ROOTDIR" && XENSTORED_ROOTDIR="@XEN_LIB_STORED@"
-rm -f "$XENSTORED_ROOTDIR"/tdb* 2>/dev/null
-test -z "$XENSTORED_TRACE" || XENSTORED_ARGS=" -T 
@XEN_LOG_DIR@/xenstored-trace.log"
+[ "$XENSTORETYPE" = "" ] && XENSTORETYPE=daemon
+
+/bin/mkdir -p @XEN_RUN_DIR@
+
+[ "$XENSTORETYPE" = "daemon" ] && {
+   [ -z "$XENSTORED_ROOTDIR" ] && XENSTORED_ROOTDIR="@XEN_LIB_STORED@"
+   /bin/rm -f $XENSTORED_ROOTDIR/tdb* 2>/dev/null
+   [ -z "$XENSTORED_TRACE" ] || XENSTORED_ARGS="$XENSTORED_ARGS -T 
@XEN_LOG_DIR@/xenstored-trace.log"
+   [ -z "$XENSTORED" ] && XENSTORED=@XENSTORED@
+   [ -x "$XENSTORED" ] || {
+   echo "No xenstored found"
+   exit 1
+   }
 
-if [ -n "$XENSTORED" ] ; then
echo -n Starting $XENSTORED...
$XENSTORED --pid-file @XEN_RUN_DIR@/xenstored.pid $XENSTORED_ARGS
-else
-   echo "No xenstored found"
-   exit 1
-fi
 
-timeout_xenstore $XENSTORED || exit 1
+   systemd-notify --booted 2>/dev/null || timeout_xenstore $XENSTORED || 
exit 1
 
-exit 0
+   exit 0
+}
+
+[ "$XENSTORETYPE" = "domain" ] && {
+   [ -z "$XENSTORE_DOMAIN_KERNEL" ] && 
XENSTORE_DOMAIN_KERNEL=@LIBEXEC@/boot/xenstore-stubdom.gz
+   XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --kernel 
$XENSTORE_DOMAIN_KERNEL"
+   [ -z "$XENSTORE_DOMAIN_SIZE" ] && XENSTORE_DOMAIN_SIZE=8
+   XENSTORE_DOMAIN_ARGS="$XENSTORE_DOMAIN_ARGS --memory 
$XENSTORE_DOMAIN_SIZE"
+
+   echo -n Starting $XENSTORE_DOMAIN_KERNEL...
+   ${LIBEXEC_BIN}/init-xenstore-domain $XENSTORE_D

Re: [Xen-devel] [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching

2016-07-22 Thread Julien Grall



On 22/07/16 15:15, Konrad Rzeszutek Wilk wrote:

diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
new file mode 100644
index 000..d00f98e
--- /dev/null
+++ b/xen/arch/arm/alternative.c


Hey!


Hi Konrad,


I've some comments, most of them are light. What I am concerned most
is the difference between comments in the header vs the code.
Looking at Linux code it seems to have the same issue - and I know you don't
want to differ to much from Linux.


Well, I had to differ from Linux because the code was ARM64 specific 
(i.e renaming function and moving code around to potentially allow ARM32 
support).


So I don't mind to fix them here and...




Perhaps go with the changes as is (With the comments being off) and the
then have some patches for Linux for the header mismatches and those
can be backported?


we can think about Linux changes and resync the code later.



..snip..

+static int __apply_alternatives(const struct alt_region *region)
+{
+const struct alt_instr *alt;
+const u32 *origptr, *replptr;
+u32 *writeptr, *writemap;
+mfn_t text_mfn = _mfn(virt_to_mfn(_stext));
+unsigned int text_order = get_order_from_bytes(_end - _start);
+
+printk("Patching kernel code\n");


I see you use prefixed later with 'alternatives:' and also this is without
any prefix (DEBUG, or ERR). Would it make sense to have this one above
be dprintk while the one below with printk (KERN_ERR) ?


I think this one is useful in non-debug build to know that we are 
patching the code. So if it crash, we know where we are.



+
+/*
+ * The text section is read-only. So re-map Xen to be able to patch
+ * the code.
+ */
+writemap = __vmap(&text_mfn, 1 << text_order, 1, 1, PAGE_HYPERVISOR,
+  VMAP_DEFAULT);
+if ( !writemap )
+{
+printk("alternatives: Unable to map the text section\n");


Perhaps include the size of the text section? That may help in figuring
out what went wrong or if one could increase the VMAP area?


Good point. I will do it in the next version.




+return -ENOMEM;
+}
+
+for ( alt = region->begin; alt < region->end; alt++ )
+{
+u32 insn;
+int i, nr_inst;
+
+if ( !cpus_have_cap(alt->cpufeature) )
+continue;
+
+BUG_ON(alt->alt_len != alt->orig_len);


The header file says:
+   u8  alt_len;/* size of new instruction(s), <= orig_len */

So this BUG_ON seems incorrect?
But hten later in the header it says:
"1. Be exactly the same length (in bytes) as the default code
+ *sequence."

So the BUG_ON is correct. Perhaps the 'alt_len' comment should be s/<=/==/ ?


Good point. I suspect the plan was to support sequence with different 
length but it has been abandoned later on (CC Andre to confirm).





+
+origptr = ALT_ORIG_PTR(alt);
+writeptr = origptr - (u32 *)_start + writemap;


How about just using writeptr += ?


I am not sure about your suggestion here. Regardless the Linux code, the 
origptr will not follow a pattern at each iteration. So we have to 
recompute it everytime.




Ah wait. You are trying to preserve the Linux code!. Nevermind then.


+replptr = ALT_REPL_PTR(alt);
+
+nr_inst = alt->alt_len / sizeof(insn);
+
+for ( i = 0; i < nr_inst; i++ )
+{
+insn = get_alt_insn(alt, origptr + i, replptr + i);
+*(writeptr + i) = cpu_to_le32(insn);
+}
+
+/* Ensure the new instructions reached the memory and nuke */
+clean_and_invalidate_dcache_va_range(writeptr,
+ (sizeof (*writeptr) * nr_inst));
+}
+
+/* Nuke the instruction cache */
+invalidate_icache();
+
+vunmap(writemap);
+
+return 0;
+}
+
+/*
+ * We might be patching the stop_machine state machine, so implement a
+ * really simple polling protocol here.
+ */
+static int __apply_alternatives_multi_stop(void *unused)
+{
+static int patched = 0;


Shouldn't this be 'atomic_t' ?


Does it matter? From my understanding the code will behave the same.


+struct alt_region region = {


Can this 'const' ?


It could, I just followed Linux but as we differ I will move to const.


+.begin = __alt_instructions,
+.end = __alt_instructions_end,
+};
+
+/* We always have a CPU 0 at this point (__init) */
+if ( smp_processor_id() )
+{
+while ( !read_atomic(&patched) )
+cpu_relax();
+isb();
+}
+else
+{
+int ret;
+
+BUG_ON(patched);
+ret = __apply_alternatives(®ion);
+/* The patching is not expected to fail during boot. */
+BUG_ON(ret != 0);
+
+/* Barriers provided by the cache flushing */
+write_atomic(&patched, 1);
+}
+
+return 0;
+}
+
+/*
+ * This function should only be called during boot and before CPU0 jump
+ * into the idle_loop.


So could you add an:

+ */
+void __init apply_alternatives_all(

Re: [Xen-devel] [PATCH v5 2/7] xen/arm: cpufeature: Provide an helper to check if a capability is supported

2016-07-22 Thread Julien Grall

Hi Konrad,

On 22/07/16 15:18, Konrad Rzeszutek Wilk wrote:

On Wed, Jul 20, 2016 at 04:25:55PM +0100, Julien Grall wrote:

diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
index 7a1b56b..088625b 100644
--- a/xen/arch/arm/cpufeature.c
+++ b/xen/arch/arm/cpufeature.c
@@ -24,6 +24,22 @@

 DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);

+void update_cpu_capabilities(const struct arm_cpu_capabilities *caps,
+ const char *info)
+{
+int i;
+
+for ( i = 0; caps[i].matches; i++ )
+{
+if ( !caps[i].matches(&caps[i]) )


So what if the 'struct arm_cpu_capabilitues' has '->matches' set to
NULL?


It is the exit condition of the loop:

 for ( i = 0; caps[i].matches; i++ )

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/7] xen/arm: cpufeature: Provide an helper to check if a capability is supported

2016-07-22 Thread Konrad Rzeszutek Wilk
On Fri, Jul 22, 2016 at 04:31:13PM +0100, Julien Grall wrote:
> Hi Konrad,
> 
> On 22/07/16 15:18, Konrad Rzeszutek Wilk wrote:
> >On Wed, Jul 20, 2016 at 04:25:55PM +0100, Julien Grall wrote:
> >>diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
> >>index 7a1b56b..088625b 100644
> >>--- a/xen/arch/arm/cpufeature.c
> >>+++ b/xen/arch/arm/cpufeature.c
> >>@@ -24,6 +24,22 @@
> >>
> >> DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
> >>
> >>+void update_cpu_capabilities(const struct arm_cpu_capabilities *caps,
> >>+ const char *info)
> >>+{
> >>+int i;
> >>+
> >>+for ( i = 0; caps[i].matches; i++ )
> >>+{
> >>+if ( !caps[i].matches(&caps[i]) )
> >
> >So what if the 'struct arm_cpu_capabilitues' has '->matches' set to
> >NULL?
> 
> It is the exit condition of the loop:
> 
>  for ( i = 0; caps[i].matches; i++ )

Duh!

Ignore my query then please.
> 
> Regards,
> 
> -- 
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching

2016-07-22 Thread Konrad Rzeszutek Wilk
> >>+
> >>+origptr = ALT_ORIG_PTR(alt);
> >>+writeptr = origptr - (u32 *)_start + writemap;
> >
> >How about just using writeptr += ?
> 
> I am not sure about your suggestion here. Regardless the Linux code, the
> origptr will not follow a pattern at each iteration. So we have to recompute
> it everytime.

Just ignore that. I somehow equated writeptr to writemp.

..snip..
> 
> >
> >Ah wait. You are trying to preserve the Linux code!. Nevermind then.
> >
> >>+replptr = ALT_REPL_PTR(alt);
> >>+
> >>+nr_inst = alt->alt_len / sizeof(insn);
> >>+
> >>+for ( i = 0; i < nr_inst; i++ )
> >>+{
> >>+insn = get_alt_insn(alt, origptr + i, replptr + i);
> >>+*(writeptr + i) = cpu_to_le32(insn);
> >>+}
> >>+
> >>+/* Ensure the new instructions reached the memory and nuke */
> >>+clean_and_invalidate_dcache_va_range(writeptr,
> >>+ (sizeof (*writeptr) * 
> >>nr_inst));
> >>+}
> >>+
> >>+/* Nuke the instruction cache */
> >>+invalidate_icache();
> >>+
> >>+vunmap(writemap);
> >>+
> >>+return 0;
> >>+}
> >>+
> >>+/*
> >>+ * We might be patching the stop_machine state machine, so implement a
> >>+ * really simple polling protocol here.
> >>+ */
> >>+static int __apply_alternatives_multi_stop(void *unused)
> >>+{
> >>+static int patched = 0;
> >
> >Shouldn't this be 'atomic_t' ?
> 
> Does it matter? From my understanding the code will behave the same.

Not at all under the hood.
 But I see 'atomic_write' and they all operate on the 'atomic_t' .. hence
the query. 

..snip.
> >This being a new file perhaps add:
> >*
> > * Local variables:
> > * mode: C
> > * c-file-style: "BSD"
> > * c-basic-offset: 4
> > * indent-tabs-mode: nil
> > * End:
> > */
> >?
> 
> It is a Linux file with Linux coding style. I would need to look what should
> be the emacs magic block here.

Right. I just meant that you needed the magic block.
> 
> Regards,
> 
> -- 
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 1/7] xen/arm: Introduce alternative runtime patching

2016-07-22 Thread Julien Grall



On 22/07/16 16:38, Konrad Rzeszutek Wilk wrote:

+
+origptr = ALT_ORIG_PTR(alt);
+writeptr = origptr - (u32 *)_start + writemap;


How about just using writeptr += ?


I am not sure about your suggestion here. Regardless the Linux code, the
origptr will not follow a pattern at each iteration. So we have to recompute
it everytime.


Just ignore that. I somehow equated writeptr to writemp.

..snip..




Ah wait. You are trying to preserve the Linux code!. Nevermind then.


+replptr = ALT_REPL_PTR(alt);
+
+nr_inst = alt->alt_len / sizeof(insn);
+
+for ( i = 0; i < nr_inst; i++ )
+{
+insn = get_alt_insn(alt, origptr + i, replptr + i);
+*(writeptr + i) = cpu_to_le32(insn);
+}
+
+/* Ensure the new instructions reached the memory and nuke */
+clean_and_invalidate_dcache_va_range(writeptr,
+ (sizeof (*writeptr) * nr_inst));
+}
+
+/* Nuke the instruction cache */
+invalidate_icache();
+
+vunmap(writemap);
+
+return 0;
+}
+
+/*
+ * We might be patching the stop_machine state machine, so implement a
+ * really simple polling protocol here.
+ */
+static int __apply_alternatives_multi_stop(void *unused)
+{
+static int patched = 0;


Shouldn't this be 'atomic_t' ?


Does it matter? From my understanding the code will behave the same.


Not at all under the hood.
 But I see 'atomic_write' and they all operate on the 'atomic_t' .. hence
the query.


The code is using *_atomic (and not atomic_*) which operate on any type. 
I know the naming is really confusing.


I would like to rename write_atomic and read_atomic to WRITE_ONCE and 
READ_ONCE.




..snip.

This being a new file perhaps add:
*
* Local variables:
* mode: C
* c-file-style: "BSD"
* c-basic-offset: 4
* indent-tabs-mode: nil
* End:
*/
?


It is a Linux file with Linux coding style. I would need to look what should
be the emacs magic block here.


Right. I just meant that you needed the magic block.


I don't think we ever put magic block on Linux file. I will try to find 
out and update the patch.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Locking on vm-event operations (monitor)

2016-07-22 Thread Corneliu ZUZU

On 7/22/2016 4:00 PM, Razvan Cojocaru wrote:

On 07/22/2016 03:32 PM, Corneliu ZUZU wrote:

Look @ hvm_do_resume():

1. If you, as a toolstack user, get at sane step no. 6: "Uninitialize
everything (no events are possible here because of steps 4-5)."
2. But just before you do that,  a hvm_do_resume() happens on an
arbitrary vCPU of the domain and gets to this point:

  if ( unlikely(v->arch.vm_event) )
  {
  // ---> let's say you are now at this point, just after the
above non-NULL check
  struct monitor_write_data *w = &v->arch.vm_event->write_data;

3. and before proceeding further, the uninitialization sequence
_completes_, i.e. this was done in vm_event_cleanup_domain():

  for_each_vcpu ( d, v )
  {
  xfree(v->arch.vm_event);
  v->arch.vm_event = NULL;
  }

4. and then in hvm_do_resume() this gets to be done:

  struct monitor_write_data *w = &v->arch.vm_event->write_data;

  [...]

  if ( w->do_write.msr )
  {
  hvm_msr_write_intercept(w->msr, w->value, 0);
  w->do_write.msr = 0;
  }

then you'll have a beautiful NULL-pointer access hypervisor crash.

As I've said before, of course I agree with that

Sorry, I thought by your previous statement "I assume that the reason
why this has not been addressed in the past is that introspection
applications are expected to be well-behaved" you were inferring that
with a 'sane' toolstack user, as you defined it, it won't be possible to
have a hypervisor crash.

I did (please see below).


Then, again, you'd be wrong, the hvm_do_resume() was just one example, 
in a similar fashion this can happen in a number of other places. 
Details will be given in the patch-series.





   - isn't that what a
previous patch you've submitted addressed?

I'm not sure what patch you're referring to, I don't remember ever
addressing these issues before..

I am talking about your patch "x86/monitor: fix: don't compromise a
monitor_write_data with pending CR writes", which, if I'm not mistaken,
addresses the above problem (since we won't xfree(v->arch.vm_event) on
monitor uninit anymore).


Heh, I wouldn't call fixing problem A and by -coincidence- with that fix 
also having -totally unrelated- problem B go away "addressing problem 
B". I didn't even mention these locking issues in that patch, so how 
could I have addressed them there?



With the above problem fixed, the workflow I suggested should be fine.


The fact of the matter remains that the current state of staging (which 
is what this thread applies to) still has the hvm_do_resume() issue as 
well as the others.



Again, I would prefer things to be rock solid, so personally I encourage
your patch and hope we get it in.

Thanks,
Razvan


Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Infrastructure Maintenance Window: July 25, 2016 from 8:00 - 10:00 UTC

2016-07-22 Thread Lars Kurth
Hi everyone,

just a quick note that we will perform the following maintenance activities on 
Monady July 25, from 8:00 - 10:00 UTC.

The following work will be performed:
- OS Upgrade of bugs.xenproject.org to Debian Jessie (this will take 1-2 hours)

Best Regards
Lars
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/4] tools: remove systemd xenstore socket definitions

2016-07-22 Thread Wei Liu
Only skim-read this patch, will do proper review later.

On Fri, Jul 22, 2016 at 05:09:28PM +0200, Juergen Gross wrote:
[...]
>  CAMLprim value ocaml_launched_by_systemd(value ignore)
>  {
> - CAMLparam1(ignore);
> - CAMLlocal1(ret);
> +   CAMLparam1(ignore);
> +   CAMLlocal1(ret);
>  
> - ret = Val_false;
> +   ret = Val_false;
>  
> - if (sd_listen_fds(0) > 0)
> - ret = Val_true;
> +   if (sd_booted() > 0)
> +   ret = Val_true;

I think this may be problematic.

sd_booted returns true if system is booted with systemd, but it has no
bearing whether this particular process is launched by systemd.

IIRC using sd_booted would cause oxenstored thinks it is launched by
systemd even if the user launches it by hand in a shell. That caused
it's initialisation to fail.  81d758afca7c3c1e3ccbd78154b33d64fd7757fb
was written to address that issue.

So, what would happen if you start oxenstored by hand with your patch
apply? Maybe we can just remove this launched_by_systemd check all
together -- i.e. we always call sd_notify?

>  
> - CAMLreturn(ret);
> +   CAMLreturn(ret);
>  }
>  
>  CAMLprim value ocaml_sd_notify_ready(value ignore)
>  {
> - CAMLparam1(ignore);
> - CAMLlocal1(ret);
> +   CAMLparam1(ignore);
> +   CAMLlocal1(ret);
>  
> - ret = Val_int(0);
> +   ret = Val_int(0);
>  
> - sd_notify(1, "READY=1");
> +   sd_notify(1, "READY=1");
>  
> - CAMLreturn(ret);
> +   CAMLreturn(ret);

It seems that you have introduced quite a few white space changes.
If you really want to change tabs to spaces, please do that in a
separate patch.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] tools/libxc: Properly increment ApicIdCoreSize field on AMD

2016-07-22 Thread Boris Ostrovsky
Current code incorrectly adds 1 to full register instead of
incrementing the field in bits 15:12.

Signed-off-by: Boris Ostrovsky 
---
 tools/libxc/xc_cpuid_x86.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 84f4e08..fbbac9e 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -331,7 +331,8 @@ static void amd_xc_cpuid_policy(xc_interface *xch,
  * ECX[15:12] is ApicIdCoreSize: ECX[7:0] is NumberOfCores (minus one).
  * Update to reflect vLAPIC_ID = vCPU_ID * 2.
  */
-regs[2] = ((regs[2] & 0xf000u) + 1) | ((regs[2] & 0xffu) << 1) | 1u;
+regs[2] = ((regs[2] + (1u << 12)) & 0xf000u) |
+  ((regs[2] & 0xffu) << 1) | 1u;
 break;
 
 case 0x800a: {
-- 
1.8.3.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/libxc: Properly increment ApicIdCoreSize field on AMD

2016-07-22 Thread Wei Liu
On Fri, Jul 22, 2016 at 01:14:01PM -0400, Boris Ostrovsky wrote:
> Current code incorrectly adds 1 to full register instead of
> incrementing the field in bits 15:12.
> 
> Signed-off-by: Boris Ostrovsky 

Acked-by: Wei Liu 

I trust your expertise in this field. :-)

> ---
>  tools/libxc/xc_cpuid_x86.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> index 84f4e08..fbbac9e 100644
> --- a/tools/libxc/xc_cpuid_x86.c
> +++ b/tools/libxc/xc_cpuid_x86.c
> @@ -331,7 +331,8 @@ static void amd_xc_cpuid_policy(xc_interface *xch,
>   * ECX[15:12] is ApicIdCoreSize: ECX[7:0] is NumberOfCores (minus 
> one).
>   * Update to reflect vLAPIC_ID = vCPU_ID * 2.
>   */
> -regs[2] = ((regs[2] & 0xf000u) + 1) | ((regs[2] & 0xffu) << 1) | 1u;
> +regs[2] = ((regs[2] + (1u << 12)) & 0xf000u) |
> +  ((regs[2] & 0xffu) << 1) | 1u;
>  break;
>  
>  case 0x800a: {
> -- 
> 1.8.3.1
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/libxc: Properly increment ApicIdCoreSize field on AMD

2016-07-22 Thread Boris Ostrovsky
On 07/22/2016 01:38 PM, Wei Liu wrote:
> On Fri, Jul 22, 2016 at 01:14:01PM -0400, Boris Ostrovsky wrote:
>> Current code incorrectly adds 1 to full register instead of
>> incrementing the field in bits 15:12.
>>
>> Signed-off-by: Boris Ostrovsky 
> Acked-by: Wei Liu 
>
> I trust your expertise in this field. :-)


Just in this field? The field is only 4 bits! ;-)

But this actually fixes a regression that was triggered by recent
hvmloader change on one particular processor that we have in the test farm.

-boris

>
>> ---
>>  tools/libxc/xc_cpuid_x86.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
>> index 84f4e08..fbbac9e 100644
>> --- a/tools/libxc/xc_cpuid_x86.c
>> +++ b/tools/libxc/xc_cpuid_x86.c
>> @@ -331,7 +331,8 @@ static void amd_xc_cpuid_policy(xc_interface *xch,
>>   * ECX[15:12] is ApicIdCoreSize: ECX[7:0] is NumberOfCores (minus 
>> one).
>>   * Update to reflect vLAPIC_ID = vCPU_ID * 2.
>>   */
>> -regs[2] = ((regs[2] & 0xf000u) + 1) | ((regs[2] & 0xffu) << 1) | 1u;
>> +regs[2] = ((regs[2] + (1u << 12)) & 0xf000u) |
>> +  ((regs[2] & 0xffu) << 1) | 1u;
>>  break;
>>  
>>  case 0x800a: {
>> -- 
>> 1.8.3.1
>>


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/libxc: Properly increment ApicIdCoreSize field on AMD

2016-07-22 Thread Wei Liu
On Fri, Jul 22, 2016 at 01:45:05PM -0400, Boris Ostrovsky wrote:
> On 07/22/2016 01:38 PM, Wei Liu wrote:
> > On Fri, Jul 22, 2016 at 01:14:01PM -0400, Boris Ostrovsky wrote:
> >> Current code incorrectly adds 1 to full register instead of
> >> incrementing the field in bits 15:12.
> >>
> >> Signed-off-by: Boris Ostrovsky 
> > Acked-by: Wei Liu 
> >
> > I trust your expertise in this field. :-)
> 
> 
> Just in this field? The field is only 4 bits! ;-)
> 

No, it's only one bit! :-P

> But this actually fixes a regression that was triggered by recent
> hvmloader change on one particular processor that we have in the test farm.

Ian, this is a backport candidate and needs to go back as far as
possible. The bogus calculation was introduced in 2008 (!). 

Wei.

> 
> -boris
> 
> >
> >> ---
> >>  tools/libxc/xc_cpuid_x86.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> >> index 84f4e08..fbbac9e 100644
> >> --- a/tools/libxc/xc_cpuid_x86.c
> >> +++ b/tools/libxc/xc_cpuid_x86.c
> >> @@ -331,7 +331,8 @@ static void amd_xc_cpuid_policy(xc_interface *xch,
> >>   * ECX[15:12] is ApicIdCoreSize: ECX[7:0] is NumberOfCores (minus 
> >> one).
> >>   * Update to reflect vLAPIC_ID = vCPU_ID * 2.
> >>   */
> >> -regs[2] = ((regs[2] & 0xf000u) + 1) | ((regs[2] & 0xffu) << 1) | 
> >> 1u;
> >> +regs[2] = ((regs[2] + (1u << 12)) & 0xf000u) |
> >> +  ((regs[2] & 0xffu) << 1) | 1u;
> >>  break;
> >>  
> >>  case 0x800a: {
> >> -- 
> >> 1.8.3.1
> >>
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/4] tools: remove systemd xenstore socket definitions

2016-07-22 Thread Juergen Gross
On 22/07/16 18:31, Wei Liu wrote:
> Only skim-read this patch, will do proper review later.
> 
> On Fri, Jul 22, 2016 at 05:09:28PM +0200, Juergen Gross wrote:
> [...]
>>  CAMLprim value ocaml_launched_by_systemd(value ignore)
>>  {
>> -CAMLparam1(ignore);
>> -CAMLlocal1(ret);
>> +   CAMLparam1(ignore);
>> +   CAMLlocal1(ret);
>>  
>> -ret = Val_false;
>> +   ret = Val_false;
>>  
>> -if (sd_listen_fds(0) > 0)
>> -ret = Val_true;
>> +   if (sd_booted() > 0)
>> +   ret = Val_true;
> 
> I think this may be problematic.
> 
> sd_booted returns true if system is booted with systemd, but it has no
> bearing whether this particular process is launched by systemd.
> 
> IIRC using sd_booted would cause oxenstored thinks it is launched by
> systemd even if the user launches it by hand in a shell. That caused
> it's initialisation to fail.  81d758afca7c3c1e3ccbd78154b33d64fd7757fb
> was written to address that issue.
> 
> So, what would happen if you start oxenstored by hand with your patch
> apply? Maybe we can just remove this launched_by_systemd check all
> together -- i.e. we always call sd_notify?

So you are concerned sd_notify() will be called too often, but you are
suggesting to call it always? I don't understand your concerns then.

> 
>>  
>> -CAMLreturn(ret);
>> +   CAMLreturn(ret);
>>  }
>>  
>>  CAMLprim value ocaml_sd_notify_ready(value ignore)
>>  {
>> -CAMLparam1(ignore);
>> -CAMLlocal1(ret);
>> +   CAMLparam1(ignore);
>> +   CAMLlocal1(ret);
>>  
>> -ret = Val_int(0);
>> +   ret = Val_int(0);
>>  
>> -sd_notify(1, "READY=1");
>> +   sd_notify(1, "READY=1");
>>  
>> -CAMLreturn(ret);
>> +   CAMLreturn(ret);
> 
> It seems that you have introduced quite a few white space changes.
> If you really want to change tabs to spaces, please do that in a
> separate patch.

Oops, didn't mean to do this. Sorry.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/4] tools: remove systemd xenstore socket definitions

2016-07-22 Thread Wei Liu
On Fri, Jul 22, 2016 at 08:49:17PM +0200, Juergen Gross wrote:
> On 22/07/16 18:31, Wei Liu wrote:
> > Only skim-read this patch, will do proper review later.
> > 
> > On Fri, Jul 22, 2016 at 05:09:28PM +0200, Juergen Gross wrote:
> > [...]
> >>  CAMLprim value ocaml_launched_by_systemd(value ignore)
> >>  {
> >> -  CAMLparam1(ignore);
> >> -  CAMLlocal1(ret);
> >> +   CAMLparam1(ignore);
> >> +   CAMLlocal1(ret);
> >>  
> >> -  ret = Val_false;
> >> +   ret = Val_false;
> >>  
> >> -  if (sd_listen_fds(0) > 0)
> >> -  ret = Val_true;
> >> +   if (sd_booted() > 0)
> >> +   ret = Val_true;
> > 
> > I think this may be problematic.
> > 
> > sd_booted returns true if system is booted with systemd, but it has no
> > bearing whether this particular process is launched by systemd.
> > 
> > IIRC using sd_booted would cause oxenstored thinks it is launched by
> > systemd even if the user launches it by hand in a shell. That caused
> > it's initialisation to fail.  81d758afca7c3c1e3ccbd78154b33d64fd7757fb
> > was written to address that issue.
> > 
> > So, what would happen if you start oxenstored by hand with your patch
> > apply? Maybe we can just remove this launched_by_systemd check all
> > together -- i.e. we always call sd_notify?
> 
> So you are concerned sd_notify() will be called too often, but you are
> suggesting to call it always? I don't understand your concerns then.
> 

No, my concern is that you won't be able to start oxenstored from
command line manually if you boot with systemd. At least that was the
bug that caused me to write that patch.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC v3 09/13] firmware: port built-in section to linker table

2016-07-22 Thread Luis R. Rodriguez
This ports built-in firmware to use linker tables,
this replaces the custom section solution with a
generic solution.

This also demos the use of the .rodata (SECTION_RO)
linker tables.

Tested with 0 built-in firmware, 1 and 2 built-in
firmwares successfully.

v3:
o explicitly include tables.h as we no longer include
  tables.h from sections.h

o use new section_tbl_asmtype() helper on firmware/Makefile
  to enable having to unfold things on our own.

v2: introduced this file in this version of the series

Signed-off-by: Luis R. Rodriguez 
---
 arch/x86/kernel/cpu/microcode/core.c |  8 
 drivers/base/firmware_class.c| 12 ++--
 firmware/Makefile|  4 +++-
 include/asm-generic/vmlinux.lds.h|  7 ---
 4 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/core.c 
b/arch/x86/kernel/cpu/microcode/core.c
index df04b2d033f6..3e7c08d99601 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -91,15 +92,14 @@ static bool __init check_loader_disabled_bsp(void)
return *res;
 }
 
-extern struct builtin_fw __start_builtin_fw[];
-extern struct builtin_fw __end_builtin_fw[];
+DECLARE_LINKTABLE_RO(struct builtin_fw, builtin_fw);
 
 bool get_builtin_firmware(struct cpio_data *cd, const char *name)
 {
 #ifdef CONFIG_FW_LOADER
-   struct builtin_fw *b_fw;
+   const struct builtin_fw *b_fw;
 
-   for (b_fw = __start_builtin_fw; b_fw != __end_builtin_fw; b_fw++) {
+   LINKTABLE_FOR_EACH(b_fw, builtin_fw) {
if (!strcmp(name, b_fw->name)) {
cd->size = b_fw->size;
cd->data = b_fw->data;
diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 22d1760a4278..8fbf03c3e4c2 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -43,15 +44,14 @@ MODULE_LICENSE("GPL");
 
 #ifdef CONFIG_FW_LOADER
 
-extern struct builtin_fw __start_builtin_fw[];
-extern struct builtin_fw __end_builtin_fw[];
+DEFINE_LINKTABLE_RO(struct builtin_fw, builtin_fw);
 
 static bool fw_get_builtin_firmware(struct firmware *fw, const char *name,
void *buf, size_t size)
 {
-   struct builtin_fw *b_fw;
+   const struct builtin_fw *b_fw;
 
-   for (b_fw = __start_builtin_fw; b_fw != __end_builtin_fw; b_fw++) {
+   LINKTABLE_FOR_EACH(b_fw, builtin_fw) {
if (strcmp(name, b_fw->name) == 0) {
fw->size = b_fw->size;
fw->data = b_fw->data;
@@ -67,9 +67,9 @@ static bool fw_get_builtin_firmware(struct firmware *fw, 
const char *name,
 
 static bool fw_is_builtin_firmware(const struct firmware *fw)
 {
-   struct builtin_fw *b_fw;
+   const struct builtin_fw *b_fw;
 
-   for (b_fw = __start_builtin_fw; b_fw != __end_builtin_fw; b_fw++)
+   LINKTABLE_FOR_EACH(b_fw, builtin_fw)
if (fw->data == b_fw->data)
return true;
 
diff --git a/firmware/Makefile b/firmware/Makefile
index fa3e81c2a97b..32304a227f09 100644
--- a/firmware/Makefile
+++ b/firmware/Makefile
@@ -155,6 +155,8 @@ quiet_cmd_fwbin = MK_FW   $@
  ASM_ALIGN=$(if $(CONFIG_64BIT),3,2);   \
  PROGBITS=$(if $(CONFIG_ARM),%,@)progbits;  \
  echo "/* Generated by firmware/Makefile */"   > $@;\
+ echo "\#include " >>$@;\
+ echo "\#include "   >>$@;\
  echo ".section .rodata"   >>$@;\
  echo ".p2align $${ASM_ALIGN}" >>$@;\
  echo "_fw_$${FWSTR}_bin:" >>$@;\
@@ -164,7 +166,7 @@ quiet_cmd_fwbin = MK_FW   $@
  echo ".p2align $${ASM_ALIGN}" >>$@;\
  echo "_fw_$${FWSTR}_name:">>$@;\
  echo ".string \"$$FWNAME\""   >>$@;\
- echo ".section .builtin_fw,\"a\",$${PROGBITS}">>$@;\
+ echo "section_tbl_asmtype(SECTION_RODATA, builtin_fw, 
SECTION_ORDER_ANY, a,$${PROGBITS})" >>$@;\
  echo ".p2align $${ASM_ALIGN}" >>$@;\
  echo "$${ASM_WORD} _fw_$${FWSTR}_name">>$@;\
  echo "$${ASM_WORD} _fw_$${FWSTR}_bin" >>$@;\
diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index b3a9cd7bc947..8e31a4454841 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -313,13 +313,6 @@
VMLINUX_SYMBOL(__end_pci_fixups_suspend_late)

[Xen-devel] [RFC v3 01/13] x86: remove LTO_REFERENCE_INITCALL()

2016-07-22 Thread Luis R. Rodriguez
The setup for LTO never made it upstream, and although this has
some users, this is now really old stuff for a gcc 4.7 LTO problem.
We know that at least LTO_REFERENCE_INITCALL() work around can
be removed if LTO is not supported on v4.7 anymore.

As per Andi the DISABLE_LTO and LTO_CFLAGS are still neeeded though.

v3: added to this series, removing LTO_REFERENCE_INITCALL()
justifies that other future similar macros do not need
a respective LTO_REFERENCE_INITCALL() on them therefore
simplifying new code.

Acked-by: Andi Kleen 
Signed-off-by: Luis R. Rodriguez 
---
 include/linux/init.h | 20 +---
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/include/linux/init.h b/include/linux/init.h
index 1e5c131d5c9a..aa662ad80d9c 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -151,23 +151,6 @@ extern bool initcall_debug;
 
 #ifndef __ASSEMBLY__
 
-#ifdef CONFIG_LTO
-/* Work around a LTO gcc problem: when there is no reference to a variable
- * in a module it will be moved to the end of the program. This causes
- * reordering of initcalls which the kernel does not like.
- * Add a dummy reference function to avoid this. The function is
- * deleted by the linker.
- */
-#define LTO_REFERENCE_INITCALL(x) \
-   ; /* yes this is needed */  \
-   static __used __exit void *reference_##x(void)  \
-   {   \
-   return &x;  \
-   }
-#else
-#define LTO_REFERENCE_INITCALL(x)
-#endif
-
 /* initcalls are now grouped by functionality into separate 
  * subsections. Ordering inside the subsections is determined
  * by link order. 
@@ -180,8 +163,7 @@ extern bool initcall_debug;
 
 #define __define_initcall(fn, id) \
static initcall_t __initcall_##fn##id __used \
-   __attribute__((__section__(".initcall" #id ".init"))) = fn; \
-   LTO_REFERENCE_INITCALL(__initcall_##fn##id)
+   __attribute__((__section__(".initcall" #id ".init"))) = fn;
 
 /*
  * Early initcalls run before initializing SMP.
-- 
2.8.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC v3 10/13] jump_label: port __jump_table to linker tables

2016-07-22 Thread Luis R. Rodriguez
Move the __jump_table from the a custom section solution
to a generic solution, this avoiding extra vmlinux.lds.h
customizations.

This also demos the use of the .data (SECTION_DATA) linker
table and of the shared asm call push_section_tbl().

Built-in kernel functionality was tested with CONFIG_STATIC_KEYS_SELFTEST.
Moduler  kernel functionality was tested with CONFIG_TEST_STATIC_KEYS.
Both work as expected.

Since __jump_table sections are also supported per
module this also required expanding module-common.lds.S
to capture and fold all .data.tlb.__jump_table.* onto
the the section __jump_table -- in this case for modules
need to keep a reference in place, given the alternative
is to use DEFINE_LINKTABLE(struct jump_entry, __jump_table)
per module -- and later through macro hacks instantiate
the jump entries per module upon init. This is doable but
we'd loose out on the sorting of the table using the
linker, to sort we'd always still need to expand the
module common linker script. An alternative mechanism
is possible which would make these custom module sections
extensions dynamic without requiring manual changes, this
however is best done later through a separate evolution
once linker tables are in place.

A careful reviewer may note that some architectures use
"\n\t" to separate asm code, while others just use a new line.
Upon review last time it was deemed reasonable to for all
architectures to juse use "\n", this is defined as ASM_CMD_SEP,
and if an architecture needs to override they can do so on their
architecture sections.h prior to including asm-generic/sections.h

v3:

o More elaborate tests performed
o first modular support use case, module tested was
  CONFIG_TEST_STATIC_KEYS (lib/test_static_keys.ko), this
  required us to extend module-common.lds.S
o use generic push_section_tbl_any() for all architectures
o Makes use of ASM_CMD_SEP to enable architectures to override later
  if needed
o guard tables.h inclusion and table definition with __KERNEL__

v2: introduced in this series

Signed-off-by: Luis R. Rodriguez 
---
 arch/arm/include/asm/jump_label.h |  6 --
 arch/arm64/include/asm/jump_label.h   |  6 --
 arch/mips/include/asm/jump_label.h|  6 --
 arch/powerpc/include/asm/jump_label.h |  8 +---
 arch/s390/include/asm/jump_label.h|  6 --
 arch/sparc/include/asm/jump_label.h   |  6 --
 arch/x86/include/asm/jump_label.h | 10 ++
 include/asm-generic/vmlinux.lds.h |  4 
 include/linux/jump_label.h| 10 ++
 kernel/jump_label.c   | 17 ++---
 scripts/module-common.lds.S   |  5 +
 tools/objtool/special.c   |  8 +++-
 12 files changed, 59 insertions(+), 33 deletions(-)

diff --git a/arch/arm/include/asm/jump_label.h 
b/arch/arm/include/asm/jump_label.h
index 34f7b6980d21..960135a7b88e 100644
--- a/arch/arm/include/asm/jump_label.h
+++ b/arch/arm/include/asm/jump_label.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_ARM_JUMP_LABEL_H
 #define _ASM_ARM_JUMP_LABEL_H
 
+#include 
+
 #ifndef __ASSEMBLY__
 
 #include 
@@ -12,7 +14,7 @@ static __always_inline bool arch_static_branch(struct 
static_key *key, bool bran
 {
asm_volatile_goto("1:\n\t"
 WASM(nop) "\n\t"
-".pushsection __jump_table,  \"aw\"\n\t"
+push_section_tbl_any(SECTION_DATA, __jump_table, aw)
 ".word 1b, %l[l_yes], %c0\n\t"
 ".popsection\n\t"
 : :  "i" (&((char *)key)[branch]) :  : l_yes);
@@ -26,7 +28,7 @@ static __always_inline bool arch_static_branch_jump(struct 
static_key *key, bool
 {
asm_volatile_goto("1:\n\t"
 WASM(b) " %l[l_yes]\n\t"
-".pushsection __jump_table,  \"aw\"\n\t"
+push_section_tbl_any(SECTION_DATA, __jump_table, aw)
 ".word 1b, %l[l_yes], %c0\n\t"
 ".popsection\n\t"
 : :  "i" (&((char *)key)[branch]) :  : l_yes);
diff --git a/arch/arm64/include/asm/jump_label.h 
b/arch/arm64/include/asm/jump_label.h
index 1b5e0e843c3a..aa52cd2607e3 100644
--- a/arch/arm64/include/asm/jump_label.h
+++ b/arch/arm64/include/asm/jump_label.h
@@ -19,6 +19,8 @@
 #ifndef __ASM_JUMP_LABEL_H
 #define __ASM_JUMP_LABEL_H
 
+#include 
+
 #ifndef __ASSEMBLY__
 
 #include 
@@ -29,7 +31,7 @@
 static __always_inline bool arch_static_branch(struct static_key *key, bool 
branch)
 {
asm goto("1: nop\n\t"
-".pushsection __jump_table,  \"aw\"\n\t"
+push_section_tbl_any(SECTION_DATA, __jump_table, aw)
 ".align 3\n\t"
 ".quad 1b, %l[l_yes], %c0\n\t"
 ".popsection\n\t"
@@ -43,7 +45,7 @@ l_yes:
 static __always_inline bool arch_static_branch_jump(struct static_key *key, 
bool branch)
 {
asm goto("1: b %l[l_yes]\n\t"
-".pushsection __jump_table,  \"aw\"\n\t"
+push_section_tbl_any(SECTION_DATA, __jump_table, aw)
  

[Xen-devel] [RFC v3 08/13] firmware/Makefile: force recompilation if makefile changes

2016-07-22 Thread Luis R. Rodriguez
If you modify the target asm we currently do not force the
recompilation of the firmware files. The target asm is in
the firmware/Makefile, peg this file as a dependency to
require re-compilation of firmware targets when the asm
changes.

v3: introduced in this series

Signed-off-by: Luis R. Rodriguez 
---
 firmware/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/firmware/Makefile b/firmware/Makefile
index e297e1b52636..fa3e81c2a97b 100644
--- a/firmware/Makefile
+++ b/firmware/Makefile
@@ -176,7 +176,8 @@ quiet_cmd_fwbin = MK_FW   $@
 wordsize_deps := $(wildcard include/config/64bit.h include/config/32bit.h \
include/config/ppc32.h include/config/ppc64.h \
include/config/superh32.h include/config/superh64.h \
-   include/config/x86_32.h include/config/x86_64.h)
+   include/config/x86_32.h include/config/x86_64.h \
+   firmware/Makefile)
 
 $(patsubst %,$(obj)/%.gen.S, $(fw-shipped-y)): %: $(wordsize_deps)
$(call cmd,fwbin,$(patsubst %.gen.S,%,$@))
-- 
2.8.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC v3 07/13] tables.h: add linker table support

2016-07-22 Thread Luis R. Rodriguez
A linker table is a data structure that is stitched together from items
in multiple object files. Linux has historically implicitly used linker
tables for ages, however they were all built in an adhoc manner which
requires linker script modifications, per architecture. This adds a
general linker table solution so that a new linker table can be
implemented by changing C code only. The Linux linker table was
originally based on Michael Brown's iPXE's linker table solution but
has been significantly modified to fit Linux's use in its integration.

The same philosophy is borrowed, extended and further simplified:

Linker tables enable an extremely light weight linker build time
solution for feature ordering and selection, this can help to both
simplify init sequences in a generic fashion and helps avoiding code
bit-rotting when desirable.

Bit rotting avoidance is enabled by *granting the option* to force
compilation of code and only enable linking object code in when
specific features have been enabled. It accomplishes this by using
linker sections, the lightweight feature ordering of code is enabled
by taking advantage of the old binutils ld SORT() on features
specific sections.

Contrary to iPXE's solution, which strives to force compilation
of all features, Linux' solution strives to encourage usage of linker
tables to force compilation of code *only on designated code*. Linker
tables can be used without requiring one to force code compilation
of features that use it though, to take advantage of the simplification
of init sequences.

Linux code that uses linker tables *and* wishes to always require
compilation but selectively linking must use new kbuild target object
that helps accomplishes this, table-y. Linux code that uses linker
tables but does not want to require forcing compilation can use the
good 'ol obj-y targets.

Used commit 67a10ef000cb7 from iPXE upstream [0] as the starting point
for development and evaluation of Linux's integration, further changes
made and evaluation of different linker table options for Linux are
available on the linker-table userspace tree [1].

[0] git://git.ipxe.org/ipxe.git
[1] https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linker-tables.git/

v3:

o addressed initial modular support test cases
o added generic asm macros so linker tables can be used in
  asm code / C asm calls
o section ranges are now split up into their own set of files
o use asm/sections.h instead of linux/sections.h for the linker
  script
o add a sections.h file for each architecture that was missing one,
  this is needed now as we'll be relying on sections.h for custom
  section types in code rather than custom architecture specific
  linker script hacks.
o full rewrite at this point, decided to pick copyleft-next license
  for this work

v2:

o modified completely to match feedback by community, made equivalent
  modifications to userspace solution. This is pretty much a complete
  rewrite of how we present and use linker tables. By using standard
  sections we no longer have to make custom linker script extensions
  for each new linker table solution, you just pick a linker table
  type by section type.
o extend documention considerably, including use of kdoc
o drop ICC hacks per popular request to ignore such issues for now
o use sections.h - this lets us streamline a clean use case of
  well documented sections. To help further with this make use of
  SECTION_TBL() to allow use of these in code and SECTION_TBL_ALL()
  on linker scripts, as well as SECTION_TBL_ALL_STR() on relocs.c
  when needed.

Cc: Michael Brown 
Signed-off-by: Luis R. Rodriguez 
---
 Documentation/DocBook/Makefile   |   2 +-
 Documentation/DocBook/linker-tables.tmpl | 166 +
 Documentation/kbuild/makefiles.txt   |  19 +
 arch/alpha/include/asm/tables.h  |   6 +
 arch/arc/include/asm/tables.h|   6 +
 arch/arm/include/asm/tables.h|   6 +
 arch/arm64/include/asm/tables.h  |   6 +
 arch/avr32/include/asm/tables.h  |   6 +
 arch/blackfin/include/asm/tables.h   |   6 +
 arch/c6x/include/asm/tables.h|   6 +
 arch/cris/include/asm/tables.h   |   6 +
 arch/frv/include/asm/tables.h|   6 +
 arch/h8300/include/asm/tables.h  |   6 +
 arch/hexagon/include/asm/tables.h|   6 +
 arch/ia64/include/asm/tables.h   |   6 +
 arch/m32r/include/asm/tables.h   |   6 +
 arch/m68k/include/asm/tables.h   |   6 +
 arch/metag/include/asm/tables.h  |   6 +
 arch/microblaze/include/asm/tables.h |   6 +
 arch/mips/include/asm/tables.h   |   6 +
 arch/mn10300/include/asm/tables.h|   6 +
 arch/nios2/include/asm/tables.h  |   6 +
 arch/openrisc/include/asm/tables.h   |   6 +
 arch/parisc/include/asm/tables.h |   6 +
 arch/powerpc/include/asm/tables.h|   6 +
 arch/s390/include/asm/tables.h   |   6 +
 arch/score/include/asm/tables.h  |   6 +
 ar

[Xen-devel] [RFC v3 02/13] dell-smo8800: include uaccess.h

2016-07-22 Thread Luis R. Rodriguez
sections.h is currently included and it provides dell-smo8800
with what it needs, an upcoming change will decouple uaccess.h
from sections.h. This driver needs to explicitly require uaccess.h
before this change.

v3: new to this series -- needed due to collateral of the split of
old linker tables from tables.h into 3 files: sections.h, ranges.h
and tables.h

Signed-off-by: Luis R. Rodriguez 
---
 drivers/platform/x86/dell-smo8800.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/platform/x86/dell-smo8800.c 
b/drivers/platform/x86/dell-smo8800.c
index 0aec4fd4c48e..37e646034ef8 100644
--- a/drivers/platform/x86/dell-smo8800.c
+++ b/drivers/platform/x86/dell-smo8800.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct smo8800_device {
u32 irq; /* acpi device irq */
-- 
2.8.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC v3 06/13] ranges.h: add helpers to build and identify Linux section ranges

2016-07-22 Thread Luis R. Rodriguez
Section ranges are on one of the types of custom sections
types used in Linux. This provides a series of helpers for
defining them and using them. Most importantly this also
enables us to avoid modifying the linker script when we
add new section range.

It turns out a lot of custom sections are actually section ranges,
and these are typically spelled out in their architecture specific
asm/sections.h file -- we anable architectures to override what asm
is used for section ranges but start by defult trusting the asm-generic
version all around.

v3: new in this series, uses copyleft-next

Signed-off-by: Luis R. Rodriguez 
---
 arch/alpha/include/asm/ranges.h  |  6 
 arch/arc/include/asm/ranges.h|  6 
 arch/arm/include/asm/ranges.h|  6 
 arch/arm64/include/asm/ranges.h  |  6 
 arch/avr32/include/asm/ranges.h  |  6 
 arch/blackfin/include/asm/ranges.h   |  6 
 arch/c6x/include/asm/ranges.h|  6 
 arch/cris/include/asm/ranges.h   |  6 
 arch/frv/include/asm/ranges.h|  6 
 arch/h8300/include/asm/ranges.h  |  6 
 arch/hexagon/include/asm/ranges.h|  6 
 arch/ia64/include/asm/ranges.h   |  6 
 arch/m32r/include/asm/ranges.h   |  6 
 arch/m68k/include/asm/ranges.h   |  6 
 arch/metag/include/asm/ranges.h  |  6 
 arch/microblaze/include/asm/ranges.h |  6 
 arch/mips/include/asm/ranges.h   |  6 
 arch/mn10300/include/asm/ranges.h|  6 
 arch/nios2/include/asm/ranges.h  |  6 
 arch/openrisc/include/asm/ranges.h   |  6 
 arch/parisc/include/asm/ranges.h |  6 
 arch/powerpc/include/asm/ranges.h|  6 
 arch/s390/include/asm/ranges.h   |  6 
 arch/score/include/asm/ranges.h  |  6 
 arch/sh/include/asm/ranges.h |  6 
 arch/sparc/include/asm/ranges.h  |  6 
 arch/tile/include/asm/ranges.h   |  6 
 arch/um/include/asm/ranges.h |  6 
 arch/unicore32/include/asm/ranges.h  |  6 
 arch/x86/include/asm/ranges.h|  6 
 arch/xtensa/include/asm/ranges.h |  6 
 include/asm-generic/ranges.h | 70 
 include/linux/ranges.h   | 54 
 33 files changed, 310 insertions(+)
 create mode 100644 arch/alpha/include/asm/ranges.h
 create mode 100644 arch/arc/include/asm/ranges.h
 create mode 100644 arch/arm/include/asm/ranges.h
 create mode 100644 arch/arm64/include/asm/ranges.h
 create mode 100644 arch/avr32/include/asm/ranges.h
 create mode 100644 arch/blackfin/include/asm/ranges.h
 create mode 100644 arch/c6x/include/asm/ranges.h
 create mode 100644 arch/cris/include/asm/ranges.h
 create mode 100644 arch/frv/include/asm/ranges.h
 create mode 100644 arch/h8300/include/asm/ranges.h
 create mode 100644 arch/hexagon/include/asm/ranges.h
 create mode 100644 arch/ia64/include/asm/ranges.h
 create mode 100644 arch/m32r/include/asm/ranges.h
 create mode 100644 arch/m68k/include/asm/ranges.h
 create mode 100644 arch/metag/include/asm/ranges.h
 create mode 100644 arch/microblaze/include/asm/ranges.h
 create mode 100644 arch/mips/include/asm/ranges.h
 create mode 100644 arch/mn10300/include/asm/ranges.h
 create mode 100644 arch/nios2/include/asm/ranges.h
 create mode 100644 arch/openrisc/include/asm/ranges.h
 create mode 100644 arch/parisc/include/asm/ranges.h
 create mode 100644 arch/powerpc/include/asm/ranges.h
 create mode 100644 arch/s390/include/asm/ranges.h
 create mode 100644 arch/score/include/asm/ranges.h
 create mode 100644 arch/sh/include/asm/ranges.h
 create mode 100644 arch/sparc/include/asm/ranges.h
 create mode 100644 arch/tile/include/asm/ranges.h
 create mode 100644 arch/um/include/asm/ranges.h
 create mode 100644 arch/unicore32/include/asm/ranges.h
 create mode 100644 arch/x86/include/asm/ranges.h
 create mode 100644 arch/xtensa/include/asm/ranges.h
 create mode 100644 include/asm-generic/ranges.h
 create mode 100644 include/linux/ranges.h

diff --git a/arch/alpha/include/asm/ranges.h b/arch/alpha/include/asm/ranges.h
new file mode 100644
index ..b7bc4107bc75
--- /dev/null
+++ b/arch/alpha/include/asm/ranges.h
@@ -0,0 +1,6 @@
+#ifndef _ASM_ALPHA_RANGES_H
+#define _ASM_ALPHA_RANGES_H
+
+#include 
+
+#endif /* _ASM_ALPHA_RANGES_H */
diff --git a/arch/arc/include/asm/ranges.h b/arch/arc/include/asm/ranges.h
new file mode 100644
index ..a761243bf79f
--- /dev/null
+++ b/arch/arc/include/asm/ranges.h
@@ -0,0 +1,6 @@
+#ifndef _ASM_ARC_RANGES_H
+#define _ASM_ARC_RANGES_H
+
+#include 
+
+#endif /* _ASM_ARC_RANGES_H */
diff --git a/arch/arm/include/asm/ranges.h b/arch/arm/include/asm/ranges.h
new file mode 100644
index ..fb059e205f90
--- /dev/null
+++ b/arch/arm/include/asm/ranges.h
@@ -0,0 +1,6 @@
+#ifndef _ASM_ARM_RANGES_H
+#define _ASM_ARM_RANGES_H
+
+#include 
+
+#endif /* _ASM_ARM_RANGES_H */
diff --git a/arch/arm64/include/asm/ranges.h b/arch/arm64/include/asm/ranges.h
new fil

[Xen-devel] [RFC v3 03/13] scripts/module-common.lds: enable generation

2016-07-22 Thread Luis R. Rodriguez
scripts/module-common.lds is currently pretty static, in the
future this may change and we will want access to kernel macros
to help expands certain areas. To get access to use macros we
need to generate module-common.lds from module-common.lds.S, for
now though only enable the generation. We'll later expand on this
as needed.

Since this file is now generated add it to .gitignore.

v3: new to this series

Signed-off-by: Luis R. Rodriguez 
---
 .gitignore | 2 ++
 Makefile   | 6 +-
 scripts/Makefile.modpost   | 2 +-
 scripts/{module-common.lds => module-common.lds.S} | 0
 4 files changed, 8 insertions(+), 2 deletions(-)
 rename scripts/{module-common.lds => module-common.lds.S} (100%)

diff --git a/.gitignore b/.gitignore
index 2be25f771bd8..14a94d939a6f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -113,3 +113,5 @@ all.config
 
 # Kdevelop4
 *.kdev4
+
+scripts/module-common.lds
diff --git a/Makefile b/Makefile
index caa33e007a8c..c56cca906769 100644
--- a/Makefile
+++ b/Makefile
@@ -408,6 +408,10 @@ KBUILD_AFLAGS_MODULE  := -DMODULE
 KBUILD_CFLAGS_MODULE  := -DMODULE
 KBUILD_LDFLAGS_MODULE := -T $(srctree)/scripts/module-common.lds
 
+$(srctree)/scripts/module-common.lds: $(srctree)/scripts/module-common.lds.S
+   $(Q)$(CC) $(CFLAGS) $(INCLUDES) $(LINUXINCLUDE) -E -P \
+   -D__ASSEMBLY__ -DLINKER_SCRIPT -o $@ $<
+
 # Read KERNELRELEASE from include/config/kernel.release (if it exists)
 KERNELRELEASE = $(shell cat include/config/kernel.release 2> /dev/null)
 KERNELVERSION = $(VERSION)$(if $(PATCHLEVEL),.$(PATCHLEVEL)$(if 
$(SUBLEVEL),.$(SUBLEVEL)))$(EXTRAVERSION)
@@ -1178,7 +1182,7 @@ all: modules
 # using awk while concatenating to the final file.
 
 PHONY += modules
-modules: $(vmlinux-dirs) $(if $(KBUILD_BUILTIN),vmlinux) modules.builtin
+modules: $(srctree)/scripts/module-common.lds $(vmlinux-dirs) $(if 
$(KBUILD_BUILTIN),vmlinux) modules.builtin
$(Q)$(AWK) '!x[$$0]++' $(vmlinux-dirs:%=$(objtree)/%/modules.order) > 
$(objtree)/modules.order
@$(kecho) '  Building modules, stage 2.';
$(Q)$(MAKE) -f $(srctree)/scripts/Makefile.modpost
diff --git a/scripts/Makefile.modpost b/scripts/Makefile.modpost
index 1366a94b6c39..2d8aff7735d6 100644
--- a/scripts/Makefile.modpost
+++ b/scripts/Makefile.modpost
@@ -121,7 +121,7 @@ quiet_cmd_ld_ko_o = LD [M]  $@
  $(KBUILD_LDFLAGS_MODULE) $(LDFLAGS_MODULE) \
  -o $@ $(filter-out FORCE,$^)
 
-$(modules): %.ko :%.o %.mod.o FORCE
+$(modules): %.ko : $(srctree)/scripts/module-common.lds %.o %.mod.o FORCE
$(call if_changed,ld_ko_o)
 
 targets += $(modules)
diff --git a/scripts/module-common.lds b/scripts/module-common.lds.S
similarity index 100%
rename from scripts/module-common.lds
rename to scripts/module-common.lds.S
-- 
2.8.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC v3 13/13] kprobes: port blacklist kprobes to linker table

2016-07-22 Thread Luis R. Rodriguez
kprobe makes use of two sections, the one dealing with the actual
kprobes was recently ported using the standard section range API.
The blacklist functionality of kprobes is still using a custom
section and declaring its custom section using the linker script
as follows:

type  Linux-section custom section name  beginend
table .init.data_kprobe_blacklist__start_kprobe_blacklist 
__stop_kprobe_blacklist

This ports the _kprobe_blacklist custom section to the standard
Linux linker table API allowing us remove all the custom blacklist
kprobe section declarations from the linker script.

This has been tested by trying to register a kprobe on a blacklisted
symbol (these are declared with NOKPROBE_SYMBOL()), and confirms that
this fails to work as expected. This was tested with:

 # insmod samples/kprobes/kprobe_example.ko symbol="get_kprobe"

This fails to load as expected with:

insmod: ERROR: could not insert module samples/kprobes/kprobe_example.ko: 
Invalid parameters

v3: this patch was introduced in this series

Signed-off-by: Luis R. Rodriguez 
---
 include/asm-generic/vmlinux.lds.h | 10 --
 include/linux/kprobes.h   |  5 +++--
 kernel/kprobes.c  | 11 ---
 3 files changed, 7 insertions(+), 19 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 1664050e6560..0e4df8c61c18 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -113,15 +113,6 @@
 #define BRANCH_PROFILE()
 #endif
 
-#ifdef CONFIG_KPROBES
-#define KPROBE_BLACKLIST() . = ALIGN(8); \
-   VMLINUX_SYMBOL(__start_kprobe_blacklist) = .; \
-   *(_kprobe_blacklist)  \
-   VMLINUX_SYMBOL(__stop_kprobe_blacklist) = .;
-#else
-#define KPROBE_BLACKLIST()
-#endif
-
 #ifdef CONFIG_EVENT_TRACING
 #define FTRACE_EVENTS(). = ALIGN(8);   
\
VMLINUX_SYMBOL(__start_ftrace_events) = .;  \
@@ -519,7 +510,6 @@
*(SECTION_INIT_RODATA)  \
FTRACE_EVENTS() \
TRACE_SYSCALLS()\
-   KPROBE_BLACKLIST()  \
MEM_DISCARD(init.rodata)\
CLK_OF_TABLES() \
RESERVEDMEM_OF_TABLES() \
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 3f46b282a3f9..c9bb9caef70c 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -43,9 +43,11 @@
 
 #ifdef CONFIG_KPROBES
 #include 
+#include 
 #include 
 
 DECLARE_SECTION_RANGE(kprobes);
+DECLARE_LINKTABLE(unsigned long, _kprobe_blacklist);
 
 /* kprobe_status settings */
 #define KPROBE_HIT_ACTIVE  0x0001
@@ -490,8 +492,7 @@ static inline int enable_jprobe(struct jprobe *jp)
  * by using this macro.
  */
 #define __NOKPROBE_SYMBOL(fname)   \
-static unsigned long __used\
-   __attribute__((section("_kprobe_blacklist")))   \
+static LINKTABLE_INIT_DATA(_kprobe_blacklist, all) \
_kbl_addr_##fname = (unsigned long)fname;
 #define NOKPROBE_SYMBOL(fname) __NOKPROBE_SYMBOL(fname)
 #else
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 387605682622..4801aa3b4adf 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2053,14 +2053,13 @@ NOKPROBE_SYMBOL(dump_kprobe);
  * since a kprobe need not necessarily be at the beginning
  * of a function.
  */
-static int __init populate_kprobe_blacklist(unsigned long *start,
-unsigned long *end)
+static int __init populate_kprobe_blacklist(void)
 {
unsigned long *iter;
struct kprobe_blacklist_entry *ent;
unsigned long entry, offset = 0, size = 0;
 
-   for (iter = start; iter < end; iter++) {
+   LINKTABLE_FOR_EACH(iter, _kprobe_blacklist) {
entry = arch_deref_entry_point((void *)*iter);
 
if (!kernel_text_address(entry) ||
@@ -2125,8 +2124,7 @@ static struct notifier_block kprobe_module_nb = {
 };
 
 /* Markers of _kprobe_blacklist section */
-extern unsigned long __start_kprobe_blacklist[];
-extern unsigned long __stop_kprobe_blacklist[];
+DEFINE_LINKTABLE_INIT_DATA(unsigned long, _kprobe_blacklist);
 
 /* Actual kprobes section range */
 DEFINE_SECTION_RANGE(kprobes, SECTION_TEXT);
@@ -2143,8 +2141,7 @@ static int __init init_kprobes(void)
raw_spin_lock_init(&(kretprobe_table_locks[i].lock));
}
 
-   err = populate_kprobe_blacklist(__start_kprobe_blacklist,
-   __stop_kprobe_blacklist);
+   err = populate_kprobe

[Xen-devel] [RFC v3 05/13] sections.h: add sections header to collect all section info

2016-07-22 Thread Luis R. Rodriguez
Linux makes extensive use of custom ELF header sections,
documentation for these are well scatterred. Unify this
documentation in a central place and provide helpers to
build custom Linux sections.

We are now generalizing sections.h in code to enable
avoiding modifying the linker when we want to add new
custom Linux sections but to make this generally useful
we need to ensure all architectures also have their own
asm/sections.h, this is now also used by the custom series
of helpers to an architecture's linker script:

  asm-generic/vmlinux.lds.h.

v3:

o add missing sections.h for architectures that did not
  have it

o move generic sections to asm-generic/sections.h

o add generic asm helpers section_type(), section_type_asmtype(),
  push_section_type() -- these helpers enable easy use for
  for later declaring and using of custom linux sections using
  more standard APIs in both C code, asm code (C asm calls, or
  asm files), enabling future standardized section types to
  be more immediately accessible to asm code, not just C code.
  Note for ASM_CMD_SEP we use by default "\n", architectures needed
  to override can do so on their own sections.h prior to inclusion
  of asm-generic/sections.h

Signed-off-by: Luis R. Rodriguez 
---
 Documentation/DocBook/Makefile|   3 +-
 Documentation/DocBook/sections.tmpl   | 112 +
 arch/alpha/include/asm/sections.h |   6 +
 arch/arm64/include/asm/sections.h |   6 +
 arch/avr32/include/asm/sections.h |   6 +
 arch/cris/include/asm/sections.h  |   6 +
 arch/h8300/include/asm/sections.h |   6 +
 arch/hexagon/include/asm/sections.h   |   6 +
 arch/m32r/include/asm/sections.h  |   6 +
 arch/m68k/include/asm/sections.h  |   6 +
 arch/metag/include/asm/sections.h |   6 +
 arch/mips/include/asm/sections.h  |   6 +
 arch/mn10300/include/asm/sections.h   |   6 +
 arch/nios2/include/asm/sections.h |   6 +
 arch/openrisc/include/asm/sections.h  |   6 +
 arch/score/include/asm/sections.h |   6 +
 arch/unicore32/include/asm/sections.h |   6 +
 arch/xtensa/include/asm/sections.h|   6 +
 include/asm-generic/sections.h| 300 ++
 include/asm-generic/vmlinux.lds.h |  27 +--
 include/linux/sections.h  | 123 ++
 21 files changed, 647 insertions(+), 14 deletions(-)
 create mode 100644 Documentation/DocBook/sections.tmpl
 create mode 100644 arch/alpha/include/asm/sections.h
 create mode 100644 arch/arm64/include/asm/sections.h
 create mode 100644 arch/avr32/include/asm/sections.h
 create mode 100644 arch/cris/include/asm/sections.h
 create mode 100644 arch/h8300/include/asm/sections.h
 create mode 100644 arch/hexagon/include/asm/sections.h
 create mode 100644 arch/m32r/include/asm/sections.h
 create mode 100644 arch/m68k/include/asm/sections.h
 create mode 100644 arch/metag/include/asm/sections.h
 create mode 100644 arch/mips/include/asm/sections.h
 create mode 100644 arch/mn10300/include/asm/sections.h
 create mode 100644 arch/nios2/include/asm/sections.h
 create mode 100644 arch/openrisc/include/asm/sections.h
 create mode 100644 arch/score/include/asm/sections.h
 create mode 100644 arch/unicore32/include/asm/sections.h
 create mode 100644 arch/xtensa/include/asm/sections.h
 create mode 100644 include/linux/sections.h

diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile
index 8e6dd4b14314..1b2ac4731418 100644
--- a/Documentation/DocBook/Makefile
+++ b/Documentation/DocBook/Makefile
@@ -17,7 +17,8 @@ DOCBOOKS := z8530book.xml device-drivers.xml \
80211.xml debugobjects.xml sh.xml regulator.xml \
alsa-driver-api.xml writing-an-alsa-driver.xml \
tracepoint.xml media_api.xml w1.xml \
-   writing_musb_glue_layer.xml crypto-API.xml iio.xml
+   writing_musb_glue_layer.xml crypto-API.xml iio.xml \
+   sections.xml
 
 include Documentation/DocBook/media/Makefile
 
diff --git a/Documentation/DocBook/sections.tmpl 
b/Documentation/DocBook/sections.tmpl
new file mode 100644
index ..2d14eb20e99c
--- /dev/null
+++ b/Documentation/DocBook/sections.tmpl
@@ -0,0 +1,112 @@
+
+http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"; []>
+
+  
+Linux ELF sections
+
+  Explains Linux ELF sections
+
+
+
+  2016
+  Luis R. Rodriguez
+
+
+
+  
+Luis
+Rodriguez
+
+  mcg...@kernel.org
+
+  
+
+
+
+  
+This documentation is free software; you can redistribute
+it and/or modify it under the terms of the GNU General Public
+License version 2 as published by the Free Software Foundation.
+  
+  
+This documentation is distributed in the hope that it will be
+useful, but WITHOUT ANY WARRANTY; without even the implied
+warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+See the GNU General Public License for more detail

  1   2   >