Re: [Xen-devel] [PATCH v4 04/11] x86/intel_pstate: avoid calling cpufreq_add_cpu() twice

2015-07-27 Thread Wang, Wei W
On 24/07/2015 21:36,  Jan Beulich wrote:
> >>> On 25.06.15 at 13:15,  wrote:
> > cpufreq_add_cpu() is already called in the hypercall code path (the
> > bottom of set_px_pminfo() and inside cpufreq_cpu_init()).
> > So, we remove the redundant calling here.
> 
> While I can see that currently the call is kind of pointless (as it can't do
> anything useful before Dom0 communicated the data obtained from ACPI),
> it's still logically correct to call the callback on the BP prior to 
> registering a hook
> for AP bringup. Otherwise you could (and perhaps should) as well defer the
> CPU notifier registration.
> 
> Otoh now that you're trying to introduce a driver independent of ACPI (and
> hence initialized at boot time) I wonder why you don't make use of what is
> here instead of deleting it.
> 

Ok, I will roll back to leave cpufreq_presmp_init() there.

Best,
Wei
> 
> > --- a/xen/drivers/cpufreq/cpufreq.c
> > +++ b/xen/drivers/cpufreq/cpufreq.c
> > @@ -632,8 +632,6 @@ static struct notifier_block cpu_nfb = {
> >
> >  static int __init cpufreq_presmp_init(void)  {
> > -void *cpu = (void *)(long)smp_processor_id();
> > -cpu_callback(&cpu_nfb, CPU_ONLINE, cpu);
> >  register_cpu_notifier(&cpu_nfb);
> >  return 0;
> >  }
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 05/11] x86/intel_pstate: relocate the driver register function

2015-07-27 Thread Wang, Wei W
On 24/07/2015 21:36,  Jan Beulich wrote:
> >>> On 25.06.15 at 13:16,  wrote:
> > Register the CPU hotplug notifier when the driver is registered, and
> > move the driver register function to the cpufreq.c.
> 
> At the very least this ought to be merged with the previous patch.
> 
> > --- a/xen/drivers/cpufreq/cpufreq.c
> > +++ b/xen/drivers/cpufreq/cpufreq.c
> > @@ -630,10 +630,18 @@ static struct notifier_block cpu_nfb = {
> >  .notifier_call = cpu_callback
> >  };
> >
> > -static int __init cpufreq_presmp_init(void)
> > +int cpufreq_register_driver(struct cpufreq_driver *driver_data)
> >  {
> > +if (!driver_data || !driver_data->init ||
> > +!driver_data->verify || !driver_data->exit ||
> > +(!driver_data->target == !driver_data->setpolicy))
> > +return -EINVAL;
> > +
> > +if (cpufreq_driver)
> > +return -EBUSY;
> > +
> > +cpufreq_driver = driver_data;
> > +
> >  register_cpu_notifier(&cpu_nfb);
> >  return 0;
> >  }
> > -presmp_initcall(cpufreq_presmp_init);
> 
> But then the code is left inconsistent: When will the notifier be called for 
> all
> the CPUs that are already up?

Since I will keep the cpufreq_presmp_init() there, 
register_cpu_notifier(&cpu_nfb) will be called in the presmp function there.

Best,
Wei

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] build: use correct qemu path in systemd service file and init script

2015-07-27 Thread Ian Campbell
On Sun, 2015-07-26 at 00:19 +0800, Ting-Wei Lan wrote:
> Ian Campbell 於 西元2015年07月24日 16:57 寫道:
> 
> > I think if $withval is yes and we are converting that to "qemu" then
> > QEMU_XEN_PATH should just be "qemu" and we should substitute that in
> > the initscript too. IOW the "taken from $PATH" applies just as much to
> > the initscript usage as it does to the toolstack.
> 
> Yes, we can use "qemu" in init scripts, but systemd service files 
> require absolute paths. We still have to do different things such as 
> "/usr/bin/env qemu" for systemd.

That's fine, IMHO systemd induced oddness ought to be confined to
systemd specific places.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Xen master hangs

2015-07-27 Thread Doug Goldstein
I am currently trying to get Xen to boot on my Lenovo T430 laptop with
BIOS 2.68 and using master results in no output to the screen and the
machine being completely hung (must pull the power cable). I've gone
ahead and bisected it and the resultant commit is:


-- 
Doug Goldstein

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Xen master hangs

2015-07-27 Thread Doug Goldstein


I am currently trying to get Xen to boot on my Lenovo T430 laptop with
BIOS 2.68 and using master results in no output to the screen and the
machine being completely hung (must pull the power cable). I've gone
ahead and bisected it and the resultant commit is:

commit 4824bdfdabebd4042277461cb3cbefa61c624804
Author: Chao Peng 
Date:   Tue Jul 7 15:42:49 2015 +0200

x86: add socket_cpumask

Maintain socket_cpumask which contains all the HT and core siblings
in the same socket.

Signed-off-by: Chao Peng 
Acked-by: Jan Beulich 

Since I'm not getting any output I'm at a loss how to debug this problem
further. I do have Intel AMT setup with Serial Over LAN working so if its
at all possible to get more info over serial I will do that. I am already
booting
Xen with console=vga,com1 com1=115200,amt loglvl=all so not sure what
else to add.

Thanks.
-- 
Doug Goldstein
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] How to build a linux based stub domain

2015-07-27 Thread Xuehan Xu
Hi, everyone.

Is there any way to run a stub domain on linux not mini-os?

Thanks:-)
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Interested in taking up a project

2015-07-27 Thread Abhinav Gupta
Hii everyone :) ,
   I'm quite familiar with the linux powerclamp driver now.
I have also started looking into xen's code as Dario suggested, but am not
able to find proper documentation for xen.
These are my doubts:
1. Looking for a brief explanation of different fields in scheduler data
structure in sched-if.h
2. From where do the different fields of scheduler structure gets called.
3. The driver i'll be writing will it be running at host machine level or
guest OS level ?. As far as my understanding goes we should have it at host
level to optimize the performance of all the guests, since VMs deal with
the abstract interface (VCPU) so they wont be having the exact notion of
the various parameters of cpu at runtime.

Please let me know if I'm wrong anywhere.

Thanks,
Abhinav

On Mon, Jul 13, 2015 at 3:15 PM, Dario Faggioli 
wrote:

> On Sat, 2015-07-11 at 02:03 +0530, Abhinav Gupta wrote:
> > Hi everyone,
> >
> Hey, :-)
>
> >   I'm sorry for the late update. Actually I had another  project going
> > on in parallel, didn't want to distribute efforts.
> >
> Sure, no problem.
>
> > I went through the implementation approach of powerclamp, it controls
> > power consumption by managing C states of the core. This was my
> > learning so far. Code makes a  little sense to me, I'll need some more
> > time to get hands on powerclamp's code ( I'hv no experience with linux
> > kernel code). After this I'll start exploring Xen.
> >
> Right. Bear in mind that, with respect to this, Linux and Xen are quite
> different. Or at least, that's certainly true for scheduling... for
> ACPI, there might be similarities due to the fact that ACPI support in
> Xen is inspired to Linux one, but I'm no expert in that, so I don't
> really know.
>
> The point I wanted to make was, although some understanding on how
> things work in Linux, in order to figure out what PowerClamp really
> does, is necessary, start focusing on Xen ASAP, as that is your
> target! :-)
>
> > @Dario I'll look into how popular it is in the linux world and if
> > there are some real popular real space applications built on top of
> > it.  I'll put my findings here.
> >
> Ok, that would be great.
>
> Thanks and Regards,
> Dario
>
> --
> <> (Raistlin Majere)
> -
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
>
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxc: fix memory leak in migration v2

2015-07-27 Thread Ian Campbell
On Sun, 2015-07-26 at 17:47 +0100, Andrew Cooper wrote:
> For 4.7 (which happens to coincide with the splitting up of libxc), I
> recommend introducing xc_unmap()

I thought I'd done that in my latest series, but looking for the
precise name now I see that there is no such function and
xenforeignmemory_map() still documents unmapping with munmap. I shall
make a note to introduce xenforeignmemory_unmap()!

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] libxc: fix memory leak in migration v2

2015-07-27 Thread Ian Campbell
On Sun, 2015-07-26 at 22:36 +0100, Andrew Cooper wrote:
> On 26/07/2015 22:34, Wei Liu wrote:
> > Originally there was only one counter to keep track of pages. It 
> was
> > used erroneously to keep track of how many pages were mapped and 
> how
> > many pages needed to be sent. In the end munmap(2) always had 0 as 
> the
> > length argument, which resulted in leaking the mapping.
> >
> > This problem was discovered on 32bit toolstack because 32bit 
> applications
> > have notably smaller address space. In fact this bug affects 64bit
> > toolstack too.
> >
> > Use a separate counter to keep track of the number of mapped pages 
> to
> > solve this problem.
> >
> > Signed-off-by: Wei Liu 
> > ---
> > Cc: Andrew Cooper 
> 
> Reviewed-by: Andrew Cooper 

acked and applied, thanks all.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/15+5+5] Begin to disentangle libxenctrl and provide some stable libraries

2015-07-27 Thread Ian Campbell
On Wed, 2015-07-22 at 12:12 +0100, Ian Campbell wrote:
> On Wed, 2015-07-15 at 16:53 +0100, Andrew Cooper wrote:
> > 
> > > I'm stilling mulling over putting everything into tools/libs/FOO
> > > instead of tools/libxenFOO
> > 
> > On balance, +1.  tools/ is already quite a mixed bag of stuff.
> 
> OK, I think I'm going to go ahead with this.
> 
> > > , I still haven't but I could if people think
> > > it is worthwhile. Eventually I'd like to split libxc into 
> > > libxenguest
> > > and libxenctrl to cut down on the amount of strange cross talk...
> > 
> > Very much +1.
> 
> OK (eventually ;-))
> 
> > FWIW, also splitting xl and libxl into different directories.
> 
> Yes, good idea (also "eventually ;-)").
> 
> While I'm accumulating TODO items in this thread, every library which
> gets split out needs a careful review before it should be declared
> A[PBI] stable. I'm thinking at least:
> 
>   * Consistent error handling (return -1 + setting errno=EFOO)
>   * Correct types
>   * Const correctness
>   * General interface sanity check

Explicit unmap operation to match the foreign map, rather than just
using munmap (to allow valgrind hooks etc).

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 60004: regressions - FAIL

2015-07-27 Thread osstest service owner
flight 60004 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/60004/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   5 xen-build fail REGR. vs. 59059
 build-i386-xsm5 xen-build fail REGR. vs. 59059
 build-i3865 xen-build fail REGR. vs. 59059
 build-amd64-pvops 5 kernel-build  fail REGR. vs. 59059
 build-i386-pvops  5 kernel-build  fail REGR. vs. 59059
 build-amd64-xsm   5 xen-build fail REGR. vs. 59059
 build-armhf   5 xen-build fail REGR. vs. 59059
 build-armhf-pvops 5 kernel-build  fail REGR. vs. 59059

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a

version targeted for testing:
 qemuuf793d97e454a56d17e404004867985622ca1a63b
baseline version:
 qemuu35360642d043c2a5366e8a04a10e5545e7353bd5

Last test of basis59059  2015-07-05 10:39:20 Z   21 days
Failing since 59109  2015-07-06 14:58:21 Z   20 days   31 attempts
Testing same since59877  2015-07-24 18:03:38 Z2 days6 attempts


People who touched revisions under test:
  Alberto Garcia 
  Alex Williamson 
  Alexander Graf 
  Alexey Kardashevskiy 
  Alvise Rigo 
  Amit Shah 
  Andreas Färber 
  Andrew Bennett 
  Andrew Jones 
  Artyom Tarasenko 
  Aurelien Jarno 
  Benjamin Herrenschmidt 
  Bharata B Rao 
  Bharata B Rao 
  Brian Kress 
  Chen Hanxiao 
  Christian Borntraeger 
  Christoffer Dall 
  Christoph Hellwig 
  Christophe Fergeau 
  Claudio Fontan

Re: [Xen-devel] [PATCH v4 07/11] x86/intel_pstate: the main boby of the intel_pstate driver

2015-07-27 Thread Wang, Wei W
On 24/07/2015 21:54,  Jan Beulich wrote:
> >>> On 25.06.15 at 13:16,  wrote:
> > +int __init intel_pstate_init(void)
> > +{
> > +   int cpu, rc = 0;
> > +   const struct x86_cpu_id *id;
> > +   struct cpu_defaults *cpu_info;
> > +
> > +   id = x86_match_cpu(intel_pstate_cpu_ids);
> > +   if (!id)
> > +   return -ENODEV;
> > +
> > +   cpu_info = (struct cpu_defaults *)id->driver_data;
> > +
> > +   copy_pid_params(&cpu_info->pid_policy);
> > +   copy_cpu_funcs(&cpu_info->funcs);
> > +
> > +   if (intel_pstate_msrs_not_valid())
> > +   return -ENODEV;
> > +
> > +   all_cpu_data = xzalloc_array(struct cpudata *, NR_CPUS);
> > +   if (!all_cpu_data)
> > +   return -ENOMEM;
> > +
> > +   rc = cpufreq_register_driver(&intel_pstate_driver);
> > +   if (rc)
> > +   goto out;
> > +
> > +   return rc;
> > +out:
> > +   for_each_online_cpu(cpu) {
> > +   if (all_cpu_data[cpu]) {
> > +   kill_timer(&all_cpu_data[cpu]->timer);
> > +   xfree(all_cpu_data[cpu]);
> > +   }
> > +   }
> 
> I have a hard time seeing where in this function the setup happens that is
> being undone here (keeping in mind that the notifier registration inside
> cpufreq_register_driver() doesn't actually call the notifier function).
> 
> And then, looking at the diff between this and what Linux 4.2-rc3 has (which
> admittedly looks a little newer than what you sent, so I already subtract
> some of the delta), it is significantly larger than the source file itself. 
> That
> surely doesn't suggest a clone-with- minimal-delta. Yet as said before - 
> either
> you do that, or you accept us picking at things you inherited from Linux.

I think it's better to choose the latter - picking out things that are useful 
for us from Linux.
Can you please take a look this patch and summarize the comments? Thanks.

Best,
Wei

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv2 10/10] xen/balloon: pre-allocate p2m entries for ballooned pages

2015-07-27 Thread David Vrabel
On 25/07/15 00:21, Julien Grall wrote:
> On 24/07/2015 12:47, David Vrabel wrote:
>> @@ -550,6 +551,11 @@ int alloc_xenballooned_pages(int nr_pages, struct
>> page **pages)
>>   page = balloon_retrieve(true);
>>   if (page) {
>>   pages[pgno++] = page;
>> +#ifdef CONFIG_XEN_HAVE_PVMMU
>> +ret = xen_alloc_p2m_entry(page_to_pfn(page));
> 
> Don't you want to call this function only when the guest is not using
> auto-translated physmap?

xen_alloc_p2m_entry() is a nop in auto-xlate guests, so no need for an
additional check here.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 0/2] xen: sched/cpupool: more fixing of (corner?) cases

2015-07-27 Thread George Dunlap
On Thu, Jul 23, 2015 at 6:03 PM, Dario Faggioli
 wrote:
> On Thu, 2015-07-23 at 17:04 +0100, Wei Liu wrote:
>> On Thu, Jul 23, 2015 at 09:49:49AM -0600, Jan Beulich wrote:
>> > >>> On 23.07.15 at 16:45,  wrote:
>
>> > > Dario Faggioli (2):
>> > >   xen: sched: reorganize cpu_disable_scheduler()
>> > >   xen: sched/cpupool: properly update affinity when removing a cpu 
>> > > from
>> > > a cpupool
>> >
>> > Especially for the first one, though, the title suggests mere cleanup
>> > (i.e. not to go in now), while the description of it looks more like a
>> > bug fix. Considering this I'd prefer to have a release ack.
>> >
>>
>> They both look like bug fixes to me.
>>
> They both are bugfixes indeed. But yes, in patch 1, taking care of the
> bugs calls for some cleanup (or so I though it was best), and I probably
> could have made the bugfix nature of the patch more clear, given where
> we are in the release.
>
> Sorry, and thanks Jan for bringing this up with Wei...
>
>> Release-acked-by: Wei Liu 
>>
> ...and thanks Wei for the ack. :-)

Looks like this hasn't been checked in yet -- I think Jan is away;
Andy / Ian, can one of you check in these two patches?

Thanks!
 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 0/2] xen: sched/cpupool: more fixing of (corner?) cases

2015-07-27 Thread George Dunlap
On Mon, Jul 27, 2015 at 10:44 AM, George Dunlap
 wrote:
> On Thu, Jul 23, 2015 at 6:03 PM, Dario Faggioli
>  wrote:
>> On Thu, 2015-07-23 at 17:04 +0100, Wei Liu wrote:
>>> On Thu, Jul 23, 2015 at 09:49:49AM -0600, Jan Beulich wrote:
>>> > >>> On 23.07.15 at 16:45,  wrote:
>>
>>> > > Dario Faggioli (2):
>>> > >   xen: sched: reorganize cpu_disable_scheduler()
>>> > >   xen: sched/cpupool: properly update affinity when removing a cpu 
>>> > > from
>>> > > a cpupool
>>> >
>>> > Especially for the first one, though, the title suggests mere cleanup
>>> > (i.e. not to go in now), while the description of it looks more like a
>>> > bug fix. Considering this I'd prefer to have a release ack.
>>> >
>>>
>>> They both look like bug fixes to me.
>>>
>> They both are bugfixes indeed. But yes, in patch 1, taking care of the
>> bugs calls for some cleanup (or so I though it was best), and I probably
>> could have made the bugfix nature of the patch more clear, given where
>> we are in the release.
>>
>> Sorry, and thanks Jan for bringing this up with Wei...
>>
>>> Release-acked-by: Wei Liu 
>>>
>> ...and thanks Wei for the ack. :-)
>
> Looks like this hasn't been checked in yet -- I think Jan is away;
> Andy / Ian, can one of you check in these two patches?

Oops -- ignore this, I had the wrong branch checked out. :-/

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen master hangs

2015-07-27 Thread Andrew Cooper
On 24/07/15 17:52, Doug Goldstein wrote:
>  shortcuts>
>
> I am currently trying to get Xen to boot on my Lenovo T430 laptop with
> BIOS 2.68 and using master results in no output to the screen and the
> machine being completely hung (must pull the power cable). I've gone
> ahead and bisected it and the resultant commit is:
>
> commit 4824bdfdabebd4042277461cb3cbefa61c624804
> Author: Chao Peng  >
> Date:   Tue Jul 7 15:42:49 2015 +0200
>
> x86: add socket_cpumask
>
> Maintain socket_cpumask which contains all the HT and core siblings
> in the same socket.
>
> Signed-off-by: Chao Peng  >
> Acked-by: Jan Beulich mailto:jbeul...@suse.com>>
>
> Since I'm not getting any output I'm at a loss how to debug this problem
> further. I do have Intel AMT setup with Serial Over LAN working so if its
> at all possible to get more info over serial I will do that. I am
> already booting
> Xen with console=vga,com1 com1=115200,amt loglvl=all so not sure what
> else to add.

This commit broke booting on just about every server.

New master has the fix it in, although there is one further fix ahead of
that.  Try using bc299d01b925d934219b6e8c29fadcd1f1a9210b which is two
ahead of current master.

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/events: Support event channel rebind on ARM

2015-07-27 Thread David Vrabel
On 25/07/15 18:34, Julien Grall wrote:
> Currently, the event channel rebind code is gated with the presence of
> the vector callback.
> 
> The virtual interrupt controller on ARM has the concept of per-CPU
> interrupt (PPI) which allow us to support per-VCPU event channel.
> Therefore there is no need of vector callback for ARM.
> 
> Xen is already using a free PPI to notify the guest VCPU of an event.
> Furthermore, the xen code initialization in Linux (see
> arch/arm/xen/enlighten.c) is requesting correctly a per-CPU IRQ.
> 
> Introduce new macro xen_support_evtchn_rebind to allow architecture
> decide whether rebind an event is support or not. It will always return
> 1 on ARM and keep the same behavior on x86.
> 
> This is also allow us to drop the usage of xen_have_vector_callback
> entirely in the ARM code.

Reviewed-by: David Vrabel 

Provided you make xen_support_evtchn_rebind() an inline function.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv2 08/10] xen/balloon: use hotplugged pages for foreign mappings etc.

2015-07-27 Thread David Vrabel
On 24/07/15 19:55, Konrad Rzeszutek Wilk wrote:
> On Fri, Jul 24, 2015 at 12:47:46PM +0100, David Vrabel wrote:
>> alloc_xenballooned_pages() is used to get ballooned pages to back
>> foreign mappings etc.  Instead of having to balloon out real pages,
>> use (if supported) hotplugged memory.
>>
>> This makes more memory available to the guest and reduces
>> fragmentation in the p2m.
>>
>> If userspace is lacking a udev rule (or similar) to online hotplugged
> 
> Is that udev rule already in distros?

Not all, which makes me think that this behaviour should be enabled by
userspace (via a module parameter).  This would also allow me to drop
the timeout and fallback path which I put in to handle the no udev rule
case.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 59910: regressions - FAIL

2015-07-27 Thread osstest service owner
flight 59910 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59910/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail REGR. vs. 59817
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail REGR. vs. 59817

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
like 59817
 test-armhf-armhf-xl-rtds 11 guest-start  fail   like 59817
 test-amd64-i386-xl-qemut-debianhvm-amd64 11 guest-saverestore  fail like 59817
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 59817
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 59817
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 guest-localmigrate 
fail like 59904-bisect

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass

version targeted for testing:
 xen  19b17e240c5f31eb1ff744946ce75afa729bfe91
baseline version:
 xen  7c60c2da3160766a265cb84c7411ff2c9cbd8d0b

Last test of basis59817  2015-07-22 07:29:29 Z5 days
Failing since 59833  2015-07-23 10:56:30 Z3 days2 attempts
Testing same since59910  2015-07-25 10:33:08 Z1 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Andrew Cooper  for the x86 bits.
  Chao Peng 
  Chris (Christopher) Brand 
  Chris Brand 
  Daniel De Graaf 
  Dario Faggioli 
  Ed White 
  George Dunlap 
  Ian Campbell 
  Ian Jackson 
  Jan Beulich 
  Jonathan Creekmore 
  Juergen Gross 
  Julien Grall 
  Jun Nakajima 
  Kevin Tian 
  Martin Lucina 
  Ravi Sahita 
  Roger Pau Monné 
  Tamas K Lengyel 
  Tiejun Chen 
  Wei Liu 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmfail
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm fail
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-x

Re: [Xen-devel] [PATCH v3 05/32] libxc: make arch_setup_meminit a xc_dom_arch hook

2015-07-27 Thread Roger Pau Monné
El 06/07/15 a les 14.23, Andrew Cooper ha escrit:
> On 03/07/15 12:34, Roger Pau Monne wrote:
>> This allows having different arch_setup_meminit implementations based on the
>> guest type. It should not introduce any functional changes.
>>
>> Signed-off-by: Roger Pau Monné 
>> Cc: Ian Jackson 
>> Cc: Stefano Stabellini 
>> Cc: Ian Campbell 
>> Cc: Wei Liu 
> 
> Reviewed-by: Andrew Cooper 
> 
> However, would you mind doing a cleanup patch which changes the use of
> __init.  Confusingly given its precedent in hypervisor code, xc_dom.h
> has "#define __init __attribute__ ((constructor))".  At a first guess,
> 'initcall' or even just 'constructor' would be more appropriate names.

Iff you don't mind, I would rather do that after this series. I'm
already carrying a non-trivial amount of patches, and doing something
like this in the middle of the series would be hard to maintain rebase wise.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread George Dunlap
On Fri, Jul 24, 2015 at 5:09 PM, Konrad Rzeszutek Wilk
 wrote:
> On Fri, Jul 24, 2015 at 05:58:29PM +0200, Dario Faggioli wrote:
>> On Fri, 2015-07-24 at 17:24 +0200, Juergen Gross wrote:
>> > On 07/24/2015 05:14 PM, Juergen Gross wrote:
>> > > On 07/24/2015 04:44 PM, Dario Faggioli wrote:
>>
>> > >> In fact, I think that it is the topology, i.e., what comes from MSRs,
>> > >> that needs to adapt, and follow vNUMA, as much as possible. Do we agree
>> > >> on this?
>> > >
>> > > I think we have to be very careful here. I see two possible scenarios:
>> > >
>> > > 1) The vcpus are not pinned 1:1 on physical cpus. The hypervisor will
>> > > try to schedule the vcpus according to their numa affinity. So they
>> > > can change pcpus at any time in case of very busy guests. I don't
>> > > think the linux kernel should treat the cpus differently in this
>> > > case as it will be in vane regarding the Xen scheduler's activity.
>> > > So we should use the "null" topology in this case.
>> >
>> > Sorry, the topology should reflect the vcpu<->numa-node relations, of
>> > course, but nothing else (so flat topolgy in each numa node).
>> >
>> Yeah, I was replying to this point saying something like this right
>> now... Luckily, I've seen this email! :-P
>>
>> With this semantic, I fully agree with this.
>>
>> > > 2) The vcpus of the guest are all pinned 1:1 to physical cpus. The Xen
>> > > scheduler can't move vcpus between pcpus, so the linux kernel should
>> > > see the real topology of the used pcpus in order to optimize for this
>> > > picture.
>> > >
>> >
>> Mmm... I did think about this too, but I'm not sure. I see the value of
>> this of course, and the reason why it makes sense. However, pinning can
>> change on-line, via `xl vcpu-pin' and stuff. Also migration could make
>> things less certain, I think. What happens if we build on top of the
>> initial pinning, and then things change?
>>
>> To be fair, there is stuff building on top of the initial pinning
>> already, e.g., from which physical NUMA node we allocate the memory
>> relies depends exactly on that. That being said, I'm not sure I'm
>> comfortable with adding more of this...
>>
>> Perhaps introduce an 'immutable_pinning' flag, which will prevent
>> affinity to be changed, and then bind the topology to pinning only if
>> that one is set?
>>
>> > >> Maybe, there is room for "fixing" this at this level, hooking up inside
>> > >> the scheduler code... but I'm shooting in the dark, without having check
>> > >> whether and how this could be really feasible, should I?
>> > >
>> > > Uuh, I don't think a change of the scheduler on behalf of Xen is really
>> > > appreciated. :-)
>> > >
>> I'm sure it would (have been! :-)) a true and giant nightmare!! :-D
>>
>> > >> One thing I don't like about this approach is that it would potentially
>> > >> solve vNUMA and other scheduling anomalies, but...
>> > >>
>> > >>> cpuid instruction is available for user mode as well.
>> > >>>
>> > >> ...it would not do any good for other subsystems, and user level code
>> > >> and apps.
>> > >
>> > > Indeed. I think the optimal solution would be two-fold: give the
>> > > scheduler the information it is needing to react correctly via a
>> > > kernel patch not relying on cpuid values and fiddle with the cpuid
>> > > values from xen tools according to any needs of other subsystems and/or
>> > > user code (e.g. licensing).
>> >
>> So, just to check if I'm understanding is correct: you'd like to add an
>> abstraction layer, in Linux, like in generic (or, perhaps, scheduling)
>> code, to hide the direct interaction with CPUID.
>> Such layer, on baremetal, would just read CPUID while, on PV-ops, it'd
>> check with Xen/match vNUMA/whatever... Is this that you are saying?
>>
>> If yes, I think I like it...
>
> I don't think this is workable. For example there are applications
> which use 'cpuid' and figure out the core/thread and use it for its own
> scheduling purposes.

Can you expand a little on this?  I'm having trouble figuring out
exactly what user-space applications are reading and how they're using
it -- and, how they work currently in virtual environments, given that
they (typically) will be moved between physical processors even if
they stay on the same virtual processor.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread George Dunlap
On Mon, Jul 27, 2015 at 5:35 AM, Juergen Gross  wrote:
> On 07/24/2015 06:44 PM, Boris Ostrovsky wrote:
>>
>> On 07/24/2015 12:39 PM, Juergen Gross wrote:
>>>
>>>
>>>
>>> I don't say mangling cpuids can't solve the scheduling problem. It
>>> surely can. But it can't solve the scheduling problem without hiding
>>> information like number of sockets or cores which might be required
>>> for license purposes. If we don't care, fine.
>>>
>>
>> (this is somewhat repeating the email I just sent)
>>
>> Why can's we construct socket/core info with CPUID (and *possibly* ACPI
>> changes) that we present a reasonable (licensing-wise) picture?
>>
>> Can you suggest an example where it will not work and then maybe we can
>> figure something out?
>
>
> Let's assume a software with license based on core count. You have a
> system with a 2 8 core processors and hyperthreads enabled, summing up
> to 32 logical processors. Your license is valid for up to 16 cores, so
> running the software on bare metal on your system is fine.
>
> Now you are running the software inside a virtual machine with 24 vcpus
> in a cpupool with 24 logical cpus limited to 12 cores (6 cores of each
> processor). As we have to hide hyperthreading in order to not to have
> to pin each vcpu to just a single logical processor, the topology
> resulting from this picture will have to present 24 cores. The license
> will not cover this hardware.

But how does doing a PV topology help this situation?  Because we're
telling one thing to the OS (via our PV interface) and another thing
to applications (via direct CPUID access)?

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Andrew Cooper
On 27/07/15 11:41, George Dunlap wrote:
> On Fri, Jul 24, 2015 at 5:09 PM, Konrad Rzeszutek Wilk
>  wrote:
>> On Fri, Jul 24, 2015 at 05:58:29PM +0200, Dario Faggioli wrote:
>>> On Fri, 2015-07-24 at 17:24 +0200, Juergen Gross wrote:
 On 07/24/2015 05:14 PM, Juergen Gross wrote:
> On 07/24/2015 04:44 PM, Dario Faggioli wrote:
>> In fact, I think that it is the topology, i.e., what comes from MSRs,
>> that needs to adapt, and follow vNUMA, as much as possible. Do we agree
>> on this?
> I think we have to be very careful here. I see two possible scenarios:
>
> 1) The vcpus are not pinned 1:1 on physical cpus. The hypervisor will
> try to schedule the vcpus according to their numa affinity. So they
> can change pcpus at any time in case of very busy guests. I don't
> think the linux kernel should treat the cpus differently in this
> case as it will be in vane regarding the Xen scheduler's activity.
> So we should use the "null" topology in this case.
 Sorry, the topology should reflect the vcpu<->numa-node relations, of
 course, but nothing else (so flat topolgy in each numa node).

>>> Yeah, I was replying to this point saying something like this right
>>> now... Luckily, I've seen this email! :-P
>>>
>>> With this semantic, I fully agree with this.
>>>
> 2) The vcpus of the guest are all pinned 1:1 to physical cpus. The Xen
> scheduler can't move vcpus between pcpus, so the linux kernel should
> see the real topology of the used pcpus in order to optimize for this
> picture.
>
>>> Mmm... I did think about this too, but I'm not sure. I see the value of
>>> this of course, and the reason why it makes sense. However, pinning can
>>> change on-line, via `xl vcpu-pin' and stuff. Also migration could make
>>> things less certain, I think. What happens if we build on top of the
>>> initial pinning, and then things change?
>>>
>>> To be fair, there is stuff building on top of the initial pinning
>>> already, e.g., from which physical NUMA node we allocate the memory
>>> relies depends exactly on that. That being said, I'm not sure I'm
>>> comfortable with adding more of this...
>>>
>>> Perhaps introduce an 'immutable_pinning' flag, which will prevent
>>> affinity to be changed, and then bind the topology to pinning only if
>>> that one is set?
>>>
>> Maybe, there is room for "fixing" this at this level, hooking up inside
>> the scheduler code... but I'm shooting in the dark, without having check
>> whether and how this could be really feasible, should I?
> Uuh, I don't think a change of the scheduler on behalf of Xen is really
> appreciated. :-)
>
>>> I'm sure it would (have been! :-)) a true and giant nightmare!! :-D
>>>
>> One thing I don't like about this approach is that it would potentially
>> solve vNUMA and other scheduling anomalies, but...
>>
>>> cpuid instruction is available for user mode as well.
>>>
>> ...it would not do any good for other subsystems, and user level code
>> and apps.
> Indeed. I think the optimal solution would be two-fold: give the
> scheduler the information it is needing to react correctly via a
> kernel patch not relying on cpuid values and fiddle with the cpuid
> values from xen tools according to any needs of other subsystems and/or
> user code (e.g. licensing).
>>> So, just to check if I'm understanding is correct: you'd like to add an
>>> abstraction layer, in Linux, like in generic (or, perhaps, scheduling)
>>> code, to hide the direct interaction with CPUID.
>>> Such layer, on baremetal, would just read CPUID while, on PV-ops, it'd
>>> check with Xen/match vNUMA/whatever... Is this that you are saying?
>>>
>>> If yes, I think I like it...
>> I don't think this is workable. For example there are applications
>> which use 'cpuid' and figure out the core/thread and use it for its own
>> scheduling purposes.
> Can you expand a little on this?  I'm having trouble figuring out
> exactly what user-space applications are reading and how they're using
> it -- and, how they work currently in virtual environments, given that
> they (typically) will be moved between physical processors even if
> they stay on the same virtual processor.

There are many examples of userspace application using cpuid to modify
themselves.  Any serious application with processor optimisations will
use the cpuid feature bits to choose the most efficient algorithm.

hwloc is an perfect example which gathers all of the topology
information out of cpuid to work out how to most efficiently
pin/schedule tasks.

As to how the work, that is very much an open question.  "function"
might be a better term.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v2] No longer export $OSSTEST_CONFIG

2015-07-27 Thread Ian Campbell
On Fri, 2015-07-24 at 18:03 +0100, Ian Jackson wrote:
> Ian Campbell writes ("[PATCH OSSTEST v2] No longer export 
> $OSSTEST_CONFIG"):
> > > From cri-args-hostlists or invoke-daemon.
> > 
> > All sites now have a suitable $HOME/.xen-osstest/settings in place
> > which does this.
> > 
> > Signed-off-by: Ian Campbell 
> > ---
> > This was waiting to be applied once " allow instance specific
> > settings" passed the Cambridge push gate, which happened ages ago.
> 
> Verified that,

I think you meant "passed the Cambridge push gate" and not "in place which
does this". I've just reconfirmed the latter too.

>  and
> 
> Acked-by: Ian Jackson 

Ta

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Juergen Gross

On 07/27/2015 12:43 PM, George Dunlap wrote:

On Mon, Jul 27, 2015 at 5:35 AM, Juergen Gross  wrote:

On 07/24/2015 06:44 PM, Boris Ostrovsky wrote:


On 07/24/2015 12:39 PM, Juergen Gross wrote:




I don't say mangling cpuids can't solve the scheduling problem. It
surely can. But it can't solve the scheduling problem without hiding
information like number of sockets or cores which might be required
for license purposes. If we don't care, fine.



(this is somewhat repeating the email I just sent)

Why can's we construct socket/core info with CPUID (and *possibly* ACPI
changes) that we present a reasonable (licensing-wise) picture?

Can you suggest an example where it will not work and then maybe we can
figure something out?



Let's assume a software with license based on core count. You have a
system with a 2 8 core processors and hyperthreads enabled, summing up
to 32 logical processors. Your license is valid for up to 16 cores, so
running the software on bare metal on your system is fine.

Now you are running the software inside a virtual machine with 24 vcpus
in a cpupool with 24 logical cpus limited to 12 cores (6 cores of each
processor). As we have to hide hyperthreading in order to not to have
to pin each vcpu to just a single logical processor, the topology
resulting from this picture will have to present 24 cores. The license
will not cover this hardware.


But how does doing a PV topology help this situation?  Because we're
telling one thing to the OS (via our PV interface) and another thing
to applications (via direct CPUID access)?


Exactly.

In my example it would even work to not modify the cpuid information at
all. The kernel wouldn't try to be extremely clever regarding scheduling
and the user land would see the cpuid information from the real hardware
(only the 12 cores it is running on, of course).


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 1/2] cr-daily-branch: Begin to support other reasons for forcing a baseline.

2015-07-27 Thread Ian Campbell
By converting the current boolean $force_baseline into a keyword
indicating the reason.

Signed-off-by: Ian Campbell 
Acked-by: Ian Jackson 
---
 cr-daily-branch | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/cr-daily-branch b/cr-daily-branch
index 34b6d2b..7e3e69e 100755
--- a/cr-daily-branch
+++ b/cr-daily-branch
@@ -47,7 +47,7 @@ determine_version () {
local tversionvar=$1
local tbranch=$2
local treevarwhich=$3
-   if [ "x$tbranch" = "x$branch" ] && ! $force_baseline; then
+   if [ "x$tbranch" = "x$branch" ] && [ "x$force_baseline" = x ]; then
 if [ "x$FORCE_REVISION" != x ]; then
 tversion="$FORCE_REVISION"
 else
@@ -70,7 +70,7 @@ fetch_version () {
 
 treeurl=`./ap-print-url $branch`
 
-force_baseline=false
+force_baseline='' # Non-empty = indication why we are forcing baseline.
 skipidentical=true
 wantpush=$OSSTEST_PUSH
 
@@ -91,7 +91,7 @@ if [ "x$OSSTEST_NO_BASELINE" != xy ] ; then
if [ "x$testedflight" = x ]; then
wantpush=false
skipidentical=false
-   force_baseline=true
+   force_baseline='untested'
if [ "x$treeurl" != xnone: ]; then
treearg=--tree-$tree=$treeurl
fi
@@ -248,7 +248,8 @@ heading=tmp/$flight.heading-info
 : >$heading
 sgr_args+=" --info-headers --include-begin=$heading"
 
-if $force_baseline; then
+case "$force_baseline" in
+untested)
subject_prefix="[$branch baseline test] "
cat >>$heading 

Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Andrew Cooper
On 27/07/15 11:43, George Dunlap wrote:
> On Mon, Jul 27, 2015 at 5:35 AM, Juergen Gross  wrote:
>> On 07/24/2015 06:44 PM, Boris Ostrovsky wrote:
>>> On 07/24/2015 12:39 PM, Juergen Gross wrote:


 I don't say mangling cpuids can't solve the scheduling problem. It
 surely can. But it can't solve the scheduling problem without hiding
 information like number of sockets or cores which might be required
 for license purposes. If we don't care, fine.

>>> (this is somewhat repeating the email I just sent)
>>>
>>> Why can's we construct socket/core info with CPUID (and *possibly* ACPI
>>> changes) that we present a reasonable (licensing-wise) picture?
>>>
>>> Can you suggest an example where it will not work and then maybe we can
>>> figure something out?
>>
>> Let's assume a software with license based on core count. You have a
>> system with a 2 8 core processors and hyperthreads enabled, summing up
>> to 32 logical processors. Your license is valid for up to 16 cores, so
>> running the software on bare metal on your system is fine.
>>
>> Now you are running the software inside a virtual machine with 24 vcpus
>> in a cpupool with 24 logical cpus limited to 12 cores (6 cores of each
>> processor). As we have to hide hyperthreading in order to not to have
>> to pin each vcpu to just a single logical processor, the topology
>> resulting from this picture will have to present 24 cores. The license
>> will not cover this hardware.
> But how does doing a PV topology help this situation?  Because we're
> telling one thing to the OS (via our PV interface) and another thing
> to applications (via direct CPUID access)?

I expressed exactly these concerns right back at the start of the vnuma
work.

The OS and its userspace can and will use cpuid.  Most examples will
only use cpuid.  The only thing worse that providing no NUMA information
at all is providing conflicting information between cpuid and vnuma.

IMO, HVM guests should get all their NUMA information from the same
sources as native hardware would provide.  PV guests are admittedly
harder as in generally we cannot hide the real topology information in
cpuid.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 2/2] cambridge: arrange to test each new baseline

2015-07-27 Thread Ian Campbell
Provide a new cr-daily-branch setting OSSTEST_BASELINES_ONLY which
causes it to only attempt to test the current baseline (if it is
untested) and never the tip version. Such tests will not result in any
push.

Add a cronjob to Cambridge which runs in this manner, ensuring that
there will usually be some sort of reasonably up to date baseline for
any given branch which can be used for comparisons in adhoc testing or
bisections.

This will also give us some data on the success of various branches on
the set of machines in Cambridge, which can be useful/interesting.

Signed-off-by: Ian Campbell 
Acked-by: Ian Jackson 
---
v2: Wording tweak.
---
 cr-daily-branch   | 13 -
 crontab-cambridge |  1 +
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/cr-daily-branch b/cr-daily-branch
index 7e3e69e..ed8f8c0 100755
--- a/cr-daily-branch
+++ b/cr-daily-branch
@@ -85,7 +85,11 @@ check_tested () {
  "$@"
 }
 
-if [ "x$OSSTEST_NO_BASELINE" != xy ] ; then
+if [ "x$OSSTEST_BASELINES_ONLY" = xy ] ; then
+force_baseline=baselines-only
+wantpush=false
+skipidentical=true
+elif [ "x$OSSTEST_NO_BASELINE" != xy ] ; then
testedflight=`check_tested --revision-$tree="$OLD_REVISION"`
 
if [ "x$testedflight" = x ]; then
@@ -258,6 +262,13 @@ any, is the most recent actually tested revision.
 
 END
 ;;
+baselines-only)
+#subject-prefix="[... ] "
+cat >> $heading 

Re: [Xen-devel] [PATCHv2 10/10] xen/balloon: pre-allocate p2m entries for ballooned pages

2015-07-27 Thread Julien Grall
On 27/07/15 10:30, David Vrabel wrote:
> On 25/07/15 00:21, Julien Grall wrote:
>> On 24/07/2015 12:47, David Vrabel wrote:
>>> @@ -550,6 +551,11 @@ int alloc_xenballooned_pages(int nr_pages, struct
>>> page **pages)
>>>   page = balloon_retrieve(true);
>>>   if (page) {
>>>   pages[pgno++] = page;
>>> +#ifdef CONFIG_XEN_HAVE_PVMMU
>>> +ret = xen_alloc_p2m_entry(page_to_pfn(page));
>>
>> Don't you want to call this function only when the guest is not using
>> auto-translated physmap?
> 
> xen_alloc_p2m_entry() is a nop in auto-xlate guests, so no need for an
> additional check here.

I don't have the impression it's the case or it's not obvious.

For instance xen_p2m_addr, used within with the xen_alloc_p2m_entry (old
name alloc_p2m) is never set for auto-xlate guests. Therefore the value
is 0.

Same for p2m_identity and p2m_missing & co.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-07-27 Thread Tim Deegan
At 13:02 +0100 on 24 Jul (1437742964), Andrew Cooper wrote:
> On 24/07/15 10:41, Jan Beulich wrote:
> > Beyond that log-dirty handling in _hvm_map_guest_frame() looks bogus
> > too: What if a XEN_DOMCTL_SHADOW_OP_* gets issued and acted upon
> > between the setting of the dirty flag and the actual write happening?
> > I.e. shouldn't the flag instead be set in hvm_unmap_guest_frame()?
> 
> It does indeed.  (Ideally the dirty bit should probably be held high for 
> the duration that a mapping exists, but that is absolutely infeasible to 
> do).

IMO that would not be very useful -- a well-behaved toolstack will
have to make sure that relevant mappings are torn down before
stop-and-copy.  Forcing the dirty bit high in the meantime just makes
every intermediate pass send a wasted copy of the page, without
actually closing the race window if the tools are buggy.

If we want to catch these bugs, it might be useful to have a flag
that the tools can set when stop-and-copy begins, to indicate any
subsequent mark_dirty() calls are "too late".

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 0/6] libxl: xs_restrict QEMU

2015-07-27 Thread Fabio Fantoni

Il 23/07/2015 19:26, Stefano Stabellini ha scritto:

Hi all,

this patch series changes libxl to start QEMU as device model with the
new xsrestrict option (http://marc.info/?l=xen-devel&m=143341692707358).
It also starts a second QEMU to provide PV backends in userspace (qdisk)
to HVM guests.


Hi, I'm interested to test this serie.
xen patch "run QEMU as non-root" and qemu patch linked above are the 
only prerequisite or other are needed?
I saw that second patch is marked as [WIP], is it usable or I must wait 
to have it complete before test this serie?


Thanks for any reply and sorry for my bad english.




Changes in v5:
- improve commit messages with security details

Changes in v4:
- update xenstore-paths.markdown
- add error message in case count > MAX_PHYSMAP_ENTRIES
- add a note to xenstore-paths.markdown about the possible change in
privilege level
- only change permissions if xsrestrict is supported

Changes in v3:
- use LIBXL_TOOLSTACK_DOMID instead of 0 in the commit message
- update commit message with more info on why it is safe
- add a limit on the number of physmap entries to save and restore
- add emulator_ids
- mark patch #3 as WIP
- use LIBXL_TOOLSTACK_DOMID instead of 0 in the commit message
- change xs path to include the emulator_id
- change qdisk-backend-pid path on xenstore
- use dcs->dmss.pvqemu to spawn the second QEMU
- keep track of the rc of both QEMUs before proceeding


Stefano Stabellini (6):
   libxl: do not add a vkb backend to hvm guests
   [WIP] libxl: xsrestrict QEMU
   libxl: allow /local/domain/$LIBXL_TOOLSTACK_DOMID/device-model/$DOMID to 
be written by $DOMID
   libxl: change xs path for QEMU
   libxl: change qdisk-backend-pid path on xenstore
   libxl: spawns two QEMUs for HVM guests

  docs/misc/xenstore-paths.markdown |   30 --
  tools/libxl/libxl.c   |2 +-
  tools/libxl/libxl_create.c|   58 +--
  tools/libxl/libxl_device.c|2 +-
  tools/libxl/libxl_dm.c|  115 +
  tools/libxl/libxl_dom.c   |   19 --
  tools/libxl/libxl_internal.c  |   19 --
  tools/libxl/libxl_internal.h  |   15 -
  tools/libxl/libxl_pci.c   |   14 ++---
  tools/libxl/libxl_utils.c |   10 
  10 files changed, 225 insertions(+), 59 deletions(-)

Cheers,

Stefano

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 03/22] xen: Add log2 functionality

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

log2 helper apis are ported from linux from
commit 13c07b0286d340275f2d97adf085cecda37ede37
(linux/log2.h: Fix rounddown_pow_of_two(1))
Changes made for xen are:
  - Only required functionality is retained
  - Replace fls_long with flsl

Signed-off-by: Vijaya Kumar K 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Jan Beulich 
CC: Keir Fraser 
CC: Tim Deegan 
---
v4: - Only retained required functionality
- Replaced fls_long with flsl
- Removed fls_long implementation in bitops.h in v3 version
---
 xen/include/xen/log2.h |  167 
 1 file changed, 167 insertions(+)

diff --git a/xen/include/xen/log2.h b/xen/include/xen/log2.h
new file mode 100644
index 000..86bd861
--- /dev/null
+++ b/xen/include/xen/log2.h
@@ -0,0 +1,167 @@
+/* 
+ * Copyright (C) 2006 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowe...@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _XEN_LOG2_H
+#define _XEN_LOG2_H
+
+#include 
+#include 
+
+/*
+ * deal with unrepresentable constant logarithms
+ */
+extern __attribute__((const))
+int ilog2_NaN(void);
+
+/*
+ * non-constant log of base 2 calculators
+ * - the arch may override these in asm/bitops.h if they can be implemented
+ *   more efficiently than using fls() and fls64()
+ * - the arch is not required to handle n==0 if implementing the fallback
+ */
+static inline __attribute__((const))
+int __ilog2_u32(u32 n)
+{
+   return fls(n) - 1;
+}
+
+static inline __attribute__((const))
+int __ilog2_u64(u64 n)
+{
+   return flsl(n) - 1;
+}
+
+/*
+ * round up to nearest power of two
+ */
+static inline __attribute__((const))
+unsigned long __roundup_pow_of_two(unsigned long n)
+{
+   return 1UL << flsl(n - 1);
+}
+
+/**
+ * ilog2 - log of base 2 of 32-bit or a 64-bit unsigned value
+ * @n - parameter
+ *
+ * constant-capable log of base 2 calculation
+ * - this can be used to initialise global variables from constant data, hence
+ *   the massive ternary operator construction
+ *
+ * selects the appropriately-sized optimised version depending on sizeof(n)
+ */
+#define ilog2(n)   \
+(  \
+   __builtin_constant_p(n) ? ( \
+   (n) < 1 ? ilog2_NaN() : \
+   (n) & (1ULL << 63) ? 63 :   \
+   (n) & (1ULL << 62) ? 62 :   \
+   (n) & (1ULL << 61) ? 61 :   \
+   (n) & (1ULL << 60) ? 60 :   \
+   (n) & (1ULL << 59) ? 59 :   \
+   (n) & (1ULL << 58) ? 58 :   \
+   (n) & (1ULL << 57) ? 57 :   \
+   (n) & (1ULL << 56) ? 56 :   \
+   (n) & (1ULL << 55) ? 55 :   \
+   (n) & (1ULL << 54) ? 54 :   \
+   (n) & (1ULL << 53) ? 53 :   \
+   (n) & (1ULL << 52) ? 52 :   \
+   (n) & (1ULL << 51) ? 51 :   \
+   (n) & (1ULL << 50) ? 50 :   \
+   (n) & (1ULL << 49) ? 49 :   \
+   (n) & (1ULL << 48) ? 48 :   \
+   (n) & (1ULL << 47) ? 47 :   \
+   (n) & (1ULL << 46) ? 46 :   \
+   (n) & (1ULL << 45) ? 45 :   \
+   (n) & (1ULL << 44) ? 44 :   \
+   (n) & (1ULL << 43) ? 43 :   \
+   (n) & (1ULL << 42) ? 42 :   \
+   (n) & (1ULL << 41) ? 41 :   \
+   (n) & (1ULL << 40) ? 40 :   \
+   (n) & (1ULL << 39) ? 39 :   \
+   (n) & (1ULL << 38) ? 38 :   \
+   (n) & (1ULL << 37) ? 37 :   \
+   (n) & (1ULL << 36) ? 36 :   \
+   (n) & (1ULL << 35) ? 35 :   \
+   (n) & (1ULL << 34) ? 34 :   \
+   (n) & (1ULL << 33) ? 33 :   \
+   (n) & (1ULL << 32) ? 32 :   \
+   (n) & (1ULL << 31) ? 31 :   \
+   (n) & (1ULL << 30) ? 30 :   \
+   (n) & (1ULL << 29) ? 29 :   \
+   (n) & (1ULL << 28) ? 28 :   \
+   (n) & (1ULL << 27) ? 27 :   \
+   (n) & (1ULL << 26) ? 26 :   \
+   (n) & (1ULL << 25) ? 25 :   \
+   (n) & (1ULL << 24) ? 24 :   \
+   (n) & (1ULL << 23) ? 23 :   \
+   (n) & (1ULL << 22) ? 22 :   \
+   (n) & (1ULL << 21) ? 21 :   \
+   (n) & (1ULL << 20) ? 20 :   \
+   (n) & (1ULL << 19) ? 19 :   \
+   (n) & (1ULL << 18) ? 18 :   \
+   (n) & (1ULL << 17) ? 17 :   \
+   (n) & (1ULL << 16) ? 16 :   \
+   (n) & (1ULL << 15) ? 15 :

[Xen-devel] [PATCH v5 02/22] xen/arm: Add bitmap_find_next_zero_area helper function

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

bitmap_find_next_zero_area helper function will be used
by physical ITS driver. This is imported from linux 4.2

Signed-off-by: Vijaya Kumar K 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Jan Beulich 
CC: Keir Fraser 
CC: Tim Deegan 
---
v5: Ported from Linux 4.2.
Added bitmap_find_next_zero_area_off().
v4: Removed spaces and added tabs
Moved ALIGN macro to lib.h
v3: Moved changes to xen/common/bitmap.c and
xen/include/xen/bitmap.h
---
 xen/common/bitmap.c  |   39 +++
 xen/include/xen/bitmap.h |   16 
 xen/include/xen/lib.h|2 ++
 3 files changed, 57 insertions(+)

diff --git a/xen/common/bitmap.c b/xen/common/bitmap.c
index 61d1ea4..ad665d1 100644
--- a/xen/common/bitmap.c
+++ b/xen/common/bitmap.c
@@ -489,6 +489,45 @@ int bitmap_allocate_region(unsigned long *bitmap, int pos, 
int order)
 }
 EXPORT_SYMBOL(bitmap_allocate_region);
 
+/*
+ * bitmap_find_next_zero_area_off - find a contiguous aligned zero area
+ * @map: The address to base the search on
+ * @size: The bitmap size in bits
+ * @start: The bitnumber to start searching at
+ * @nr: The number of zeroed bits we're looking for
+ * @align_mask: Alignment mask for zero area
+ * @align_offset: Alignment offset for zero area.
+ *
+ * The @align_mask should be one less than a power of 2; the effect is that
+ * the bit offset of all zero areas this function finds plus @align_offset
+ * is multiple of that power of 2.
+ */
+unsigned long bitmap_find_next_zero_area_off(unsigned long *map,
+unsigned long size,
+unsigned long start,
+unsigned int nr,
+unsigned long align_mask,
+unsigned long align_offset)
+{
+   unsigned long index, end, i;
+again:
+   index = find_next_zero_bit(map, size, start);
+
+   /* Align allocation */
+   index = ALIGN_MASK(index + align_offset, align_mask) - align_offset;
+
+   end = index + nr;
+   if (end > size)
+   return end;
+   i = find_next_bit(map, end, index);
+   if (i < end) {
+   start = i + 1;
+   goto again;
+   }
+   return index;
+}
+EXPORT_SYMBOL(bitmap_find_next_zero_area_off)
+
 #ifdef __BIG_ENDIAN
 
 void bitmap_long_to_byte(uint8_t *bp, const unsigned long *lp, int nbits)
diff --git a/xen/include/xen/bitmap.h b/xen/include/xen/bitmap.h
index e2a3686..161f990 100644
--- a/xen/include/xen/bitmap.h
+++ b/xen/include/xen/bitmap.h
@@ -101,6 +101,22 @@ extern int bitmap_scnlistprintf(char *buf, unsigned int 
len,
 extern int bitmap_find_free_region(unsigned long *bitmap, int bits, int order);
 extern void bitmap_release_region(unsigned long *bitmap, int pos, int order);
 extern int bitmap_allocate_region(unsigned long *bitmap, int pos, int order);
+extern unsigned long bitmap_find_next_zero_area_off(unsigned long *map,
+   unsigned long size,
+   unsigned long start,
+   unsigned int nr,
+   unsigned long align_mask,
+   unsigned long align_offset);
+
+static inline unsigned long bitmap_find_next_zero_area(unsigned long *map,
+  unsigned long size,
+  unsigned long start,
+  unsigned int nr,
+  unsigned long align_mask)
+{
+   return bitmap_find_next_zero_area_off(map, size, start, nr,
+ align_mask, 0);
+}
 
 #define BITMAP_LAST_WORD_MASK(nbits)   \
 (  \
diff --git a/xen/include/xen/lib.h b/xen/include/xen/lib.h
index 4258912..e7d9d95 100644
--- a/xen/include/xen/lib.h
+++ b/xen/include/xen/lib.h
@@ -55,6 +55,8 @@
 
 #define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
 
+#define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
+
 #define reserve_bootmem(_p,_l) ((void)0)
 
 struct domain;
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 04/22] xen/arm: Set nr_cpu_ids to available number of cpus

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

nr_cpu_ids for arm platforms is set to 128 irrespective of
number of cpus supported by platform.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/setup.c  |1 +
 xen/arch/arm/smpboot.c|   11 +++
 xen/include/asm-arm/smp.h |1 +
 3 files changed, 13 insertions(+)

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index a46c583..6ca787b 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -772,6 +772,7 @@ void __init start_xen(unsigned long boot_phys_offset,
 
 smp_init_cpus();
 cpus = smp_get_max_cpus();
+set_nr_cpu_ids(cpus);
 
 init_xen_time();
 
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index a96cda2..233ad95 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -243,6 +243,17 @@ void __init smp_init_cpus(void)
 }
 }
 
+void __init set_nr_cpu_ids(unsigned int max_cpus)
+{
+if ( !max_cpus )
+max_cpus = 1;
+   else if ( max_cpus > NR_CPUS )
+max_cpus = NR_CPUS;
+
+printk(XENLOG_INFO "SMP: Allowing %u CPUs\n", max_cpus);
+nr_cpu_ids = max_cpus;
+}
+
 int __init
 smp_get_max_cpus (void)
 {
diff --git a/xen/include/asm-arm/smp.h b/xen/include/asm-arm/smp.h
index 91b1e52..c23bd0f 100644
--- a/xen/include/asm-arm/smp.h
+++ b/xen/include/asm-arm/smp.h
@@ -29,6 +29,7 @@ extern void init_secondary(void);
 extern void smp_init_cpus(void);
 extern void smp_clear_cpu_maps (void);
 extern int smp_get_max_cpus (void);
+extern void set_nr_cpu_ids(unsigned int max_cpus);
 #endif
 
 /*
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 00/22] Add ITS support

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

This is based on DraftG version
http://xenbits.xen.org/people/ianc/vits/draftG.pdf

Following major features are supported
 - GICv3 ITS support for arm64 platform
 - Only Dom0 is supported. For DomU pci passthrough feature
   is required.

Basic boot is tested with single ITS node by adding
and assigning devices from platform initialization.

Changes in v5:
  - Added following new patches
  0001-xen-arm-Return-success-if-dt-node-does-not-have-irq-.patch
  0004-xen-arm-Set-nr_cpu_ids-to-available-number-of-cpus.patch
  0009-xen-arm-ITS-Export-ITS-info-to-Virtual-ITS.patch
  0013-xen-arm-ITS-Implement-gic_is_lpi-helper-function.patch
  - Split patch #12 into two patches #14 & #16
  0014-xen-arm-ITS-Allocate-irq-descriptors-for-LPIs.patch
  0016-xen-arm-ITS-Route-LPIs.patch
  - Introduce new API to route LPI (route_lpi_to_guest() )
  - Moved patch #8 in v4 as patch #19
  - irq_descritors for LPIs are allocated dynamically
  - Removed RB-tree for managing vitual its devices. Will be
introduced when pci-passthrough is implemented
  - its_add_device() api now takes nr_ites and DT its node as parameters
  - some function are kept as non-static when introduced in a patch for
compilation purpose and eventually made static when used.
  - Tested compilation for arm32

Changes in v4:
  - Patch for rate limiting of error message is removed.
  - Patch #4 and #5 in v3 is merged
  - Merged #13 and #16 as one patch
  - hw_irq_controller is implemented for LPIs
  - GITS and GICR emulation for LPIs in separate patches
  - Removed build functions for ITS command in physical ITS driver
  - Added new patch to add and assign devices from platform file
  - Enable compilation of vits and pits driver in separate patch
  - Replace msi-parent property in all pci dt nodes to single
ITS node generated by Xen for Dom0

Vijaya Kumar K (22):
  xen/arm: Return success if dt node does not have irq mapping
  xen/arm: Add bitmap_find_next_zero_area helper function
  xen: Add log2 functionality
  xen/arm: Set nr_cpu_ids to available number of cpus
  xen/arm: ITS: Port ITS driver to Xen
  xen/arm: ITS: Add helper functions to manage its_devices
  xen/arm: ITS: Add virtual ITS driver
  xen/arm: ITS: Add virtual ITS commands support
  xen/arm: ITS: Export ITS info to Virtual ITS
  xen/arm: ITS: Add GITS registers emulation
  xen/arm: ITS: Enable physical and virtual ITS driver compilation
  xen/arm: ITS: Add GICR register emulation
  xen/arm: ITS: Implement gic_is_lpi helper function
  xen/arm: ITS: Allocate irq descriptors for LPIs
  xen/arm: ITS: implement hw_irq_controller for LPIs
  xen/arm: ITS: Route LPIs
  xen/arm: ITS: Initialize physical ITS
  xen/arm: ITS: Add domain specific ITS initialization
  xen/arm: ITS: Add APIs to add and assign device
  xen/arm: ITS: Map ITS translation space
  xen/arm: ITS: Generate ITS node for Dom0
  xen/arm: ITS: Add pci devices in ThunderX

 xen/arch/arm/Makefile |2 +
 xen/arch/arm/domain_build.c   |   17 +
 xen/arch/arm/gic-hip04.c  |   20 +-
 xen/arch/arm/gic-v2.c |   20 +-
 xen/arch/arm/gic-v3-its.c | 1527 +
 xen/arch/arm/gic-v3.c |  127 ++-
 xen/arch/arm/gic.c|   80 +-
 xen/arch/arm/irq.c|  195 -
 xen/arch/arm/platforms/Makefile   |1 +
 xen/arch/arm/platforms/thunderx.c |  151 
 xen/arch/arm/setup.c  |1 +
 xen/arch/arm/smpboot.c|   11 +
 xen/arch/arm/vgic-v2.c|5 +-
 xen/arch/arm/vgic-v3-its.c| 1178 
 xen/arch/arm/vgic-v3.c|  131 +++-
 xen/arch/arm/vgic.c   |   88 ++-
 xen/common/bitmap.c   |   39 +
 xen/common/device_tree.c  |2 +-
 xen/include/asm-arm/domain.h  |6 +
 xen/include/asm-arm/gic-its.h |  393 ++
 xen/include/asm-arm/gic.h |   44 +-
 xen/include/asm-arm/gic_v3_defs.h |   46 +-
 xen/include/asm-arm/irq.h |   18 +-
 xen/include/asm-arm/smp.h |1 +
 xen/include/asm-arm/vgic.h|5 +
 xen/include/xen/bitmap.h  |   16 +
 xen/include/xen/lib.h |2 +
 xen/include/xen/log2.h|  167 
 28 files changed, 4228 insertions(+), 65 deletions(-)
 create mode 100644 xen/arch/arm/gic-v3-its.c
 create mode 100644 xen/arch/arm/platforms/thunderx.c
 create mode 100644 xen/arch/arm/vgic-v3-its.c
 create mode 100644 xen/include/asm-arm/gic-its.h
 create mode 100644 xen/include/xen/log2.h

-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 06/22] xen/arm: ITS: Add helper functions to manage its_devices

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Helper functions to manage its devices using RB-tree
are introduced in physical ITS driver.

This is global list of all the devices.

Signed-off-by: Vijaya Kumar K 
Acked-by: Ian Campbell 
Reviewed-by: Julien Grall 
---
v5: - Added assert on spinlock
v4: - Remove passing of root node as parameter
- Declare prototype in header file
- Rename find_its_device to its_find_device
---
 xen/arch/arm/gic-v3-its.c |   53 +
 xen/include/asm-arm/gic-its.h |3 +++
 2 files changed, 56 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index ba4110f..aa4d3c5 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -93,6 +93,8 @@ struct its_node {
 static LIST_HEAD(its_nodes);
 static DEFINE_SPINLOCK(its_lock);
 static struct rdist_prop  *gic_rdists;
+static struct rb_root rb_its_dev;
+static DEFINE_SPINLOCK(rb_its_dev_lock);
 
 #define gic_data_rdist()(this_cpu(rdist))
 
@@ -107,6 +109,55 @@ void dump_cmd(its_cmd_block *cmd)
 void dump_cmd(its_cmd_block *cmd) { do {} while ( 0 ); }
 #endif
 
+/* RB-tree helpers for its_device */
+struct its_device *its_find_device(u32 devid)
+{
+struct rb_node *node = rb_its_dev.rb_node;
+
+ASSERT(spin_is_locked(&rb_its_dev_lock));
+while ( node )
+{
+struct its_device *dev;
+
+dev = container_of(node, struct its_device, node);
+if ( devid < dev->device_id )
+node = node->rb_left;
+else if ( devid > dev->device_id )
+node = node->rb_right;
+else
+return dev;
+}
+
+return NULL;
+}
+
+int its_insert_device(struct its_device *dev)
+{
+struct rb_node **new, *parent;
+
+ASSERT(spin_is_locked(&rb_its_dev_lock));
+new = &rb_its_dev.rb_node;
+parent = NULL;
+while ( *new )
+{
+struct its_device *this;
+
+this  = container_of(*new, struct its_device, node);
+parent = *new;
+if ( dev->device_id < this->device_id )
+new = &((*new)->rb_left);
+else if ( dev->device_id > this->device_id )
+new = &((*new)->rb_right);
+else
+return -EEXIST;
+}
+
+rb_link_node(&dev->node, parent, new);
+rb_insert_color(&dev->node, &rb_its_dev);
+
+return 0;
+}
+
 #define ITS_CMD_QUEUE_SZSZ_64K
 #define ITS_CMD_QUEUE_NR_ENTRIES(ITS_CMD_QUEUE_SZ / sizeof(its_cmd_block))
 
@@ -942,6 +993,8 @@ static int its_probe(struct dt_device_node *node)
 list_add(&its->entry, &its_nodes);
 spin_unlock(&its_lock);
 
+rb_its_dev = RB_ROOT;
+
 return 0;
 
 out_free_tables:
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 5f44d5f..3daba4b 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -19,6 +19,7 @@
 #define __ASM_ARM_GIC_ITS_H__
 
 #include 
+#include 
 
 /*
  * ITS registers, offsets from ITS_base
@@ -259,6 +260,8 @@ struct its_device {
 u32 nr_lpis;
 /* Physical Device id */
 u32 device_id;
+/* RB-tree entry */
+struct rb_node  node;
 };
 
 int its_init(struct rdist_prop *rdists);
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 10/22] xen/arm: ITS: Add GITS registers emulation

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Emulate GITS* registers

Signed-off-by: Vijaya Kumar K 
---
v4: - Removed GICR register emulation
---
 xen/arch/arm/irq.c|3 +
 xen/arch/arm/vgic-v3-its.c|  365 -
 xen/include/asm-arm/gic-its.h |   15 ++
 xen/include/asm-arm/gic.h |1 +
 4 files changed, 381 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index 1f38605..85cacb0 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -31,6 +31,9 @@
 static unsigned int local_irqs_type[NR_LOCAL_IRQS];
 static DEFINE_SPINLOCK(local_irqs_type_lock);
 
+/* Number of LPI supported in XEN */
+unsigned int num_of_lpis = 8192;
+
 /* Describe an IRQ assigned to a guest */
 struct irq_guest
 {
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 3a003d4..1c7d9b6 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -33,8 +33,16 @@
 #include 
 #include 
 
-#define DEBUG_ITS
-
+//#define DEBUG_ITS
+
+/* GITS_PIDRn register values for ARM implementations */
+#define GITS_PIDR0_VAL   (0x94)
+#define GITS_PIDR1_VAL   (0xb4)
+#define GITS_PIDR2_VAL   (0x3b)
+#define GITS_PIDR3_VAL   (0x00)
+#define GITS_PIDR4_VAL   (0x04)
+#define GITS_BASER_INIT_VAL  ((1UL << GITS_BASER_TYPE_SHIFT) | \
+ (0x7UL << GITS_BASER_ENTRY_SIZE_SHIFT))
 #ifdef DEBUG_ITS
 # define DPRINTK(fmt, args...) dprintk(XENLOG_DEBUG, fmt, ##args)
 #else
@@ -60,6 +68,14 @@ void vits_setup_hw(struct gic_its_info *its_info)
 vits_hw.info = its_info;
 }
 
+static inline uint32_t vits_get_max_collections(struct domain *d)
+{
+/* Collection ID is only 16 bit */
+ASSERT(d->max_vcpus < 256);
+
+return (d->max_vcpus + 1);
+}
+
 static int vits_access_guest_table(struct domain *d, paddr_t entry, void *addr,
uint32_t size, bool_t set)
 {
@@ -502,7 +518,7 @@ static int vits_read_virt_cmd(struct vcpu *v, struct 
vgic_its *vits,
 return 0;
 }
 
-int vits_process_cmd(struct vcpu *v, struct vgic_its *vits)
+static int vits_process_cmd(struct vcpu *v, struct vgic_its *vits)
 {
 its_cmd_block virt_cmd;
 
@@ -527,11 +543,338 @@ err:
 return 0;
 }
 
+static inline uint32_t vits_get_word(uint32_t reg_offset, uint64_t val)
+{
+if ( (reg_offset % 8) == 0 )
+return (u32)val;
+else
+return (u32)(val >> 32);
+}
+
+static inline void vits_spin_lock(struct vgic_its *vits)
+{
+spin_lock(&vits->lock);
+}
+
+static inline void vits_spin_unlock(struct vgic_its *vits)
+{
+spin_unlock(&vits->lock);
+}
+
+static int vgic_v3_gits_mmio_read(struct vcpu *v, mmio_info_t *info)
+{
+struct vgic_its *vits = v->domain->arch.vgic.vits;
+struct hsr_dabt dabt = info->dabt;
+struct cpu_user_regs *regs = guest_cpu_user_regs();
+register_t *r = select_user_reg(regs, dabt.reg);
+uint64_t val = 0;
+uint32_t gits_reg;
+
+gits_reg = info->gpa - vits->gits_base;
+DPRINTK("%pv: vITS: GITS_MMIO_READ offset 0x%"PRIx32"\n", v, gits_reg);
+
+switch ( gits_reg )
+{
+case GITS_CTLR:
+if ( dabt.size != DABT_WORD ) goto bad_width;
+vits_spin_lock(vits);
+*r = vits->ctrl | GITS_CTLR_QUIESCENT;
+vits_spin_unlock(vits);
+return 1;
+case GITS_IIDR:
+if ( dabt.size != DABT_WORD ) goto bad_width;
+*r = GICV3_GICD_IIDR_VAL;
+return 1;
+case GITS_TYPER:
+case GITS_TYPER + 4:
+/*
+ * GITS_TYPER.HCC = max_vcpus + 1 (max collection supported )
+ * GITS_TYPER.Devbits = HW supported Devbits size
+ * GITS_TYPER.IDbits = HW supported IDbits size
+ * GITS_TYPER.PTA = 0 ( Target addresses are linear processor numbers
+ * GITS_TYPER.ITTSize = Size of struct vitt
+ * GITS_TYPER.Physical = 1
+ */
+if ( dabt.size != DABT_DOUBLE_WORD &&
+ dabt.size != DABT_WORD ) goto bad_width;
+val = ((vits_get_max_collections(v->domain) << GITS_TYPER_HCC_SHIFT ) |
+   ((vits_hw.info->dev_bits - 1) << GITS_TYPER_DEVBITS_SHIFT) |
+   ((vits_hw.info->eventid_bits - 1) << GITS_TYPER_IDBITS_SHIFT)  |
+   ((sizeof(struct vitt) - 1) << GITS_TYPER_ITT_SIZE_SHIFT)   |
+ GITS_TYPER_PHYSICAL_LPIS);
+if ( dabt.size == DABT_DOUBLE_WORD )
+*r = val;
+else
+*r = vits_get_word(gits_reg, val);
+return 1;
+case 0x0010 ... 0x007c:
+case 0xc000 ... 0xffcc:
+/* Implementation defined -- read ignored */
+goto read_as_zero;
+case GITS_CBASER:
+case GITS_CBASER + 4:
+/* Only read support 32/64-bit access */
+if ( dabt.size != DABT_DOUBLE_WORD &&
+ dabt.size != DABT_WORD ) goto bad_width;
+vits_spin_lock(vits);
+if ( dabt.size == DABT_DOUBLE_WORD )
+*r = vits->cmd

[Xen-devel] [PATCH v5 09/22] xen/arm: ITS: Export ITS info to Virtual ITS

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Export physical ITS information to virtual ITS driver

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/gic-v3-its.c |   27 ++-
 xen/arch/arm/vgic-v3-its.c|9 +
 xen/include/asm-arm/gic-its.h |   14 ++
 3 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index aa4d3c5..e16fa03 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -94,6 +94,7 @@ static LIST_HEAD(its_nodes);
 static DEFINE_SPINLOCK(its_lock);
 static struct rdist_prop  *gic_rdists;
 static struct rb_root rb_its_dev;
+static struct gic_its_info its_data;
 static DEFINE_SPINLOCK(rb_its_dev_lock);
 
 #define gic_data_rdist()(this_cpu(rdist))
@@ -942,6 +943,8 @@ static int its_probe(struct dt_device_node *node)
 its->phys_size = its_size;
 typer = readl_relaxed(its_base + GITS_TYPER);
 its->ite_size = ((typer >> 4) & 0xf) + 1;
+its_data.eventid_bits = GITS_TYPER_IDBITS(typer);
+its_data.dev_bits = GITS_TYPER_DEVBITS(typer);
 
 its->cmd_base = xzalloc_bytes(ITS_CMD_QUEUE_SZ);
 if ( !its->cmd_base )
@@ -1032,7 +1035,10 @@ int its_cpu_init(void)
 
 int __init its_init(struct rdist_prop *rdists)
 {
+struct its_node *its;
+struct its_node_info *info;
 struct dt_device_node *np = NULL;
+uint32_t i, nr_its = 0;
 
 static const struct dt_device_match its_device_ids[] __initconst =
 {
@@ -1042,7 +1048,10 @@ int __init its_init(struct rdist_prop *rdists)
 
 for (np = dt_find_matching_node(NULL, its_device_ids); np;
  np = dt_find_matching_node(np, its_device_ids))
-its_probe(np);
+{
+if ( !its_probe(np) )
+nr_its++;
+}
 
 if ( list_empty(&its_nodes) )
 {
@@ -1050,6 +1059,22 @@ int __init its_init(struct rdist_prop *rdists)
 return -ENXIO;
 }
 
+info = xzalloc_array(struct its_node_info, nr_its);
+if ( !info )
+return -ENOMEM;
+
+i = 0;
+list_for_each_entry(its, &its_nodes, entry)
+{
+ info[i].phys_base = its->phys_base;
+ info[i].phys_size = its->phys_size;
+ i++;
+}
+
+its_data.nr_its = nr_its;
+its_data.its_hw = info;
+vits_setup_hw(&its_data);
+
 gic_rdists = rdists;
 its_lpi_init(rdists->id_bits);
 its_alloc_lpi_tables();
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index dfa3435..3a003d4 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -51,6 +51,15 @@ static void dump_cmd(its_cmd_block *cmd)
 static void dump_cmd(its_cmd_block *cmd) { do {} while ( 0 ); }
 #endif
 
+static struct {
+struct gic_its_info *info;
+} vits_hw;
+
+void vits_setup_hw(struct gic_its_info *its_info)
+{
+vits_hw.info = its_info;
+}
+
 static int vits_access_guest_table(struct domain *d, paddr_t entry, void *addr,
uint32_t size, bool_t set)
 {
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index cdb786c..23ff66c 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -309,12 +309,26 @@ struct vitt {
 uint32_t vlpi;
 };
 
+struct its_node_info
+{
+paddr_t phys_base;
+unsigned long phys_size;
+};
+
+struct gic_its_info {
+uint32_t eventid_bits;
+uint32_t dev_bits;
+uint32_t nr_its;
+struct its_node_info *its_hw;
+};
+
 int its_init(struct rdist_prop *rdists);
 int its_cpu_init(void);
 int vits_get_vitt_entry(struct domain *d, uint32_t devid,
 uint32_t event, struct vitt *entry);
 int vits_get_vdevice_entry(struct domain *d, uint32_t devid,
struct vdevice_table *entry);
+void vits_setup_hw(struct gic_its_info *info);
 
 #endif /* __ASM_ARM_GIC_ITS_H__ */
 /*
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 08/22] xen/arm: ITS: Add virtual ITS commands support

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Add Virtual ITS command processing support to Virtual ITS driver

Signed-off-by: Vijaya Kumar K 
---
v5: - Rename vgic_its_*() to vits_*()
v4: - Use helper function to read from command queue
- Add MOVALL
- Removed check for entry in device in domain RB-tree
---
 xen/arch/arm/vgic-v3-its.c|  392 +
 xen/include/asm-arm/gic-its.h |   13 ++
 2 files changed, 405 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 60f8332..dfa3435 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -30,8 +30,27 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
+#define DEBUG_ITS
+
+#ifdef DEBUG_ITS
+# define DPRINTK(fmt, args...) dprintk(XENLOG_DEBUG, fmt, ##args)
+#else
+# define DPRINTK(fmt, args...) do {} while ( 0 )
+#endif
+
+#ifdef DEBUG_ITS
+static void dump_cmd(its_cmd_block *cmd)
+{
+printk("VITS:CMD[0] = 0x%lx CMD[1] = 0x%lx CMD[2] = 0x%lx CMD[3] = 
0x%lx\n",
+   cmd->bits[0], cmd->bits[1], cmd->bits[2], cmd->bits[3]);
+}
+#else
+static void dump_cmd(its_cmd_block *cmd) { do {} while ( 0 ); }
+#endif
+
 static int vits_access_guest_table(struct domain *d, paddr_t entry, void *addr,
uint32_t size, bool_t set)
 {
@@ -152,6 +171,379 @@ int vits_get_vitt_entry(struct domain *d, uint32_t devid,
 return vits_vitt_entry(d, devid, event, entry, 0);
 }
 
+static int vits_process_sync(struct vcpu *v, struct vgic_its *vits,
+ its_cmd_block *virt_cmd)
+{
+/* Ignored */
+DPRINTK("%pv: vITS: SYNC: ta 0x%"PRIx32" \n", v, virt_cmd->sync.ta);
+
+return 0;
+}
+
+static int vits_process_mapvi(struct vcpu *v, struct vgic_its *vits,
+  its_cmd_block *virt_cmd)
+{
+struct vitt entry;
+struct domain *d = v->domain;
+uint8_t vcol_id, cmd;
+uint32_t vid, dev_id, event;
+
+vcol_id = virt_cmd->mapvi.col;
+vid = virt_cmd->mapvi.phy_id;
+cmd = virt_cmd->mapvi.cmd;
+dev_id = virt_cmd->mapvi.devid;
+
+DPRINTK("%pv: vITS: MAPVI: dev 0x%"PRIx32" vcol %"PRId32" vid %"PRId32"\n",
+ v, dev_id, vcol_id, vid);
+
+entry.valid = true;
+entry.vcollection = vcol_id;
+entry.vlpi = vid;
+
+if ( cmd == GITS_CMD_MAPI )
+vits_set_vitt_entry(d, dev_id, vid, &entry);
+else
+{
+event = virt_cmd->mapvi.event;
+vits_set_vitt_entry(d, dev_id, event, &entry);
+}
+
+return 0;
+}
+
+static int vits_process_movi(struct vcpu *v, struct vgic_its *vits,
+ its_cmd_block *virt_cmd)
+{
+struct vitt entry;
+struct domain *d = v->domain;
+uint32_t dev_id, event;
+uint8_t vcol_id;
+
+vcol_id = virt_cmd->movi.col;
+event = virt_cmd->movi.event;
+dev_id = virt_cmd->movi.devid;
+
+DPRINTK("%pv vITS: MOVI: dev_id 0x%"PRIx32" vcol %"PRId32" event 
%"PRId32"\n",
+v, dev_id, vcol_id, event);
+
+if ( vits_get_vitt_entry(d, dev_id, event, &entry) )
+return -EINVAL;
+
+entry.vcollection = vcol_id;
+
+if ( vits_set_vitt_entry(d, dev_id, event, &entry) )
+return -EINVAL;
+
+return 0;
+}
+
+static int vits_process_movall(struct vcpu *v, struct vgic_its *vits,
+   its_cmd_block *virt_cmd)
+{
+/* Ignored */
+DPRINTK("%pv: vITS: MOVALL: ta1 0x%"PRIx32" ta2 0x%"PRIx32" \n",
+v, virt_cmd->movall.ta1, virt_cmd->movall.ta2);
+
+return 0;
+}
+
+static int vits_process_discard(struct vcpu *v, struct vgic_its *vits,
+its_cmd_block *virt_cmd)
+{
+struct vitt entry;
+struct domain *d = v->domain;
+uint32_t event, dev_id;
+
+event = virt_cmd->discard.event;
+dev_id = virt_cmd->discard.devid;
+
+DPRINTK("%pv vITS: DISCARD: dev_id 0x%"PRIx32" id %"PRId32"\n",
+v, virt_cmd->discard.devid, event);
+
+if ( vits_get_vitt_entry(d, dev_id, event, &entry) )
+return -EINVAL;
+
+entry.valid = false;
+
+if ( vits_set_vitt_entry(d, dev_id, event, &entry) )
+return -EINVAL;
+
+return 0;
+}
+
+static int vits_process_inv(struct vcpu *v, struct vgic_its *vits,
+its_cmd_block *virt_cmd)
+{
+/* Ignored */
+DPRINTK("%pv vITS: INV: dev_id 0x%"PRIx32" id %"PRId32"\n",
+v, virt_cmd->inv.devid, virt_cmd->inv.event);
+
+return 0;
+}
+
+static int vits_process_clear(struct vcpu *v, struct vgic_its *vits,
+  its_cmd_block *virt_cmd)
+{
+/* Ignored */
+DPRINTK("%pv: vITS: CLEAR: dev_id 0x%"PRIx32" id %"PRId32"\n",
+ v, virt_cmd->clear.devid, virt_cmd->clear.event);
+
+return 0;
+}
+
+static int vits_process_invall(struct vcpu *v, struct vgic_its *vits,
+   its_cmd_block *virt_cmd)
+{
+/* Ignored */
+DPRINTK("%pv: vITS: INVALL: vCID %"PRId32"\n", v, virt_cmd->invall.col);
+
+ret

Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread George Dunlap
On 07/27/2015 11:54 AM, Juergen Gross wrote:
> On 07/27/2015 12:43 PM, George Dunlap wrote:
>> On Mon, Jul 27, 2015 at 5:35 AM, Juergen Gross  wrote:
>>> On 07/24/2015 06:44 PM, Boris Ostrovsky wrote:

 On 07/24/2015 12:39 PM, Juergen Gross wrote:
>
>
>
> I don't say mangling cpuids can't solve the scheduling problem. It
> surely can. But it can't solve the scheduling problem without hiding
> information like number of sockets or cores which might be required
> for license purposes. If we don't care, fine.
>

 (this is somewhat repeating the email I just sent)

 Why can's we construct socket/core info with CPUID (and *possibly* ACPI
 changes) that we present a reasonable (licensing-wise) picture?

 Can you suggest an example where it will not work and then maybe we can
 figure something out?
>>>
>>>
>>> Let's assume a software with license based on core count. You have a
>>> system with a 2 8 core processors and hyperthreads enabled, summing up
>>> to 32 logical processors. Your license is valid for up to 16 cores, so
>>> running the software on bare metal on your system is fine.
>>>
>>> Now you are running the software inside a virtual machine with 24 vcpus
>>> in a cpupool with 24 logical cpus limited to 12 cores (6 cores of each
>>> processor). As we have to hide hyperthreading in order to not to have
>>> to pin each vcpu to just a single logical processor, the topology
>>> resulting from this picture will have to present 24 cores. The license
>>> will not cover this hardware.
>>
>> But how does doing a PV topology help this situation?  Because we're
>> telling one thing to the OS (via our PV interface) and another thing
>> to applications (via direct CPUID access)?
> 
> Exactly.
> 
> In my example it would even work to not modify the cpuid information at
> all. The kernel wouldn't try to be extremely clever regarding scheduling
> and the user land would see the cpuid information from the real hardware
> (only the 12 cores it is running on, of course).

Right; so it seems

1. Userspace applications are in the habit of reading CPUID to determine
the topology of the system they're running on

2. Many use the topology information to help themselves make better
scheduling decisions.  Because a vcpu is not typically pinned to a
specific pcpu, we may need to lie here slightly (e.g., not mention
threads) to get the optimal behavior overall.

3. Others use the topology information to implement licensing
restrictions.  Because threads are treated differently to cores, we want
to tell the truth here (i.e., make sure we mention that some of these
are threads) to get the optimal behavior overall.

Numbers #2 and #3 lead to contradictory courses of action; we cannot
optimize for both at the same time.

I think at some level we need to just try to accommodate both -- if the
user doesn't have licensing issues, or prefers performance over
licensing, then present a unified topology in PVH / HVM using CPUID,
ACPI, &c.  I think this should be the default.

If the user has licensing issues, and doesn't mind having wonky or
unreliable topology to its guests, then let the raw CPUID through.  But
it would, in this case, be good to try to give the guest OS scheduler a
hint that it shouldn't really bother trying to read the topology or do
placement as a result, as any decisions will be unreliable.

Or alternately, if the user wants to give up on the "consolidation"
aspect of virtualization, they can pin vcpus to pcpus and then pass in
the actual host topology (hyperthreads and all).

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 07/22] xen/arm: ITS: Add virtual ITS driver

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

This patch introduces virtual ITS driver with following
functionality
 - Introduces helper functions to manage device table and
   ITT table in guest memory
 - Helper function to handle virtual ITS devices assigned
   to domain

Signed-off-by: Vijaya Kumar K 
---
v5: - Removed RB tree that manages vitual ITS devices
v4: - Rename functions {find,remove,insert}_vits_* to
  vits_{find,remove,insert}.
- Add common helper function to map and read/write dt
  or vitt table entry.
- Removed unused code
---
 xen/arch/arm/vgic-v3-its.c|  162 +
 xen/include/asm-arm/domain.h  |2 +
 xen/include/asm-arm/gic-its.h |   36 +
 3 files changed, 200 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
new file mode 100644
index 000..60f8332
--- /dev/null
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -0,0 +1,162 @@
+/*
+ * Copyright (C) 2015 Cavium Inc.
+ * Vijaya Kumar K 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int vits_access_guest_table(struct domain *d, paddr_t entry, void *addr,
+   uint32_t size, bool_t set)
+{
+struct page_info *page;
+uint64_t offset;
+p2m_type_t p2mt;
+void *p;
+
+page = get_page_from_gfn(d, paddr_to_pfn(entry), &p2mt, P2M_ALLOC);
+if ( !page )
+{
+printk(XENLOG_G_ERR "d%"PRId32": vITS: Failed to get table entry\n",
+   d->domain_id);
+return -EINVAL;
+}
+
+if ( !p2m_is_ram(p2mt) )
+{
+put_page(page);
+printk(XENLOG_G_ERR "d%"PRId32": vITS: with wrong attributes\n",
+   d->domain_id);
+return -EINVAL;
+}
+
+p = __map_domain_page(page);
+/* Offset within the mapped page */
+offset = entry & ~PAGE_MASK;
+
+if ( set )
+memcpy(p + offset, addr, size);
+else
+memcpy(addr, p + offset, size);
+
+unmap_domain_page(p);
+put_page(page);
+
+return 0;
+}
+
+/* ITS device table helper functions */
+static int vits_vdevice_entry(struct domain *d, uint32_t dev_id,
+  struct vdevice_table *entry, bool_t set)
+{
+uint64_t offset;
+paddr_t dt_entry;
+struct vgic_its *vits = d->arch.vgic.vits;
+
+BUILD_BUG_ON(sizeof(struct vdevice_table) != 16);
+
+offset = dev_id * sizeof(struct vdevice_table);
+if ( offset > vits->dt_size )
+{
+printk(XENLOG_G_ERR
+   "d%"PRId32":vITS:Out of range off 0x%"PRIx64" id 0x%"PRIx32"\n",
+   d->domain_id, offset, dev_id);
+return -EINVAL;
+}
+
+dt_entry = vits->dt_ipa + offset;
+
+return vits_access_guest_table(d, dt_entry, entry,
+   sizeof(struct vdevice_table), set);
+}
+
+int vits_set_vdevice_entry(struct domain *d, uint32_t devid,
+   struct vdevice_table *entry)
+{
+return vits_vdevice_entry(d, devid, entry, 1);
+}
+
+int vits_get_vdevice_entry(struct domain *d, uint32_t devid,
+   struct vdevice_table *entry)
+{
+return vits_vdevice_entry(d, devid, entry, 0);
+}
+
+static int vits_vitt_entry(struct domain *d, uint32_t devid,
+   uint32_t event, struct vitt *entry, bool_t set)
+{
+struct vdevice_table dt_entry;
+paddr_t vitt_entry;
+uint64_t offset;
+
+BUILD_BUG_ON(sizeof(struct vitt) != 8);
+
+if ( vits_get_vdevice_entry(d, devid, &dt_entry) )
+{
+printk(XENLOG_G_ERR
+"d%"PRId32": vITS: Fail to get vdevice for vdev 0x%"PRIx32"\n",
+d->domain_id, devid);
+return -EINVAL;
+}
+
+/* dt_entry is validated in vits_get_vdevice_entry */
+offset = event * sizeof(struct vitt);
+if ( offset > dt_entry.vitt_size )
+{
+printk(XENLOG_G_ERR "d%"PRId32": vITS: ITT out of range\n",
+   d->domain_id);
+return -EINVAL;
+}
+
+vitt_entry = dt_entry.vitt_ipa + offset;
+
+return vits_access_guest_table(d, vitt_entry, entry,
+   sizeof(struct vitt), set);
+}
+
+int vits_set_vitt_entry(struct domain *d, uint32_t devid,
+uint32_t event, struct vitt *entry)
+{
+return v

[Xen-devel] [PATCH v5 11/22] xen/arm: ITS: Enable physical and virtual ITS driver compilation

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Compilation is delayed till this patch.
>From now on functions in physical ITS and virtual ITS
driver are required. So enable compilation

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/Makefile |2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 1ef39f7..14cbf12 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -14,6 +14,7 @@ obj-y += domain_build.o
 obj-y += gic.o gic-v2.o
 obj-$(CONFIG_ARM_32) += gic-hip04.o
 obj-$(HAS_GICV3) += gic-v3.o
+obj-$(HAS_GICV3) += gic-v3-its.o
 obj-y += io.o
 obj-y += irq.o
 obj-y += kernel.o
@@ -32,6 +33,7 @@ obj-y += shutdown.o
 obj-y += traps.o
 obj-y += vgic.o vgic-v2.o
 obj-$(CONFIG_ARM_64) += vgic-v3.o
+obj-$(HAS_GICV3) += vgic-v3-its.o
 obj-y += vtimer.o
 obj-y += vuart.o
 obj-y += hvm.o
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 18/22] xen/arm: ITS: Add domain specific ITS initialization

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Add Domain and vcpu specific ITS initialization

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/vgic-v3-its.c|   10 ++
 xen/arch/arm/vgic-v3.c|   10 ++
 xen/arch/arm/vgic.c   |3 +++
 xen/include/asm-arm/gic-its.h |2 ++
 xen/include/asm-arm/vgic.h|3 +++
 5 files changed, 28 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 5323192..e182cee 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -44,6 +44,7 @@
 #define GITS_PIDR4_VAL   (0x04)
 #define GITS_BASER_INIT_VAL  ((1UL << GITS_BASER_TYPE_SHIFT) | \
  (0x7UL << GITS_BASER_ENTRY_SIZE_SHIFT))
+//#define DEBUG_ITS
 #ifdef DEBUG_ITS
 # define DPRINTK(fmt, args...) dprintk(XENLOG_DEBUG, fmt, ##args)
 #else
@@ -1122,6 +1123,15 @@ int vits_domain_init(struct domain *d)
 return 0;
 }
 
+void vits_domain_free(struct domain *d)
+{
+   free_xenheap_pages(d->arch.vgic.vits->prop_page,
+   get_order_from_bytes(d->arch.vgic.vits->prop_size));
+   xfree(d->arch.vgic.pending_lpis);
+   xfree(d->arch.vgic.vits->collections);
+   xfree(d->arch.vgic.vits);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 9e6e3ff..a09ba36 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -1299,12 +1299,22 @@ static int vgic_v3_domain_init(struct domain *d)
 
 d->arch.vgic.ctlr = VGICD_CTLR_DEFAULT;
 
+if ( is_hardware_domain(d) && gic_lpi_supported() )
+vits_domain_init(d);
+
 return 0;
 }
 
+void vgic_v3_domain_free(struct domain *d)
+{
+if ( is_hardware_domain(d) && gic_lpi_supported() )
+vits_domain_free(d);
+}
+
 static const struct vgic_ops v3_ops = {
 .vcpu_init   = vgic_v3_vcpu_init,
 .domain_init = vgic_v3_domain_init,
+.domain_free = vgic_v3_domain_free,
 .get_irq_priority = vgic_v3_get_irq_priority,
 .get_target_vcpu  = vgic_v3_get_target_vcpu,
 .emulate_sysreg  = vgic_v3_emulate_sysreg,
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 57c0f52..e2bfdb6 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -166,6 +166,9 @@ void domain_vgic_free(struct domain *d)
 xfree(d->arch.vgic.shared_irqs);
 xfree(d->arch.vgic.pending_irqs);
 xfree(d->arch.vgic.allocated_irqs);
+
+if ( !d->arch.vgic.handler->domain_free )
+d->arch.vgic.handler->domain_free(d);
 }
 
 int vcpu_vgic_init(struct vcpu *v)
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 870c9a8..da689a4 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -367,6 +367,8 @@ int its_cpu_init(void);
 void its_set_lpi_properties(struct irq_desc *desc,
 const cpumask_t *cpu_mask,
 unsigned int priority);
+int vits_domain_init(struct domain *d);
+void vits_domain_free(struct domain *d);
 int vits_get_vitt_entry(struct domain *d, uint32_t devid,
 uint32_t event, struct vitt *entry);
 int vits_get_vdevice_entry(struct domain *d, uint32_t devid,
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index b11faa0..853df04 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -114,6 +114,8 @@ struct vgic_ops {
 int (*vcpu_init)(struct vcpu *v);
 /* Domain specific initialization of vGIC */
 int (*domain_init)(struct domain *d);
+/* Free domain specific resources */
+void (*domain_free)(struct domain *d);
 /* Get priority for a given irq stored in vgic structure */
 int (*get_irq_priority)(struct vcpu *v, unsigned int irq);
 /* Get the target vcpu for a given virq. The rank lock is already taken
@@ -191,6 +193,7 @@ enum gic_sgi_mode;
 #define vgic_num_irqs(d)((d)->arch.vgic.nr_spis + 32)
 
 extern int domain_vgic_init(struct domain *d, unsigned int nr_spis);
+extern void vgic_its_init(void);
 extern void domain_vgic_free(struct domain *d);
 extern int vcpu_vgic_init(struct vcpu *v);
 extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int irq);
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 16/22] xen/arm: ITS: Route LPIs

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Allocate and initialize irq descriptor for LPIs and
route LPIs to guest

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/gic-v3-its.c |   20 +++
 xen/arch/arm/gic-v3.c |   17 +-
 xen/arch/arm/gic.c|   38 -
 xen/arch/arm/irq.c|  126 ++---
 xen/arch/arm/vgic-v2.c|5 +-
 xen/arch/arm/vgic-v3-its.c|   22 ++-
 xen/arch/arm/vgic-v3.c|   17 --
 xen/arch/arm/vgic.c   |   72 ++-
 xen/include/asm-arm/gic-its.h |   10 
 xen/include/asm-arm/gic.h |   11 +++-
 xen/include/asm-arm/irq.h |2 +
 xen/include/asm-arm/vgic.h|2 +
 12 files changed, 319 insertions(+), 23 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 5ffd52f..0d8582a 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -110,6 +110,11 @@ void dump_cmd(its_cmd_block *cmd)
 void dump_cmd(its_cmd_block *cmd) { do {} while ( 0 ); }
 #endif
 
+u32 its_get_nr_event_ids(void)
+{
+return (1 << its_data.eventid_bits);
+}
+
 /* RB-tree helpers for its_device */
 struct its_device *its_find_device(u32 devid)
 {
@@ -415,6 +420,21 @@ static void its_flush_and_invalidate_prop(struct irq_desc 
*desc, u8 *cfg)
 its_send_inv(its_dev, col, vid);
 }
 
+void its_set_lpi_properties(struct irq_desc *desc,
+const cpumask_t *cpu_mask,
+unsigned int priority)
+{
+unsigned long flags;
+u8 *cfg;
+
+spin_lock_irqsave(&its_lock, flags);
+cfg = gic_rdists->prop_page + desc->irq - FIRST_GIC_LPI;
+*cfg = (*cfg & 3) | (priority & LPI_PRIORITY_MASK) ;
+
+its_flush_and_invalidate_prop(desc, cfg);
+spin_unlock_irqrestore(&its_lock, flags);
+}
+
 static void its_set_lpi_state(struct irq_desc *desc, int enable)
 {
 u8 *cfg;
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 58e878e..8c7c5cf 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -535,6 +535,16 @@ static void gicv3_set_irq_properties(struct irq_desc *desc,
 spin_unlock(&gicv3.lock);
 }
 
+static void gicv3_set_properties(struct irq_desc *desc,
+ const cpumask_t *cpu_mask,
+ unsigned int priority)
+{
+if ( gic_is_lpi(desc->irq) )
+its_set_lpi_properties(desc, cpu_mask, priority);
+else
+gicv3_set_irq_properties(desc, cpu_mask, priority);
+}
+
 static void __init gicv3_dist_init(void)
 {
 uint32_t type;
@@ -912,7 +922,7 @@ static void gicv3_update_lr(int lr, const struct 
pending_irq *p,
 val |= ((uint64_t)p->priority & 0xff) << GICH_LR_PRIORITY_SHIFT;
 val |= ((uint64_t)p->irq & GICH_LR_VIRTUAL_MASK) << GICH_LR_VIRTUAL_SHIFT;
 
-   if ( p->desc != NULL )
+   if ( p->desc != NULL && !(gic_is_lpi(p->irq)) )
val |= GICH_LR_HW | (((uint64_t)p->desc->irq & GICH_LR_PHYSICAL_MASK)
<< GICH_LR_PHYSICAL_SHIFT);
 
@@ -1312,7 +1322,10 @@ static int __init gicv3_init(void)
 spin_lock(&gicv3.lock);
 
 if ( gicv3_dist_supports_lpis() )
+{
 gicv3_info.lpi_supported = 1;
+gicv3_info.nr_event_ids = its_get_nr_event_ids();
+}
 else
 gicv3_info.lpi_supported = 0;
 
@@ -1336,7 +1349,7 @@ static const struct gic_hw_operations gicv3_ops = {
 .eoi_irq = gicv3_eoi_irq,
 .deactivate_irq  = gicv3_dir_irq,
 .read_irq= gicv3_read_irq,
-.set_irq_properties  = gicv3_set_irq_properties,
+.set_irq_properties  = gicv3_set_properties,
 .send_SGI= gicv3_send_sgi,
 .disable_interface   = gicv3_disable_interface,
 .update_lr   = gicv3_update_lr,
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 092087d..f6be0e9 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static void gic_restore_pending_irqs(struct vcpu *v);
@@ -67,11 +68,23 @@ bool_t gic_is_lpi(unsigned int irq)
 return gic_hw_ops->is_lpi(irq);
 }
 
+/* Returns number of PPIs/SGIs/SPIs supported */
 unsigned int gic_number_lines(void)
 {
 return gic_hw_ops->info->nr_lines;
 }
 
+/* Validates PPIs/SGIs/SPIs/LPIs supported */
+bool_t gic_is_valid_irq(unsigned int irq)
+{
+return ((irq < gic_hw_ops->info->nr_lines) || gic_is_lpi(irq));
+}
+
+unsigned int gic_nr_event_ids(void)
+{
+return gic_hw_ops->info->nr_event_ids;
+}
+
 bool_t gic_lpi_supported(void)
 {
 return gic_hw_ops->info->lpi_supported;
@@ -134,7 +147,8 @@ void gic_route_irq_to_xen(struct irq_desc *desc, const 
cpumask_t *cpu_mask,
   unsigned int priority)
 {
 ASSERT(priority <= 0xff); /* Only 8 bits of priority */
-ASSERT(desc->irq < gic_number_lines());/* Can't route interrupts that 
don't exist */
+/* Can't route interrupts that don't exist */
+ASSERT(gic_is_valid_irq(

[Xen-devel] [PATCH v5 01/22] xen/arm: Return success if dt node does not have irq mapping

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

dt_for_each_irq_map() returns error if no irq mapping is found.
With this patch, Ignore error and return success

Signed-off-by: Vijaya Kumar K 
---
 xen/common/device_tree.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 323c3be..1325ad5 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -1085,7 +1085,7 @@ int dt_for_each_irq_map(const struct dt_device_node *dev,
 if ( imap == NULL )
 {
 dt_dprintk(" -> no map, ignoring\n");
-goto fail;
+return 0;
 }
 imaplen /= sizeof(u32);
 
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 12/22] xen/arm: ITS: Add GICR register emulation

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Emulate LPI related changes to GICR registers

Signed-off-by: Vijaya Kumar K 
---
v5: - Handled all sizes access to LPI configuration table
- Rename vits_unmap_lpi_prop as  vits_map_lpi_prop
v4: - Added LPI configuration table emulation
- Rename function inline with vits
- Copied guest lpi configuration table to xen
---
 xen/arch/arm/gic-v3.c |   10 +++
 xen/arch/arm/gic.c|5 ++
 xen/arch/arm/vgic-v3-its.c|  168 +
 xen/arch/arm/vgic-v3.c|  104 +++
 xen/include/asm-arm/domain.h  |3 +
 xen/include/asm-arm/gic-its.h |   12 +++
 xen/include/asm-arm/gic.h |5 ++
 xen/include/asm-arm/gic_v3_defs.h |1 +
 8 files changed, 294 insertions(+), 14 deletions(-)

diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index f2784a2..91c1b74 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -681,6 +681,11 @@ static int __init gicv3_populate_rdist(void)
 return -ENODEV;
 }
 
+static int gicv3_dist_supports_lpis(void)
+{
+return readl_relaxed(GICD + GICD_TYPER) & GICD_TYPER_LPIS_SUPPORTED;
+}
+
 static int __cpuinit gicv3_cpu_init(void)
 {
 int i;
@@ -1274,6 +1279,11 @@ static int __init gicv3_init(void)
 
 spin_lock(&gicv3.lock);
 
+if ( gicv3_dist_supports_lpis() )
+gicv3_info.lpi_supported = 1;
+else
+gicv3_info.lpi_supported = 0;
+
 gicv3_dist_init();
 res = gicv3_cpu_init();
 gicv3_hyp_init();
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 1757193..af8a34b 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -67,6 +67,11 @@ unsigned int gic_number_lines(void)
 return gic_hw_ops->info->nr_lines;
 }
 
+bool_t gic_lpi_supported(void)
+{
+return gic_hw_ops->info->lpi_supported;
+}
+
 void gic_save_state(struct vcpu *v)
 {
 ASSERT(!local_irq_is_enabled());
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 1c7d9b6..4afb62b 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -76,6 +77,34 @@ static inline uint32_t vits_get_max_collections(struct 
domain *d)
 return (d->max_vcpus + 1);
 }
 
+static void vits_disable_lpi(struct vcpu *v, uint32_t vlpi)
+{
+struct pending_irq *p;
+
+p = irq_to_pending(v, vlpi);
+clear_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
+gic_remove_from_queues(v, vlpi);
+}
+
+static void vits_enable_lpi(struct vcpu *v, uint32_t vlpi, uint8_t priority)
+{
+struct pending_irq *p;
+unsigned long flags;
+
+p = irq_to_pending(v, vlpi);
+
+set_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
+
+spin_lock_irqsave(&v->arch.vgic.lock, flags);
+
+/*XXX: raise on right vcpu */
+if ( !list_empty(&p->inflight) &&
+ !test_bit(GIC_IRQ_GUEST_VISIBLE, &p->status) )
+gic_raise_guest_irq(v, vlpi, p->priority);
+
+spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
+}
+
 static int vits_access_guest_table(struct domain *d, paddr_t entry, void *addr,
uint32_t size, bool_t set)
 {
@@ -551,6 +580,145 @@ static inline uint32_t vits_get_word(uint32_t reg_offset, 
uint64_t val)
 return (u32)(val >> 32);
 }
 
+static int vgic_v3_gits_lpi_mmio_read(struct vcpu *v, mmio_info_t *info)
+{
+uint32_t offset;
+struct vgic_its *vits = v->domain->arch.vgic.vits;
+struct hsr_dabt dabt = info->dabt;
+struct cpu_user_regs *regs = guest_cpu_user_regs();
+register_t *r = select_user_reg(regs, dabt.reg);
+
+offset = info->gpa - (vits->propbase & MASK_4K);
+
+DPRINTK("%pv: vITS: LPI Table read offset 0x%"PRIx32"\n", v, offset);
+spin_lock(&vits->prop_lock);
+if ( dabt.size == DABT_DOUBLE_WORD )
+*r = *((u64*)vits->prop_page + offset);
+else if (dabt.size == DABT_WORD )
+*r = *((u32*)vits->prop_page + offset);
+else if (dabt.size == DABT_HALF_WORD )
+*r = *((u16*)vits->prop_page + offset);
+else
+*r = *((u8*)vits->prop_page + offset);
+spin_unlock(&vits->prop_lock);
+
+return 1;
+}
+
+static int vgic_v3_gits_lpi_mmio_write(struct vcpu *v, mmio_info_t *info)
+{
+uint32_t offset, vid;
+uint8_t cfg, *p, i, iter;
+bool_t enable;
+struct vgic_its *vits = v->domain->arch.vgic.vits;
+struct hsr_dabt dabt = info->dabt;
+struct cpu_user_regs *regs = guest_cpu_user_regs();
+register_t *r = select_user_reg(regs, dabt.reg);
+
+offset = info->gpa - (vits->propbase & MASK_4K);
+
+DPRINTK("%pv: vITS: LPI Table write offset 0x%"PRIx32"\n", v, offset);
+
+if ( dabt.size == DABT_DOUBLE_WORD )
+iter = 8;
+else if ( dabt.size == DABT_WORD )
+iter = 4;
+else if ( dabt.size == DABT_HALF_WORD )
+iter = 2;
+else
+iter = 1;
+
+spin_lock(&vits->prop_lock);
+p = ((u8*)vits->prop_page + offset);

[Xen-devel] [PATCH v5 14/22] xen/arm: ITS: Allocate irq descriptors for LPIs

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Allocate irq descriptors for LPIs dynamically and
also update irq_to_pending helper for LPIs

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/gic-v3-its.c |   15 +++
 xen/arch/arm/gic-v3.c |6 ++
 xen/arch/arm/irq.c|   12 +++-
 xen/arch/arm/vgic-v3-its.c|   22 ++
 xen/arch/arm/vgic.c   |   13 ++---
 xen/include/asm-arm/domain.h  |1 +
 xen/include/asm-arm/gic-its.h |6 ++
 xen/include/asm-arm/gic.h |2 ++
 xen/include/asm-arm/gic_v3_defs.h |3 ++-
 9 files changed, 75 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index e16fa03..0d17885 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -1038,6 +1038,7 @@ int __init its_init(struct rdist_prop *rdists)
 struct its_node *its;
 struct its_node_info *info;
 struct dt_device_node *np = NULL;
+struct irq_desc *desc;
 uint32_t i, nr_its = 0;
 
 static const struct dt_device_match its_device_ids[] __initconst =
@@ -1079,6 +1080,20 @@ int __init its_init(struct rdist_prop *rdists)
 its_lpi_init(rdists->id_bits);
 its_alloc_lpi_tables();
 
+/* Allocate LPI irq descriptors */
+irq_desc_lpi = xzalloc_array(struct irq_desc, num_of_lpis);
+if ( !irq_desc_lpi )
+return -ENOSPC;
+
+for ( i = 0; i < num_of_lpis; i++ )
+{
+   desc = &irq_desc_lpi[i];
+   init_one_irq_desc(desc);
+   desc->irq = FIRST_GIC_LPI + i;
+   desc->arch.type = DT_IRQ_TYPE_EDGE_BOTH;
+   desc->action = NULL;
+}
+
 return 0;
 }
 
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 3b4dea3..98d45bc 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1280,6 +1280,12 @@ static int __init gicv3_init(void)
  gicv3.rdist_stride);
 gicv3_init_v2(node, dbase);
 
+reg = readl_relaxed(GICD + GICD_TYPER);
+
+gicv3.rdist_data.id_bits = ((reg >> GICD_TYPE_ID_BITS_SHIFT) &
+GICD_TYPE_ID_BITS_MASK) + 1;
+gicv3_info.nr_id_bits = gicv3.rdist_data.id_bits;
+
 spin_lock_init(&gicv3.lock);
 
 spin_lock(&gicv3.lock);
diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index 85cacb0..63feb43 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -31,6 +31,7 @@
 static unsigned int local_irqs_type[NR_LOCAL_IRQS];
 static DEFINE_SPINLOCK(local_irqs_type_lock);
 
+irq_desc_t *irq_desc_lpi;
 /* Number of LPI supported in XEN */
 unsigned int num_of_lpis = 8192;
 
@@ -64,7 +65,16 @@ static DEFINE_PER_CPU(irq_desc_t[NR_LOCAL_IRQS], 
local_irq_desc);
 irq_desc_t *__irq_to_desc(int irq)
 {
 if (irq < NR_LOCAL_IRQS) return &this_cpu(local_irq_desc)[irq];
-return &irq_desc[irq-NR_LOCAL_IRQS];
+else if ( irq >= NR_LOCAL_IRQS && irq < NR_IRQS)
+return &irq_desc[irq-NR_LOCAL_IRQS];
+#ifdef HAS_GICV3 
+else if ( gic_is_lpi(irq) )
+{
+ASSERT(irq_desc_lpi != NULL);
+return &irq_desc_lpi[irq - FIRST_GIC_LPI];
+}
+#endif
+return NULL;
 }
 
 int __init arch_init_one_irq_desc(struct irq_desc *desc)
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 4afb62b..b8f32ed 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -69,6 +69,12 @@ void vits_setup_hw(struct gic_its_info *its_info)
 vits_hw.info = its_info;
 }
 
+bool_t is_domain_lpi(struct domain *d, unsigned int lpi)
+{
+return ((lpi >= FIRST_GIC_LPI) &&
+(lpi < (d->arch.vgic.nr_lpis + FIRST_GIC_LPI)));
+}
+
 static inline uint32_t vits_get_max_collections(struct domain *d)
 {
 /* Collection ID is only 16 bit */
@@ -1049,6 +1055,21 @@ int vits_domain_init(struct domain *d)
 
 vits = d->arch.vgic.vits;
 
+d->arch.vgic.pending_lpis = xzalloc_array(struct pending_irq,
+  d->arch.vgic.nr_lpis);
+if ( d->arch.vgic.pending_lpis == NULL )
+{
+xfree(d->arch.vgic.vits);
+return -ENOMEM;
+}
+
+for ( i = 0; i < d->arch.vgic.nr_lpis; i++ )
+{
+INIT_LIST_HEAD(&d->arch.vgic.pending_lpis[i].inflight);
+INIT_LIST_HEAD(&d->arch.vgic.pending_lpis[i].lr_queue);
+d->arch.vgic.pending_lpis[i].irq = FIRST_GIC_LPI + i;
+}
+
 spin_lock_init(&vits->lock);
 spin_lock_init(&vits->prop_lock);
 
@@ -1056,6 +1077,7 @@ int vits_domain_init(struct domain *d)
 if ( !vits->collections )
 {
 xfree(d->arch.vgic.vits);
+xfree(d->arch.vgic.pending_lpis);
 return -ENOMEM;
 }
 
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index a6835a8..ab5e81b 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -30,6 +30,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
@@ -375,13 +376,19 @@ int vgic_to_sgi(struct vcpu *v, register_t sgir,

[Xen-devel] [PATCH v5 13/22] xen/arm: ITS: Implement gic_is_lpi helper function

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Helper function gic_is_lpi() is used to find
if irq is lpi or not. For GICv2 platforms this function
returns 0.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/gic-hip04.c  |6 ++
 xen/arch/arm/gic-v2.c |6 ++
 xen/arch/arm/gic-v3.c |6 ++
 xen/arch/arm/gic.c|5 +
 xen/include/asm-arm/gic.h |2 ++
 5 files changed, 25 insertions(+)

diff --git a/xen/arch/arm/gic-hip04.c b/xen/arch/arm/gic-hip04.c
index c5ed545..fdd428a 100644
--- a/xen/arch/arm/gic-hip04.c
+++ b/xen/arch/arm/gic-hip04.c
@@ -87,6 +87,11 @@ static DEFINE_PER_CPU(u16, gic_cpu_id);
 #define HIP04_GICH_APR   0x70
 #define HIP04_GICH_LR0x80
 
+static bool_t hip04gic_is_lpi(unsigned int irq)
+{
+return 0;
+}
+
 static inline void writeb_gicd(uint8_t val, unsigned int offset)
 {
 writeb_relaxed(val, gicv2.map_dbase + offset);
@@ -727,6 +732,7 @@ const static struct gic_hw_operations hip04gic_ops = {
 .read_vmcr_priority  = hip04gic_read_vmcr_priority,
 .read_apr= hip04gic_read_apr,
 .make_hwdom_dt_node  = hip04gic_make_hwdom_dt_node,
+.is_lpi  = hip04gic_is_lpi,
 };
 
 /* Set up the GIC */
diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index 596126d..5cc2bca 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -81,6 +81,11 @@ static DEFINE_PER_CPU(u8, gic_cpu_id);
 /* Maximum cpu interface per GIC */
 #define NR_GIC_CPU_IF 8
 
+static bool_t gicv2_is_lpi(unsigned int irq)
+{
+return 0;
+}
+
 static inline void writeb_gicd(uint8_t val, unsigned int offset)
 {
 writeb_relaxed(val, gicv2.map_dbase + offset);
@@ -713,6 +718,7 @@ const static struct gic_hw_operations gicv2_ops = {
 .read_vmcr_priority  = gicv2_read_vmcr_priority,
 .read_apr= gicv2_read_apr,
 .make_hwdom_dt_node  = gicv2_make_hwdom_dt_node,
+.is_lpi  = gicv2_is_lpi,
 };
 
 /* Set up the GIC */
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 91c1b74..3b4dea3 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -62,6 +62,11 @@ DEFINE_PER_CPU(struct rdist, rdist);
 #define GICD_RDIST_BASE(this_cpu(rdist).rbase)
 #define GICD_RDIST_SGI_BASE(GICD_RDIST_BASE + SZ_64K)
 
+static bool_t gicv3_is_lpi(u32 irq)
+{
+return (irq >= FIRST_GIC_LPI && irq < (1 << gicv3_info.nr_id_bits));
+}
+
 /*
  * Saves all 16(Max) LR registers. Though number of LRs implemented
  * is implementation specific.
@@ -1316,6 +1321,7 @@ static const struct gic_hw_operations gicv3_ops = {
 .read_apr= gicv3_read_apr,
 .secondary_init  = gicv3_secondary_cpu_init,
 .make_hwdom_dt_node  = gicv3_make_hwdom_dt_node,
+.is_lpi  = gicv3_is_lpi,
 };
 
 static int __init gicv3_preinit(struct dt_device_node *node, const void *data)
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index af8a34b..cb4cdc8 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -62,6 +62,11 @@ enum gic_version gic_hw_version(void)
return gic_hw_ops->info->hw_version;
 }
 
+bool_t gic_is_lpi(unsigned int irq)
+{
+return gic_hw_ops->is_lpi(irq);
+}
+
 unsigned int gic_number_lines(void)
 {
 return gic_hw_ops->info->nr_lines;
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index a9a5874..f80f291 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -359,12 +359,14 @@ struct gic_hw_operations {
 int (*secondary_init)(void);
 int (*make_hwdom_dt_node)(const struct domain *d,
   const struct dt_device_node *node, void *fdt);
+bool_t (*is_lpi)(unsigned int irq);
 };
 
 void register_gic_ops(const struct gic_hw_operations *ops);
 int gic_make_hwdom_dt_node(const struct domain *d,
const struct dt_device_node *node,
void *fdt);
+bool_t gic_is_lpi(unsigned int irq);
 
 #endif /* __ASSEMBLY__ */
 #endif
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 21/22] xen/arm: ITS: Generate ITS node for Dom0

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Parse host dt and generate ITS node for Dom0.
ITS node resides inside GIC node so when GIC node
is encountered look for ITS node.

Signed-off-by: Vijaya Kumar K 
---
v5: - Moved ITS dt node generation to ITS driver
v4: - Generate only one ITS node for Dom0
- Replace msi-parent references to single its phandle
---
 xen/arch/arm/domain_build.c   |   17 ++
 xen/arch/arm/gic-v3-its.c |   74 +
 xen/arch/arm/gic-v3.c |   29 
 xen/arch/arm/gic.c|   18 ++
 xen/include/asm-arm/gic-its.h |3 ++
 xen/include/asm-arm/gic.h |7 
 6 files changed, 148 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 8556afd..6b6f013 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -469,6 +469,19 @@ static int write_properties(struct domain *d, struct 
kernel_info *kinfo,
 continue;
 }
 
+/*
+ * Replace all msi-parent phandle references to single ITS node
+ * generated for Dom0
+ */
+if ( dt_property_name_is_equal(prop, "msi-parent") )
+{
+fdt32_t phandle = gic_get_msi_handle();
+DPRINT(" Set msi-parent(ITS) phandle 0x%x\n",phandle);
+fdt_property(kinfo->fdt, prop->name, (void *)&phandle,
+ sizeof(phandle));
+continue;
+}
+
 res = fdt_property(kinfo->fdt, prop->name, prop_data, prop_len);
 
 xfree(new_data);
@@ -875,6 +888,10 @@ static int make_gic_node(const struct domain *d, void *fdt,
 return res;
 
 res = fdt_end_node(fdt);
+if ( res )
+return res;
+
+res = gic_its_hwdom_dt_node(d, node, fdt);
 
 return res;
 }
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 99f6edc..042c70d 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -27,6 +27,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -96,6 +98,7 @@ static struct rdist_prop  *gic_rdists;
 static struct rb_root rb_its_dev;
 static struct gic_its_info its_data;
 static DEFINE_SPINLOCK(rb_its_dev_lock);
+static fdt32_t its_phandle;
 
 #define gic_data_rdist()(this_cpu(rdist))
 
@@ -1198,6 +1201,77 @@ static void its_cpu_init_collection(void)
 spin_unlock(&its_lock);
 }
 
+int its_make_dt_node(const struct domain *d,
+ const struct dt_device_node *node, void *fdt)
+{
+struct its_node *its;
+const struct dt_device_node *gic;
+const void *compatible = NULL;
+u32 len;
+__be32 *new_cells, *tmp;
+int res = 0;
+
+/* Will pass only first ITS node info */
+its = list_first_entry(&its_nodes, struct its_node, entry);
+if ( !its )
+{
+dprintk(XENLOG_ERR, "ITS node not found\n");
+return -FDT_ERR_XEN(ENOENT);
+}
+
+gic = its->dt_node;
+
+compatible = dt_get_property(gic, "compatible", &len);
+if ( !compatible )
+{
+dprintk(XENLOG_ERR, "Can't find compatible property for the its 
node\n");
+return -FDT_ERR_XEN(ENOENT);
+}
+
+res = fdt_begin_node(fdt, "gic-its");
+if ( res )
+return res;
+
+res = fdt_property(fdt, "compatible", compatible, len);
+if ( res )
+return res;
+
+res = fdt_property(fdt, "msi-controller", NULL, 0);
+if ( res )
+return res;
+
+len = dt_cells_to_size(dt_n_addr_cells(node) + dt_n_size_cells(node));
+
+new_cells = xzalloc_bytes(len);
+if ( new_cells == NULL )
+return -FDT_ERR_XEN(ENOMEM);
+tmp = new_cells;
+
+dt_set_range(&tmp, node, its->phys_base, its->phys_size);
+
+res = fdt_property(fdt, "reg", new_cells, len);
+xfree(new_cells);
+
+if ( node->phandle )
+{
+res = fdt_property_cell(fdt, "phandle", node->phandle);
+if ( res )
+return res;
+
+its_phandle = cpu_to_fdt32(node->phandle);
+}
+
+res = fdt_end_node(fdt);
+
+return res;
+}
+
+
+fdt32_t its_get_lpi_handle(void)
+{
+return its_phandle;
+}
+
 static int its_force_quiescent(void __iomem *base)
 {
 u32 count = 100;   /* 1s */
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 23eb47c..828bf27 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1143,6 +1143,34 @@ static int gicv3_make_hwdom_dt_node(const struct domain 
*d,
 return res;
 }
 
+static int gicv3_make_hwdom_its_dt_node(const struct domain *d,
+const struct dt_device_node *node,
+void *fdt)
+{
+struct dt_device_node *gic_child;
+int res = 0;
+
+static const struct dt_device_match its_matches[] __initconst =
+{
+DT_MATCH_GIC_ITS,
+{ /* sentinel */ },
+};
+
+dt_for_each_child_node(node, gic_child)
+{
+if ( gic_child != NULL )
+{
+  

[Xen-devel] [PATCH v5 17/22] xen/arm: ITS: Initialize physical ITS

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Initialize physical ITS driver from GIC v3 driver
if LPIs are supported by hardware

Signed-off-by: Vijaya Kumar K 
---
v5: Made check of its dt node availability before
setting lpi_supported flag
---
 xen/arch/arm/gic-v3.c |   19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 8c7c5cf..23eb47c 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -714,6 +714,10 @@ static int __cpuinit gicv3_cpu_init(void)
 if ( gicv3_enable_redist() )
 return -ENODEV;
 
+/* Give LPIs a spin */
+if ( gicv3_info.lpi_supported )
+its_cpu_init();
+
 /* Set priority on PPI and SGI interrupts */
 priority = (GIC_PRI_IPI << 24 | GIC_PRI_IPI << 16 | GIC_PRI_IPI << 8 |
 GIC_PRI_IPI);
@@ -1323,11 +1327,18 @@ static int __init gicv3_init(void)
 
 if ( gicv3_dist_supports_lpis() )
 {
-gicv3_info.lpi_supported = 1;
-gicv3_info.nr_event_ids = its_get_nr_event_ids();
+/*
+ * LPI support is enabled only if HW supports it and
+ * ITS dt node is available
+ */
+if ( !its_init(&gicv3.rdist_data) )
+{
+gicv3_info.lpi_supported = 1;
+gicv3_info.nr_event_ids = its_get_nr_event_ids();
+}
+else
+gicv3_info.lpi_supported = 0;
 }
-else
-gicv3_info.lpi_supported = 0;
 
 gicv3_dist_init();
 res = gicv3_cpu_init();
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 15/22] xen/arm: ITS: implement hw_irq_controller for LPIs

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Implements hw_irq_controller api's required
to handle LPI's

Signed-off-by: Vijaya Kumar K 
---
v5: - Fixed review comments
- Exposed gicv3_[host|guest]_irq_end and hook to its
v4: - Implement separate hw_irq_controller for LPIs
- Drop setting LPI affinity
- virq and vid are moved under union
- Introduced inv command handling
- its_device is stored in irq_desc
---
 xen/arch/arm/gic-hip04.c  |   14 -
 xen/arch/arm/gic-v2.c |   14 -
 xen/arch/arm/gic-v3-its.c |  124 -
 xen/arch/arm/gic-v3.c |   33 +--
 xen/arch/arm/gic.c|   14 -
 xen/arch/arm/irq.c|   46 +++
 xen/include/asm-arm/gic-its.h |2 +
 xen/include/asm-arm/gic.h |   13 -
 xen/include/asm-arm/irq.h |   15 -
 9 files changed, 258 insertions(+), 17 deletions(-)

diff --git a/xen/arch/arm/gic-hip04.c b/xen/arch/arm/gic-hip04.c
index fdd428a..9ad8b6a 100644
--- a/xen/arch/arm/gic-hip04.c
+++ b/xen/arch/arm/gic-hip04.c
@@ -634,6 +634,16 @@ static hw_irq_controller hip04gic_guest_irq_type = {
 .set_affinity = hip04gic_irq_set_affinity,
 };
 
+static hw_irq_controller *hip04gic_get_host_irq_type(unsigned int irq)
+{
+return &hip04gic_host_irq_type;
+}
+
+static hw_irq_controller *hip04gic_get_guest_irq_type(unsigned int irq)
+{
+return &hip04gic_guest_irq_type;
+}
+
 static int __init hip04gic_init(void)
 {
 int res;
@@ -716,8 +726,8 @@ const static struct gic_hw_operations hip04gic_ops = {
 .save_state  = hip04gic_save_state,
 .restore_state   = hip04gic_restore_state,
 .dump_state  = hip04gic_dump_state,
-.gic_host_irq_type   = &hip04gic_host_irq_type,
-.gic_guest_irq_type  = &hip04gic_guest_irq_type,
+.gic_get_host_irq_type   = hip04gic_get_host_irq_type,
+.gic_get_guest_irq_type  = hip04gic_get_guest_irq_type,
 .eoi_irq = hip04gic_eoi_irq,
 .deactivate_irq  = hip04gic_dir_irq,
 .read_irq= hip04gic_read_irq,
diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index 5cc2bca..5c580bc 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -620,6 +620,16 @@ static hw_irq_controller gicv2_guest_irq_type = {
 .set_affinity = gicv2_irq_set_affinity,
 };
 
+static hw_irq_controller *gicv2_get_host_irq_type(unsigned int irq)
+{
+return &gicv2_host_irq_type;
+}
+
+static hw_irq_controller *gicv2_get_guest_irq_type(unsigned int irq)
+{
+return &gicv2_guest_irq_type;
+}
+
 static int __init gicv2_init(void)
 {
 int res;
@@ -702,8 +712,8 @@ const static struct gic_hw_operations gicv2_ops = {
 .save_state  = gicv2_save_state,
 .restore_state   = gicv2_restore_state,
 .dump_state  = gicv2_dump_state,
-.gic_host_irq_type   = &gicv2_host_irq_type,
-.gic_guest_irq_type  = &gicv2_guest_irq_type,
+.gic_get_host_irq_type   = gicv2_get_host_irq_type,
+.gic_get_guest_irq_type  = gicv2_get_guest_irq_type,
 .eoi_irq = gicv2_eoi_irq,
 .deactivate_irq  = gicv2_dir_irq,
 .read_irq= gicv2_read_irq,
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 0d17885..5ffd52f 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -301,8 +301,8 @@ post:
 its_wait_for_range_completion(its, cmd, next_cmd);
 }
 
-void its_send_inv(struct its_device *dev, struct its_collection *col,
-  u32 event_id)
+static void its_send_inv(struct its_device *dev, struct its_collection *col,
+ u32 event_id)
 {
 its_cmd_block cmd;
 
@@ -390,6 +390,126 @@ void its_send_discard(struct its_device *dev, struct 
its_collection *col,
 its_send_single_command(dev->its, &cmd, col);
 }
 
+static void its_flush_and_invalidate_prop(struct irq_desc *desc, u8 *cfg)
+{
+struct its_collection *col;
+struct its_device *its_dev = get_irq_its_device(desc);
+u32 vid = irq_to_vid(desc);
+u16 col_id;
+
+ASSERT(vid < its_dev->nr_lpis);
+
+/*
+ * Make the above write visible to the redistributors.
+ * And yes, we're flushing exactly: One. Single. Byte.
+ * Humpf...
+ */
+if ( gic_rdists->flags & RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING )
+clean_and_invalidate_dcache_va_range(cfg, sizeof(*cfg));
+else
+dsb(ishst);
+
+/* Get collection id for this event id */
+col_id = irqdesc_get_collection(desc);
+col = &its_dev->its->collections[col_id];
+its_send_inv(its_dev, col, vid);
+}
+
+static void its_set_lpi_state(struct irq_desc *desc, int enable)
+{
+u8 *cfg;
+
+ASSERT(spin_is_locked(&its_lock));
+
+cfg = gic_rdists->prop_page + desc->irq - FIRST_GIC_LPI;
+if ( enable )
+*cfg |= LPI_PROP_ENABLED;
+else
+*cfg &= ~LPI_PROP_ENABLED;
+
+its_flush_and_invalidate_prop(desc, cfg);
+}
+
+static void its_irq_enable(struct irq_desc *desc)
+{
+u

[Xen-devel] [PATCH v5 20/22] xen/arm: ITS: Map ITS translation space

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

ITS translation space contains GITS_TRANSLATOR register
which is written by device to raise LPI. This space needs
to mapped to every domain address space for all physical
ITS available,so that device can access GITS_TRANSLATOR
register using SMMU.

Signed-off-by: Vijaya Kumar K 
Acked-by: Ian Campbell 
---
 xen/arch/arm/vgic-v3-its.c |   38 +-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index e182cee..27523f4 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -1060,6 +1060,42 @@ static const struct mmio_handler_ops 
vgic_gits_mmio_handler = {
 .write_handler = vgic_v3_gits_mmio_write,
 };
 
+/*
+ * Map the 64K ITS translation space in guest.
+ * This is required purely for device smmu writes.
+*/
+
+static int vits_map_translation_space(struct domain *d)
+{
+uint64_t addr, size;
+int ret;
+
+if ( !is_hardware_domain(d) )
+return 0;
+
+ASSERT(is_domain_direct_mapped(d));
+
+addr = d->arch.vgic.vits->gits_base + SZ_64K;
+size = SZ_64K;
+
+/* Using 1:1 mapping to map translation space */
+/* TODO: Handle DomU mapping */
+ret = map_mmio_regions(d,
+   paddr_to_pfn(addr & PAGE_MASK),
+   DIV_ROUND_UP(size, PAGE_SIZE),
+   paddr_to_pfn(addr & PAGE_MASK));
+
+if ( ret )
+{
+ dprintk(XENLOG_G_ERR, "vITS: Unable to map to dom%d access to"
+ " 0x%"PRIx64" - 0x%"PRIx64"\n",
+ d->domain_id,
+ addr & PAGE_MASK, PAGE_ALIGN(addr + size) - 1);
+}
+
+return ret;
+}
+
 int vits_domain_init(struct domain *d)
 {
 struct vgic_its *vits;
@@ -1120,7 +1156,7 @@ int vits_domain_init(struct domain *d)
 
 register_mmio_handler(d, &vgic_gits_mmio_handler, vits->gits_base, SZ_64K);
 
-return 0;
+return vits_map_translation_space(d);
 }
 
 void vits_domain_free(struct domain *d)
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 22/22] xen/arm: ITS: Add pci devices in ThunderX

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

ITS initialization required for all PCI devices in
ThunderX platform are done by calling from specific
mapping function.

This patch can be reverted once XEN PCI passthrough
framework for arm64 is in available.

For now all the PCI devices are assigned to Dom0

Signed-off-by: Vijaya Kumar K 
Acked-by: Ian Campbell 
---
 xen/arch/arm/platforms/Makefile   |1 +
 xen/arch/arm/platforms/thunderx.c |  151 +
 2 files changed, 152 insertions(+)

diff --git a/xen/arch/arm/platforms/Makefile b/xen/arch/arm/platforms/Makefile
index e173fec..d9f98f9 100644
--- a/xen/arch/arm/platforms/Makefile
+++ b/xen/arch/arm/platforms/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_ARM_32) += sunxi.o
 obj-$(CONFIG_ARM_32) += rcar2.o
 obj-$(CONFIG_ARM_64) += seattle.o
 obj-$(CONFIG_ARM_64) += xgene-storm.o
+obj-$(CONFIG_ARM_64) += thunderx.o
diff --git a/xen/arch/arm/platforms/thunderx.c 
b/xen/arch/arm/platforms/thunderx.c
new file mode 100644
index 000..7c335ba
--- /dev/null
+++ b/xen/arch/arm/platforms/thunderx.c
@@ -0,0 +1,151 @@
+/*
+ * xen/arch/arm/platforms/thunderx.c
+ *
+ * Cavium Thunder specific settings
+ *
+ * Vijaya Kumar K 
+ * Copyright (c) 2015 Cavium Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+
+struct pci_dev_list 
+{
+   uint32_t seg;
+   uint32_t bus;
+   uint32_t dev;
+   uint32_t func;
+};
+
+#define NUM_DEVIDS   54
+
+static struct pci_dev_list bdf[NUM_DEVIDS] =
+{
+{0, 0, 2, 0}, /* 1 */
+{0, 0, 6, 0},
+{0, 0, 7, 0},
+{0, 0, 10, 0},
+{0, 0, 11, 0},
+{0, 1, 0, 0},
+{0, 1, 0, 1},
+{0, 1, 0, 5},
+{0, 1, 1, 4},
+{0, 1, 9, 0}, /* 10 */
+{0, 1, 9, 1},
+{0, 1, 9, 2},
+{0, 1, 9, 3},
+{0, 1, 9, 4},
+{0, 1, 9, 5},
+{0, 1, 10, 0},
+{0, 1, 10, 1},
+{0, 1, 10, 2},
+{0, 1, 10, 3},
+{0, 1, 14, 0}, /* 20 */
+{0, 1, 14, 2},
+{0, 1, 14, 4},
+{0, 1, 16, 0},
+{0, 1, 16, 1},
+{0, 2, 0, 0},
+{0, 3, 0, 0},
+{0, 4, 0, 0},
+{1, 0, 8, 0},
+{1, 0, 9, 0},
+{1, 0, 10, 0},  /* 30 */
+{1, 0, 11, 0},
+{2, 0, 1, 0},
+{2, 0, 3, 0},
+{2, 1, 0, 0},
+{2, 1, 0, 1},
+{2, 1, 0, 2},
+{2, 1, 0, 3},
+{2, 1, 0, 4},
+{2, 1, 0, 5},
+{2, 1, 0, 6}, /* 40 */
+{2, 1, 0, 7},
+{2, 1, 1, 0},
+{2, 1, 1, 1},
+{2, 1, 1, 2},
+{2, 1, 1, 3},
+{2, 1, 1, 4},
+{2, 1, 1, 5},
+{2, 1, 1, 6},
+{2, 1, 1, 7},
+{2, 1, 2, 0}, /* 50 */
+{2, 1, 2, 1},
+{2, 1, 2, 2},
+{2, 1, 1, 7},
+{3, 0, 1, 0}, /* 54 */
+};
+
+#define BDF_TO_DEVID(seg, bus, dev, func) (seg << 16 | bus << 8 | dev << 3| 
func)
+
+/* TODO: add and assign devices using PCI framework */
+static int thunderx_specific_mapping(struct domain *d)
+{
+struct dt_device_node *dt_its;
+uint32_t devid, i;
+int res;
+
+static const struct dt_device_match its_device_ids[] __initconst =
+{
+DT_MATCH_GIC_ITS,
+{ /* sentinel */ },
+};
+
+for (dt_its = dt_find_matching_node(NULL, its_device_ids); dt_its;
+   dt_its = dt_find_matching_node(dt_its, its_device_ids))
+{
+break;
+}
+
+if ( dt_its == NULL )
+{
+dprintk(XENLOG_ERR, "ThunderX: ITS node not found to add device\n");
+return 0;
+}
+
+for ( i = 0; i < NUM_DEVIDS; i++ )
+{
+devid = BDF_TO_DEVID(bdf[i].seg, bdf[i].bus,bdf[i].dev, bdf[i].func);
+res = its_add_device(devid, 32, dt_its);
+if ( res )
+return res;
+res = its_assign_device(d, devid, devid);
+if ( res )
+return res;
+}
+
+return 0;
+}
+
+static const char * const thunderx_dt_compat[] __initconst =
+{
+"cavium,thunder-88xx",
+NULL
+};
+
+PLATFORM_START(thunderx, "THUNDERX")
+.compatible = thunderx_dt_compat,
+.specific_mapping = thunderx_specific_mapping,
+PLATFORM_END
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 05/22] xen/arm: ITS: Port ITS driver to Xen

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

The linux driver is based on 4.1 with below commit id

3ad2a5f57656a14d964b673a5a0e4ab0e583c870

Only following code from Linux ITS driver is ported
and compiled
 - LPI initialization
 - ITS configuration code
 - Physical command queue management
 - ITS command building

Also redistributor information is split into rdist and
rdist_prop structures.

The rdist_prop struct holds the redistributor common
information for all re-distributor and rdist struct
holds the per-cpu specific information.

This per-cpu rdist is defined as global and shared with
physical ITS driver.

Signed-off-by: Vijaya Kumar K 
---
v5:
  - dump_cmd is called from its_flush_cmd
  - Added its_lpi_alloc_chunks, lpi_free, its_send_inv, its_send_mapd,
its_send_mapvi to this patch as these functions are ported from
linux and more logical to be in this patch.
For now these functions are non-static. Will be made static
when used.
  - Used this_cpu instead of per_cpu
  - Moved GITS_* definitions to git-its.h
v4: Major changes
  - Redistributor refactoring patch is merged
  - Fixed comments from v3 related to coding style and
removing duplicate code.
  - Target address is stored from bits[48:16] to avoid
shifting of target address while building ITS commands
  - Removed non-static functions
  - Removed usage of command builder functions
  - Changed its_cmd_block union to include mix of bit and unsigned
variable types to define ITS command structure
v3:
  - Only required changes from Linux ITS driver is ported
  - Xen coding style is followed.
---
 xen/arch/arm/gic-v3-its.c | 1014 +
 xen/arch/arm/gic-v3.c |   15 +-
 xen/include/asm-arm/gic-its.h |  275 ++
 xen/include/asm-arm/gic.h |1 +
 xen/include/asm-arm/gic_v3_defs.h |   42 +-
 5 files changed, 1340 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
new file mode 100644
index 000..ba4110f
--- /dev/null
+++ b/xen/arch/arm/gic-v3-its.c
@@ -0,0 +1,1014 @@
+/*
+ * Copyright (C) 2013, 2014 ARM Limited, All Rights Reserved.
+ * Author: Marc Zyngier 
+ *
+ * Xen changes:
+ * Vijaya Kumar K 
+ * Copyright (C) 2014, 2015 Cavium Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define its_print(lvl, fmt, ...)  \
+printk("GIC-ITS:" fmt, ## __VA_ARGS__)
+
+#define its_err(fmt, ...) its_print(XENLOG_ERR, fmt, ## __VA_ARGS__)
+/* TODO: ratelimit for Xen messages */
+#define its_err_ratelimited(fmt, ...) \
+its_print(XENLOG_ERR, fmt, ## __VA_ARGS__)
+
+#define its_dbg(fmt, ...) \
+its_print(XENLOG_DEBUG, fmt, ## __VA_ARGS__)
+
+#define its_info(fmt, ...)\
+its_print(XENLOG_INFO, fmt, ## __VA_ARGS__)
+
+#define its_warn(fmt, ...)\
+its_print(XENLOG_WARNING, fmt, ## __VA_ARGS__)
+
+//#define DEBUG_GIC_ITS
+
+#ifdef DEBUG_GIC_ITS
+# define DPRINTK(fmt, args...) printk(XENLOG_DEBUG fmt, ##args)
+#else
+# define DPRINTK(fmt, args...) do {} while ( 0 )
+#endif
+
+#define ITS_FLAGS_CMDQ_NEEDS_FLUSHING (1 << 0)
+#define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING   (1 << 0)
+
+/*
+ * The ITS structure - contains most of the infrastructure, with the
+ * msi_controller, the command queue, the collections, and the list of
+ * devices writing to it.
+ */
+struct its_node {
+spinlock_t  lock;
+struct list_headentry;
+void __iomem*base;
+paddr_t phys_base;
+paddr_t phys_size;
+its_cmd_block   *cmd_base;
+its_cmd_block   *cmd_write;
+void*tables[GITS_BASER_NR_REGS];
+u32 order[GITS_BASER_NR_REGS];
+struct its_collection   *collections;
+u64 flags;
+u32 ite_size;
+struct dt_device_node   *dt_node;
+};
+
+#define ITS_ITT_ALIGNSZ_256
+
+static LIST_HEAD(its_nodes);
+static DEFINE_SPINLOCK(its_lock);
+static struct rdist_prop  *gic_rdists;
+
+#define gic_data_rdis

Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Juergen Gross

On 07/27/2015 12:54 PM, Andrew Cooper wrote:

On 27/07/15 11:43, George Dunlap wrote:

On Mon, Jul 27, 2015 at 5:35 AM, Juergen Gross  wrote:

On 07/24/2015 06:44 PM, Boris Ostrovsky wrote:

On 07/24/2015 12:39 PM, Juergen Gross wrote:



I don't say mangling cpuids can't solve the scheduling problem. It
surely can. But it can't solve the scheduling problem without hiding
information like number of sockets or cores which might be required
for license purposes. If we don't care, fine.


(this is somewhat repeating the email I just sent)

Why can's we construct socket/core info with CPUID (and *possibly* ACPI
changes) that we present a reasonable (licensing-wise) picture?

Can you suggest an example where it will not work and then maybe we can
figure something out?


Let's assume a software with license based on core count. You have a
system with a 2 8 core processors and hyperthreads enabled, summing up
to 32 logical processors. Your license is valid for up to 16 cores, so
running the software on bare metal on your system is fine.

Now you are running the software inside a virtual machine with 24 vcpus
in a cpupool with 24 logical cpus limited to 12 cores (6 cores of each
processor). As we have to hide hyperthreading in order to not to have
to pin each vcpu to just a single logical processor, the topology
resulting from this picture will have to present 24 cores. The license
will not cover this hardware.

But how does doing a PV topology help this situation?  Because we're
telling one thing to the OS (via our PV interface) and another thing
to applications (via direct CPUID access)?


I expressed exactly these concerns right back at the start of the vnuma
work.

The OS and its userspace can and will use cpuid.  Most examples will
only use cpuid.  The only thing worse that providing no NUMA information
at all is providing conflicting information between cpuid and vnuma.

IMO, HVM guests should get all their NUMA information from the same
sources as native hardware would provide.  PV guests are admittedly
harder as in generally we cannot hide the real topology information in
cpuid.


Are you aware the same is true currently even without vNUMA?

The linux kernel (and other OS's as well) will make scheduling decisions
based on cpuid data obtained during boot. The information will be
correct only by chance and the real relation between vcpus and pcpus is
changing all the time.

So without adapting the kernel to that scenario it won't run optimal.
You can either change the data to let the kernel make some sane
decisions (cpuid mangling) or you can adapt the kernel somehow, e.g.
by modifying the kernel internal tables used for making scheduling
decisions (my proposal).

Something should be done regardless of the vNUMA support.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 19/22] xen/arm: ITS: Add APIs to add and assign device

2015-07-27 Thread vijay . kilari
From: Vijaya Kumar K 

Add APIs to add devices to RB-tree, assign and remove
devices to domain.

Signed-off-by: Vijaya Kumar K 
---
v5: - Removed its_detach_device API
- Pass nr_ites as parameter to its_add_device
v4: - Introduced helper to populate its_device struct
- Fixed freeing of its_device memory
- its_device struct holds domain id
---
 xen/arch/arm/gic-v3-its.c |  224 +++--
 xen/arch/arm/irq.c|8 ++
 xen/include/asm-arm/gic-its.h |2 +
 xen/include/asm-arm/irq.h |1 +
 4 files changed, 226 insertions(+), 9 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 0d8582a..99f6edc 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -115,8 +115,21 @@ u32 its_get_nr_event_ids(void)
 return (1 << its_data.eventid_bits);
 }
 
+static struct its_node *its_get_phys_node(struct dt_device_node *dt)
+{
+struct its_node *its;
+
+list_for_each_entry(its, &its_nodes, entry)
+{
+if ( its->dt_node == dt )
+return its;
+}
+
+return NULL;
+}
+
 /* RB-tree helpers for its_device */
-struct its_device *its_find_device(u32 devid)
+static struct its_device *its_find_device(u32 devid)
 {
 struct rb_node *node = rb_its_dev.rb_node;
 
@@ -137,7 +150,7 @@ struct its_device *its_find_device(u32 devid)
 return NULL;
 }
 
-int its_insert_device(struct its_device *dev)
+static int its_insert_device(struct its_device *dev)
 {
 struct rb_node **new, *parent;
 
@@ -319,7 +332,7 @@ static void its_send_inv(struct its_device *dev, struct 
its_collection *col,
 its_send_single_command(dev->its, &cmd, col);
 }
 
-void its_send_mapd(struct its_device *dev, int valid)
+static void its_send_mapd(struct its_device *dev, int valid)
 {
 its_cmd_block cmd;
 unsigned long itt_addr;
@@ -357,8 +370,8 @@ static void its_send_mapc(struct its_node *its, struct 
its_collection *col,
 its_send_single_command(its, &cmd, col);
 }
 
-void its_send_mapvi(struct its_device *dev, struct its_collection *col,
-u32 phys_id, u32 event)
+static void its_send_mapvi(struct its_device *dev, struct its_collection *col,
+   u32 phys_id, u32 event)
 {
 its_cmd_block cmd;
 
@@ -383,8 +396,8 @@ static void its_send_invall(struct its_node *its, struct 
its_collection *col)
 its_send_single_command(its, &cmd, NULL);
 }
 
-void its_send_discard(struct its_device *dev, struct its_collection *col,
-  u32 event)
+static void its_send_discard(struct its_device *dev, struct its_collection 
*col,
+ u32 event)
 {
 its_cmd_block cmd;
 
@@ -573,7 +586,7 @@ static int its_lpi_init(u32 id_bits)
 return 0;
 }
 
-unsigned long *its_lpi_alloc_chunks(int nirqs, int *base)
+static unsigned long *its_lpi_alloc_chunks(int nirqs, int *base)
 {
 unsigned long *bitmap = NULL;
 int chunk_id, nr_chunks, nr_ids, i;
@@ -617,7 +630,7 @@ out:
 return bitmap;
 }
 
-void its_lpi_free(struct its_device *dev)
+static void its_lpi_free(struct its_device *dev)
 {
 int lpi;
 
@@ -639,6 +652,199 @@ void its_lpi_free(struct its_device *dev)
 xfree(dev->lpi_map);
 }
 
+static void its_discard_lpis(struct its_device *dev, u32 ids)
+{
+struct its_collection *col;
+int i;
+
+for ( i = 0; i < ids; i++)
+{
+   col = &dev->its->collections[(i % nr_cpu_ids)];
+   its_send_discard(dev, col, i);
+}
+}
+
+static inline u32 its_get_plpi(struct its_device *dev, u32 event)
+{
+return dev->lpi_base + event;
+}
+
+static int its_alloc_device_irq(struct its_device *dev, u32 *hwirq)
+{
+int idx;
+
+idx = find_first_zero_bit(dev->lpi_map, dev->nr_lpis);
+if ( idx == dev->nr_lpis )
+return -ENOSPC;
+
+*hwirq = its_get_plpi(dev, idx);
+set_bit(idx, dev->lpi_map);
+
+return 0;
+}
+
+static void its_free_device(struct its_device *dev)
+{
+xfree(dev->itt_addr);
+xfree(dev->lpi_map);
+xfree(dev);
+}
+
+static struct its_device *its_alloc_device(u32 devid, u32 nr_ites,
+   struct dt_device_node *dt_its)
+{
+struct its_device *dev;
+paddr_t *itt;
+unsigned long *lpi_map;
+int lpi_base, sz;
+
+dev = xzalloc(struct its_device);
+if ( dev == NULL )
+return NULL;
+
+dev->its = its_get_phys_node(dt_its);
+if (dev->its == NULL)
+{
+dprintk(XENLOG_G_ERR, "ITS: Failed to find ITS node 0x%"PRIx32"\n",
+devid);
+goto err;
+}
+
+sz = nr_ites * dev->its->ite_size;
+sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
+itt = xzalloc_bytes(sz);
+if ( !itt )
+goto err;
+
+lpi_map = its_lpi_alloc_chunks(nr_ites, &lpi_base);
+if ( !lpi_map )
+goto lpi_err;
+
+dev->itt_addr = itt;
+dev->lpi_map = lpi_map;
+dev->lpi_base = lpi_base;
+dev->nr_lpis = nr_ites;
+dev->device_id = devid;

[Xen-devel] [PATCH v2] xen/events: Support event channel rebind on ARM

2015-07-27 Thread Julien Grall
Currently, the event channel rebind code is gated with the presence of
the vector callback.

The virtual interrupt controller on ARM has the concept of per-CPU
interrupt (PPI) which allow us to support per-VCPU event channel.
Therefore there is no need of vector callback for ARM.

Xen is already using a free PPI to notify the guest VCPU of an event.
Furthermore, the xen code initialization in Linux (see
arch/arm/xen/enlighten.c) is requesting correctly a per-CPU IRQ.

Introduce new helper xen_support_evtchn_rebind to allow architecture
decide whether rebind an event is support or not. It will always return
1 on ARM and keep the same behavior on x86.

This is also allow us to drop the usage of xen_have_vector_callback
entirely in the ARM code.

Signed-off-by: Julien Grall 
Reviewed-by: David Vrabel 
Cc: Stefano Stabellini 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
---
Tested by setting the affinity manually via /proc on dom0 ARM64

Changes in v2:
- Use a static inline rather than a macro
- Add David's reviewed-by
---
 arch/arm/include/asm/xen/events.h   |  6 ++
 arch/arm/xen/enlighten.c|  4 
 arch/arm64/include/asm/xen/events.h |  6 ++
 arch/x86/include/asm/xen/events.h   | 14 ++
 drivers/xen/events/events_base.c|  6 +-
 include/xen/events.h|  1 -
 6 files changed, 27 insertions(+), 10 deletions(-)

diff --git a/arch/arm/include/asm/xen/events.h 
b/arch/arm/include/asm/xen/events.h
index 8b1f37b..2123aaa 100644
--- a/arch/arm/include/asm/xen/events.h
+++ b/arch/arm/include/asm/xen/events.h
@@ -20,4 +20,10 @@ static inline int xen_irqs_disabled(struct pt_regs *regs)
atomic64_t, \
counter), (val))
 
+/* Rebind event channel is supported by default */
+static inline bool xen_support_evtchn_rebind(void)
+{
+   return 1;
+}
+
 #endif /* _ASM_ARM_XEN_EVENTS_H */
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 9055c92..c50c8d3 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -45,10 +45,6 @@ static struct vcpu_info __percpu *xen_vcpu_info;
 unsigned long xen_released_pages;
 struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;
 
-/* TODO: to be removed */
-__read_mostly int xen_have_vector_callback;
-EXPORT_SYMBOL_GPL(xen_have_vector_callback);
-
 static __read_mostly unsigned int xen_events_irq;
 
 static __initdata struct device_node *xen_node;
diff --git a/arch/arm64/include/asm/xen/events.h 
b/arch/arm64/include/asm/xen/events.h
index 8655321..ed4aa90 100644
--- a/arch/arm64/include/asm/xen/events.h
+++ b/arch/arm64/include/asm/xen/events.h
@@ -18,4 +18,10 @@ static inline int xen_irqs_disabled(struct pt_regs *regs)
 
 #define xchg_xen_ulong(ptr, val) xchg((ptr), (val))
 
+/* Rebind event channel is supported by default */
+static inline bool xen_support_evtchn_rebind(void)
+{
+   return 1;
+}
+
 #endif /* _ASM_ARM64_XEN_EVENTS_H */
diff --git a/arch/x86/include/asm/xen/events.h 
b/arch/x86/include/asm/xen/events.h
index 608a79d..37bc720 100644
--- a/arch/x86/include/asm/xen/events.h
+++ b/arch/x86/include/asm/xen/events.h
@@ -20,4 +20,18 @@ static inline int xen_irqs_disabled(struct pt_regs *regs)
 /* No need for a barrier -- XCHG is a barrier on x86. */
 #define xchg_xen_ulong(ptr, val) xchg((ptr), (val))
 
+/*
+ * Events delivered via platform PCI interrupts are always
+ * routed to vcpu 0 and hence cannot be rebound.
+ */
+static inline bool xen_support_evtchn_rebind(void)
+{
+   return (!xen_hvm_domain() || xen_have_vector_callback);
+}
+
+#define xen_support_evtchn_rebind()\
+   (!xen_hvm_domain() || xen_have_vector_callback)
+
+extern int xen_have_vector_callback;
+
 #endif /* _ASM_X86_XEN_EVENTS_H */
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 96093ae..ed620e5 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1301,11 +1301,7 @@ static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
if (!VALID_EVTCHN(evtchn))
return -1;
 
-   /*
-* Events delivered via platform PCI interrupts are always
-* routed to vcpu 0 and hence cannot be rebound.
-*/
-   if (xen_hvm_domain() && !xen_have_vector_callback)
+   if (!xen_support_evtchn_rebind())
return -1;
 
/* Send future instances of this interrupt to other vcpu. */
diff --git a/include/xen/events.h b/include/xen/events.h
index 7d95fdf..88da2ab 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -92,7 +92,6 @@ void xen_hvm_callback_vector(void);
 #ifdef CONFIG_TRACING
 #define trace_xen_hvm_callback_vector xen_hvm_callback_vector
 #endif
-extern int xen_have_vector_callback;
 int

Re: [Xen-devel] [PATCH v3 07/32] xen/x86: fix arch_set_info_guest for HVM guests

2015-07-27 Thread Roger Pau Monné
El 24/07/15 a les 19.36, Konrad Rzeszutek Wilk ha escrit:
> On Fri, Jul 24, 2015 at 06:54:09PM +0200, Roger Pau Monné wrote:
>> I have the feeling that we are over engineering this interface. IMHO we
>> should only allow the user to set the control registers, efer (on amd64)
>> and the GP registers. Segment selectors would be set by Xen to point to
>> a flat segment suitable for the mode the guest has selected. I think
>> that should be enough to get the guest OS into it's entry point, and
>> then it can do whatever it wants.
> 
> If you are doing that why not use the old interface?

Because the current structure contains quite a lot of PV fields, like
the failsafe_callback, the GDT... Also the old structure is limited in a
way that you can only use the x86- format if your kernel is
compiled with , and we want to allow the vCPU to start in any
mode, regardless of the mode the kernel is compiled with.

>>
>> If that's not suitable, then my second option would be to allow the
>> guest to set the base, limit and AR bytes of the selectors, but not load
>> a GDT.
> 
> They should be able to do any of those operations as a normal HVM
> guest I would think?

Yes, hence I would like to limit setting anything that's not essential
to boot.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Juergen Gross

On 07/27/2015 01:11 PM, George Dunlap wrote:

On 07/27/2015 11:54 AM, Juergen Gross wrote:

On 07/27/2015 12:43 PM, George Dunlap wrote:

On Mon, Jul 27, 2015 at 5:35 AM, Juergen Gross  wrote:

On 07/24/2015 06:44 PM, Boris Ostrovsky wrote:


On 07/24/2015 12:39 PM, Juergen Gross wrote:




I don't say mangling cpuids can't solve the scheduling problem. It
surely can. But it can't solve the scheduling problem without hiding
information like number of sockets or cores which might be required
for license purposes. If we don't care, fine.



(this is somewhat repeating the email I just sent)

Why can's we construct socket/core info with CPUID (and *possibly* ACPI
changes) that we present a reasonable (licensing-wise) picture?

Can you suggest an example where it will not work and then maybe we can
figure something out?



Let's assume a software with license based on core count. You have a
system with a 2 8 core processors and hyperthreads enabled, summing up
to 32 logical processors. Your license is valid for up to 16 cores, so
running the software on bare metal on your system is fine.

Now you are running the software inside a virtual machine with 24 vcpus
in a cpupool with 24 logical cpus limited to 12 cores (6 cores of each
processor). As we have to hide hyperthreading in order to not to have
to pin each vcpu to just a single logical processor, the topology
resulting from this picture will have to present 24 cores. The license
will not cover this hardware.


But how does doing a PV topology help this situation?  Because we're
telling one thing to the OS (via our PV interface) and another thing
to applications (via direct CPUID access)?


Exactly.

In my example it would even work to not modify the cpuid information at
all. The kernel wouldn't try to be extremely clever regarding scheduling
and the user land would see the cpuid information from the real hardware
(only the 12 cores it is running on, of course).


Right; so it seems

1. Userspace applications are in the habit of reading CPUID to determine
the topology of the system they're running on

2. Many use the topology information to help themselves make better
scheduling decisions.  Because a vcpu is not typically pinned to a
specific pcpu, we may need to lie here slightly (e.g., not mention
threads) to get the optimal behavior overall.

3. Others use the topology information to implement licensing
restrictions.  Because threads are treated differently to cores, we want
to tell the truth here (i.e., make sure we mention that some of these
are threads) to get the optimal behavior overall.

Numbers #2 and #3 lead to contradictory courses of action; we cannot
optimize for both at the same time.

I think at some level we need to just try to accommodate both -- if the
user doesn't have licensing issues, or prefers performance over
licensing, then present a unified topology in PVH / HVM using CPUID,
ACPI, &c.  I think this should be the default.

If the user has licensing issues, and doesn't mind having wonky or
unreliable topology to its guests, then let the raw CPUID through.  But
it would, in this case, be good to try to give the guest OS scheduler a
hint that it shouldn't really bother trying to read the topology or do
placement as a result, as any decisions will be unreliable.

Or alternately, if the user wants to give up on the "consolidation"
aspect of virtualization, they can pin vcpus to pcpus and then pass in
the actual host topology (hyperthreads and all).


There would be another solution, of course:

Support hyperthreads in the Xen scheduler via gang scheduling. While
this is not a simple solution, it is a fair one. Hyperthreads on one
core can influence each other rather much. With both threads always
running vcpus of the same guest the penalty/advantage would stay in the
same domain. The guest could make really sensible scheduling decisions
and the licensing would still work as desired.

Just an idea, but maybe worth to explore further instead of tweaking
more and more bits to make the virtual system somehow act sane.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Xen Security Advisory 138 (CVE-2015-5154) - QEMU heap overflow flaw while processing certain ATAPI commands.

2015-07-27 Thread Xen . org security team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Xen Security Advisory CVE-2015-5154 / XSA-138
  version 2

   QEMU heap overflow flaw while processing certain ATAPI commands.

UPDATES IN VERSION 2


Public release.

ISSUE DESCRIPTION
=

The QEMU security team has predisclosed the following advisory:

A heap overflow flaw was found in the way QEMU's IDE subsystem
handled I/O buffer access while processing certain ATAPI commands.

A privileged guest user in a guest with CDROM drive enabled could
potentially use this flaw to execute arbitrary code on the host
with the privileges of the host's QEMU process corresponding to
the guest.

IMPACT
==

An HVM guest which has access to an emulated IDE CDROM device
(e.g. with a device with "devtype=cdrom", or the "cdrom" convenience
alias, in the VBD configuration) can exploit this vulnerability to
take over the qemu process elevating its privilege to that of the qemu
process.

VULNERABLE SYSTEMS
==

All Xen systems running x86 HVM guests without stubdomains which have
been configured with an emulated CD-ROM driver model are vulnerable.

Systems using qemu-dm stubdomain device models (for example, by
specifying "device_model_stubdomain_override=1" in xl's domain
configuration files) are NOT vulnerable.

Both the traditional ("qemu-xen-traditional") or upstream-based
("qemu-xen") qemu device models are potentially vulnerable.

Systems running only PV guests are NOT vulnerable.

ARM systems are NOT vulnerable.

MITIGATION
==

Avoiding the use of emulated CD-ROM devices altogether, by not
specifying such devices in the domain configuration, will avoid this
issue.

Enabling stubdomains will mitigate this issue, by reducing the
escalation to only those privileges accorded to the service domain.
qemu-dm stubdomains are only available with "qemu-xen-traditional".

CREDITS
===

This issue was discovered by Kevin Wolf of Red Hat.

RESOLUTION
==

Applying the appropriate attached patch resolves this issue.

xsa138-qemut-{1,2}.patch qemu-xen-traditional, Xen unstable, Xen 4.5.x,
 Xen 4.4.x, Xen 4.3.x, Xen 4.2.x
xsa138-qemuu-{1,2,3}.patch   qemu-upstream, xen unstable, Xen 4.5.x,
 Xen 4.4.x, Xen 4.3.x
xsa138-qemuu-{1,3}.patch qemu-upstream, Xen 4.2.x

NOTE: xsa138-qemuu-2.patch is not required for Xen 4.2.x.

$ sha256sum xsa138*.patch
7e385455379d88658b8ab0d4c1ee9af21fff2e1dc0fe51cacc779afc83a4  
xsa138-qemut-1.patch
c9a89082e36a0646a6fe002c6892d966d415d11ad5cfdcfea7e9c8d7a3f1316c  
xsa138-qemut-2.patch
a076808f543c82aeac2f0239a4a46d9baadcd4e4b0a2f9ae7ded99cf59cffde6  
xsa138-qemuu-1.patch
ed16dca7d2c179d0931d6e2503264d6593547a803eb3f08f6db7fff2127509a9  
xsa138-qemuu-2.patch
090bdec00ede1f0ace1af52833038a74971e060d0c176b42bfca08511d36c644  
xsa138-qemuu-3.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of patches or mitigations is NOT permitted (except on
systems used and administered only by organisations which are members
of the Xen Project Security Issues Predisclosure List).  Specifically,
deployment on public cloud systems is NOT permitted.

The decision not to permit deployment was made by the group that, at
their discretion, disclosed the issue to the Xen Project Security
Team.

Deployment is permitted only AFTER the embargo ends.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iQEcBAEBAgAGBQJVth2LAAoJEIP+FMlX6CvZcd4IAJYWZrj86FDn9L5SqeTq8cLX
6tnptNaQb+uDQ/thV2R+nUVdJNaJt1UIhRhO2tD2g0dEqj/I7Vx/Hh95ncPCQ3fS
ec7ph9lcsdAy8E+7abNlhJnPsOVOazEwI0we2deKjdn3CqyfVXqA47rSDY4VChtc
kTV7lEIEebBlo1igz05/poUEhjkCP8UvSfpgpQY60N2y+C0OyIXPIog4q2LiEbeO
cq/deACYN3jOVwPTozkQNAAOq0++UfnGfDredOIYCbvqA5OtMf1DGlWyTQLIEuKJ
zCiatGudJI2klVYkHSVYfXr54WjreiRCOfLB9ilhBW7Yr2juWFQIAc+0Kf09uFo=
=I0Tz
-END PGP SIGNATURE-


xsa138-qemut-1.patch
Description: Binary data


xsa138-qemut-2.patch
Description: Binary data


xsa138-qemuu-1.patch
Description: Binary data


xsa138-qemuu-2.patch
Description: Binary data


xsa138-qemuu-3.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.18 test] 59942: regressions - trouble: broken/fail/pass

2015-07-27 Thread osstest service owner
flight 59942 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59942/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail REGR. vs. 58581
 test-amd64-i386-xl-qemut-debianhvm-amd64 11 guest-saverestore fail REGR. vs. 
58581

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-cubietruck 3 host-install(3) broken in 59840 pass in 59942
 test-amd64-i386-qemut-rhel6hvm-intel 3 host-install(3) broken in 59840 pass in 
59942
 test-amd64-amd64-xl-qemuu-win7-amd64 3 host-install(3) broken in 59840 pass in 
59942
 test-armhf-armhf-xl-multivcpu  3 host-install(3)  broken pass in 59840
 test-amd64-amd64-rumpuserxen-amd64 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail in 59825 pass in 59942
 test-amd64-i386-pair15 debian-install/dst_host fail in 59840 pass in 59942
 test-amd64-amd64-xl-qemut-debianhvm-amd64 9 debian-hvm-install fail in 59840 
pass in 59942
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 9 debian-hvm-install fail in 
59840 pass in 59942
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 11 guest-saverestore fail pass in 
59825
 test-amd64-i386-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail pass in 
59825

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt  6 xen-boot  fail REGR. vs. 58581
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 guest-localmigrate 
fail baseline untested
 test-armhf-armhf-xl-rtds 11 guest-start fail baseline untested
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 11 guest-saverestore fail 
in 59825 baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 14 guest-localmigrate.2 
fail in 59825 baseline untested
 test-armhf-armhf-xl-rtds 14 guest-start.2  fail in 59825 baseline untested
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail in 59825 like 58581
 test-amd64-i386-rumpuserxen-i386 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail in 59840 like 58558
 test-armhf-armhf-xl-credit2   6 xen-boot fail   like 58581
 test-armhf-armhf-xl-xsm   6 xen-boot fail   like 58581
 test-armhf-armhf-xl   6 xen-boot fail   like 58581
 test-armhf-armhf-libvirt-xsm  6 xen-boot fail   like 58581
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58581
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58581
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58581

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-rtds 12 migrate-support-check fail in 59825 never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail in 59825 never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck  6 xen-boot fail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass

version targeted for testing:
 linux22a6cbf9f36ee3ae2878efcbdde33e6ca00b9c4b
baseline version:
 linuxd048c068d00da7d4cfa5ea7651933b99026958cf

Last test of basis58581  2015-06-15 09:42:22 Z   42 days
Failing since 58976  2015-06-29 19:43:23 Z   27 days   35 attempts
Testing same since59825  2015-07-22 17:19:07 Z4 days3 attempts


346 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386

Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Tim Deegan
At 14:01 +0200 on 27 Jul (1438005701), Juergen Gross wrote:
> There would be another solution, of course:
> 
> Support hyperthreads in the Xen scheduler via gang scheduling. While
> this is not a simple solution, it is a fair one. Hyperthreads on one
> core can influence each other rather much. With both threads always
> running vcpus of the same guest the penalty/advantage would stay in the
> same domain. The guest could make really sensible scheduling decisions
> and the licensing would still work as desired.

This might also be interesting for people worried about side-channels
(e.g. http://sophia.re/RECON ).

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxl: check nesthvm and altp2m in libxl level

2015-07-27 Thread Julien Grall
Hi Wei,

On 24/07/15 16:39, Wei Liu wrote:
> In ea214001 ("x86/altp2m: add altp2mhvm HVM domain parameter"), a
> check was added to ensure nestedhvm and altp2m cannot be enabled at
> the same time. That check was added in xl, but in fact it should be in
> libxl because it should be the entity that decides whether
> the provided configuration is valid.
> 
> This patch moves the check to libxl. The code snippet is moved after
> calling libxl__domain_build_info_setdefault so that we can:
> 1. remove libxl_defbool_is_default in `if()';
> 2. detect mistake in libxl__domain_build_info_setdefault.
> 
> Signed-off-by: Wei Liu 
> ---
> Cc: Ed White 
> Cc: "Sahita, Ravi" 
> 
> I said I discovered an issue during review but also volunteered to fix
> it after the series is merged. Here is the patch to do that.
> ---
>  tools/libxl/libxl_create.c | 6 ++
>  tools/libxl/xl_cmdimpl.c   | 8 
>  2 files changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index 855b42c..de536ba 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -883,6 +883,12 @@ static void initiate_domain_create(libxl__egc *egc,
>  goto error_out;
>  }
>  
> +if (libxl_defbool_val(d_config->b_info.u.hvm.nested_hvm) &&
> +libxl_defbool_val(d_config->b_info.u.hvm.altp2m)) {
> +LOG(ERROR, "nestedhvm and altp2mhvm cannot be used together");
> +goto error_out;
> +}
> +

The u.hvm.{nested_hvm,altp2m} can only be checked when the created
domain is an HVM.

But initiate_domain_create is called with either a PV or HVM domain. So
you may need to check if we are creating a HVM one.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen/events: Support event channel rebind on ARM

2015-07-27 Thread David Vrabel
On 27/07/15 12:35, Julien Grall wrote:
> Currently, the event channel rebind code is gated with the presence of
> the vector callback.
> 
> The virtual interrupt controller on ARM has the concept of per-CPU
> interrupt (PPI) which allow us to support per-VCPU event channel.
> Therefore there is no need of vector callback for ARM.
> 
> Xen is already using a free PPI to notify the guest VCPU of an event.
> Furthermore, the xen code initialization in Linux (see
> arch/arm/xen/enlighten.c) is requesting correctly a per-CPU IRQ.
> 
> Introduce new helper xen_support_evtchn_rebind to allow architecture
> decide whether rebind an event is support or not. It will always return
> 1 on ARM and keep the same behavior on x86.
> 
> This is also allow us to drop the usage of xen_have_vector_callback
> entirely in the ARM code.

This did not apply cleanly.  Please always base patches on Linus's
master branch.

This also breaks the x86 build.

/local/davidvr/work/k.org/tip/arch/x86/include/asm/xen/events.h: In
function ‘xen_support_evtchn_rebind’:
/local/davidvr/work/k.org/tip/arch/x86/include/asm/xen/events.h:29:85:
error: ‘xen_have_vector_callback’ undeclared (first use in this function)
  return (!xen_hvm_domain() || xen_have_vector_callback);

> --- a/arch/arm/include/asm/xen/events.h
> +++ b/arch/arm/include/asm/xen/events.h
> @@ -20,4 +20,10 @@ static inline int xen_irqs_disabled(struct pt_regs *regs)
>   atomic64_t, \
>   counter), (val))
>  
> +/* Rebind event channel is supported by default */
> +static inline bool xen_support_evtchn_rebind(void)
> +{
> + return 1;

This should be true (similarly for arm64).

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv2 10/10] xen/balloon: pre-allocate p2m entries for ballooned pages

2015-07-27 Thread David Vrabel
On 27/07/15 12:01, Julien Grall wrote:
> On 27/07/15 10:30, David Vrabel wrote:
>> On 25/07/15 00:21, Julien Grall wrote:
>>> On 24/07/2015 12:47, David Vrabel wrote:
 @@ -550,6 +551,11 @@ int alloc_xenballooned_pages(int nr_pages, struct
 page **pages)
   page = balloon_retrieve(true);
   if (page) {
   pages[pgno++] = page;
 +#ifdef CONFIG_XEN_HAVE_PVMMU
 +ret = xen_alloc_p2m_entry(page_to_pfn(page));
>>>
>>> Don't you want to call this function only when the guest is not using
>>> auto-translated physmap?
>>
>> xen_alloc_p2m_entry() is a nop in auto-xlate guests, so no need for an
>> additional check here.
> 
> I don't have the impression it's the case or it's not obvious.

Oops. You're right.  I'll add a

if (xen_feature(XENFEAT_auto_translated_physmap))
return true;

Check at the top.

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen/events: Support event channel rebind on ARM

2015-07-27 Thread Julien Grall
On 27/07/15 13:35, David Vrabel wrote:
> On 27/07/15 12:35, Julien Grall wrote:
>> Currently, the event channel rebind code is gated with the presence of
>> the vector callback.
>>
>> The virtual interrupt controller on ARM has the concept of per-CPU
>> interrupt (PPI) which allow us to support per-VCPU event channel.
>> Therefore there is no need of vector callback for ARM.
>>
>> Xen is already using a free PPI to notify the guest VCPU of an event.
>> Furthermore, the xen code initialization in Linux (see
>> arch/arm/xen/enlighten.c) is requesting correctly a per-CPU IRQ.
>>
>> Introduce new helper xen_support_evtchn_rebind to allow architecture
>> decide whether rebind an event is support or not. It will always return
>> 1 on ARM and keep the same behavior on x86.
>>
>> This is also allow us to drop the usage of xen_have_vector_callback
>> entirely in the ARM code.
> 
> This did not apply cleanly.  Please always base patches on Linus's
> master branch.

My patch is based on Linus's master branch. Although, it has a
pre-requisite of [1] which I sent a few seconds before it. Though it's
only for applying because the 2 patches are not related.

I should have mentioned it in the mail, sorry.

> This also breaks the x86 build.

I forgot to test this new version when I turned the macro into a static
inline.

> /local/davidvr/work/k.org/tip/arch/x86/include/asm/xen/events.h: In
> function ‘xen_support_evtchn_rebind’:
> /local/davidvr/work/k.org/tip/arch/x86/include/asm/xen/events.h:29:85:
> error: ‘xen_have_vector_callback’ undeclared (first use in this function)
>   return (!xen_hvm_domain() || xen_have_vector_callback);
> 
>> --- a/arch/arm/include/asm/xen/events.h
>> +++ b/arch/arm/include/asm/xen/events.h
>> @@ -20,4 +20,10 @@ static inline int xen_irqs_disabled(struct pt_regs *regs)
>>  atomic64_t, \
>>  counter), (val))
>>  
>> +/* Rebind event channel is supported by default */
>> +static inline bool xen_support_evtchn_rebind(void)
>> +{
>> +return 1;
> 
> This should be true (similarly for arm64).

Will resend a new version, and remove the dependency on [1].

Regards,

[1] http://permalink.gmane.org/gmane.linux.kernel/2004690

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] xenconsole: Ensure exclusive access to console using locks

2015-07-27 Thread Martin Lucina
On Friday, 24.07.2015 at 17:01, Ian Jackson wrote:
> Martin Lucina writes ("[PATCH v3] xenconsole: Ensure exclusive access to 
> console using locks"):
> > If more than one instance of xenconsole is run against the same DOMID
> > then each instance will only get some data. This change ensures
> > exclusive access to the console by obtaining an exclusive lock on
> > /xenconsole..
> 
> Acked-by: Ian Jackson 

Can this also make it into the 4.6 release on the grounds of being a bugfix
for xenconsole, or is this change too invasive?

Martin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 2/3] ts-debian-hvm-install: use di_installcmdline_core

2015-07-27 Thread Ian Campbell
This is primarily to get DEBIAN_FRONTEND=test, for easier to read
logging.

Previously the command line consisted of the console and
preseed/file=/preseed.cfg. After this it is more complex.

The preseed file uses file= which is an alias for preseed/file. Extra
options are given including DEBIAN_FRONTEND and DEBCONF_DEBUG and the
following are preseeded via the command line:

Previous implied were "auto=true preseed" which are now explicit.

In addition the following harmless (in this context) options are
added:
hw-detect/load_firmware=
hostname=
netcfg/dhcp_timeout=
netcfg/choose_interface=

The caller could also cause debconf/priority to be set, but doesn't
here.

ts-debian-di-install in the distro test series also uses
di_installcmdline_core for guest uses.

Signed-off-by: Ian Campbell 
---
v2: Refactor to use gcmdline to contain the repetitive bit.
---
 Osstest/Debian.pm |  4 +++-
 ts-debian-hvm-install | 28 +++-
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/Osstest/Debian.pm b/Osstest/Debian.pm
index 7ce5d67..f0bcf06 100644
--- a/Osstest/Debian.pm
+++ b/Osstest/Debian.pm
@@ -627,6 +627,8 @@ our %preseed_cmds;
 sub di_installcmdline_core ($$;@) {
 my ($tho, $ps_url, %xopts) = @_;
 
+$xopts{PreseedScheme} //= 'url';
+
 $ps_url =~ s,^http://,,;
 
 my $netcfg_interface= get_host_property($tho,'interface force','auto');
@@ -640,7 +642,7 @@ sub di_installcmdline_core ($$;@) {
 push @cl, (
"DEBIAN_FRONTEND=$difront",
"hostname=$tho->{Name}",
-   "url=$ps_url",
+   "$xopts{PreseedScheme}=$ps_url",
"netcfg/dhcp_timeout=150",
"netcfg/choose_interface=$netcfg_interface"
);
diff --git a/ts-debian-hvm-install b/ts-debian-hvm-install
index 0c94c7e..d4639b3 100755
--- a/ts-debian-hvm-install
+++ b/ts-debian-hvm-install
@@ -97,23 +97,41 @@ END
 return $preseed_file;
 }
 
-sub grub_cfg () {
+sub gcmdline (;$) {
+my ($extra) = @_;
+my @dicmdline = ();
+my $gconsole = "console=ttyS0,115200n8";
+
+push @dicmdline, $gconsole;
+push @dicmdline, di_installcmdline_core($gho, '/preseed.cfg',
+   PreseedScheme => 'file');
+push @dicmdline, $extra if $extra;
+
+push @dicmdline, "--";
 # See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=762007 for
 # why console= is repeated.
+push @dicmdline, $gconsole;
+
+return join(" ", @dicmdline);
+}
+
+sub grub_cfg () {
+my $cmdline = gcmdline();
+
 return <<"END";
 set default="0"
 set timeout=5
 
 menuentry 'debian guest auto Install' {
-linux /install.amd/vmlinuz preseed/file=/preseed.cfg 
console=ttyS0,115200n8 -- console=ttyS0,115200n8
+linux /install.amd/vmlinuz $cmdline
 initrd /install.amd/initrd.gz
 }
 END
 }
 
 sub isolinux_cfg () {
-# See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=762007 for
-# why console= is repeated.
+my $cmdline = gcmdline("initrd=/install.amd/initrd.gz");
+
 return <<"END";
 default autoinstall
 prompt 0
@@ -121,7 +139,7 @@ sub isolinux_cfg () {
 
 label autoinstall
 kernel /install.amd/vmlinuz
-append preseed/file=/preseed.cfg initrd=/install.amd/initrd.gz 
console=ttyS0,115200n8 -- console=ttyS0,115200n8
+append $cmdline
 END
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH OSSTEST v2 0/3] fixes to ts-debian-hvm-install

2015-07-27 Thread Ian Campbell
The main one is the middle one which would have made
http://logs.test-lab.xenproject.org/osstest/logs/59681/test-amd64-i386-xl
-qemuu-debianhvm-amd64/info.html a lot easier to read due to the
DEBIAN_FRONTEND=text.

Since v1 applied some acks and refactored the middle patch to make it all
less repetitive.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 3/3] ts-debian-hvm-install: Use xargs -0 to avoid massive filelist in logs.

2015-07-27 Thread Ian Campbell
The current arrangement is a bit odd, I'm not sure why it would be
that way and it results in a huge list of files in the middle of the
log which is rather boring to scroll through.

Signed-off-by: Ian Campbell 
Acked-by: Ian Jackson 
---
 ts-debian-hvm-install | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ts-debian-hvm-install b/ts-debian-hvm-install
index d4639b3..8198a7a 100755
--- a/ts-debian-hvm-install
+++ b/ts-debian-hvm-install
@@ -155,7 +155,7 @@ sub prepare_initrd ($$$) {
   cd -
   rm -rf $initrddir
   cd $newiso
-  md5sum `find -L -type f -print0 | xargs -0` > md5sum.txt
+  find -L -type f -print0 | xargs -0 md5sum > md5sum.txt
   cd -
 END
 }
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 1/3] ts-debian-hvm-install: Remove VGA console runes.

2015-07-27 Thread Ian Campbell
I don't think there is any point in these since c60b6d20b0fd
"ts-debian-hvm-install: Arrange for installed guest to use a serial
console" and they represent an unexplained difference between the
islinux and grub cases.

Signed-off-by: Ian Campbell 
Acked-by: Ian Jackson 
---
 ts-debian-hvm-install | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ts-debian-hvm-install b/ts-debian-hvm-install
index f05b1a7..0c94c7e 100755
--- a/ts-debian-hvm-install
+++ b/ts-debian-hvm-install
@@ -105,7 +105,7 @@ set default="0"
 set timeout=5
 
 menuentry 'debian guest auto Install' {
-linux /install.amd/vmlinuz console=vga preseed/file=/preseed.cfg 
console=ttyS0,115200n8 -- console=ttyS0,115200n8
+linux /install.amd/vmlinuz preseed/file=/preseed.cfg 
console=ttyS0,115200n8 -- console=ttyS0,115200n8
 initrd /install.amd/initrd.gz
 }
 END
@@ -121,7 +121,7 @@ sub isolinux_cfg () {
 
 label autoinstall
 kernel /install.amd/vmlinuz
-append video=vesa:ywrap,mtrr vga=788 preseed/file=/preseed.cfg 
initrd=/install.amd/initrd.gz console=ttyS0,115200n8 -- console=ttyS0,115200n8
+append preseed/file=/preseed.cfg initrd=/install.amd/initrd.gz 
console=ttyS0,115200n8 -- console=ttyS0,115200n8
 END
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.4 test] 59961: regressions - FAIL

2015-07-27 Thread osstest service owner
flight 59961 linux-3.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59961/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-win7-amd64  6 xen-boot  fail REGR. vs. 30511

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-sedf-pin  6 xen-boot   fail in 58831 pass in 58798
 test-amd64-amd64-pair10 xen-boot/dst_host  fail in 58831 pass in 59961
 test-amd64-i386-xl-qemuu-win7-amd64 9 windows-install fail in 58831 pass in 
59961
 test-amd64-i386-pair 10 xen-boot/dst_host   fail pass in 58831
 test-amd64-i386-pair  9 xen-boot/src_host   fail pass in 58831
 test-amd64-amd64-pair 9 xen-boot/src_host   fail pass in 59550
 test-amd64-amd64-xl   6 xen-bootfail pass in 59576

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-xsm6 xen-bootfail baseline untested
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-i386-libvirt-xsm   6 xen-bootfail baseline untested
 test-amd64-amd64-xl-multivcpu  6 xen-boot   fail baseline untested
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-amd64-xl-xsm   6 xen-bootfail baseline untested
 test-amd64-amd64-xl-credit2   6 xen-bootfail baseline untested
 test-amd64-amd64-xl-rtds  6 xen-bootfail baseline untested
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 6 xen-boot fail baseline 
untested
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 11 guest-saverestore fail 
baseline untested
 test-amd64-amd64-xl-sedf  6 xen-boot  fail in 58831 like 30406
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail in 58831 like 30511
 test-amd64-amd64-libvirt-xsm  6 xen-boot   fail in 59550 baseline untested
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 12 guest-localmigrate 
fail in 59550 baseline untested
 test-amd64-i386-libvirt  11 guest-start   fail in 59550 like 30511
 test-amd64-amd64-libvirt 11 guest-start   fail in 59550 like 30511
 test-amd64-i386-xl-qemut-win7-amd64 15 guest-localmigrate/x10  fail like 30394
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 30511
 test-amd64-amd64-xl-qemuu-ovmf-amd64  6 xen-bootfail like 53709-bisect
 test-amd64-i386-xl6 xen-bootfail like 53725-bisect
 test-amd64-i386-freebsd10-amd64  6 xen-boot fail like 58780-bisect
 test-amd64-i386-xl-qemuu-winxpsp3  6 xen-boot   fail like 58786-bisect
 test-amd64-i386-qemut-rhel6hvm-intel  6 xen-bootfail like 58788-bisect
 test-amd64-i386-rumpuserxen-i386  6 xen-bootfail like 58799-bisect
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  6 xen-bootfail like 58801-bisect
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  6 xen-boot   fail like 58803-bisect
 test-amd64-amd64-xl-qemut-winxpsp3  6 xen-boot  fail like 58804-bisect
 test-amd64-i386-freebsd10-i386  6 xen-boot  fail like 58805-bisect
 test-amd64-i386-xl-qemuu-ovmf-amd64  6 xen-boot fail like 58806-bisect
 test-amd64-amd64-xl-qemuu-winxpsp3  6 xen-boot  fail like 58807-bisect
 test-amd64-i386-xl-qemut-winxpsp3  6 xen-boot   fail like 58808-bisect
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  6 xen-bootfail like 58809-bisect
 test-amd64-amd64-rumpuserxen-amd64  6 xen-boot  fail like 58810-bisect
 test-amd64-i386-xl-qemuu-debianhvm-amd64  6 xen-bootfail like 58811-bisect
 test-amd64-amd64-xl-qemut-debianhvm-amd64  6 xen-boot   fail like 58813-bisect
 test-amd64-i386-qemuu-rhel6hvm-intel  6 xen-bootfail like 58814-bisect
 test-amd64-i386-xl-qemut-debianhvm-amd64  6 xen-bootfail like 58815-bisect

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass

version targeted for testing:
 linuxcf1b3dad6c5699b977273276bada8597636ef3e2
baseline version:
 linuxbb4a05a0400ed6d2f1e13d1f82f289ff74300a70

Last test of basis30511  2014-09-29 16:37:46 Z  300 days
Failing since 32004  2014-12-02 04:10:03 Z  237 days  192 attempts
Testing same since 

Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Dario Faggioli
On Mon, 2015-07-27 at 11:49 +0100, Andrew Cooper wrote:
> On 27/07/15 11:41, George Dunlap wrote:

> > Can you expand a little on this?  I'm having trouble figuring out
> > exactly what user-space applications are reading and how they're using
> > it -- and, how they work currently in virtual environments, given that
> > they (typically) will be moved between physical processors even if
> > they stay on the same virtual processor.
> 
> There are many examples of userspace application using cpuid to modify
> themselves.  Any serious application with processor optimisations will
> use the cpuid feature bits to choose the most efficient algorithm.
> 
> hwloc is an perfect example which gathers all of the topology
> information out of cpuid to work out how to most efficiently
> pin/schedule tasks.
> 
And all of this is broken, right now, isn't it?

Where, saying this, I'm aiming at stressing the point (as the thread is
starting to become a bit hard to follow) that we must come up with a
decent solution, working reasonably well in a bunch of (conflicting!
*sigh*) scenarios, *not* to implement something new without breaking
what we have, as what we have is broken already!

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7] run QEMU as non-root

2015-07-27 Thread Fabio Fantoni

Il 23/07/2015 19:08, Stefano Stabellini ha scritto:

Try to use "xen-qemudepriv-domid$domid" first, then
"xen-qemudepriv-shared" and root if everything else fails.

The uids need to be manually created by the user or, more likely, by the
xen package maintainer.

Expose a device_model_user setting in libxl_domain_build_info, so that
opinionated callers, such as libvirt, can set any user they like. Do not
fall back to root if device_model_user is set.

QEMU is going to setuid and setgid to the user ID and the group ID of
the specified user, soon after initialization, before starting to deal
with any guest IO.

To actually secure QEMU when running in Dom0, we need at least to
deprivilege the privcmd and xenstore interfaces, this is just the first
step in that direction.

Signed-off-by: Stefano Stabellini 


Thanks for this patch, now I'll test it.
I think can be good add also domU's xl cfg parameter for set custom user 
to use instead make possible only in libxl from external tools, is 
possible to add it?
For example in my case I use xl and I want run domU's qemu with user not 
shared but I want create only one user for each effective domU and not 
thousand of users.

Another comment below...



---
Changes in v7:
- do not fall back to root if the user explicitly set
b_info->device_model_user.

Changes in v6:
- add device_model_user to libxl_domain_build_info
- improve doc
- improve wording in commit message

Changes in v5:
- improve wording in doc
- fix wording in warning message
- fix example in doc
- drop xen-qemudepriv-$domname

Changes in v4:
- rename qemu-deprivilege to qemu-deprivilege.txt
- add a note about qemu-deprivilege.txt to INSTALL
- instead of xen-qemudepriv-base + $domid, try xen-qemudepriv-domid$domid
- introduce libxl__dm_runas_helper to make the code nicer

Changes in v3:
- clarify doc
- handle errno == ERANGE
---
  INSTALL|7 +
  docs/misc/qemu-deprivilege.txt |   31 ++
  tools/libxl/libxl.h|6 +
  tools/libxl/libxl_dm.c |   55 
  tools/libxl/libxl_internal.h   |4 +++
  tools/libxl/libxl_types.idl|1 +
  6 files changed, 104 insertions(+)
  create mode 100644 docs/misc/qemu-deprivilege.txt

diff --git a/INSTALL b/INSTALL
index a0f2e7b..fe83c20 100644
--- a/INSTALL
+++ b/INSTALL
@@ -297,6 +297,13 @@ systemctl enable xendomains.service
  systemctl enable xen-watchdog.service
  
  
+QEMU Deprivilege

+
+It is recommended to run QEMU as non-root.
+See docs/misc/qemu-deprivilege.txt for an explanation on what you need
+to do at installation time to run QEMU as a dedicated user.
+
+
  History of options
  ==
  
diff --git a/docs/misc/qemu-deprivilege.txt b/docs/misc/qemu-deprivilege.txt

new file mode 100644
index 000..12eb104
--- /dev/null
+++ b/docs/misc/qemu-deprivilege.txt
@@ -0,0 +1,31 @@
+For security reasons, libxl tries to pass a non-root username to QEMU as
+argument. During initialization QEMU calls setuid and setgid with the
+user ID and the group ID of the user passed as argument.
+Libxl looks for the following users in this order:
+
+1) a user named "xen-qemuuser-domid$domid",
+Where $domid is the domid of the domain being created.
+This requires the reservation of 65535 uids from xen-qemuuser-domid1
+to xen-qemuuser-domid65535. To use this mechanism, you might want to
+create a large number of users at installation time. For example:
+
+for ((i=1; i<65536; i++))
+do
+adduser --no-create-home --system xen-qemuuser-domid$i
+done
+
+You might want to consider passing --group to adduser to create a new
+group for each new user.
+
+
+2) a user named "xen-qemuuser-shared"
+As a fall back if both 1) and 2) fail, libxl will use a single user for
+all QEMU instances. The user is named xen-qemuuser-shared. This is
+less secure but still better than running QEMU as root. Using this is as
+simple as creating just one more user on your host:
+
+adduser --no-create-home --system xen-qemuuser-shared
+
+
+3) root
+As a last resort, libxl will start QEMU as root.
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index efc0617..3f4283f 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -192,6 +192,12 @@
   * is not present, instead of ERROR_INVAL.
   */
  #define LIBXL_HAVE_ERROR_DOMAIN_NOTFOUND 1
+
+/* libxl_domain_build_info has device_model_user to specify the user to
+ * run the device model with. See docs/misc/qemu-deprivilege.txt.
+ */
+#define LIBXL_HAVE_DEVICE_MODEL_USER 1
+
  /*
   * libxl ABI compatibility
   *
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 0c6408d..24c43df 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -19,6 +19,8 @@
  
  #include "libxl_internal.h"

  #include 
+#include 
+#include 
  
  static const char *libxl_tapif_script(libxl__gc *gc)

  {
@@ -418,6 +420,33 @@ static char *dm_spice_options(libxl__gc *gc,
  return opt;
  }
  
+/* return 1

Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Dario Faggioli
On Mon, 2015-07-27 at 14:01 +0200, Juergen Gross wrote:
> On 07/27/2015 01:11 PM, George Dunlap wrote:

> > Or alternately, if the user wants to give up on the "consolidation"
> > aspect of virtualization, they can pin vcpus to pcpus and then pass in
> > the actual host topology (hyperthreads and all).
> 
> There would be another solution, of course:
> 
> Support hyperthreads in the Xen scheduler via gang scheduling. While
> this is not a simple solution, it is a fair one. Hyperthreads on one
> core can influence each other rather much. With both threads always
> running vcpus of the same guest the penalty/advantage would stay in the
> same domain. The guest could make really sensible scheduling decisions
> and the licensing would still work as desired.
> 
This is interesting indeed, but I much rather see it as something
orthogonal, which may indeed bring benefits in some of the scenarios
described here, but should not be considered *the* solution.

Implementing, enabling and asking users to use something like this will
impact the system behavior and performance, in ways that may not be
desirable for all use cases.

So, while I do think that this may be something nice to have and offer,
trying to use it for solving the problem we're debating here would make
things even more complex to configure.

Also, this would take care of HT related issues, but what about cores
(as in 'should vcpus be cores of sockets or full sockets') and !HT boxes
(like AMD)?

Not to mention, as you say yourself, that it's not easy to implement.

> Just an idea, but maybe worth to explore further instead of tweaking
> more and more bits to make the virtual system somehow act sane.
> 
Sure, and it it's interesting indeed, for a bunch or reasons and
purposes (as Tim is also noting). Not so much --or at least not
necessarily-- for this one, IMO.

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxl: check nesthvm and altp2m in libxl level

2015-07-27 Thread Wei Liu
On Mon, Jul 27, 2015 at 01:34:04PM +0100, Julien Grall wrote:
> Hi Wei,
> 
> On 24/07/15 16:39, Wei Liu wrote:
> > In ea214001 ("x86/altp2m: add altp2mhvm HVM domain parameter"), a
> > check was added to ensure nestedhvm and altp2m cannot be enabled at
> > the same time. That check was added in xl, but in fact it should be in
> > libxl because it should be the entity that decides whether
> > the provided configuration is valid.
> > 
> > This patch moves the check to libxl. The code snippet is moved after
> > calling libxl__domain_build_info_setdefault so that we can:
> > 1. remove libxl_defbool_is_default in `if()';
> > 2. detect mistake in libxl__domain_build_info_setdefault.
> > 
> > Signed-off-by: Wei Liu 
> > ---
> > Cc: Ed White 
> > Cc: "Sahita, Ravi" 
> > 
> > I said I discovered an issue during review but also volunteered to fix
> > it after the series is merged. Here is the patch to do that.
> > ---
> >  tools/libxl/libxl_create.c | 6 ++
> >  tools/libxl/xl_cmdimpl.c   | 8 
> >  2 files changed, 6 insertions(+), 8 deletions(-)
> > 
> > diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> > index 855b42c..de536ba 100644
> > --- a/tools/libxl/libxl_create.c
> > +++ b/tools/libxl/libxl_create.c
> > @@ -883,6 +883,12 @@ static void initiate_domain_create(libxl__egc *egc,
> >  goto error_out;
> >  }
> >  
> > +if (libxl_defbool_val(d_config->b_info.u.hvm.nested_hvm) &&
> > +libxl_defbool_val(d_config->b_info.u.hvm.altp2m)) {
> > +LOG(ERROR, "nestedhvm and altp2mhvm cannot be used together");
> > +goto error_out;
> > +}
> > +
> 
> The u.hvm.{nested_hvm,altp2m} can only be checked when the created
> domain is an HVM.
> 
> But initiate_domain_create is called with either a PV or HVM domain. So
> you may need to check if we are creating a HVM one.
> 

Good point. I will respin.

Wei.

> Regards,
> 
> -- 
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] xenconsole: Ensure exclusive access to console using locks

2015-07-27 Thread Wei Liu
On Mon, Jul 27, 2015 at 02:44:57PM +0200, Martin Lucina wrote:
> On Friday, 24.07.2015 at 17:01, Ian Jackson wrote:
> > Martin Lucina writes ("[PATCH v3] xenconsole: Ensure exclusive access to 
> > console using locks"):
> > > If more than one instance of xenconsole is run against the same DOMID
> > > then each instance will only get some data. This change ensures
> > > exclusive access to the console by obtaining an exclusive lock on
> > > /xenconsole..
> > 
> > Acked-by: Ian Jackson 
> 
> Can this also make it into the 4.6 release on the grounds of being a bugfix
> for xenconsole, or is this change too invasive?
> 

I would say this is a bugfix. The locking pattern and implementation are
well understood. The risk is minimal.

Release-acked-by: Wei Liu 

> Martin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 59995: regressions - FAIL

2015-07-27 Thread osstest service owner
flight 59995 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59995/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-ovmf-amd64 11 guest-saverestore  fail REGR. vs. 59964

version targeted for testing:
 ovmf cdc83ccf7195c66c77b869ee62155108e39a0246
baseline version:
 ovmf 88656abf1b3a690969851880ba2b134e0d144625

Last test of basis59964  2015-07-26 15:23:09 Z0 days
Testing same since59995  2015-07-27 01:12:08 Z0 days1 attempts


People who touched revisions under test:
  Star Zeng 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit cdc83ccf7195c66c77b869ee62155108e39a0246
Author: Star Zeng 
Date:   Mon Jul 27 00:49:00 2015 +

MdeModulePkg Variable: Read MonotonicCount by ReadUnaligned64()

As variable HEADER_ALIGNMENT = 4, the MonotonicCount in
AUTHENTICATED_VARIABLE_HEADER may be not UINT64 aligned,
so go to use ReadUnaligned64() to ensure read data correctly.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Star Zeng 
Reviewed-by: Jiewen Yao 

git-svn-id: https://svn.code.sf.net/p/edk2/code/trunk/edk2@18064 
6f19259b-4bc3-4df7-8a09-765794883524

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] How to build a linux based stub domain

2015-07-27 Thread Wei Liu
On Mon, Jul 27, 2015 at 02:29:14PM +0800, Xuehan Xu wrote:
> Hi, everyone.
> 
> Is there any way to run a stub domain on linux not mini-os?
> 
> Thanks:-)

Use your favourite mailing list archive to search for message-id

<1423022775-7132-1-git-send-email-eshel...@pobox.com>

Wei.

> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Dario Faggioli
On Fri, 2015-07-24 at 13:11 -0400, Boris Ostrovsky wrote:
> On 07/24/2015 12:48 PM, Juergen Gross wrote:
> > On 07/24/2015 06:40 PM, Boris Ostrovsky wrote:
> >> On 07/24/2015 12:10 PM, Juergen Gross wrote:
> >>>
> >>> If we can fiddle with the masks on boot, we could do it in a running
> >>> system, too. Another advantage with not relying on cpuid. :-)
> >>
> >>
> >> I am trying to catch up with this thread so I may have missed it, but I
> >> still don't understand why we don't want to rely on CPUID.
> >>
> >> I think I saw Juergen said --- because it's HW-specific. But what's
> >> wrong with that? Hypervisor is building virtualized x86 (in this case)
> >> hardware and on such HW CPUID is the standard way of determining
> >> thread/core topology. Plus various ACPI tables and such.
> >>
> >> And having a solution that doesn't address userspace (when there *is* a
> >> solution that can do it) doesn't seem like the best approach. Yes, it
> >> still won't cover userspace for PV guests but neither will the kernel
> >> patch.
> >>
> >> As far as licensing is concerned --- are we sure this can't also be
> >> addressed by CPUID? BTW, if I was asked about who is most concerned
> >> about licensing my first answer would be --- databases. I.e. userspace.
> >
> > The problem is to construct cpuids which will enable the linux
> > scheduler to work correct in spite of the hypervisor scheduler
> > moving vcpus between pcpus. The only way to do this is to emulate
> > single-threaded cores on the numa nodes without further grouping.
> > So either single-core sockets or one socket with many cores.
> 
> Right.
> 
So you see it now? If adhering to some guest virtual topology (for
whatever reason, e.g., performance) and licensing disagree, using CPUID
for both is always going to fail!

> > This might be problematic for licensing: the multi-socket solution
> > might require a higher license based on socket numbers. Or the
> > license is based on cores and will be more expensive as no hyperthreads
> > are detectable.
> 
> If we don't pin VCPUs approriately (which I think is the scenario that 
> we are discussing) then CPUID can be used for find out package ID. And 
> so any licensed SW will easily discover that it is running on different 
> packages.
> 
That's why Juergen is suggesting to keep the things separate,
effectively decoupling them, AFAICU his proposal.

Note that I'm still a bit puzzled by the idea of presenting different
information to the guest OS and to the guest userspace, but that has
upsides, and this decoupling is one.

In fact, in the case you're describing, i.e., not pinned vcpus, etc:
 - the Linux kernel is not relying on CPUID when building scheduling  
   domains, and everything so, at least all the user space applications
   that does not call CPUID and/or rely on that for
   scheduling/performance matters *will* *be* *fine*
 - you can pass down, via tools, a mangled CPUID, e.g., making it best 
   fit your licensing needs.

Problems arises in case you have both the following kind of applications
(or the same application doing both the following operations):
 a) applications that poke at CPUID for licensing purposes
 b) applications that poke at CPUID for placement/performance purposes

In this case, it's well possible that mangling CPUID for making a)
happy, will make b) unhappy, and vice-versa. And the fact that the
kernel does not rely on CPUID any longer may not help, as the
application is kind of bypassing it... Although, it won't harm either...

Regards,
Dario

-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Regression in OVMF + RMRR series

2015-07-27 Thread Wei Liu
I found this in OSSTest

http://logs.test-lab.xenproject.org/osstest/logs/59910/test-amd64-amd64-xl-qemuu-ovmf-amd64/serial-fiano1.log

Jul 25 17:48:51.468985 (d1) HVM Loader
Jul 25 17:48:51.517028 (d1) Detected Xen v4.6-unstable
Jul 25 17:48:51.517056 (d1) Xenbus rings @0xfeffc000, event channel 1
Jul 25 17:48:51.525060 (d1) System requested OVMF
Jul 25 17:48:51.525097 (d1) CPU speed is 1800 MHz
Jul 25 17:48:51.525131 (d1) Relocating guest memory for lowmem MMIO space 
disabled
Jul 25 17:48:51.533055 (XEN) irq.c:276: Dom1 PCI link 0 changed 0 -> 5
Jul 25 17:48:51.541041 (d1) PCI-ISA link 0 routed to IRQ5
Jul 25 17:48:51.541070 (XEN) irq.c:276: Dom1 PCI link 1 changed 0 -> 10
Jul 25 17:48:51.549023 (d1) PCI-ISA link 1 routed to IRQ10
Jul 25 17:48:51.549051 (XEN) irq.c:276: Dom1 PCI link 2 changed 0 -> 11
Jul 25 17:48:51.557040 (d1) PCI-ISA link 2 routed to IRQ11
Jul 25 17:48:51.557067 (XEN) irq.c:276: Dom1 PCI link 3 changed 0 -> 5
Jul 25 17:48:51.557093 (d1) PCI-ISA link 3 routed to IRQ5
Jul 25 17:48:51.565085 (d1) pci dev 01:2 INTD->IRQ5
Jul 25 17:48:51.565111 (d1) pci dev 01:3 INTA->IRQ10
Jul 25 17:48:51.573018 (d1) pci dev 02:0 INTA->IRQ11
Jul 25 17:48:51.573050 (d1) pci dev 04:0 INTA->IRQ5
Jul 25 17:48:51.573080 (d1) No RAM in high memory; setting high_mem resource 
base to 1
Jul 25 17:48:51.589031 (d1) pci dev 03:0 bar 10 size 00200: 0f008
Jul 25 17:48:51.597018 (d1) pci dev 02:0 bar 14 size 00100: 0f208
Jul 25 17:48:51.597053 (d1) pci dev 04:0 bar 30 size 4: 0f300
Jul 25 17:48:51.605026 (d1) pci dev 03:0 bar 30 size 1: 0f304
Jul 25 17:48:51.605063 (d1) pci dev 03:0 bar 14 size 01000: 0f305
Jul 25 17:48:51.613035 (d1) pci dev 02:0 bar 10 size 00100: 0c001
Jul 25 17:48:51.613103 (d1) pci dev 04:0 bar 10 size 00100: 0c101
Jul 25 17:48:51.621075 (d1) pci dev 04:0 bar 14 size 00100: 0f3051000
Jul 25 17:48:51.621109 (d1) pci dev 01:2 bar 20 size 00020: 0c201
Jul 25 17:48:51.629023 (d1) pci dev 01:1 bar 20 size 00010: 0c221
Jul 25 17:48:51.637041 (d1) Multiprocessor initialisation:
Jul 25 17:48:51.637086 (d1)  - CPU0 ... 46-bit phys ... fixed MTRRs ... var 
MTRRs [1/8] ... done.
Jul 25 17:48:51.645025 (d1)  - CPU1 ... 46-bit phys ... fixed MTRRs ... var 
MTRRs [1/8] ... done.
Jul 25 17:48:51.653020 (d1) Testing HVM environment:
Jul 25 17:48:51.653052 (d1)  - REP INSB across page boundaries ... passed
Jul 25 17:48:51.661025 (d1)  - GS base MSRs and SWAPGS ... passed
Jul 25 17:48:51.661058 (d1) Passed 2 of 2 tests
Jul 25 17:48:51.661084 (d1) Writing SMBIOS tables ...
Jul 25 17:48:51.669048 (d1) Loading OVMF ...
Jul 25 17:48:51.669078 (XEN) d1v0 Over-allocation for domain 1: 196865 > 196864
Jul 25 17:48:51.677064 (XEN) memory.c:155:d1v0 Could not allocate order=0 
extent: id=1 memflags=0 (0 of 1)
Jul 25 17:48:51.677111 (d1) Loading ACPI ...
Jul 25 17:48:51.685061 (d1) vm86 TSS at fc012d00
Jul 25 17:48:51.685107 (d1) BIOS map:
Jul 25 17:48:51.685145 (d1)  ffe0-: Main BIOS
Jul 25 17:48:51.693030 (d1) *** HVMLoader bug at e820.c:262
Jul 25 17:48:51.693064 (d1) *** HVMLoader crashed.

Git blame shows that the change that crashes hvmloader was part of the
RMRR series.

Tiejun, could you please fix this please?  It should be easy to reproduce.

Guest configuration file is at:

http://logs.test-lab.xenproject.org/osstest/logs/59910/test-amd64-amd64-xl-qemuu-ovmf-amd64/fiano1--debianhvm.guest.osstest.cfg

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Getting rid of invalid SYSCALL RSP under Xen?

2015-07-27 Thread Andrew Cooper
On 27/07/15 00:27, Andy Lutomirski wrote:
>
> For SYSRET, I think the way to go is to force Xen to always use the
> syscall slow path.  Instead, Xen could hook into
> syscall_return_via_sysret or even right before the opportunistic
> sysret stuff.  Then we could remove the USERGS_SYSRET hooks entirely.
>
> Would this work?
 None of the opportunistic sysret stuff makes sense under Xen.  The path
 will inevitably end up in xen_iret making a hypercall.  Short circuiting
 all of this seems like a good idea, especially if it allows for the
 removal of the UERGS_SYSRET.
>>> Doesn't Xen decide what to do based on VGCF_IN_SYSCALL?  Maybe Xen
>>> should have its own opportunistic VGCF_IN_SYSCALL logic.
>> VGCF_in_syscall affects whether the extra r11/rcx get restored or not,
>> as the hypercall itself is implemented using syscall.  As the extra
>> r11/rcx (and rax for that matter) are unconditionally saved in the
>> hypercall stub, I can't see anything Linux could usefully do,
>> opportunistically speaking.
> Xen does:
>
> /* %rbx: struct vcpu, interrupts disabled */
> restore_all_guest:
> ASSERT_INTERRUPTS_DISABLED
> RESTORE_ALL
> testw $TRAP_syscall,4(%rsp)
> jziret_exit_to_guest
>
> /* Don't use SYSRET path if the return address is not canonical. */
> movq  8(%rsp),%rcx
> sarq  $47,%rcx
> incl  %ecx
> cmpl  $1,%ecx
> ja.Lforce_iret
>
> cmpw  $FLAT_USER_CS32,16(%rsp)# CS
> movq  8(%rsp),%rcx# RIP
> movq  24(%rsp),%r11   # RFLAGS
> movq  32(%rsp),%rsp   # RSP
> je1f
> sysretq
> 1:  sysretl
>
> That's essentially the same thing as opportunistic sysret.  If Linux
> stops setting VGCF_in_syscall, though, I think we'll bypass that code,
> which will hurt performance.  Whether this should be fixed in the
> hypervisor or in the guest kernel hooks, I don't know, but it would be
> easy to have a very simple xen_opportunistic_sysret path that checks
> rcx==rip and r11==rflags and, if so, sets VGCF_in_syscall.

I see your point.  I didn't intend to suggest that Linux should stop
setting VGCF_in_syscall, as it is the only entity which knows whether it
is safe to clobber rcx/r11 in user context.

Having said this, Xen could certainly do its own opportunistic sysret
calculations as well.  There are a number of issues in the Xen sysret
code which I plan to fix in due course, and I will see about making this
adjustment.

>
>>> Hmm, maybe some of this would be easier to think about if, rather than
>>> having a paravirt op, we could have:
>>>
>>> ALTERNATIVE "", "jmp xen_pop_things_and_iret", X86_FEATURE_XEN
>>>
>>> Or just IF_XEN("jmp ...");
>>>
>>> As a practical matter, x86_64 has native and Xen -- I don't think
>>> there's any other paravirt platform that needs the asm hooks.
>> It would certainly seem so.  A careful use of IF_XEN() or two would make
>> the code far clearer to read, and to drop the hooks.
>>
> Want to add an IF_XEN macro?

I currently have a blocker bug against the impending Xen 4.6 release
which is higher on my todo list, but I will look into this as soon as I can.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH for-4.6 v2] libxl: check nesthvm and altp2m in libxl

2015-07-27 Thread Wei Liu
In ea214001 ("x86/altp2m: add altp2mhvm HVM domain parameter"), a
check was added to ensure nestedhvm and altp2m cannot be enabled at
the same time. That check was added in xl, but in fact it should be in
libxl because it should be the entity that decides whether
the provided configuration is valid.

This patch moves the check to libxl. The code snippet is moved after
calling libxl__domain_build_info_setdefault so that we can:
1. remove libxl_defbool_is_default in `if()';
2. detect mistake in libxl__domain_build_info_setdefault.

Signed-off-by: Wei Liu 
---
Cc: Ed White 
Cc: "Sahita, Ravi" 

v2: make sure we're building hvm domain before checking those flags
---
 tools/libxl/libxl_create.c | 7 +++
 tools/libxl/xl_cmdimpl.c   | 8 
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 855b42c..0294844 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -883,6 +883,13 @@ static void initiate_domain_create(libxl__egc *egc,
 goto error_out;
 }
 
+if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM &&
+(libxl_defbool_val(d_config->b_info.u.hvm.nested_hvm) &&
+ libxl_defbool_val(d_config->b_info.u.hvm.altp2m))) {
+LOG(ERROR, "nestedhvm and altp2mhvm cannot be used together");
+goto error_out;
+}
+
 for (i = 0; i < d_config->num_disks; i++) {
 ret = libxl__device_disk_setdefault(gc, &d_config->disks[i]);
 if (ret) {
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index d0bf0cb..9755d55 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1568,14 +1568,6 @@ static void parse_config_data(const char *config_source,
 
 xlu_cfg_get_defbool(config, "altp2mhvm", &b_info->u.hvm.altp2m, 0);
 
-if (!libxl_defbool_is_default(b_info->u.hvm.nested_hvm) &&
-libxl_defbool_val(b_info->u.hvm.nested_hvm) &&
-!libxl_defbool_is_default(b_info->u.hvm.altp2m) &&
-libxl_defbool_val(b_info->u.hvm.altp2m)) {
-fprintf(stderr, "ERROR: nestedhvm and altp2mhvm cannot be used 
together\n");
-exit(1);
-}
-
 xlu_cfg_replace_string(config, "smbios_firmware",
&b_info->u.hvm.smbios_firmware, 0);
 xlu_cfg_replace_string(config, "acpi_firmware",
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Juergen Gross

On 07/27/2015 03:23 PM, Dario Faggioli wrote:

On Mon, 2015-07-27 at 14:01 +0200, Juergen Gross wrote:

On 07/27/2015 01:11 PM, George Dunlap wrote:



Or alternately, if the user wants to give up on the "consolidation"
aspect of virtualization, they can pin vcpus to pcpus and then pass in
the actual host topology (hyperthreads and all).


There would be another solution, of course:

Support hyperthreads in the Xen scheduler via gang scheduling. While
this is not a simple solution, it is a fair one. Hyperthreads on one
core can influence each other rather much. With both threads always
running vcpus of the same guest the penalty/advantage would stay in the
same domain. The guest could make really sensible scheduling decisions
and the licensing would still work as desired.


This is interesting indeed, but I much rather see it as something
orthogonal, which may indeed bring benefits in some of the scenarios
described here, but should not be considered *the* solution.


Correct. I still think it should be done.


Implementing, enabling and asking users to use something like this will
impact the system behavior and performance, in ways that may not be
desirable for all use cases.


I'd make it a scheduler parameter. So you could it enable for a
specific cpupool where you want it to be active.


So, while I do think that this may be something nice to have and offer,
trying to use it for solving the problem we're debating here would make
things even more complex to configure.

Also, this would take care of HT related issues, but what about cores
(as in 'should vcpus be cores of sockets or full sockets') and !HT boxes
(like AMD)?


!HT boxes will have no problem: We won't have to hide HT as cores...

Regarding many sockets with 1 core each or 1 socket with many cores: I
think 1 socket for the non-NUMA case is okay, we'll want multiple
sockets for NUMA.


Not to mention, as you say yourself, that it's not easy to implement.


Yeah, but it will be fun. ;-)




Just an idea, but maybe worth to explore further instead of tweaking
more and more bits to make the virtual system somehow act sane.


Sure, and it it's interesting indeed, for a bunch or reasons and
purposes (as Tim is also noting). Not so much --or at least not
necessarily-- for this one, IMO.


It's especially interesting regarding accounting. A vcpu running for
1 second can do much more if no other vcpu is running on the same core.
This would be a problem of the guest then, like on bare metal.

For real time purposes it might be even interesting to schedule only 1
vcpu per core to have a reliable high speed of the vcpu.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Dario Faggioli
On Mon, 2015-07-27 at 12:11 +0100, George Dunlap wrote:

> 1. Userspace applications are in the habit of reading CPUID to determine
> the topology of the system they're running on
> 
I'd add this item here:

1b. Linux kernel uses CPUID to configure some bits of its scheduler. The
result of that, will affect *all* the in-guest application that does
not use CPUID for scheduling/performance purposes

Yes, I saw you mention this afterwards, but I think it really should be
here, in the analysis part.

> 2. Many use the topology information to help themselves make better
> scheduling decisions.  Because a vcpu is not typically pinned to a
> specific pcpu, we may need to lie here slightly (e.g., not mention
> threads) to get the optimal behavior overall.
> 
> 3. Others use the topology information to implement licensing
> restrictions.  Because threads are treated differently to cores, we want
> to tell the truth here (i.e., make sure we mention that some of these
> are threads) to get the optimal behavior overall.
> 
Define truth. I mean, is 'telling the truth' always good for this case?
E.g., in a 4 vcpus guest, on a 4 socket box, without any pinning, it may
be that when app X samples CPUID to check the license, each vcpu is
running on a different socket, so the truth means we need a licence for
4 sockets... I don't think this is ideal, is it?

Anyways...
> Numbers #2 and #3 lead to contradictory courses of action; we cannot
> optimize for both at the same time.
> 
...that's certainly true, IMO.

> I think at some level we need to just try to accommodate both -- if the
> user doesn't have licensing issues, or prefers performance over
> licensing, then present a unified topology in PVH / HVM using CPUID,
> ACPI, &c.  I think this should be the default.
> 
> If the user has licensing issues, and doesn't mind having wonky or
> unreliable topology to its guests, then let the raw CPUID through.  But
> it would, in this case, be good to try to give the guest OS scheduler a
> hint that it shouldn't really bother trying to read the topology or do
> placement as a result, as any decisions will be unreliable.
> 
This last sentence is basically Juergen proposal, AFAICT, and I agree
that it would be good in case #3. But, thinking more about it, would it
harm in case #2? I don't think it would.

After all, it'd boil down to make peace with the fact that something
like CPUID, in a (not only PV!) VM, is just not reliable enough to use
it for building in-kernel scheduling related data structure, like
Linux's scheduling domains. It is unreliable because its content may
conflict with vNUMA, but, really, even with no vNUMA, it is unreliable
because it depends on whether the VM's vcpus are pinned or not, and if
not, it depends on where they actually run, which is pure randomness,
from the guest point of view.

So, I'm really starting to think that a patch stopping the Linux kernel
relying on CPUID, whether original or mangled, and for all kind of
guests, would really be a nice one to have!

Regards,
Dario

-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Dario Faggioli
On Fri, 2015-07-24 at 18:10 +0200, Juergen Gross wrote:
> On 07/24/2015 05:58 PM, Dario Faggioli wrote:

> > So, just to check if I'm understanding is correct: you'd like to add an
> > abstraction layer, in Linux, like in generic (or, perhaps, scheduling)
> > code, to hide the direct interaction with CPUID.
> > Such layer, on baremetal, would just read CPUID while, on PV-ops, it'd
> > check with Xen/match vNUMA/whatever... Is this that you are saying?
> 
> Sort of, yes.
> 
> I just wouldn't add it, as it is already existing (more or less). It
> can deal right now with AMD and Intel, we would "just" have to add Xen.
> 
So, having gone through the rest of the thread (so far), and having
given a fair amount o thinking to this, I really think that something
like this would be a good thing to have in Linux.

Of course, it's not that my opinion on where should be in Linux counts
that much! :-D   Nevertheless, I wanted to make it clear that, while
skeptic at the beginning, I now think this is (part of) the way to go,
as I said and explained in my reply to George.

Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xen-unstabel + linux 4.2: MMIO emulation failed: d23v0 64bit @ 0010:ffffffff814e2b1c -> 66 89 02 48 8d 55 c0 48 89 5d c0 44 89 65 c8 e8

2015-07-27 Thread Andrew Cooper
On 24/07/15 19:56, li...@eikelenboom.it wrote:
> Hi All,
>
> On my AMD system running xen-unstable (last commit: ),
>  after a few restarts of a HVM guest with pci-passthrough i got these
> on shutdown of the guest:
> (never seen this before, so it should be something triggered by a
> recent commit)
>
> -- 
> Sander
>
>
>  (probably lost before but that's lost)
>
> (XEN) [2015-07-24 18:46:53.732] domain_crash called from io.c:166
> (XEN) [2015-07-24 18:46:53.732] io.c:165:d23v0 Weird HVM ioemulation
> status 1.
> (XEN) [2015-07-24 18:46:53.732] domain_crash called from io.c:166
> (XEN) [2015-07-24 18:46:53.732] io.c:165:d23v0 Weird HVM ioemulation
> status 1.
> (XEN) [2015-07-24 18:46:53.732] domain_crash called from io.c:166
> (XEN) [2015-07-24 18:46:53.732] io.c:165:d23v0 Weird HVM ioemulation
> status 1.

Paul: this is very likely an issue your emulation series.

66 89 02 is mov %ax,(%rdx), but has ended up in handle_pio() which seems
wrong.

Sander: Please can you rerun with the following debug

diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index d3b9cae..7560d08 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -163,7 +163,9 @@ int handle_pio(uint16_t port, unsigned int size, int
dir)
 break;
 default:
 gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc);
-domain_crash(curr->domain);
+show_execution_state(curr);
+dump_execution_state();
+domain_crash_synchronous(curr->domain);
 break;
 }

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [libvirt test] 60006: regressions - FAIL

2015-07-27 Thread osstest service owner
flight 60006 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/60006/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-libvirt5 libvirt-build fail REGR. vs. 59907
 build-amd64-pvops 5 kernel-build  fail REGR. vs. 59907
 build-i386-pvops  5 kernel-build  fail REGR. vs. 59907
 build-armhf-pvops 5 kernel-build  fail REGR. vs. 59907

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  2094d01e2f54e5774c0d0d380e83154b42ea65be
baseline version:
 libvirt  be6c35e4acdff92c9f9a875de28474a84367f742

Last test of basis59907  2015-07-25 10:30:59 Z2 days
Failing since 59948  2015-07-26 04:19:54 Z1 days2 attempts
Testing same since60006  2015-07-27 04:20:16 Z0 days1 attempts


People who touched revisions under test:
  Daniel Veillard 
  Laine Stump 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   fail
 build-amd64-pvopsfail
 build-armhf-pvopsfail
 build-i386-pvops fail
 test-amd64-amd64-libvirt-xsm blocked 
 test-armhf-armhf-libvirt-xsm blocked 
 test-amd64-i386-libvirt-xsm  blocked 
 test-amd64-amd64-libvirt blocked 
 test-armhf-armhf-libvirt blocked 
 test-amd64-i386-libvirt  blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 2094d01e2f54e5774c0d0d380e83154b42ea65be
Author: Daniel Veillard 
Date:   Mon Jul 27 10:17:05 2015 +0800

Renamed deconfigured-cpus to allow make dist

Simplest was just to rename that extra long name and move files in git
accordingly

commit e14310724071d5853ae5ca612756dfc6b156642e
Author: Laine Stump 
Date:   Thu Jun 25 12:55:12 2015 -0400

conf: add virDomainControllerDefNew()

There are some non-0 default values in virDomainControllerDef (and
will soon be more) that are easier to not forget if the remembering is
done by a single initializer function (rather than inline code after
allocating the obejct with generic VIR_ALLOC().

commit 0726878297c75da15052df3b16d76c7abfd76013
Author: Laine Stump 
Date:   Thu Jun 25 12:02:32 2015 -0400

qemu: reorganize loop in qemuDomainAssignPCIAddresses

This loop occurs just after we've assured that all devices that
require a PCI device have been assigned and all necessary PCI
controllers have been added. It is the perfect place to add other
potentially auto-generated PCI controller attributes that are
dependent on the controller's PCI address (upcoming patch).

There is a convenient loop through all controllers at the end of the
function, but the patch to add new functionality will be cleaner if we
first rearrange that loop a bit.

Note that the loop originally was accessing info.addr.pci.bus prior to
determining that the pci part of the object was valid. This

Re: [Xen-devel] Interested in taking up a project

2015-07-27 Thread Dario Faggioli
On Sat, 2015-07-25 at 18:04 +0530, Abhinav Gupta wrote:
> Hii everyone :) , 
>
Hi,

>I'm quite familiar with the linux powerclamp driver now. 
>
Nice to hear. Is there anything about that you think it would be
useful/interesting about to share here? 
> 
> I have also started looking into xen's code as Dario suggested, but am
> not able to find proper documentation for xen.
>
He, I think I see what you mean... Consider that, when you reach a
certain point, e.g., wanting to understand the code of a complex project
like Xen, the best documentation for the source code is the source code
itself.

> 1. Looking for a brief explanation of different fields in scheduler
> data structure in sched-if.h
>
> 2. From where do the different fields of scheduler structure gets
> called.
>
As said above, something like that does not exist for any reasonably big
and reasonably complex piece of software. There are many reasons. One,
for instance, is that it would take a great effort to be put together,
and it will get out of date in a matter of a few months, at most (but
even a few weeks, or a few days, if you're unlucky).

All that being said, you can have a look here:
 http://wiki.xen.org/wiki/Credit2_Scheduler_Development

It's not exactly what you're asking, but it's probably the closest
existing thing.

Also, as a proof that I was speaking the truth just above:
 1. it's incomplete
 2. it's (slightly) out of dated already

 :-(

> 3. The driver i'll be writing will it be running at host machine level
> or guest OS level ?. 
>
Definitely host.

Actually, at the guest level, it's already there... it's powerclamp
itself, isn't it (at lease in case of Linux) ? :-D

> As far as my understanding goes we should have it at host level to
> optimize the performance of all the guests, 
>
Yes, exactly. Also, it's only the host that have a big enough picture,
and the access to all the information and the data you need.

> since VMs deal with the abstract interface (VCPU) so they wont be
> having the exact notion of the various parameters of cpu at runtime.
>
Indeed. They won't have the exact notion of a bunch of stuff. :-)

Regards,
Dario

-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xen-unstabel + linux 4.2: MMIO emulation failed: d23v0 64bit @ 0010:ffffffff814e2b1c -> 66 89 02 48 8d 55 c0 48 89 5d c0 44 89 65 c8 e8

2015-07-27 Thread Paul Durrant
> -Original Message-
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: 27 July 2015 15:08
> To: li...@eikelenboom.it; xen-devel@lists.xen.org
> Cc: Paul Durrant
> Subject: Re: [Xen-devel] xen-unstabel + linux 4.2: MMIO emulation failed:
> d23v0 64bit @ 0010:814e2b1c -> 66 89 02 48 8d 55 c0 48 89 5d c0 44 89
> 65 c8 e8
> 
> On 24/07/15 19:56, li...@eikelenboom.it wrote:
> > Hi All,
> >
> > On my AMD system running xen-unstable (last commit: ),
> >  after a few restarts of a HVM guest with pci-passthrough i got these
> > on shutdown of the guest:
> > (never seen this before, so it should be something triggered by a
> > recent commit)
> >
> > --
> > Sander
> >
> >
> >  (probably lost before but that's lost)
> >
> > (XEN) [2015-07-24 18:46:53.732] domain_crash called from io.c:166
> > (XEN) [2015-07-24 18:46:53.732] io.c:165:d23v0 Weird HVM ioemulation
> > status 1.
> > (XEN) [2015-07-24 18:46:53.732] domain_crash called from io.c:166
> > (XEN) [2015-07-24 18:46:53.732] io.c:165:d23v0 Weird HVM ioemulation
> > status 1.
> > (XEN) [2015-07-24 18:46:53.732] domain_crash called from io.c:166
> > (XEN) [2015-07-24 18:46:53.732] io.c:165:d23v0 Weird HVM ioemulation
> > status 1.
> 
> Paul: this is very likely an issue your emulation series.
> 
> 66 89 02 is mov %ax,(%rdx), but has ended up in handle_pio() which seems
> wrong.
> 

It suggests that the MMIO emulation failure did not clean up and thus the 
subsequent handle_pio() found the state machine in a bad state.

  Paul

> Sander: Please can you rerun with the following debug
> 
> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
> index d3b9cae..7560d08 100644
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -163,7 +163,9 @@ int handle_pio(uint16_t port, unsigned int size, int
> dir)
>  break;
>  default:
>  gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc);
> -domain_crash(curr->domain);
> +show_execution_state(curr);
> +dump_execution_state();
> +domain_crash_synchronous(curr->domain);
>  break;
>  }
> 
> ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/libxl: Fixes to stream v2 task joining logic

2015-07-27 Thread Andrew Cooper
On 24/07/15 12:41, Ian Jackson wrote:
> Andrew Cooper writes ("[PATCH] tools/libxl: Fixes to stream v2 task joining 
> logic"):
>> During review of the libxl migration v2 series, I changes the task
>> joining logic, but clearly didn't think the result through
>> properly. This patch fixes several errors.
> This would have been much easier to review if it had been split into 3
> patches.  I have gone mostly by the commit message because it was hard
> to see hunk belonged to what.

I have split them up. 

>
>> 3) Avoid stacking of check_all_finished() via synchronous teardown of
>> tasks.  If the _abort() functions call back synchronously,
>> stream->completion_callback() ends up getting called twice, as first and
>> last check_all_finished() frames observe each task being finished.
> I think this part of the patch is fine.
>
>
>> 1) Do not call check_all_finished() in the success cases of
>> libxl__xc_domain_{save,restore}_done().  It serves no specific purpose
>> as the save helper state will report itself as inactive by this point,
>> and avoids triggering a second stream->completion_callback() in the case
>> that write_toolstack_record()/stream_continue() record errors
>> synchronously themselves.
> "Serves no specific purpose" other than having a single exit path,
> which makes matters much less confusing.

"Serves no specific purpose" in so far as what check_all_finished()
would do in the success case.

>
> I think the problem may be that libxl__xc_domain_{save,restore}_done
> fail to "return" after "write_toolstack_record" and "stream_continue".
> That seems like simply a bug.  I'm sorry that I didn't notice it in
> review.

After some more thought, I don't believe that my fix is necessarily
correct.  If a condition were to exist where the stream had recorded an
error and abort()'ed the save helper, but the save helper was already
exiting with a success condition, then the callback wouldn't be fired at
all.  I have a proposed alternate solution.

>
> In general each callback function should set up exactly one other
> callback.  If it does anything else then reentrancy hazards arise.

The entire point of this logic is that there are multiple operations
going on in parallel.  It is not guaranteed that a save helper will ever
be spawned on the read side (although this would be a very useless
stream).  The state of the libxl stream read/write object is
deliberately separate from the save helper.

>
> Also, it is confusing and perhaps wrong that write_toolstack_record
> calls stream_complete.  What if there are other threads of control
> outstanding ?

This is exactly the problem which check_all_finished() is supposed to
solve, but currently doesn't.

>
>
>> 2) Only ever set stream->rc in stream_complete().  The first version of
>> the migration v2 series had separate rc and joined_rc parameters, where
>> this logic worked.  However when combining the two, the teardown path
>> fails to trigger if stream_done() records stream->rc itself.  A side
>> effect of this is that stream_done() needs to take an rc parameter.
> "the teardown path fails to trigger if stream_done() records
> stream->rc itself" but in the code I am looking at neither of the
> functions stream_done assign to rc.

I have no idea why I wrote what I did.  The code was correct but the
description was wrong.  I have fixed it in the split version of this patch.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Boris Ostrovsky

On 07/27/2015 10:09 AM, Dario Faggioli wrote:

On Fri, 2015-07-24 at 18:10 +0200, Juergen Gross wrote:

On 07/24/2015 05:58 PM, Dario Faggioli wrote:

So, just to check if I'm understanding is correct: you'd like to add an
abstraction layer, in Linux, like in generic (or, perhaps, scheduling)
code, to hide the direct interaction with CPUID.
Such layer, on baremetal, would just read CPUID while, on PV-ops, it'd
check with Xen/match vNUMA/whatever... Is this that you are saying?

Sort of, yes.

I just wouldn't add it, as it is already existing (more or less). It
can deal right now with AMD and Intel, we would "just" have to add Xen.


So, having gone through the rest of the thread (so far), and having
given a fair amount o thinking to this, I really think that something
like this would be a good thing to have in Linux.

Of course, it's not that my opinion on where should be in Linux counts
that much! :-D   Nevertheless, I wanted to make it clear that, while
skeptic at the beginning, I now think this is (part of) the way to go,
as I said and explained in my reply to George.


And I continue to believe that kernel solution does not address the 
userland problem which is no less important than making kernel do proper 
scheduling decisions (and I suspect when this patch goes for review 
that's what the scheduling people are going to say).


Remember the original problem that started this thread was that kernel 
complained that topology didn't make sense and it turned off all 
topology-related decisions. Which means that kernel already has a 
solution for weird topology. Some enumeration doesn't trigger this 
warning, but we can come up with one that does. Or we can indeed have a 
patch in kernel that will, possibly silently, fail topology_sane() when 
virtualized and not pinned.


(This is what I assume kernel does when topology_sane() fails. And if it 
doesn't, that's a bug IMO)


The licensing problem that Juergen described can be solved by pining 
vcpus and exposing HT bit. Besides,  creating a guest with 24 VPCUs and 
hoping that 16-core licensing will work I think is pushing it a bit when 
you know that VCPUs will jump around cores (i.e. "on average" you are 
running on more than 16 cores -- multi-threaded or not -- which arguably 
is what licensing is trying to prevent)


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression in OVMF + RMRR series

2015-07-27 Thread Wei Liu
I forgot to mention:  you need to use --enable-ovmf in ./configure to
enable OVMF support.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Juergen Gross

On 07/27/2015 04:34 PM, Boris Ostrovsky wrote:

On 07/27/2015 10:09 AM, Dario Faggioli wrote:

On Fri, 2015-07-24 at 18:10 +0200, Juergen Gross wrote:

On 07/24/2015 05:58 PM, Dario Faggioli wrote:

So, just to check if I'm understanding is correct: you'd like to add an
abstraction layer, in Linux, like in generic (or, perhaps, scheduling)
code, to hide the direct interaction with CPUID.
Such layer, on baremetal, would just read CPUID while, on PV-ops, it'd
check with Xen/match vNUMA/whatever... Is this that you are saying?

Sort of, yes.

I just wouldn't add it, as it is already existing (more or less). It
can deal right now with AMD and Intel, we would "just" have to add Xen.


So, having gone through the rest of the thread (so far), and having
given a fair amount o thinking to this, I really think that something
like this would be a good thing to have in Linux.

Of course, it's not that my opinion on where should be in Linux counts
that much! :-D   Nevertheless, I wanted to make it clear that, while
skeptic at the beginning, I now think this is (part of) the way to go,
as I said and explained in my reply to George.


And I continue to believe that kernel solution does not address the
userland problem which is no less important than making kernel do proper
scheduling decisions (and I suspect when this patch goes for review
that's what the scheduling people are going to say).

Remember the original problem that started this thread was that kernel
complained that topology didn't make sense and it turned off all
topology-related decisions. Which means that kernel already has a
solution for weird topology. Some enumeration doesn't trigger this
warning, but we can come up with one that does. Or we can indeed have a
patch in kernel that will, possibly silently, fail topology_sane() when
virtualized and not pinned.


How would you come up with a topology the kernel is complaining about
and user mode scheduling will use for sane decisions ?


(This is what I assume kernel does when topology_sane() fails. And if it
doesn't, that's a bug IMO)

The licensing problem that Juergen described can be solved by pining
vcpus and exposing HT bit. Besides,  creating a guest with 24 VPCUs and


Hmm, yes. This way you sacrifice most of the virtualization advantages.


hoping that 16-core licensing will work I think is pushing it a bit when
you know that VCPUs will jump around cores (i.e. "on average" you are
running on more than 16 cores -- multi-threaded or not -- which arguably
is what licensing is trying to prevent)


On a machine with only 16 cores running on more than 16 cores? I have
some problems to believe this. The point was: if the license is happy on
bare metal it should be so when running on the same hardware as a guest.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xen-unstabel + linux 4.2: MMIO emulation failed: d23v0 64bit @ 0010:ffffffff814e2b1c -> 66 89 02 48 8d 55 c0 48 89 5d c0 44 89 65 c8 e8

2015-07-27 Thread Sander Eikelenboom

Monday, July 27, 2015, 4:07:39 PM, you wrote:

> On 24/07/15 19:56, li...@eikelenboom.it wrote:
>> Hi All,
>>
>> On my AMD system running xen-unstable (last commit: ),
>>  after a few restarts of a HVM guest with pci-passthrough i got these
>> on shutdown of the guest:
>> (never seen this before, so it should be something triggered by a
>> recent commit)
>>
>> -- 
>> Sander
>>
>>
>>  (probably lost before but that's lost)
>>
>> (XEN) [2015-07-24 18:46:53.732] domain_crash called from io.c:166
>> (XEN) [2015-07-24 18:46:53.732] io.c:165:d23v0 Weird HVM ioemulation
>> status 1.
>> (XEN) [2015-07-24 18:46:53.732] domain_crash called from io.c:166
>> (XEN) [2015-07-24 18:46:53.732] io.c:165:d23v0 Weird HVM ioemulation
>> status 1.
>> (XEN) [2015-07-24 18:46:53.732] domain_crash called from io.c:166
>> (XEN) [2015-07-24 18:46:53.732] io.c:165:d23v0 Weird HVM ioemulation
>> status 1.

> Paul: this is very likely an issue your emulation series.

> 66 89 02 is mov %ax,(%rdx), but has ended up in handle_pio() which seems
> wrong.

> Sander: Please can you rerun with the following debug

Well i have only seen this once now .. hasn't happened again so far, so 
it's not very reproduceable i'm afraid.

I can see when i can make some time to run a script that does a loop
on creating and shutting down a guest with pci-passthrough, see if i can
get it to fail again.

--
Sander


> diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
> index d3b9cae..7560d08 100644
> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -163,7 +163,9 @@ int handle_pio(uint16_t port, unsigned int size, int
> dir)
>  break;
>  default:
>  gdprintk(XENLOG_ERR, "Weird HVM ioemulation status %d.\n", rc);
> -domain_crash(curr->domain);
> +show_execution_state(curr);
> +dump_execution_state();
> +domain_crash_synchronous(curr->domain);
>  break;
>  }

> ~Andrew



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Juergen Gross

On 07/27/2015 04:34 PM, Boris Ostrovsky wrote:

On 07/27/2015 10:09 AM, Dario Faggioli wrote:

On Fri, 2015-07-24 at 18:10 +0200, Juergen Gross wrote:

On 07/24/2015 05:58 PM, Dario Faggioli wrote:

So, just to check if I'm understanding is correct: you'd like to add an
abstraction layer, in Linux, like in generic (or, perhaps, scheduling)
code, to hide the direct interaction with CPUID.
Such layer, on baremetal, would just read CPUID while, on PV-ops, it'd
check with Xen/match vNUMA/whatever... Is this that you are saying?

Sort of, yes.

I just wouldn't add it, as it is already existing (more or less). It
can deal right now with AMD and Intel, we would "just" have to add Xen.


So, having gone through the rest of the thread (so far), and having
given a fair amount o thinking to this, I really think that something
like this would be a good thing to have in Linux.

Of course, it's not that my opinion on where should be in Linux counts
that much! :-D   Nevertheless, I wanted to make it clear that, while
skeptic at the beginning, I now think this is (part of) the way to go,
as I said and explained in my reply to George.


And I continue to believe that kernel solution does not address the
userland problem which is no less important than making kernel do proper
scheduling decisions (and I suspect when this patch goes for review
that's what the scheduling people are going to say).


I didn't say it would solve that problem. It would decouple kernel
scheduling and cpuid values in order to be free to present cpuid values
to user land needed there.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Boris Ostrovsky

On 07/27/2015 10:43 AM, Juergen Gross wrote:

On 07/27/2015 04:34 PM, Boris Ostrovsky wrote:

On 07/27/2015 10:09 AM, Dario Faggioli wrote:

On Fri, 2015-07-24 at 18:10 +0200, Juergen Gross wrote:

On 07/24/2015 05:58 PM, Dario Faggioli wrote:
So, just to check if I'm understanding is correct: you'd like to 
add an
abstraction layer, in Linux, like in generic (or, perhaps, 
scheduling)

code, to hide the direct interaction with CPUID.
Such layer, on baremetal, would just read CPUID while, on PV-ops, 
it'd

check with Xen/match vNUMA/whatever... Is this that you are saying?

Sort of, yes.

I just wouldn't add it, as it is already existing (more or less). It
can deal right now with AMD and Intel, we would "just" have to add 
Xen.



So, having gone through the rest of the thread (so far), and having
given a fair amount o thinking to this, I really think that something
like this would be a good thing to have in Linux.

Of course, it's not that my opinion on where should be in Linux counts
that much! :-D   Nevertheless, I wanted to make it clear that, while
skeptic at the beginning, I now think this is (part of) the way to go,
as I said and explained in my reply to George.


And I continue to believe that kernel solution does not address the
userland problem which is no less important than making kernel do proper
scheduling decisions (and I suspect when this patch goes for review
that's what the scheduling people are going to say).

Remember the original problem that started this thread was that kernel
complained that topology didn't make sense and it turned off all
topology-related decisions. Which means that kernel already has a
solution for weird topology. Some enumeration doesn't trigger this
warning, but we can come up with one that does. Or we can indeed have a
patch in kernel that will, possibly silently, fail topology_sane() when
virtualized and not pinned.


How would you come up with a topology the kernel is complaining about
and user mode scheduling will use for sane decisions ?


We need to understand first why Dario's box is apparently the only one 
resulting in a warning and probably then emulate that enumeration.


And again, if that is not possible then just make topology_sane() fail.




(This is what I assume kernel does when topology_sane() fails. And if it
doesn't, that's a bug IMO)

The licensing problem that Juergen described can be solved by pining
vcpus and exposing HT bit. Besides,  creating a guest with 24 VPCUs and


Hmm, yes. This way you sacrifice most of the virtualization advantages.


hoping that 16-core licensing will work I think is pushing it a bit when
you know that VCPUs will jump around cores (i.e. "on average" you are
running on more than 16 cores -- multi-threaded or not -- which arguably
is what licensing is trying to prevent)


On a machine with only 16 cores running on more than 16 cores? I have
some problems to believe this. The point was: if the license is happy on
bare metal it should be so when running on the same hardware as a guest.


Ok, that's not how I should have described it. I meant that IMO asking 
for 24 VCPUs is somewhat akin to oversubscribing since you kind of know 
that you dont' have 24 PCPUs, you are just trying to fool the kernel 
into thinking that threads are cores.


-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Dario Faggioli
On Mon, 2015-07-27 at 10:34 -0400, Boris Ostrovsky wrote:
> On 07/27/2015 10:09 AM, Dario Faggioli wrote:

> > Of course, it's not that my opinion on where should be in Linux counts
> > that much! :-D   Nevertheless, I wanted to make it clear that, while
> > skeptic at the beginning, I now think this is (part of) the way to go,
> > as I said and explained in my reply to George.
> 
> And I continue to believe that kernel solution does not address the 
> userland problem which is no less important than making kernel do proper 
> scheduling decisions 
>
Sure, but the key point now is, IMO, whether we recognise that we're
dealing with two problems, which I think is the case.

One problem is:
 1. CPUID in VMs is unreliable, can we rely on it as few as possible?

The other problem is:
 2. if someone wants to rely on CPUID, what should we do.

Avoiding the scheduling domains (and/or whatever else!) to relay on
something that is unreliable by definition, which is what is being
proposed, seems to me a pretty decent solution to 1.

> (and I suspect when this patch goes for review 
> that's what the scheduling people are going to say).
> 
Perhaps. However, I may be too optimistic, but I think that making them
realize that all we are doing is stopping relying on unreliable grounds
would not be impossible... Especially if the code really looks as
Juergen said, i.e., it's already quite friendly to implementing
something like this...

> Remember the original problem that started this thread was that kernel 
> complained that topology didn't make sense and it turned off all 
> topology-related decisions. Which means that kernel already has a 
> solution for weird topology. Some enumeration doesn't trigger this 
> warning, but we can come up with one that does.
>
Exactly, because it relies on CPUID, which is unreliable!! :-)

> Or we can indeed have a 
> patch in kernel that will, possibly silently, fail topology_sane() when 
> virtualized and not pinned.
> 
Well, I think that, if I were a Linux scheduler or x86 maintainer, I
would oppose much more to a similar patch, rather than to one that makes
the kernel stop relying on an instruction that gives pCPU specific
results, to build-up long lived information, in a context where pCPUs
change. :-)

Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 60005: regressions - FAIL

2015-07-27 Thread osstest service owner
flight 60005 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/60005/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-xsm   5 xen-build fail REGR. vs. 59254
 build-i3865 xen-build fail REGR. vs. 59254
 build-amd64-pvops 5 kernel-build  fail REGR. vs. 59254
 build-i386-pvops  5 kernel-build  fail REGR. vs. 59254
 build-armhf-pvops 5 kernel-build  fail REGR. vs. 59254

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemut-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemut-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemut-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 build-i386-rumpuserxen1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemut-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   bloc

Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-27 Thread Juergen Gross

On 07/27/2015 04:51 PM, Boris Ostrovsky wrote:

On 07/27/2015 10:43 AM, Juergen Gross wrote:

On 07/27/2015 04:34 PM, Boris Ostrovsky wrote:

On 07/27/2015 10:09 AM, Dario Faggioli wrote:

On Fri, 2015-07-24 at 18:10 +0200, Juergen Gross wrote:

On 07/24/2015 05:58 PM, Dario Faggioli wrote:

So, just to check if I'm understanding is correct: you'd like to
add an
abstraction layer, in Linux, like in generic (or, perhaps,
scheduling)
code, to hide the direct interaction with CPUID.
Such layer, on baremetal, would just read CPUID while, on PV-ops,
it'd
check with Xen/match vNUMA/whatever... Is this that you are saying?

Sort of, yes.

I just wouldn't add it, as it is already existing (more or less). It
can deal right now with AMD and Intel, we would "just" have to add
Xen.


So, having gone through the rest of the thread (so far), and having
given a fair amount o thinking to this, I really think that something
like this would be a good thing to have in Linux.

Of course, it's not that my opinion on where should be in Linux counts
that much! :-D   Nevertheless, I wanted to make it clear that, while
skeptic at the beginning, I now think this is (part of) the way to go,
as I said and explained in my reply to George.


And I continue to believe that kernel solution does not address the
userland problem which is no less important than making kernel do proper
scheduling decisions (and I suspect when this patch goes for review
that's what the scheduling people are going to say).

Remember the original problem that started this thread was that kernel
complained that topology didn't make sense and it turned off all
topology-related decisions. Which means that kernel already has a
solution for weird topology. Some enumeration doesn't trigger this
warning, but we can come up with one that does. Or we can indeed have a
patch in kernel that will, possibly silently, fail topology_sane() when
virtualized and not pinned.


How would you come up with a topology the kernel is complaining about
and user mode scheduling will use for sane decisions ?


We need to understand first why Dario's box is apparently the only one
resulting in a warning and probably then emulate that enumeration.


This will lead to other problems in user land e.g. with hwloc.


And again, if that is not possible then just make topology_sane() fail.


And again: once you claim that kernel mode isn't everything and here
you fail to respect possible user land requirements.


(This is what I assume kernel does when topology_sane() fails. And if it
doesn't, that's a bug IMO)

The licensing problem that Juergen described can be solved by pining
vcpus and exposing HT bit. Besides,  creating a guest with 24 VPCUs and


Hmm, yes. This way you sacrifice most of the virtualization advantages.


hoping that 16-core licensing will work I think is pushing it a bit when
you know that VCPUs will jump around cores (i.e. "on average" you are
running on more than 16 cores -- multi-threaded or not -- which arguably
is what licensing is trying to prevent)


On a machine with only 16 cores running on more than 16 cores? I have
some problems to believe this. The point was: if the license is happy on
bare metal it should be so when running on the same hardware as a guest.


Ok, that's not how I should have described it. I meant that IMO asking
for 24 VCPUs is somewhat akin to oversubscribing since you kind of know
that you dont' have 24 PCPUs, you are just trying to fool the kernel
into thinking that threads are cores.


/proc/cpuinfo on bare metal will list 32 cpus. xl info in dom0 will list
32 cpus. You have 32 entities where you can do scheduling. So what's the
problem having a domU with 24 vcpus? There are still 8 pcpus free for
e.g. dom0 then.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   >