date:20151105

Re: [Xen-devel] [PATCH] smpboot: Add smpboot state variables instead of reusing CPU hotplug states

2015-11-05 Thread Daniel Wagner

Hi Paul,

I guess this patch got the summer conference period treatment. ACK,
NACK, completely STUPID idea?

cheers,
daniel

On 10/15/2015 01:32 PM, Daniel Wagner wrote:
> The cpu hotplug state machine in smpboot.c is reusing the states from
> cpu.h. That is confusing when it comes to the CPU_DEAD_FROZEN usage.
> Paul explained to me that he was in need of an additional state
> for destinguishing between a CPU error states. For this he just
> picked CPU_DEAD_FROZEN.
> 
> 8038dad7e888581266c76df15d70ca457a3c5910 smpboot: Add common code for 
> notification from dying CPU
> 2a442c9c6453d3d043dfd89f2e03a1deff8a6f06 x86: Use common 
> outgoing-CPU-notification code
> 
> Instead of reusing the states, let's add new definition inside
> the smpboot.c file with explenation what those states
> mean. Thanks Paul for providing them.
> 
> Signed-off-by: Daniel Wagner 
> Cc: Thomas Gleixner 
> Cc: "Paul E. McKenney" 
> Cc: Peter Zijlstra 
> Cc: xen-de...@lists.xenproject.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  arch/x86/xen/smp.c  |  4 +--
>  include/linux/cpu.h |  3 +-
>  kernel/smpboot.c| 82 
> -
>  3 files changed, 67 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> index 3f4ebf0..804bf5c 100644
> --- a/arch/x86/xen/smp.c
> +++ b/arch/x86/xen/smp.c
> @@ -495,7 +495,7 @@ static int xen_cpu_up(unsigned int cpu, struct 
> task_struct *idle)
>   rc = HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL);
>   BUG_ON(rc);
>  
> - while (cpu_report_state(cpu) != CPU_ONLINE)
> + while (!cpu_check_online(cpu))
>   HYPERVISOR_sched_op(SCHEDOP_yield, NULL);
>  
>   return 0;
> @@ -767,7 +767,7 @@ static int xen_hvm_cpu_up(unsigned int cpu, struct 
> task_struct *tidle)
>* This can happen if CPU was offlined earlier and
>* offlining timed out in common_cpu_die().
>*/
> - if (cpu_report_state(cpu) == CPU_DEAD_FROZEN) {
> + if (cpu_check_timeout(cpu)) {
>   xen_smp_intr_free(cpu);
>   xen_uninit_lock_cpu(cpu);
>   }
> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> index 23c30bd..f78ab46 100644
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -284,7 +284,8 @@ void arch_cpu_idle_dead(void);
>  
>  DECLARE_PER_CPU(bool, cpu_dead_idle);
>  
> -int cpu_report_state(int cpu);
> +int cpu_check_online(int cpu);
> +int cpu_check_timeout(int cpu);
>  int cpu_check_up_prepare(int cpu);
>  void cpu_set_state_online(int cpu);
>  #ifdef CONFIG_HOTPLUG_CPU
> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> index a818cbc..75e5724 100644
> --- a/kernel/smpboot.c
> +++ b/kernel/smpboot.c
> @@ -371,19 +371,63 @@ int smpboot_update_cpumask_percpu_thread(struct 
> smp_hotplug_thread *plug_thread,
>  }
>  EXPORT_SYMBOL_GPL(smpboot_update_cpumask_percpu_thread);
>  
> +/* The CPU is offline, and its last offline operation was
> + * successful and proceeded normally.  (Or, alternatively, the
> + * CPU never has come online, as this is the initial state.)
> + */
> +#define CPUHP_POST_DEAD  0x01
> +
> +/* The CPU is in the process of coming online.
> + * Simple architectures can skip this state, and just invoke
> + * cpu_set_state_online() unconditionally instead.
> + */
> +#define CPUHP_UP_PREPARE 0x02
> +
> +/* The CPU is now online.  Simple architectures can skip this
> + * state, and just invoke cpu_wait_death() and cpu_report_death()
> + * unconditionally instead.
> + */
> +#define CPUHP_ONLINE 0x03
> +
> +/* The CPU has gone offline, so that it may now be safely
> + * powered off (or whatever the architecture needs to do to it).
> + */
> +#define CPUHP_DEAD   0x04
> +
> +/* The CPU did not go offline in a timely fashion, if at all,
> + * so it might need special processing at the next online (for
> + * example, simply refusing to bring it online).
> + */
> +#define CPUHP_BROKEN 0x05
> +
> +/* The CPU eventually did go offline, but not in a timely
> + * fashion.  If some sort of reset operation is required before it
> + * can be brought online, that reset operation needs to be carried
> + * out at online time.  (Or, again, the architecture might simply
> + * refuse to bring it online.)
> + */
> +#define CPUHP_TIMEOUT0x06
> +
>  static DEFINE_PER_CPU(atomic_t, cpu_hotplug_state) = 
> ATOMIC_INIT(CPU_POST_DEAD);
>  
>  /*
>   * Called to poll specified CPU's state, for example, when waiting for
>   * a CPU to come online.
>   */
> -int cpu_report_state(int cpu)
> +int cpu_check_online(int cpu)
> +{
> + return atomic_read(&per_cpu(cpu_hotplug_state, cpu)) ==
> +CPUHP_ONLINE;
> +}
> +
> +int cpu_check_timeout(int cpu)
>  {
> - return atomic_read(&per_cpu(cpu_hotplug_state, cpu));
> + return atomic_read(&per_cpu(cpu_hotplug_state, cpu)) ==
> +CPUHP_TIMEOUT;
>  }
>  
>  /*
> - * If CPU has died properly, set its state to CPU_UP_PREPARE an

Re: [Xen-devel] [PATCH 4/4] xen/public: arm: rework the macro set_xen_guest_handle_raw

2015-11-05 Thread Jan Beulich

>>> On 04.11.15 at 18:06,  wrote:
> Jan Beulich writes ("Re: [PATCH 4/4] xen/public: arm: rework the macro 
> set_xen_guest_handle_raw"):
>> On 04.11.15 at 17:50,  wrote:
>> > If we don't provide a get_xen_guest_handle, a kernel developer will be
>> > sorely tempted to make one.
>> 
>> What use would it be to them? Kernels only write handles, they
>> shouldn't have a need for reading them.
> 
> I foresee situations where a kernel might like to update a proposed
> hypercall argument structure in place, which might involve reading the
> handles.

I guess you think of e.g. the privcmd filtering done in XenServer, but
I think this is an odd thing for a kernel to do: Down to the final actual
hypercall invocation, it should deal with pointers, not handles.
Filtering should either be done prior to reaching that layer (obviously
not an option for privcmd, but that layer is guarded against issues
with the compiler doing the wrong thing afaict), or would better be
left to the hypervisor (said filtering in XenServer could likely be moved
into the hypervisor, with a flag added to the hypercall number
indicating whether to invoke the filtering, which the privcmd layer
then would set unconditionally).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [V9 1/3] x86/xsaves: enable xsaves/xrstors/xsavec in xen

2015-11-05 Thread Jan Beulich

>>> On 05.11.15 at 02:34,  wrote:
> On Wed, Nov 04, 2015 at 10:04:33AM -0700, Jan Beulich wrote:
>> >>> On 03.11.15 at 07:27,  wrote:
>> > @@ -158,6 +334,20 @@ void xsave(struct vcpu *v, uint64_t mask)
>> >  ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = word_size;
>> >  }
>> > +#define XSTATE_FIXUP ".section .fixup,\"ax\"  \n"\
>> > + "2: mov %5,%%ecx \n"\
>> > + "   xor %1,%1\n"\
>> > + "   rep stosb\n"\
>> > + "   lea %2,%0\n"\
>> > + "   mov %3,%1\n"\
>> > + "   jmp 1b   \n"\
>> > + ".previous   \n"\
>> > + _ASM_EXTABLE(1b, 2b)\
>> > + : "+&D" (ptr), "+&a" (lmask)\
>> > + : "m" (*ptr), "g" (lmask), "d" (hmask), \
>> > +   "m" (xsave_cntxt_size)\
>> > + : "ecx"
>> > +
>> >  void xrstor(struct vcpu *v, uint64_t mask)
>> >  {
>> >  uint32_t hmask = mask >> 32;
>> > @@ -187,39 +377,22 @@ void xrstor(struct vcpu *v, uint64_t mask)
>> >  switch ( __builtin_expect(ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET], 8) )
>> >  {
>> >  default:
>> > -asm volatile ( "1: .byte 0x48,0x0f,0xae,0x2f\n"
>> > -   ".section .fixup,\"ax\"  \n"
>> > -   "2: mov %5,%%ecx \n"
>> > -   "   xor %1,%1\n"
>> > -   "   rep stosb\n"
>> > -   "   lea %2,%0\n"
>> > -   "   mov %3,%1\n"
>> > -   "   jmp 1b   \n"
>> > -   ".previous   \n"
>> > -   _ASM_EXTABLE(1b, 2b)
>> > -   : "+&D" (ptr), "+&a" (lmask)
>> > -   : "m" (*ptr), "g" (lmask), "d" (hmask),
>> > - "m" (xsave_cntxt_size)
>> > -   : "ecx" );
>> > +alternative_input("1: "".byte 0x48,0x0f,0xae,0x2f",
>> > +  ".byte 0x48,0x0f,0xc7,0x1f",
>> > +  X86_FEATURE_XSAVES,
>> > +  "D" (ptr), "m" (*ptr), "a" (lmask), "d" 
> (hmask));
>> > +asm volatile (XSTATE_FIXUP);
>> >  break;
>> >  case 4: case 2:
>> > -asm volatile ( "1: .byte 0x0f,0xae,0x2f\n"
>> > -   ".section .fixup,\"ax\" \n"
>> > -   "2: mov %5,%%ecx\n"
>> > -   "   xor %1,%1   \n"
>> > -   "   rep stosb   \n"
>> > -   "   lea %2,%0   \n"
>> > -   "   mov %3,%1   \n"
>> > -   "   jmp 1b  \n"
>> > -   ".previous  \n"
>> > -   _ASM_EXTABLE(1b, 2b)
>> > -   : "+&D" (ptr), "+&a" (lmask)
>> > -   : "m" (*ptr), "g" (lmask), "d" (hmask),
>> > - "m" (xsave_cntxt_size)
>> > -   : "ecx" );
>> > +alternative_input("1: "".byte 0x0f,0xae,0x2f",
>> > +  ".byte 0x0f,0xc7,0x1f",
>> > +  X86_FEATURE_XSAVES,
>> > +  "D" (ptr), "m" (*ptr), "a" (lmask), "d" 
> (hmask));
>> > +asm volatile (XSTATE_FIXUP);
>> >  break;
>> >  }
>> >  }
>> > +#undef XSTATE_FIXUP
>> 
>> Repeating my comment on v8: "I wonder whether at least for the
>> restore side alternative asm wouldn't result in better readable code
>> and at the same time in a smaller patch." Did you at least look into
>> that option?
>> 
> I may misunderstand your meaning. I have adressed the comment by changing 
> the restor side using alternative_input. Does "alternative_input" not what 
> you want ? 
> if it is not what you want, please give me some suggestions how to
> address this ?  

Oh, I'm sorry, I should have looked more closely. The fact that
XSTATE_FIXUP survived made me draw wrong conclusions without
looking more closely. Now the bad news is - you can't split things
like this, as the compiler doesn't make any guarantees as to
register values between two asm()-s. The whole construct needs
to and up as a single asm(), which is why XSTATE_FIXUP and is
unlikely to be of much use here (at least in the context of this
patch; a separate cleanup patch might eliminate the redundancy).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Linux 4.4 MW: Boot under Xen fails with CONFIG_DEBUG_WX enabled: RIP: ptdump_walk_pgd_level_core

2015-11-05 Thread Sander Eikelenboom


On 2015-11-05 00:13, Boris Ostrovsky wrote:

On 11/04/2015 03:02 PM, Sander Eikelenboom wrote:

On 2015-11-04 19:47, Stephen Smalley wrote:

On 11/04/2015 01:28 PM, Sander Eikelenboom wrote:

On 2015-11-04 16:52, Stephen Smalley wrote:

On 11/04/2015 06:55 AM, Sander Eikelenboom wrote:

Hi All,

I just tried to boot with the current linus mergewindow tree under 
Xen.
It fails with a kernel panic at boot with the new 
"CONFIG_DEBUG_WX"

option enabled.
Disabling it makes the kernel boot fine.

The splat:
[   18.424241] Freeing unused kernel memory: 1104K 
(822fc000 -

8241)
[   18.430314] Write protecting the kernel read-only data: 18432k
[   18.441054] Freeing unused kernel memory: 1144K 
(880001ae2000 -

880001c0)
[   18.447966] Freeing unused kernel memory: 1560K 
(88000207a000 -

88000220)
[   18.453947] BUG: unable to handle kernel paging request at
88055c883000
[   18.459943] IP: []
ptdump_walk_pgd_level_core+0x20e/0x440
[   18.465847] PGD 2212067 PUD 0
[   18.471564] Oops:  [#1] SMP
[   18.477248] Modules linked in:
[   18.482918] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
4.3.0-mw-20151104-linus-doflr+ #1
[   18.488804] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , 
BIOS

V1.8B1 09/13/2010
[   18.494778] task: 880059b9 ti: 880059b98000 
task.ti:

880059b98000
[   18.500852] RIP: e030:[] []
ptdump_walk_pgd_level_core+0x20e/0x440
[   18.507102] RSP: e02b:880059b9be48  EFLAGS: 00010296
[   18.513351] RAX: 88055c883000 RBX: 81ae2000 RCX:
8800
[   18.519733] RDX: 0067 RSI: 880059b9be98 RDI:
88001000
[   18.526129] RBP: 880059b9bf00 R08:  R09:

[   18.532522] R10: 88005fd0e790 R11: 0001 R12:
88008000
[   18.538891] R13: cfff R14: 880059b9be98 R15:

[   18.545247] FS:  () 
GS:88005f68()

knlGS:
[   18.551708] CS:  e033 DS:  ES:  CR0: 8005003b
[   18.558153] CR2: 88055c883000 CR3: 02211000 CR4:
0660
[   18.564686] Stack:
[   18.571106]  000159b9be50 82211000 88055c884000
0800
[   18.577704]  8000 88055c883000 0007
88005fd0e790
[   18.584291]  880059b9bed8 81156ace 0001

[   18.590916] Call Trace:
[   18.597458]  [] ? 
free_reserved_area+0x11e/0x120

[   18.604180]  []
ptdump_walk_pgd_level_checkwx+0x12/0x20
[   18.611014]  [] mark_rodata_ro+0xe9/0xf0
[   18.617819]  [] ? rest_init+0x80/0x80
[   18.624512]  [] kernel_init+0x18/0xe0
[   18.631095]  [] ret_from_fork+0x3f/0x70
[   18.637650]  [] ? rest_init+0x80/0x80
[   18.644178] Code: 70 ff ff ff 48 3b 85 58 ff ff ff 0f 84 c0 fe 
ff ff
48 8b 85 68 ff ff ff 48 c1 e0 10 48 c1 f8 10 48 89 45 b0 48 8b 85 
70 ff
ff ff <48> 8b 38 48 85 ff 0f 85 4e ff ff ff b9 02 00 00 00 31 d2 
4c 89

[   18.658246] RIP  []
ptdump_walk_pgd_level_core+0x20e/0x440
[   18.665211]  RSP 
[   18.672073] CR2: 88055c883000
[   18.678852] ---[ end trace d84e34461c40637a ]---
[   18.685641] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x0009
[   18.685641]
[   18.699520] Kernel Offset: disable



What's your .config?  Does cat /sys/kernel/debug/kernel_page_tables
produce a similar fault even with CONFIG_DEBUG_WX=n?


.config is attached

Hmm that sysfs file doesn't seem to exist then:
# cat /sys/kernel/debug/kernel_page_tables
cat: /sys/kernel/debug/kernel_page_tables: No such file or directory


Needs CONFIG_X86_PTDUMP=y.
Also assumes you have debugfs mounted there.


Recompiled, and the result is that it also blows up:



Can you try this:


diff --git a/arch/x86/mm/dump_pagetables.c 
b/arch/x86/mm/dump_pagetables.c

index 1bf417e..b534216 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -362,8 +362,13 @@ static void ptdump_walk_pgd_level_core(struct
seq_file *m, pgd_t *pgd,
bool checkwx)
 {
 #ifdef CONFIG_X86_64
+/* 8000 - 87ff is reserved for hypervisor */
+#define is_hypervisor_range(idx)  (paravirt_enabled() && \
+  ((idx >= pgd_index(__PAGE_OFFSET) - 16) && \
+   (idx < pgd_index(__PAGE_OFFSET
 pgd_t *start = (pgd_t *) &init_level4_pgt;
 #else
+#define is_hypervisor_range(idx)   0
 pgd_t *start = swapper_pg_dir;
 #endif
 pgprotval_t prot;
@@ -381,7 +386,7 @@ static void ptdump_walk_pgd_level_core(struct
seq_file *m, pgd_t *pgd,

 for (i = 0; i < PTRS_PER_PGD; i++) {
 st.current_address = normalize_addr(i * PGD_LEVEL_MULT);
-if (!pgd_none(*start)) {
+if (!pgd_none(*start) && !is_hypervisor_range(i)) {
 if (pgd_large(*start) || !pgd_present(*start)) {
 prot = pgd_flags(*start);
 note_page(m, &st, __pgprot(prot), 1);


Hi Boris,

Thank for your patch !
It makes "cat /

Re: [Xen-devel] [PATCH v3 2/4] arm64: Add xen_boot module file

2015-11-05 Thread Fu Wei

Hi Ian,

On 3 November 2015 at 23:22, Ian Campbell  wrote:
> On Tue, 2015-11-03 at 22:57 +0800, Fu Wei wrote:
>> Hi Vladimir,
>>
>> After discussing with Ian Campbell,   Since we already can load all
>> the necessary binaries for Xen boot on arm64 for now,  we don't really
>> need "xen_module" command now.
>> But maybe someday , xen need a new type of binary in boot time, then
>> we still need this support.
>
> You mean support for "--type" passed to the xen_module command, right? I
> thought the xen_module stuff had been applied. Or am I misunderstanding
> which bits have been applied?

Actually, I mean: xen-module command is for "--type" support. If we
don't need "--type" now, we can delete  xen-module code(which has been
deleted by Vladimir from my patch, so now, the upstream grub has not
--type support).
Vladimir has applied most of my patch, except xen-module command code.

>
>> So I will submit  a   "xen_module" command patch soon, in case we need
>> it.
>
> Just to clarify, my suggestion was to repost the bits which were omitted
> from the prior patches just so that they are available in the ML archives
> etc should anyone ever want to resurrect them in the future.

yes, that is what I am gonna do.

>
> Ian.
>



-- 
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] Getting the XSAVE size from userspace

2015-11-05 Thread Razvan Cojocaru

Hello,

I need to get the XSAVE size from userspace. The easiest way seems to be
to use the XEN_DOMCTL_getvcpuextstate hypercall, but that hypercall is
not public / there's no xenctrl.h wrapper for it.

There's also struct hvm_hw_cpu_xsave, which I can get to, but it doesn't
have a size member:

542 /*
543  * The save area of XSAVE/XRSTOR.
544  */
545
546 struct hvm_hw_cpu_xsave {
547 uint64_t xfeature_mask;/* Ignored */
548 uint64_t xcr0; /* Updated by XSETBV */
549 uint64_t xcr0_accum;   /* Updated by XSETBV */
550 struct {
551 struct { char x[512]; } fpu_sse;
552
553 struct {
554 uint64_t xstate_bv; /* Updated by XRSTOR */
555 uint64_t reserved[7];
556 } xsave_hdr;/* The 64-byte header */
557
558 struct { char x[0]; } ymm;/* YMM */
559 } save_area;
560 };

I see that in the hypervisor code the length is computed by using the
HVM_CPU_XSAVE_SIZE() macro:

2126 #define HVM_CPU_XSAVE_SIZE(xcr0) (offsetof(struct hvm_hw_cpu_xsave, \
2127save_area) + \
2128   xstate_ctxt_size(xcr0))

where:

256 static unsigned int _xstate_ctxt_size(u64 xcr0)
257 {
258 u64 act_xcr0 = get_xcr0();
259 u32 eax, ebx = 0, ecx, edx;
260 bool_t ok = set_xcr0(xcr0);
261
262 ASSERT(ok);
263 cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
264 ASSERT(ebx <= ecx);
265 ok = set_xcr0(act_xcr0);
266 ASSERT(ok);
267
268 return ebx;
269 }
270
271 /* Fastpath for common xstate size requests, avoiding reloads of
xcr0. */
272 unsigned int xstate_ctxt_size(u64 xcr0)
273 {
274 if ( xcr0 == xfeature_mask )
275 return xsave_cntxt_size;
276
277 if ( xcr0 == 0 )
278 return 0;
279
280 return _xstate_ctxt_size(xcr0);
281 }

But that doesn't seem to translate cleanly to userspace code.

I had hoped that I would be able to get this with no custom Xen patches,
is there a simpler way I'm not aware of to get to this information? And
if there isn't, would you prefer a libxc patch that exposes
XEN_DOMCTL_getvcpuextstate, or one that adds a size member to struct
hvm_hw_cpu_xsave (I'd guess the latter)?


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [distros-debian-wheezy test] 38249: all pass

2015-11-05 Thread Platform Team regression test user

flight 38249 distros-debian-wheezy real [real]
http://osstest.xs.citrite.net/~osstest/testlogs/logs/38249/

Perfect :-)
All tests in this flight passed
baseline version:
 flight   38221

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-amd64-wheezy-netboot-pvgrub pass
 test-amd64-i386-i386-wheezy-netboot-pvgrub   pass
 test-amd64-i386-amd64-wheezy-netboot-pygrub  pass
 test-amd64-amd64-i386-wheezy-netboot-pygrub  pass



sg-report-flight on osstest.xs.citrite.net
logs: /home/osstest/logs
images: /home/osstest/images

Logs, config files, etc. are available at
http://osstest.xs.citrite.net/~osstest/testlogs/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Push not applicable.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [V9 1/3] x86/xsaves: enable xsaves/xrstors/xsavec in xen

2015-11-05 Thread Shuai Ruan

On Thu, Nov 05, 2015 at 02:06:25AM -0700, Jan Beulich wrote:
> >>> On 05.11.15 at 02:34,  wrote:
> > On Wed, Nov 04, 2015 at 10:04:33AM -0700, Jan Beulich wrote:
> >> >>> On 03.11.15 at 07:27,  wrote:
> >> > @@ -158,6 +334,20 @@ void xsave(struct vcpu *v, uint64_t mask)
> >> >  case 4: case 2:
> >> > -asm volatile ( "1: .byte 0x0f,0xae,0x2f\n"
> >> > -   ".section .fixup,\"ax\" \n"
> >> > -   "2: mov %5,%%ecx\n"
> >> > -   "   xor %1,%1   \n"
> >> > -   "   rep stosb   \n"
> >> > -   "   lea %2,%0   \n"
> >> > -   "   mov %3,%1   \n"
> >> > -   "   jmp 1b  \n"
> >> > -   ".previous  \n"
> >> > -   _ASM_EXTABLE(1b, 2b)
> >> > -   : "+&D" (ptr), "+&a" (lmask)
> >> > -   : "m" (*ptr), "g" (lmask), "d" (hmask),
> >> > - "m" (xsave_cntxt_size)
> >> > -   : "ecx" );
> >> > +alternative_input("1: "".byte 0x0f,0xae,0x2f",
> >> > +  ".byte 0x0f,0xc7,0x1f",
> >> > +  X86_FEATURE_XSAVES,
> >> > +  "D" (ptr), "m" (*ptr), "a" (lmask), "d" 
> > (hmask));
> >> > +asm volatile (XSTATE_FIXUP);
> >> >  break;
> >> >  }
> >> >  }
> >> > +#undef XSTATE_FIXUP
> >> 
> >> Repeating my comment on v8: "I wonder whether at least for the
> >> restore side alternative asm wouldn't result in better readable code
> >> and at the same time in a smaller patch." Did you at least look into
> >> that option?
> >> 
> > I may misunderstand your meaning. I have adressed the comment by changing 
> > the restor side using alternative_input. Does "alternative_input" not what 
> > you want ? 
> > if it is not what you want, please give me some suggestions how to
> > address this ?  
> 
> Oh, I'm sorry, I should have looked more closely. The fact that
> XSTATE_FIXUP survived made me draw wrong conclusions without
> looking more closely. Now the bad news is - you can't split things
> like this, as the compiler doesn't make any guarantees as to
> register values between two asm()-s. The whole construct needs
> to and up as a single asm(), which is why XSTATE_FIXUP and is
> unlikely to be of much use here (at least in the context of this
> patch; a separate cleanup patch might eliminate the redundancy).
> 
Ok. So alternative_input will not used here (means use the way
xrstor in Patch 8)? Or put the XSTATE_FIXUP into alternative_input ?
Which one is ok to you ?

Thanks
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [V9 1/3] x86/xsaves: enable xsaves/xrstors/xsavec in xen

2015-11-05 Thread Jan Beulich

>>> On 05.11.15 at 10:57,  wrote:
> Ok. So alternative_input will not used here (means use the way
> xrstor in Patch 8)? Or put the XSTATE_FIXUP into alternative_input ?
> Which one is ok to you ?

The latter, if necessary by extending alternative_input() accordingly
(or provide a second, more flexible variant if need be; iirc Linux has
gained a couple of variants over the years).

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [V9 2/3] x86/xsaves: enable xsaves/xrstors for hvm guest

2015-11-05 Thread Jan Beulich

>>> On 03.11.15 at 07:27,  wrote:
> @@ -640,6 +640,14 @@ static void vmx_save_msr(struct vcpu *v, struct hvm_msr 
> *ctxt)
>  }
>  
>  vmx_vmcs_exit(v);
> +
> +if ( cpu_has_xsaves )
> +{
> +ctxt->msr[ctxt->count].val = v->arch.hvm_vcpu.msr_xss;
> +if ( ctxt->msr[ctxt->count].val )
> +ctxt->msr[ctxt->count++].index = MSR_IA32_XSS;
> +}
> +
>  }

Stray blank line (not the first time I have to make this comment on
this series).

With it removed,
Reviewed-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v1 07/11] xsplice: Implement payload loading

2015-11-05 Thread Jan Beulich

>>> On 04.11.15 at 23:21,  wrote:
>> +int xsplice_perform_rela(struct xsplice_elf *elf,
>> + struct xsplice_elf_sec *base,
>> + struct xsplice_elf_sec *rela)
>> +{
>> +Elf64_Rela *r;
>> +int symndx, i;
> 
> unsigned int
> 
>> +uint64_t val;
>> +uint8_t *dest;
>> +
> 
> Can you double check that rela->sec-sh_entsize is not zero first?

Perhaps not just not zero, but at least a certain minimum? Or even
equaling some sizeof()?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v1 05/11] elf: Add relocation types to elfstructs.h

2015-11-05 Thread Jan Beulich

>>> On 03.11.15 at 19:16,  wrote:
> --- a/xen/include/xen/elfstructs.h
> +++ b/xen/include/xen/elfstructs.h
> @@ -348,6 +348,27 @@ typedef struct {
>  #define  ELF64_R_TYPE(info)  ((info) & 0x)
>  #define ELF64_R_INFO(s,t)(((s) << 32) + (u_int32_t)(t))
>  
> +/* x86-64 relocation types */
> +#define R_X86_64_NONE0   /* No reloc */
> +#define R_X86_64_64  1   /* Direct 64 bit  */
> +#define R_X86_64_PC322   /* PC relative 32 bit signed */
> +#define R_X86_64_GOT32   3   /* 32 bit GOT entry */
> +#define R_X86_64_PLT32   4   /* 32 bit PLT address */
> +#define R_X86_64_COPY5   /* Copy symbol at runtime */
> +#define R_X86_64_GLOB_DAT6   /* Create GOT entry */
> +#define R_X86_64_JUMP_SLOT   7   /* Create PLT entry */
> +#define R_X86_64_RELATIVE8   /* Adjust by program base */
> +#define R_X86_64_GOTPCREL9   /* 32 bit signed pc relative
> +offset to GOT */
> +#define R_X86_64_32  10  /* Direct 32 bit zero extended */
> +#define R_X86_64_32S 11  /* Direct 32 bit sign extended */
> +#define R_X86_64_16  12  /* Direct 16 bit zero extended */
> +#define R_X86_64_PC1613  /* 16 bit sign extended pc 
> relative */
> +#define R_X86_64_8   14  /* Direct 8 bit sign extended  */
> +#define R_X86_64_PC8 15  /* 8 bit sign extended pc relative */
> +
> +#define R_X86_64_NUM 16

Since the set isn't complete anyway - any reason not to drop
everything that's of no relevance to xSplice?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Getting the XSAVE size from userspace

2015-11-05 Thread Jan Beulich

>>> On 05.11.15 at 10:52,  wrote:
> I need to get the XSAVE size from userspace. The easiest way seems to be
> to use the XEN_DOMCTL_getvcpuextstate hypercall, but that hypercall is
> not public / there's no xenctrl.h wrapper for it.

Before going into any detail of the rest of your mail - any reason you
can't just consult CPUID output?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Getting the XSAVE size from userspace

2015-11-05 Thread Razvan Cojocaru

On 11/05/2015 12:42 PM, Jan Beulich wrote:
 On 05.11.15 at 10:52,  wrote:
>> I need to get the XSAVE size from userspace. The easiest way seems to be
>> to use the XEN_DOMCTL_getvcpuextstate hypercall, but that hypercall is
>> not public / there's no xenctrl.h wrapper for it.
> 
> Before going into any detail of the rest of your mail - any reason you
> can't just consult CPUID output?

That's because the userspace application doesn't live in dom0, but in a
dedicated privileged domain, and I'm unsure if a CPUID issued there
yields the same results as a CPUID issued in dom0. So I thought the
safest way is to get the information directly from the hypervisor. Is
this assumption incorrect?

Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v1 01/11] xsplice: Design document (v2).

2015-11-05 Thread Ross Lagerwall


On 11/04/2015 09:10 PM, Konrad Rzeszutek Wilk wrote:
snip

+The payload **MUST** contain enough data to allow us to apply the update
+and also safely reverse it. As such we **MUST** know:
+
+ * The locations in memory to be patched. This can be determined dynamically
+   via symbols or via virtual addresses.
+ * The new code that will be patched in.
+ * Signature to verify the payload.


Argh. We need to move the 'Signature to verify' in the 'v2' section
as I don't think we can get that done in time.


No, not for V1.




+
+This binary format can be constructed using an custom binary format but
+there are severe disadvantages of it:
+
+ * The format might need to be changed and we need an mechanism to accommodate
+   that.
+ * It has to be platform agnostic.
+ * Easily constructed using existing tools.
+
+As such having the payload in an ELF file is the sensible way. We would be
+carrying the various sets of structures (and data) in the ELF sections under
+different names and with definitions. The prefix for the ELF section name
+would always be: *.xsplice* to match up to the names of the structures.
+
+Note that every structure has padding. This is added so that the hypervisor
+can re-use those fields as it sees fit.
+
+Earlier design attempted to ineptly explain the relations of the ELF sections
+to each other without using proper ELF mechanism (sh_info, sh_link, data
+structures using Elf types, etc). This design will explain in detail
+the structures and how they are used together and not dig in the ELF
+format - except mention that the section names should match the
+structure names.
+
+The xSplice payload is a relocatable ELF binary. A typical binary would have:
+
+ * One or more .text sections
+ * Zero or more read-only data sections
+ * Zero or more data sections
+ * Relocations for each of these sections
+
+It may also have some architecture-specific sections. For example:
+
+ * Alternatives instructions
+ * Bug frames
+ * Exception tables
+ * Relocations for each of these sections
+
+The xSplice core code loads the payload as a standard ELF binary, relocates it
+and handles the architecture-specifc sections as needed. This process is much
+like what the Linux kernel module loader does. It contains no xSplice-specific
+details and thus will not be discussed further.


What is 'it'? The 'process of what module loader does'?


'It' refers to the process of module loading in the previous sentence.




+
+Importantly, the payload also contains a section with an array of structures
+describing the functions to be patched:
+
+struct xsplice_patch_func {
+unsigned long new_addr;
+unsigned long new_size;
+unsigned long old_addr;
+unsigned long old_size;
+char *name;
+uint8_t pad[64];
+};
+


Uh, so 104 bytes ? Or did you mean to s/64/24/ so the structure is nicely
padded to 64-bytes?

I think that is what you meant.


OK. I'm not too fussed about exact sizes for V1 anyway, it's likely to 
change at some point.


--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [xen-unstable test] 63540: regressions - FAIL

2015-11-05 Thread Jan Beulich

>>> On 05.11.15 at 04:01,  wrote:
> flight 63540 xen-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/63540/ 
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-amd64-xl-qemut-winxpsp3  6 xen-bootfail REGR. vs. 
> 63475

Hmm, did there something go wrong during install? The first boot
after install appears to be a kernel booted natively, and then
nothing else.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Getting the XSAVE size from userspace

2015-11-05 Thread Andrew Cooper

On 05/11/15 10:42, Jan Beulich wrote:
 On 05.11.15 at 10:52,  wrote:
>> I need to get the XSAVE size from userspace. The easiest way seems to be
>> to use the XEN_DOMCTL_getvcpuextstate hypercall, but that hypercall is
>> not public / there's no xenctrl.h wrapper for it.
> Before going into any detail of the rest of your mail - any reason you
> can't just consult CPUID output?

It depends on precisely what you want.

CPUID.0xD[0].ecx gives you the maximum xsave area on this processor
CPUID.0xD[0].ebx gives you the current size for the value in xcr0, but
that is not very useful from userspace.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Getting the XSAVE size from userspace

2015-11-05 Thread Jan Beulich

>>> On 05.11.15 at 11:49,  wrote:
> On 05/11/15 10:42, Jan Beulich wrote:
> On 05.11.15 at 10:52,  wrote:
>>> I need to get the XSAVE size from userspace. The easiest way seems to be
>>> to use the XEN_DOMCTL_getvcpuextstate hypercall, but that hypercall is
>>> not public / there's no xenctrl.h wrapper for it.
>> Before going into any detail of the rest of your mail - any reason you
>> can't just consult CPUID output?
> 
> It depends on precisely what you want.
> 
> CPUID.0xD[0].ecx gives you the maximum xsave area on this processor
> CPUID.0xD[0].ebx gives you the current size for the value in xcr0, but
> that is not very useful from userspace.

Why would the maximum size not be sufficient for most (all?) user
mode purposes?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Getting the XSAVE size from userspace

2015-11-05 Thread Jan Beulich

>>> On 05.11.15 at 11:47,  wrote:
> On 11/05/2015 12:42 PM, Jan Beulich wrote:
> On 05.11.15 at 10:52,  wrote:
>>> I need to get the XSAVE size from userspace. The easiest way seems to be
>>> to use the XEN_DOMCTL_getvcpuextstate hypercall, but that hypercall is
>>> not public / there's no xenctrl.h wrapper for it.
>> 
>> Before going into any detail of the rest of your mail - any reason you
>> can't just consult CPUID output?
> 
> That's because the userspace application doesn't live in dom0, but in a
> dedicated privileged domain, and I'm unsure if a CPUID issued there
> yields the same results as a CPUID issued in dom0. So I thought the
> safest way is to get the information directly from the hypervisor. Is
> this assumption incorrect?

See my other reply (to Andrew) - as long as there's no problem with
using the maximum possible size, I don't see why you couldn't use
just CPUID.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Getting the XSAVE size from userspace

2015-11-05 Thread Andrew Cooper

On 05/11/15 10:47, Razvan Cojocaru wrote:
> On 11/05/2015 12:42 PM, Jan Beulich wrote:
> On 05.11.15 at 10:52,  wrote:
>>> I need to get the XSAVE size from userspace. The easiest way seems to be
>>> to use the XEN_DOMCTL_getvcpuextstate hypercall, but that hypercall is
>>> not public / there's no xenctrl.h wrapper for it.
>> Before going into any detail of the rest of your mail - any reason you
>> can't just consult CPUID output?
> That's because the userspace application doesn't live in dom0, but in a
> dedicated privileged domain, and I'm unsure if a CPUID issued there
> yields the same results as a CPUID issued in dom0. So I thought the
> safest way is to get the information directly from the hypervisor. Is
> this assumption incorrect?

What purpose are you wanting the information for?

Using cpuid (should) get you the information concerning your domain,
which is liable to be different to what another domain might see.

Currently, the information available through the domain cpuid policy is
inaccurate, and *not* migration safe.  I am working on fixing this as
part 2 of my cpuid levelling fixes.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-4.3-testing test] 63569: regressions - FAIL

2015-11-05 Thread osstest service owner

flight 63569 xen-4.3-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/63569/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-migrupgrade 21 guest-migrate/src_host/dst_host fail REGR. vs. 
63212

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl   3 host-install(3)  broken in 63524 pass in 63569
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 13 guest-localmigrate fail pass in 
63524

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail like 63212

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail never pass
 build-amd64-rumpuserxen   6 xen-buildfail   never pass
 build-i386-rumpuserxen6 xen-buildfail   never pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install  fail never pass
 test-amd64-i386-migrupgrade 21 guest-migrate/src_host/dst_host fail never pass
 test-armhf-armhf-xl-vhd   6 xen-boot fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale   6 xen-boot fail   never pass
 test-armhf-armhf-libvirt-qcow2  6 xen-boot fail never pass
 test-armhf-armhf-libvirt  6 xen-boot fail   never pass
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail  never pass
 test-armhf-armhf-xl-cubietruck  6 xen-boot fail never pass
 test-armhf-armhf-xl-credit2   6 xen-boot fail   never pass
 test-armhf-armhf-libvirt-raw  6 xen-boot fail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl   6 xen-boot fail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xend-qemut-winxpsp3 21 leak-check/checkfail never pass

version targeted for testing:
 xen  e875e0e5fcc5912f71422b53674a97e5c0ae77be
baseline version:
 xen  85ca813ec23c5a60680e4a13777dad530065902b

Last test of basis63212  2015-10-22 10:03:01 Z   14 days
Failing since 63360  2015-10-29 13:39:04 Z6 days5 attempts
Testing same since63381  2015-10-30 18:44:54 Z5 days4 attempts


People who touched revisions under test:
  Andrew Cooper 
  Ian Campbell 
  Ian Jackson 
  Jan Beulich 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-prev pass
 build-i386-prev  pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  fail
 test-amd64-i386-xl   pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail
 test-amd64-amd64-rumpuserxen-amd64   blocked
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl

Re: [Xen-devel] [V9 2/3] x86/xsaves: enable xsaves/xrstors for hvm guest

2015-11-05 Thread Shuai Ruan

On Thu, Nov 05, 2015 at 03:28:47AM -0700, Jan Beulich wrote:
> >>> On 03.11.15 at 07:27,  wrote:
> > @@ -640,6 +640,14 @@ static void vmx_save_msr(struct vcpu *v, struct 
> > hvm_msr *ctxt)
> >  }
> >  
> >  vmx_vmcs_exit(v);
> > +
> > +if ( cpu_has_xsaves )
> > +{
> > +ctxt->msr[ctxt->count].val = v->arch.hvm_vcpu.msr_xss;
> > +if ( ctxt->msr[ctxt->count].val )
> > +ctxt->msr[ctxt->count++].index = MSR_IA32_XSS;
> > +}
> > +
> >  }
> 
> Stray blank line (not the first time I have to make this comment on
> this series).
Sorry for that.
> 
> With it removed,
> Reviewed-by: Jan Beulich 
> 
Thanks.
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v1 07/11] xsplice: Implement payload loading

2015-11-05 Thread Ross Lagerwall


On 11/04/2015 10:21 PM, Konrad Rzeszutek Wilk wrote:
snip


+
+/*
+ * The following functions prepare an xSplice module to be executed by
+ * allocating space, loading the allocated sections, resolving symbols,
+ * performing relocations, etc.
+ */
+#ifdef CONFIG_X86
+static void *alloc_module(size_t size)


s/module/payload/


My intention was that all the code which implements the "module loader" 
functionality (and is sort of independent from xSplice) uses the term 
"module" whereas the payload implies the loaded module + the other 
xSplice-specific bits. Your thoughts?



+{
+mfn_t *mfn, *mfn_ptr;
+size_t pages, i;
+struct page_info *pg;
+unsigned long hole_start, hole_end, cur;
+struct payload *data, *data2;
+
+ASSERT(size);
+
+pages = PFN_UP(size);
+mfn = xmalloc_array(mfn_t, pages);
+if ( mfn == NULL )
+return NULL;
+
+for ( i = 0; i < pages; i++ )
+{
+pg = alloc_domheap_page(NULL, 0);
+if ( pg == NULL )
+goto error;
+mfn[i] = _mfn(page_to_mfn(pg));
+}


This looks like 'vmalloc'. Why not use that?
(That explanation should be part of the commit description probably)


vmalloc allocates pages and then maps them to an arbitrary virtual 
address with PAGE_HYPERVISOR. I needed to use a specific virtual address 
with PAGE_HYPERVISOR_RWX.





+
+hole_start = (unsigned long)module_virt_start;
+hole_end = hole_start + pages * PAGE_SIZE;
+spin_lock(&payload_list_lock);
+list_for_each_entry ( data, &payload_list, list )
+{
+list_for_each_entry ( data2, &payload_list, list )
+{
+unsigned long start, end;
+
+start = (unsigned long)data2->module_address;
+end = start + data2->module_pages * PAGE_SIZE;
+if ( hole_end > start && hole_start < end )
+{
+hole_start = end;
+hole_end = hole_start + pages * PAGE_SIZE;
+break;
+}
+}
+if ( &data2->list == &payload_list )
+break;
+}
+spin_unlock(&payload_list_lock);


This could be made in a nice function. 'find_hole' perhaps?


+
+if ( hole_end >= module_virt_end )
+goto error;
+
+for ( cur = hole_start, mfn_ptr = mfn; pages--; ++mfn_ptr, cur += 
PAGE_SIZE )
+{
+if ( map_pages_to_xen(cur, mfn_x(*mfn_ptr), 1, PAGE_HYPERVISOR_RWX) )
+{
+if ( cur != hole_start )
+destroy_xen_mappings(hole_start, cur);


I think 'destroy_xen_mappings' is OK handling hole_start == cur.


+goto error;
+}
+}
+xfree(mfn);
+return (void *)hole_start;
+
+ error:
+while ( i-- )
+free_domheap_page(mfn_to_page(mfn_x(mfn[i])));
+xfree(mfn);
+return NULL;
+}
+#else
+static void *alloc_module(size_t size)


s/module/payload/

+{
+return NULL;
+}
+#endif
+
+static void free_module(struct payload *payload)
+{
+int i;


unsigned int;


+struct page_info *pg;
+PAGE_LIST_HEAD(pg_list);
+void *va = payload->module_address;
+unsigned long addr = (unsigned long)va;
+
+if ( !payload->module_address )
+return;


How about 'if ( !addr )
return;
?


+
+payload->module_address = NULL;
+
+for ( i = 0; i < payload->module_pages; i++ )
+page_list_add(vmap_to_page(va + i * PAGE_SIZE), &pg_list);
+
+destroy_xen_mappings(addr, addr + payload->module_pages * PAGE_SIZE);
+
+while ( (pg = page_list_remove_head(&pg_list)) != NULL )
+free_domheap_page(pg);
+
+payload->module_pages = 0;
+}
+
+static void alloc_section(struct xsplice_elf_sec *sec, size_t *core_size)


s/alloc/compute/?


+{
+size_t align_size = ROUNDUP(*core_size, sec->sec->sh_addralign);
+sec->sec->sh_entsize = align_size;
+*core_size = sec->sec->sh_size + align_size;
+}
+
+static int move_module(struct payload *payload, struct xsplice_elf *elf)
+{
+uint8_t *buf;
+int i;


unsigned int i;


+size_t core_size = 0;
+
+/* Allocate text regions */


s/Allocate/Compute/


+for ( i = 0; i < elf->hdr->e_shnum; i++ )
+{
+if ( (elf->sec[i].sec->sh_flags & (SHF_ALLOC|SHF_EXECINSTR)) ==
+ (SHF_ALLOC|SHF_EXECINSTR) )
+alloc_section(&elf->sec[i], &core_size);
+}
+
+/* Allocate rw data */
+for ( i = 0; i < elf->hdr->e_shnum; i++ )
+{
+if ( (elf->sec[i].sec->sh_flags & SHF_ALLOC) &&
+ !(elf->sec[i].sec->sh_flags & SHF_EXECINSTR) &&
+ (elf->sec[i].sec->sh_flags & SHF_WRITE) )
+alloc_section(&elf->sec[i], &core_size);
+}
+
+/* Allocate ro data */
+for ( i = 0; i < elf->hdr->e_shnum; i++ )
+{
+if ( (elf->sec[i].sec->sh_flags & SHF_ALLOC) &&
+ !(elf->sec[i].sec->sh_flags & SHF_EXECINSTR) &&
+ !(elf->sec[i].sec->sh_flags & SHF_WRITE) )
+alloc_section(&elf->sec[i], &core_size);
+}
+
+buf = alloc_module(core_si

Re: [Xen-devel] [RFC PATCH] x86/paravirt: Kill some unused patching functions

2015-11-05 Thread Juergen Gross


On 11/03/2015 10:18 AM, Borislav Petkov wrote:

From: Borislav Petkov 

paravirt_patch_ignore() is completely unused and paravirt_patch_nop()
doesn't do a whole lot. Remove them both.

Signed-off-by: Borislav Petkov 


Reviewed-by: Juergen Gross 


Cc: Andrew Morton 
Cc: Andy Lutomirski 
Cc: Chris Wright 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Jeremy Fitzhardinge 
Cc: Juergen Gross 
Cc: "Peter Zijlstra (Intel)" 
Cc: Rusty Russell 
Cc: Thomas Gleixner 
Cc: virtualizat...@lists.linux-foundation.org
Cc: xen-de...@lists.xenproject.org
---
  arch/x86/include/asm/paravirt_types.h |  2 --
  arch/x86/kernel/paravirt.c| 13 +
  2 files changed, 1 insertion(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index 31247b5bff7c..e1f31dfc3b31 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -402,10 +402,8 @@ extern struct pv_lock_ops pv_lock_ops;
__visible extern const char start_##ops##_##name[], 
end_##ops##_##name[];   \
asm(NATIVE_LABEL("start_", ops, name) code NATIVE_LABEL("end_", ops, 
name))

-unsigned paravirt_patch_nop(void);
  unsigned paravirt_patch_ident_32(void *insnbuf, unsigned len);
  unsigned paravirt_patch_ident_64(void *insnbuf, unsigned len);
-unsigned paravirt_patch_ignore(unsigned len);
  unsigned paravirt_patch_call(void *insnbuf,
 const void *target, u16 tgt_clobbers,
 unsigned long addr, u16 site_clobbers,
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index c2130aef3f9d..4f32a10979db 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -74,16 +74,6 @@ void __init default_banner(void)
  /* Undefined instruction for dealing with missing ops pointers. */
  static const unsigned char ud2a[] = { 0x0f, 0x0b };

-unsigned paravirt_patch_nop(void)
-{
-   return 0;
-}
-
-unsigned paravirt_patch_ignore(unsigned len)
-{
-   return len;
-}
-
  struct branch {
unsigned char opcode;
u32 delta;
@@ -152,8 +142,7 @@ unsigned paravirt_patch_default(u8 type, u16 clobbers, void 
*insnbuf,
/* If there's no function, patch it with a ud2a (BUG) */
ret = paravirt_patch_insns(insnbuf, len, ud2a, 
ud2a+sizeof(ud2a));
else if (opfunc == _paravirt_nop)
-   /* If the operation is a nop, then nop the callsite */
-   ret = paravirt_patch_nop();
+   ret = 0;

/* identity functions just return their single argument */
else if (opfunc == _paravirt_ident_32)




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [libvirt test] 63578: regressions - FAIL

2015-11-05 Thread osstest service owner

flight 63578 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/63578/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-libvirt   5 libvirt-build fail REGR. vs. 63340

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  ac339206bfe98e78925b183cba058d0e2e7f03e3
baseline version:
 libvirt  3c7590e0a435d833895fc7b5be489e53e223ad95

Last test of basis63340  2015-10-28 04:19:47 Z8 days
Failing since 63352  2015-10-29 04:20:29 Z7 days6 attempts
Testing same since63373  2015-10-30 04:21:45 Z6 days5 attempts


People who touched revisions under test:
  Laine Stump 
  Luyao Huang 
  Maxim Perevedentsev 
  Michal Privoznik 
  Roman Bogorodskiy 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  fail
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm blocked 
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt blocked 
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-armhf-armhf-libvirt-qcow2   blocked 
 test-armhf-armhf-libvirt-raw blocked 
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit ac339206bfe98e78925b183cba058d0e2e7f03e3
Author: Laine Stump 
Date:   Thu Oct 29 14:09:59 2015 -0400

util: set max wait for IPv6 DAD to 20 seconds

This was originally set to 5 seconds, but times of 5.5 to 7 seconds
were experienced. Since it's an arbitrary number intended to prevent
an infinite hang, having it a bit too high won't hurt anything, and 20
seconds looks to be adequate (i.e. I think/hope we don't need to make
it tunable in libvirtd.conf)

commit d41a64a1948c88ccec5b4cff34fd04d3aae7a71e
Author: Luyao Huang 
Date:   Thu Oct 29 17:47:33 2015 +0800

util: set error if DAD is not finished

If DAD not finished in 5 seconds, user will get an
unknown error like this:

 # virsh net-start ipv6

Re: [Xen-devel] Getting the XSAVE size from userspace

2015-11-05 Thread Andrei LUTAS


On 11/5/2015 12:51 PM, Jan Beulich wrote:

On 05.11.15 at 11:49,  wrote:

On 05/11/15 10:42, Jan Beulich wrote:

On 05.11.15 at 10:52,  wrote:

I need to get the XSAVE size from userspace. The easiest way seems to be
to use the XEN_DOMCTL_getvcpuextstate hypercall, but that hypercall is
not public / there's no xenctrl.h wrapper for it.

Before going into any detail of the rest of your mail - any reason you
can't just consult CPUID output?

It depends on precisely what you want.

CPUID.0xD[0].ecx gives you the maximum xsave area on this processor
CPUID.0xD[0].ebx gives you the current size for the value in xcr0, but
that is not very useful from userspace.

Why would the maximum size not be sufficient for most (all?) user
mode purposes?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Hello,

The use-case is the following: whenever an EPT violation is triggered 
inside a monitored VM, the introspection logic needs to know how many 
bytes were accessed (read/written). This is done by inspecting the 
faulting instruction and directly inferring the size, which is not 
straight-forward for XSAVE/XRSTOR family. Using the maximum possible 
size is wrong, as in any given moment the OS may or may not desire to 
XSAVE/XRSTOR the entire state (and thinking that the instruction tries 
to access more than it actually does may yield undesired effects). 
Therefore, the size needed for the currently enabled features of the 
monitored guest is required instead. Normally, it could be done by 
running CPUID with eax = 0xD and ecx = i, where i >= 2 and XCR0[i] is 1 
(XCR0 belongs to the monitored guest), but I am unsure if using CPUID 
this way would be safe/desired: will Xen expose the same CPUID features, 
for XSAVE related functionality, on all VMs? (using XCPUID with eax = 
0xD and ecx = 0 would give us the needed size for the SVA, and like I 
said, using the maximum size would not be safe, even if it's the same 
across all VMs on a given host). Also, I'm unsure how this would get 
along with migration...


Thanks,
Andrei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] ocaml/xc: correct shutdown_reason enumeration

2015-11-05 Thread Simon Rowe

As defined by the Xen public header the fifth value of
shutdown_reason is watchdog.

Signed-off-by: Simon Rowe 
---
 tools/ocaml/libs/xc/xenctrl.ml  |2 +-
 tools/ocaml/libs/xc/xenctrl.mli |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
index b7ba8b7..beb95b8 100644
--- a/tools/ocaml/libs/xc/xenctrl.ml
+++ b/tools/ocaml/libs/xc/xenctrl.ml
@@ -89,7 +89,7 @@ type compile_info =
compile_date : string;
 }
 
-type shutdown_reason = Poweroff | Reboot | Suspend | Crash | Halt
+type shutdown_reason = Poweroff | Reboot | Suspend | Crash | Watchdog
 
 type domain_create_flag = CDF_HVM | CDF_HAP
 
diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
index bc4af56..8928a2e 100644
--- a/tools/ocaml/libs/xc/xenctrl.mli
+++ b/tools/ocaml/libs/xc/xenctrl.mli
@@ -61,7 +61,7 @@ type compile_info = {
   compile_domain : string;
   compile_date : string;
 }
-type shutdown_reason = Poweroff | Reboot | Suspend | Crash | Halt
+type shutdown_reason = Poweroff | Reboot | Suspend | Crash | Watchdog
 
 type domain_create_flag = CDF_HVM | CDF_HAP
 
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Getting the XSAVE size from userspace

2015-11-05 Thread Andrew Cooper

On 05/11/15 11:35, Andrei LUTAS wrote:
> On 11/5/2015 12:51 PM, Jan Beulich wrote:
> On 05.11.15 at 11:49,  wrote:
>>> On 05/11/15 10:42, Jan Beulich wrote:
>>> On 05.11.15 at 10:52,  wrote:
> I need to get the XSAVE size from userspace. The easiest way seems
> to be
> to use the XEN_DOMCTL_getvcpuextstate hypercall, but that
> hypercall is
> not public / there's no xenctrl.h wrapper for it.
 Before going into any detail of the rest of your mail - any reason you
 can't just consult CPUID output?
>>> It depends on precisely what you want.
>>>
>>> CPUID.0xD[0].ecx gives you the maximum xsave area on this processor
>>> CPUID.0xD[0].ebx gives you the current size for the value in xcr0, but
>>> that is not very useful from userspace.
>> Why would the maximum size not be sufficient for most (all?) user
>> mode purposes?
>>
>> Jan
>>
>>
>> ___
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>>
> Hello,
>
> The use-case is the following: whenever an EPT violation is triggered
> inside a monitored VM, the introspection logic needs to know how many
> bytes were accessed (read/written). This is done by inspecting the
> faulting instruction and directly inferring the size, which is not
> straight-forward for XSAVE/XRSTOR family. Using the maximum possible
> size is wrong, as in any given moment the OS may or may not desire to
> XSAVE/XRSTOR the entire state (and thinking that the instruction tries
> to access more than it actually does may yield undesired effects).
> Therefore, the size needed for the currently enabled features of the
> monitored guest is required instead. Normally, it could be done by
> running CPUID with eax = 0xD and ecx = i, where i >= 2 and XCR0[i] is
> 1 (XCR0 belongs to the monitored guest), but I am unsure if using
> CPUID this way would be safe/desired: will Xen expose the same CPUID
> features, for XSAVE related functionality, on all VMs? (using XCPUID
> with eax = 0xD and ecx = 0 would give us the needed size for the SVA,
> and like I said, using the maximum size would not be safe, even if
> it's the same across all VMs on a given host). Also, I'm unsure how
> this would get along with migration...

Hmm yes - there is no way to do this currently.

Xen's CPUID handling for xsave related things is broken in levelling and
migration scenarios, which is why it is *still* disabled by default in
XenServer.

I am working on fixing it, and will take this usecase into account
(although I think I had already included enough for this usecase to work).

At the point of the xsave/xrestor trap, you need to know xcr0 and be
able to perfom a cpuid instruction in the context of a target domain, to
make use of 0xD[0].ebx to get the "current size based on xcr0".

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v1 08/11] xsplice: Implement support for applying patches

2015-11-05 Thread Ross Lagerwall


On 11/05/2015 03:17 AM, Konrad Rzeszutek Wilk wrote:
snip

diff --git a/xen/arch/x86/xsplice.c b/xen/arch/x86/xsplice.c
index dbff0d5..31e4124 100644
--- a/xen/arch/x86/xsplice.c
+++ b/xen/arch/x86/xsplice.c
@@ -3,6 +3,25 @@
  #include 
  #include 

+#define PATCH_INSN_SIZE 5
+
+void xsplice_apply_jmp(struct xsplice_patch_func *func)


Don't we want for it to be 'int'


Only if an error is expected.


+{
+uint32_t val;
+uint8_t *old_ptr;
+
+old_ptr = (uint8_t *)func->old_addr;
+memcpy(func->undo, old_ptr, PATCH_INSN_SIZE);


And perhaps use something which can catch an exception (#GP) so that
this can error out?


Why would this fail?


+*old_ptr++ = 0xe9; /* Relative jump */
+val = func->new_addr - func->old_addr - PATCH_INSN_SIZE;
+memcpy(old_ptr, &val, sizeof val);
+}
+
+void xsplice_revert_jmp(struct xsplice_patch_func *func)
+{
+memcpy((void *)func->old_addr, func->undo, PATCH_INSN_SIZE);
+}
+
  int xsplice_verify_elf(uint8_t *data, ssize_t len)
  {

diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
index 5e88c55..4476be5 100644
--- a/xen/common/xsplice.c
+++ b/xen/common/xsplice.c
@@ -11,16 +11,21 @@
  #include 
  #include 
  #include 
+#include 
  #include 
+#include 
  #include 
  #include 
  #include 

  #include 
+#include 

  static DEFINE_SPINLOCK(payload_list_lock);
  static LIST_HEAD(payload_list);

+static LIST_HEAD(applied_list);
+
  static unsigned int payload_cnt;
  static unsigned int payload_version = 1;

@@ -29,15 +34,34 @@ struct payload {
  int32_t rc; /* 0 or -EXX. */

  struct list_head   list;   /* Linked to 'payload_list'. */
+struct list_head   applied_list;   /* Linked to 'applied_list'. */

+struct xsplice_patch_func *funcs;
+int nfuncs;


unsigned int;


  void *module_address;
  size_t module_pages;

  char  id[XEN_XSPLICE_NAME_SIZE + 1];  /* Name of it. */
  };

+/* Defines an outstanding patching action. */
+struct xsplice_work
+{
+atomic_t semaphore;  /* Used for rendezvous */
+atomic_t irq_semaphore;  /* Used to signal all IRQs disabled */
+struct payload *data;/* The payload on which to act */
+volatile bool_t do_work; /* Signals work to do */
+volatile bool_t ready;   /* Signals all CPUs synchronized */
+uint32_t cmd;/* Action request. XSPLICE_ACTION_* */


Now since you have a pointer to 'data' can't you follow that for the
cmd? Or at least the 'data->state'?


I moved cmd out of the payload and into xsplice_work since cmd is only 
needed when there is work to do.
data->state contains the current state of the payload (i.e. before the 
action has been performed) so it provides no indication of what command 
needs to be performed.




Missing full stops.

+};
+
+static DEFINE_SPINLOCK(xsplice_work_lock);
+/* There can be only one outstanding patching action. */
+static struct xsplice_work xsplice_work;
+
  static int load_module(struct payload *payload, uint8_t *raw, ssize_t len);
  static void free_module(struct payload *payload);
+static int schedule_work(struct payload *data, uint32_t cmd);

  static const char *state2str(int32_t state)
  {
@@ -341,28 +365,22 @@ static int xsplice_action(xen_sysctl_xsplice_action_t 
*action)
  case XSPLICE_ACTION_REVERT:
  if ( data->state == XSPLICE_STATE_APPLIED )
  {
-/* No implementation yet. */
-data->state = XSPLICE_STATE_CHECKED;
-data->rc = 0;
-rc = 0;
+data->rc = -EAGAIN;
+rc = schedule_work(data, action->cmd);
  }
  break;
  case XSPLICE_ACTION_APPLY:
  if ( (data->state == XSPLICE_STATE_CHECKED) )
  {
-/* No implementation yet. */
-data->state = XSPLICE_STATE_APPLIED;
-data->rc = 0;
-rc = 0;
+data->rc = -EAGAIN;
+rc = schedule_work(data, action->cmd);
  }
  break;
  case XSPLICE_ACTION_REPLACE:
  if ( data->state == XSPLICE_STATE_CHECKED )
  {
-/* No implementation yet. */
-data->state = XSPLICE_STATE_CHECKED;
-data->rc = 0;
-rc = 0;
+data->rc = -EAGAIN;
+rc = schedule_work(data, action->cmd);
  }
  break;
  default:
@@ -637,6 +655,24 @@ static int perform_relocs(struct xsplice_elf *elf)
  return 0;
  }

+static int find_special_sections(struct payload *payload,
+ struct xsplice_elf *elf)
+{
+struct xsplice_elf_sec *sec;
+
+sec = xsplice_elf_sec_by_name(elf, ".xsplice.funcs");
+if ( !sec )
+{
+printk(XENLOG_ERR ".xsplice.funcs is missing\n");
+return -1;
+}
+
+payload->funcs = (struct xsplice_patch_func *)sec->load_addr;
+payload->nfuncs = sec->sec->sh_size / (sizeof *payload->funcs);
+
+return 0;
+}


That looks like it should belong to another patch?


Why? The ar

Re: [Xen-devel] [PATCH v1 07/11] xsplice: Implement payload loading

2015-11-05 Thread Ross Lagerwall


On 11/05/2015 10:35 AM, Jan Beulich wrote:

On 04.11.15 at 23:21,  wrote:

+int xsplice_perform_rela(struct xsplice_elf *elf,
+ struct xsplice_elf_sec *base,
+ struct xsplice_elf_sec *rela)
+{
+Elf64_Rela *r;
+int symndx, i;


unsigned int


+uint64_t val;
+uint8_t *dest;
+


Can you double check that rela->sec-sh_entsize is not zero first?


Perhaps not just not zero, but at least a certain minimum? Or even
equaling some sizeof()?



Well it only makes sense if rela->sec-sh_entsize == sizeof(Elf64_Rela) 
so that is what I shall check for.


--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v1 05/11] elf: Add relocation types to elfstructs.h

2015-11-05 Thread Ross Lagerwall


On 11/05/2015 10:38 AM, Jan Beulich wrote:

On 03.11.15 at 19:16,  wrote:

--- a/xen/include/xen/elfstructs.h
+++ b/xen/include/xen/elfstructs.h
@@ -348,6 +348,27 @@ typedef struct {
  #define   ELF64_R_TYPE(info)  ((info) & 0x)
  #define ELF64_R_INFO(s,t) (((s) << 32) + (u_int32_t)(t))

+/* x86-64 relocation types */
+#define R_X86_64_NONE  0   /* No reloc */
+#define R_X86_64_641   /* Direct 64 bit  */
+#define R_X86_64_PC32  2   /* PC relative 32 bit signed */
+#define R_X86_64_GOT32 3   /* 32 bit GOT entry */
+#define R_X86_64_PLT32 4   /* 32 bit PLT address */
+#define R_X86_64_COPY  5   /* Copy symbol at runtime */
+#define R_X86_64_GLOB_DAT  6   /* Create GOT entry */
+#define R_X86_64_JUMP_SLOT 7   /* Create PLT entry */
+#define R_X86_64_RELATIVE  8   /* Adjust by program base */
+#define R_X86_64_GOTPCREL  9   /* 32 bit signed pc relative
+  offset to GOT */
+#define R_X86_64_3210  /* Direct 32 bit zero extended */
+#define R_X86_64_32S   11  /* Direct 32 bit sign extended */
+#define R_X86_64_1612  /* Direct 16 bit zero extended */
+#define R_X86_64_PC16  13  /* 16 bit sign extended pc relative */
+#define R_X86_64_8 14  /* Direct 8 bit sign extended  */
+#define R_X86_64_PC8   15  /* 8 bit sign extended pc relative */
+
+#define R_X86_64_NUM   16


Since the set isn't complete anyway - any reason not to drop
everything that's of no relevance to xSplice?



I copied these definitions from Linux (wrongly) assuming that they were 
complete. I shall remove the unused ones.


--
Ross Lagerwall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] ocaml/xc: correct shutdown_reason enumeration

2015-11-05 Thread David Scott


> On 5 Nov 2015, at 11:39, Simon Rowe  wrote:
> 
> As defined by the Xen public header the fifth value of
> shutdown_reason is watchdog.

I’ve always been a bit suspicious about having both “Poweroff” and “Halt” 
there. Perhaps there was some confusion between what could be written to 
‘control/shutdown’ in xenstore and legal arguments to `xc_domain_shutdown` and 
`SCHEDOP_shutdown`?

Anyway you’re clearly right, `Watchdog` is the 5th value. So I think this is 
fine.

Acked-by: David Scott 

I happen to notice there’s a type with the same name in “xenopsd”[1], so I’ve 
cc:d xen-api@lists as a heads-up.

Thanks,
Dave

[1] 
https://github.com/xapi-project/xenopsd/blob/7818ab896d9969c5f5462a2f0d0ae62703b104b6/xc/domain.ml#L268

> 
> Signed-off-by: Simon Rowe 
> ---
> tools/ocaml/libs/xc/xenctrl.ml  |2 +-
> tools/ocaml/libs/xc/xenctrl.mli |2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
> index b7ba8b7..beb95b8 100644
> --- a/tools/ocaml/libs/xc/xenctrl.ml
> +++ b/tools/ocaml/libs/xc/xenctrl.ml
> @@ -89,7 +89,7 @@ type compile_info =
>   compile_date : string;
> }
> 
> -type shutdown_reason = Poweroff | Reboot | Suspend | Crash | Halt
> +type shutdown_reason = Poweroff | Reboot | Suspend | Crash | Watchdog
> 
> type domain_create_flag = CDF_HVM | CDF_HAP
> 
> diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
> index bc4af56..8928a2e 100644
> --- a/tools/ocaml/libs/xc/xenctrl.mli
> +++ b/tools/ocaml/libs/xc/xenctrl.mli
> @@ -61,7 +61,7 @@ type compile_info = {
>   compile_domain : string;
>   compile_date : string;
> }
> -type shutdown_reason = Poweroff | Reboot | Suspend | Crash | Halt
> +type shutdown_reason = Poweroff | Reboot | Suspend | Crash | Watchdog
> 
> type domain_create_flag = CDF_HVM | CDF_HAP
> 
> -- 
> 1.7.10.4
> 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v7 27/32] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs

2015-11-05 Thread Roger Pau Monné

El 19/10/15 a les 17.48, Jan Beulich ha escrit:
 On 02.10.15 at 17:48,  wrote:
>> @@ -1176,6 +1177,190 @@ int arch_set_info_guest(
>>  #undef c
>>  }
>>  
>> +/* Called by VCPUOP_initialise for HVM guests. */
>> +int arch_set_info_hvm_guest(struct vcpu *v, vcpu_hvm_context_t *ctx)
> 
> const ... *ctx

Sure.

>> +{
>> +struct cpu_user_regs *uregs = &v->arch.user_regs;
>> +struct segment_register cs, ds, ss, es, tr;
>> +
>> +switch ( ctx->mode )
>> +{
>> +default:
>> +return -EINVAL;
>> +
>> +case VCPU_HVM_MODE_32B:
>> +{
>> +const struct vcpu_hvm_x86_32 *regs = &ctx->cpu_regs.x86_32;
>> +uint32_t limit;
>> +
>> +#define SEG(s, r)   \
>> +(struct segment_register){ .sel = 0, .base = (r)->s ## _base,   \
>> +.limit = (r)->s ## _limit, .attr.bytes = (r)->s ## _ar }
>> +cs = SEG(cs, regs);
>> +ds = SEG(ds, regs);
>> +ss = SEG(ss, regs);
>> +es = SEG(es, regs);
>> +tr = SEG(tr, regs);
>> +#undef SEG
>> +
>> +/* Basic sanity checks. */
>> +if ( cs.attr.fields.pad != 0 || ds.attr.fields.pad != 0 ||
>> + ss.attr.fields.pad != 0 || es.attr.fields.pad != 0 ||
>> + tr.attr.fields.pad != 0 )
>> +{
>> +gprintk(XENLOG_ERR, "Attribute bits 12-15 of the segments are 
>> not null\n");
>> +return -EINVAL;
>> +}
>> +
>> +limit = cs.limit * (cs.attr.fields.g ? PAGE_SIZE : 1);
>> +if ( regs->eip > limit )
>> +{
>> +gprintk(XENLOG_ERR, "EIP address is outside of the CS limit\n");
>> +return -EINVAL;
>> +}
>> +
>> +if ( ds.attr.fields.dpl > cs.attr.fields.dpl )
> 
> Checks like this imo need to take into account cases where the effect
> of a null selector loaded into the register is intended (in which case I
> would assume DPL to not matter). Speaking of which - with all these
> DPL checks done, what about non-code segments loaded into CS or
> other illegal things? Question is whether the
> hvm_set_segment_register() calls below could be made take care of
> these instead of having to enumerate everything here.

hvm_set_segment_register is just an inline wrapper around
hvm_funcs.set_segment_register. I could turn that into a proper function
with checks, but it's a shame because hvm_load_segment_selector also
performs some of this checks, but it requires a valid GDT to be loaded
in order to use it which we don't have.

I don't mind adding some more checks to the current ones:

 - Check that all segments that are not null selectors have the
'present' bit set.
 - Check that CS.type matches a code segment.
 - Check that all segments except CS don't have the 'code' type.
 - Don't perform the DPL check if the segment is a null selector.

I'm adding a small inline stub to do this checks.

>> --- a/xen/common/compat/domain.c
>> +++ b/xen/common/compat/domain.c
>> @@ -10,6 +10,9 @@
>>  #include 
>>  #include 
>>  #include 
>> +#ifdef CONFIG_X86
>> +#include 
>> +#endif
> 
> I'd avoid such #if-s in this file, since it's only x86 that uses compat
> code right now.

OK, knowing that the compat code is only used in x86 helps to simplify 
some of this code also.

>> --- a/xen/common/domain.c
>> +++ b/xen/common/domain.c
>> @@ -1207,11 +1207,35 @@ void unmap_vcpu_info(struct vcpu *v)
>>  put_page_and_type(mfn_to_page(mfn));
>>  }
>>  
>> +static int default_initialize_vcpu(struct vcpu *v,
>> +   XEN_GUEST_HANDLE_PARAM(void) arg)
>> +{
>> +struct vcpu_guest_context *ctxt;
>> +struct domain *d = v->domain;
>> +int rc;
>> +
>> +if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
>> +return -ENOMEM;
>> +
>> +if ( copy_from_guest(ctxt, arg, 1) )
>> +{
>> +free_vcpu_guest_context(ctxt);
>> +return -EFAULT;
>> +}
>> +
>> +domain_lock(d);
>> +rc = v->is_initialised ? -EEXIST : arch_set_info_guest(v, ctxt);
>> +domain_unlock(d);
>> +
>> +free_vcpu_guest_context(ctxt);
>> +
>> +return rc;
>> +}
>> +
>>  long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) 
>> arg)
>>  {
>>  struct domain *d = current->domain;
>>  struct vcpu *v;
>> -struct vcpu_guest_context *ctxt;
>>  long rc = 0;
>>  
>>  if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL )
>> @@ -1223,20 +1247,28 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, 
>> XEN_GUEST_HANDLE_PARAM(void) arg)
>>  if ( v->vcpu_info == &dummy_vcpu_info )
>>  return -EINVAL;
>>  
>> -if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
>> -return -ENOMEM;
>> -
>> -if ( copy_from_guest(ctxt, arg, 1) )
>> +#if defined(CONFIG_X86)
> 
> Looks like you went from one extreme to the other: Now there's no
> per-arch function anymore, and hence you need this ugly #ifdef-ery.
> Why don't you add default_initialize_

Re: [Xen-devel] [PATCH v1 07/11] xsplice: Implement payload loading

2015-11-05 Thread Jan Beulich

>>> On 05.11.15 at 12:51,  wrote:
> On 11/05/2015 10:35 AM, Jan Beulich wrote:
> On 04.11.15 at 23:21,  wrote:
 +int xsplice_perform_rela(struct xsplice_elf *elf,
 + struct xsplice_elf_sec *base,
 + struct xsplice_elf_sec *rela)
 +{
 +Elf64_Rela *r;
 +int symndx, i;
>>>
>>> unsigned int
>>>
 +uint64_t val;
 +uint8_t *dest;
 +
>>>
>>> Can you double check that rela->sec-sh_entsize is not zero first?
>>
>> Perhaps not just not zero, but at least a certain minimum? Or even
>> equaling some sizeof()?
>>
> 
> Well it only makes sense if rela->sec-sh_entsize == sizeof(Elf64_Rela) 
> so that is what I shall check for.

The question whether to use == or >= really depends on whether
we expect (theoretical) additions to the structure to be backwards
compatible.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] x86/hvm: make sure stdvga cache cannot be re-enabled

2015-11-05 Thread Paul Durrant

As soon as the cache is disabled, it will become out-of-sync with the
VGA device model and since no mechanism exists to acquire current VRAM
state from the device model, re-enabling it leads to stale data
being seen by the guest.

The problem can be seen by deliberately crashing a Windows guest; the
BSOD output is corrupted.

This patch changes the existing 'cache' boolean in hvm_hw_stdvga into a
tri-state enum and only allows the state to move from 'uninitialized' to
'enabled'. Once the cache state becomes 'disabled' it will remain so for
the lifetime of the VM.

Signed-off-by: Paul Durrant 
Cc: Keir Fraser 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 xen/arch/x86/hvm/save.c  |  2 +-
 xen/arch/x86/hvm/stdvga.c| 50 
 xen/include/asm-x86/hvm/io.h |  8 ++-
 3 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/hvm/save.c b/xen/arch/x86/hvm/save.c
index 4660beb..f7d4999 100644
--- a/xen/arch/x86/hvm/save.c
+++ b/xen/arch/x86/hvm/save.c
@@ -73,7 +73,7 @@ int arch_hvm_load(struct domain *d, struct hvm_save_header 
*hdr)
 d->arch.hvm_domain.sync_tsc = rdtsc();
 
 /* VGA state is not saved/restored, so we nobble the cache. */
-d->arch.hvm_domain.stdvga.cache = 0;
+d->arch.hvm_domain.stdvga.cache = STDVGA_CACHE_DISABLED;
 
 return 0;
 }
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 02a97f9..246c629 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -101,6 +101,37 @@ static void vram_put(struct hvm_hw_stdvga *s, void *p)
 unmap_domain_page(p);
 }
 
+static void stdvga_try_cache_enable(struct hvm_hw_stdvga *s)
+{
+/*
+ * Caching mode can only be enabled if the the cache has
+ * never been used before. As soon as it is disabled, it will
+ * become out-of-sync with the VGA device model and since no
+ * mechanism exists to acquire current VRAM state from the
+ * device model, re-enabling it would lead to stale data being
+ * seen by the guest.
+ */
+if ( s->cache != STDVGA_CACHE_UNINITIALIZED )
+return;
+
+gdprintk(XENLOG_INFO, "entering caching mode\n");
+s->cache = STDVGA_CACHE_ENABLED;
+}
+
+static void stdvga_cache_disable(struct hvm_hw_stdvga *s)
+{
+if ( s->cache != STDVGA_CACHE_ENABLED )
+return;
+
+gdprintk(XENLOG_INFO, "leaving caching mode\n");
+s->cache = STDVGA_CACHE_DISABLED;
+}
+
+static bool_t stdvga_cache_is_enabled(struct hvm_hw_stdvga *s)
+{
+return s->cache == STDVGA_CACHE_ENABLED;
+}
+
 static int stdvga_outb(uint64_t addr, uint8_t val)
 {
 struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm_domain.stdvga;
@@ -139,12 +170,8 @@ static int stdvga_outb(uint64_t addr, uint8_t val)
 
 if ( !prev_stdvga && s->stdvga )
 {
-/*
- * (Re)start caching of video buffer.
- * XXX TODO: In case of a restart the cache could be unsynced.
- */
-s->cache = 1;
-gdprintk(XENLOG_INFO, "entering stdvga and caching modes\n");
+gdprintk(XENLOG_INFO, "entering stdvga mode\n");
+stdvga_try_cache_enable(s);
 }
 else if ( prev_stdvga && !s->stdvga )
 {
@@ -441,7 +468,7 @@ static int stdvga_mem_write(const struct hvm_io_handler 
*handler,
 };
 struct hvm_ioreq_server *srv;
 
-if ( !s->cache || !s->stdvga )
+if ( !stdvga_cache_is_enabled(s) || !s->stdvga )
 goto done;
 
 /* Intercept mmio write */
@@ -515,15 +542,12 @@ static bool_t stdvga_mem_accept(const struct 
hvm_io_handler *handler,
  * not active since we can assert, when in stdvga mode, that writes
  * to VRAM have no side effect and thus we can try to buffer them.
  */
-if ( s->cache )
-{
-gdprintk(XENLOG_INFO, "leaving caching mode\n");
-s->cache = 0;
-}
+stdvga_cache_disable(s);
 
 goto reject;
 }
-else if ( p->dir == IOREQ_READ && (!s->cache || !s->stdvga) )
+else if ( p->dir == IOREQ_READ &&
+  (!stdvga_cache_is_enabled(s) || !s->stdvga) )
 goto reject;
 
 /* s->lock intentionally held */
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 8585a1f..ceefa2e 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -128,13 +128,19 @@ void hvm_dpci_eoi(struct domain *d, unsigned int 
guest_irq,
 void msix_write_completion(struct vcpu *);
 void msixtbl_init(struct domain *d);
 
+enum stdvga_cache_state {
+STDVGA_CACHE_UNINITIALIZED,
+STDVGA_CACHE_ENABLED,
+STDVGA_CACHE_DISABLED
+};
+
 struct hvm_hw_stdvga {
 uint8_t sr_index;
 uint8_t sr[8];
 uint8_t gr_index;
 uint8_t gr[9];
 bool_t stdvga;
-bool_t cache;
+enum stdvga_cache_state cache;
 uint32_t latch;
 struct page_info *vram_page[64];  /* shadow of 0xa-0xa */
 spinlock_t lock;
-- 
2.1.4


___
Xen-devel mailing list

Re: [Xen-devel] Getting the XSAVE size from userspace

2015-11-05 Thread Razvan Cojocaru

On 11/05/2015 01:44 PM, Andrew Cooper wrote:
> On 05/11/15 11:35, Andrei LUTAS wrote:
>> The use-case is the following: whenever an EPT violation is triggered
>> inside a monitored VM, the introspection logic needs to know how many
>> bytes were accessed (read/written). This is done by inspecting the
>> faulting instruction and directly inferring the size, which is not
>> straight-forward for XSAVE/XRSTOR family. Using the maximum possible
>> size is wrong, as in any given moment the OS may or may not desire to
>> XSAVE/XRSTOR the entire state (and thinking that the instruction tries
>> to access more than it actually does may yield undesired effects).
>> Therefore, the size needed for the currently enabled features of the
>> monitored guest is required instead. Normally, it could be done by
>> running CPUID with eax = 0xD and ecx = i, where i >= 2 and XCR0[i] is
>> 1 (XCR0 belongs to the monitored guest), but I am unsure if using
>> CPUID this way would be safe/desired: will Xen expose the same CPUID
>> features, for XSAVE related functionality, on all VMs? (using XCPUID
>> with eax = 0xD and ecx = 0 would give us the needed size for the SVA,
>> and like I said, using the maximum size would not be safe, even if
>> it's the same across all VMs on a given host). Also, I'm unsure how
>> this would get along with migration...
> 
> Hmm yes - there is no way to do this currently.
> 
> Xen's CPUID handling for xsave related things is broken in levelling and
> migration scenarios, which is why it is *still* disabled by default in
> XenServer.
> 
> I am working on fixing it, and will take this usecase into account
> (although I think I had already included enough for this usecase to work).
> 
> At the point of the xsave/xrestor trap, you need to know xcr0 and be
> able to perfom a cpuid instruction in the context of a target domain, to
> make use of 0xD[0].ebx to get the "current size based on xcr0".

So then the closest thing to what we need would be to add a size field
to struct hvm_hw_cpu_xsave, and just assign the size variable to it in
hvm_save_cpu_xsave_states (migration aside)?

2130 static int hvm_save_cpu_xsave_states(struct domain *d,
hvm_domain_context_t *h)
2131 {
2132 struct vcpu *v;
2133 struct hvm_hw_cpu_xsave *ctxt;
2134
2135 if ( !cpu_has_xsave )
2136 return 0;   /* do nothing */
2137
2138 for_each_vcpu ( d, v )
2139 {
2140 unsigned int size = HVM_CPU_XSAVE_SIZE(v->arch.xcr0_accum);
2141
2142 if ( !xsave_enabled(v) )
2143 continue;
2144 if ( _hvm_init_entry(h, CPU_XSAVE_CODE, v->vcpu_id, size) )
2145 return 1;
2146 ctxt = (struct hvm_hw_cpu_xsave *)&h->data[h->cur];
2147 h->cur += size;
2148
2149 ctxt->xfeature_mask = xfeature_mask;
2150 ctxt->xcr0 = v->arch.xcr0;
2151 ctxt->xcr0_accum = v->arch.xcr0_accum;
2152 memcpy(&ctxt->save_area, v->arch.xsave_area,
2153size - offsetof(struct hvm_hw_cpu_xsave, save_area));
2154 }
2155
2156 return 0;
2157 }


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/hvm: make sure stdvga cache cannot be re-enabled

2015-11-05 Thread Andrew Cooper

On 05/11/15 12:17, Paul Durrant wrote:
> As soon as the cache is disabled, it will become out-of-sync with the
> VGA device model and since no mechanism exists to acquire current VRAM
> state from the device model, re-enabling it leads to stale data
> being seen by the guest.
>
> The problem can be seen by deliberately crashing a Windows guest; the
> BSOD output is corrupted.
>
> This patch changes the existing 'cache' boolean in hvm_hw_stdvga into a
> tri-state enum and only allows the state to move from 'uninitialized' to
> 'enabled'. Once the cache state becomes 'disabled' it will remain so for
> the lifetime of the VM.

Should identify that this is a regression introduced by c/s
3bbaaec09b1b942f5624dee176da6e416d31f982

>
> Signed-off-by: Paul Durrant 
> Cc: Keir Fraser 
> Cc: Jan Beulich 
> Cc: Andrew Cooper 

Reviewed-by: Andrew Cooper , with one small
issue which could be fixed on commit...

> ---
>  xen/arch/x86/hvm/save.c  |  2 +-
>  xen/arch/x86/hvm/stdvga.c| 50 
> 
>  xen/include/asm-x86/hvm/io.h |  8 ++-
>  3 files changed, 45 insertions(+), 15 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/save.c b/xen/arch/x86/hvm/save.c
> index 4660beb..f7d4999 100644
> --- a/xen/arch/x86/hvm/save.c
> +++ b/xen/arch/x86/hvm/save.c
> @@ -73,7 +73,7 @@ int arch_hvm_load(struct domain *d, struct hvm_save_header 
> *hdr)
>  d->arch.hvm_domain.sync_tsc = rdtsc();
>  
>  /* VGA state is not saved/restored, so we nobble the cache. */
> -d->arch.hvm_domain.stdvga.cache = 0;
> +d->arch.hvm_domain.stdvga.cache = STDVGA_CACHE_DISABLED;
>  
>  return 0;
>  }
> diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
> index 02a97f9..246c629 100644
> --- a/xen/arch/x86/hvm/stdvga.c
> +++ b/xen/arch/x86/hvm/stdvga.c
> @@ -101,6 +101,37 @@ static void vram_put(struct hvm_hw_stdvga *s, void *p)
>  unmap_domain_page(p);
>  }
>  
> +static void stdvga_try_cache_enable(struct hvm_hw_stdvga *s)
> +{
> +/*
> + * Caching mode can only be enabled if the the cache has
> + * never been used before. As soon as it is disabled, it will
> + * become out-of-sync with the VGA device model and since no
> + * mechanism exists to acquire current VRAM state from the
> + * device model, re-enabling it would lead to stale data being
> + * seen by the guest.
> + */
> +if ( s->cache != STDVGA_CACHE_UNINITIALIZED )
> +return;
> +
> +gdprintk(XENLOG_INFO, "entering caching mode\n");
> +s->cache = STDVGA_CACHE_ENABLED;
> +}
> +
> +static void stdvga_cache_disable(struct hvm_hw_stdvga *s)
> +{
> +if ( s->cache != STDVGA_CACHE_ENABLED )
> +return;
> +
> +gdprintk(XENLOG_INFO, "leaving caching mode\n");
> +s->cache = STDVGA_CACHE_DISABLED;
> +}
> +
> +static bool_t stdvga_cache_is_enabled(struct hvm_hw_stdvga *s)

const struct hvm_hw_stdvga *s

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] x86/hvm: make sure stdvga cache cannot be re-enabled

2015-11-05 Thread Paul Durrant

> -Original Message-
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: 05 November 2015 12:32
> To: Paul Durrant; xen-de...@lists.xenproject.org
> Cc: Keir (Xen.org); Jan Beulich
> Subject: Re: [PATCH] x86/hvm: make sure stdvga cache cannot be re-
> enabled
> 
> On 05/11/15 12:17, Paul Durrant wrote:
> > As soon as the cache is disabled, it will become out-of-sync with the
> > VGA device model and since no mechanism exists to acquire current VRAM
> > state from the device model, re-enabling it leads to stale data
> > being seen by the guest.
> >
> > The problem can be seen by deliberately crashing a Windows guest; the
> > BSOD output is corrupted.
> >
> > This patch changes the existing 'cache' boolean in hvm_hw_stdvga into a
> > tri-state enum and only allows the state to move from 'uninitialized' to
> > 'enabled'. Once the cache state becomes 'disabled' it will remain so for
> > the lifetime of the VM.
> 
> Should identify that this is a regression introduced by c/s
> 3bbaaec09b1b942f5624dee176da6e416d31f982
> 
> >
> > Signed-off-by: Paul Durrant 
> > Cc: Keir Fraser 
> > Cc: Jan Beulich 
> > Cc: Andrew Cooper 
> 
> Reviewed-by: Andrew Cooper , with one
> small
> issue which could be fixed on commit...
> 
> > ---
> >  xen/arch/x86/hvm/save.c  |  2 +-
> >  xen/arch/x86/hvm/stdvga.c| 50
> 
> >  xen/include/asm-x86/hvm/io.h |  8 ++-
> >  3 files changed, 45 insertions(+), 15 deletions(-)
> >
> > diff --git a/xen/arch/x86/hvm/save.c b/xen/arch/x86/hvm/save.c
> > index 4660beb..f7d4999 100644
> > --- a/xen/arch/x86/hvm/save.c
> > +++ b/xen/arch/x86/hvm/save.c
> > @@ -73,7 +73,7 @@ int arch_hvm_load(struct domain *d, struct
> hvm_save_header *hdr)
> >  d->arch.hvm_domain.sync_tsc = rdtsc();
> >
> >  /* VGA state is not saved/restored, so we nobble the cache. */
> > -d->arch.hvm_domain.stdvga.cache = 0;
> > +d->arch.hvm_domain.stdvga.cache = STDVGA_CACHE_DISABLED;
> >
> >  return 0;
> >  }
> > diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
> > index 02a97f9..246c629 100644
> > --- a/xen/arch/x86/hvm/stdvga.c
> > +++ b/xen/arch/x86/hvm/stdvga.c
> > @@ -101,6 +101,37 @@ static void vram_put(struct hvm_hw_stdvga *s,
> void *p)
> >  unmap_domain_page(p);
> >  }
> >
> > +static void stdvga_try_cache_enable(struct hvm_hw_stdvga *s)
> > +{
> > +/*
> > + * Caching mode can only be enabled if the the cache has
> > + * never been used before. As soon as it is disabled, it will
> > + * become out-of-sync with the VGA device model and since no
> > + * mechanism exists to acquire current VRAM state from the
> > + * device model, re-enabling it would lead to stale data being
> > + * seen by the guest.
> > + */
> > +if ( s->cache != STDVGA_CACHE_UNINITIALIZED )
> > +return;
> > +
> > +gdprintk(XENLOG_INFO, "entering caching mode\n");
> > +s->cache = STDVGA_CACHE_ENABLED;
> > +}
> > +
> > +static void stdvga_cache_disable(struct hvm_hw_stdvga *s)
> > +{
> > +if ( s->cache != STDVGA_CACHE_ENABLED )
> > +return;
> > +
> > +gdprintk(XENLOG_INFO, "leaving caching mode\n");
> > +s->cache = STDVGA_CACHE_DISABLED;
> > +}
> > +
> > +static bool_t stdvga_cache_is_enabled(struct hvm_hw_stdvga *s)
> 
> const struct hvm_hw_stdvga *s
> 

I'll re-spin with this fixed and regression-introducing commit mentioned in the 
message.

  Paul

> ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2] x86/hvm: make sure stdvga cache cannot be re-enabled

2015-11-05 Thread Paul Durrant

As soon as the cache is disabled, it will become out-of-sync with the
VGA device model and since no mechanism exists to acquire current VRAM
state from the device model, re-enabling it leads to stale data
being seen by the guest.

The problem was introduced by commit 3bbaaec0 ("x86/hvm: unify stdvga
mmio intercept with standard mmio intercept") and can be seen by
deliberately crashing a Windows guest; the BSOD output is corrupted.

This patch changes the existing 'cache' boolean in hvm_hw_stdvga into a
tri-state enum and only allows the state to move from 'uninitialized' to
'enabled'. Once the cache state becomes 'disabled' it will remain so for
the lifetime of the VM.

Signed-off-by: Paul Durrant 
Cc: Keir Fraser 
Cc: Jan Beulich 
Reviewed-by: Andrew Cooper 
---
 xen/arch/x86/hvm/save.c  |  2 +-
 xen/arch/x86/hvm/stdvga.c| 50 
 xen/include/asm-x86/hvm/io.h |  8 ++-
 3 files changed, 45 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/hvm/save.c b/xen/arch/x86/hvm/save.c
index 4660beb..f7d4999 100644
--- a/xen/arch/x86/hvm/save.c
+++ b/xen/arch/x86/hvm/save.c
@@ -73,7 +73,7 @@ int arch_hvm_load(struct domain *d, struct hvm_save_header 
*hdr)
 d->arch.hvm_domain.sync_tsc = rdtsc();
 
 /* VGA state is not saved/restored, so we nobble the cache. */
-d->arch.hvm_domain.stdvga.cache = 0;
+d->arch.hvm_domain.stdvga.cache = STDVGA_CACHE_DISABLED;
 
 return 0;
 }
diff --git a/xen/arch/x86/hvm/stdvga.c b/xen/arch/x86/hvm/stdvga.c
index 02a97f9..86c94d2 100644
--- a/xen/arch/x86/hvm/stdvga.c
+++ b/xen/arch/x86/hvm/stdvga.c
@@ -101,6 +101,37 @@ static void vram_put(struct hvm_hw_stdvga *s, void *p)
 unmap_domain_page(p);
 }
 
+static void stdvga_try_cache_enable(struct hvm_hw_stdvga *s)
+{
+/*
+ * Caching mode can only be enabled if the the cache has
+ * never been used before. As soon as it is disabled, it will
+ * become out-of-sync with the VGA device model and since no
+ * mechanism exists to acquire current VRAM state from the
+ * device model, re-enabling it would lead to stale data being
+ * seen by the guest.
+ */
+if ( s->cache != STDVGA_CACHE_UNINITIALIZED )
+return;
+
+gdprintk(XENLOG_INFO, "entering caching mode\n");
+s->cache = STDVGA_CACHE_ENABLED;
+}
+
+static void stdvga_cache_disable(struct hvm_hw_stdvga *s)
+{
+if ( s->cache != STDVGA_CACHE_ENABLED )
+return;
+
+gdprintk(XENLOG_INFO, "leaving caching mode\n");
+s->cache = STDVGA_CACHE_DISABLED;
+}
+
+static bool_t stdvga_cache_is_enabled(const struct hvm_hw_stdvga *s)
+{
+return s->cache == STDVGA_CACHE_ENABLED;
+}
+
 static int stdvga_outb(uint64_t addr, uint8_t val)
 {
 struct hvm_hw_stdvga *s = ¤t->domain->arch.hvm_domain.stdvga;
@@ -139,12 +170,8 @@ static int stdvga_outb(uint64_t addr, uint8_t val)
 
 if ( !prev_stdvga && s->stdvga )
 {
-/*
- * (Re)start caching of video buffer.
- * XXX TODO: In case of a restart the cache could be unsynced.
- */
-s->cache = 1;
-gdprintk(XENLOG_INFO, "entering stdvga and caching modes\n");
+gdprintk(XENLOG_INFO, "entering stdvga mode\n");
+stdvga_try_cache_enable(s);
 }
 else if ( prev_stdvga && !s->stdvga )
 {
@@ -441,7 +468,7 @@ static int stdvga_mem_write(const struct hvm_io_handler 
*handler,
 };
 struct hvm_ioreq_server *srv;
 
-if ( !s->cache || !s->stdvga )
+if ( !stdvga_cache_is_enabled(s) || !s->stdvga )
 goto done;
 
 /* Intercept mmio write */
@@ -515,15 +542,12 @@ static bool_t stdvga_mem_accept(const struct 
hvm_io_handler *handler,
  * not active since we can assert, when in stdvga mode, that writes
  * to VRAM have no side effect and thus we can try to buffer them.
  */
-if ( s->cache )
-{
-gdprintk(XENLOG_INFO, "leaving caching mode\n");
-s->cache = 0;
-}
+stdvga_cache_disable(s);
 
 goto reject;
 }
-else if ( p->dir == IOREQ_READ && (!s->cache || !s->stdvga) )
+else if ( p->dir == IOREQ_READ &&
+  (!stdvga_cache_is_enabled(s) || !s->stdvga) )
 goto reject;
 
 /* s->lock intentionally held */
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 8585a1f..ceefa2e 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -128,13 +128,19 @@ void hvm_dpci_eoi(struct domain *d, unsigned int 
guest_irq,
 void msix_write_completion(struct vcpu *);
 void msixtbl_init(struct domain *d);
 
+enum stdvga_cache_state {
+STDVGA_CACHE_UNINITIALIZED,
+STDVGA_CACHE_ENABLED,
+STDVGA_CACHE_DISABLED
+};
+
 struct hvm_hw_stdvga {
 uint8_t sr_index;
 uint8_t sr[8];
 uint8_t gr_index;
 uint8_t gr[9];
 bool_t stdvga;
-bool_t cache;
+enum stdvga_cache_state cache;
 uint32_t latch;
 struct page_info *vram_page[64];  /* shadow of 0

Re: [Xen-devel] [PATCH v9] run QEMU as non-root

2015-11-05 Thread Stefano Stabellini

On Tue, 3 Nov 2015, Ian Campbell wrote:
> On Tue, 2015-11-03 at 16:49 +, Ian Campbell wrote:
> > On Mon, 2015-11-02 at 12:30 +, Stefano Stabellini wrote:
> > > Try to use "xen-qemudepriv-domid$domid" first, then
> > > "xen-qemudepriv-shared" and root if everything else fails.
> > >
> > > The uids need to be manually created by the user or, more likely, by
> > > the
> > > xen package maintainer.
> > >
> > > Expose a device_model_user setting in libxl_domain_build_info, so that
> > > opinionated callers, such as libvirt, can set any user they like. Do
> > > not
> > > fall back to root if device_model_user is set. Users can also set
> > > device_model_user by hand in the xl domain config file.
> > >
> > > QEMU is going to setuid and setgid to the user ID and the group ID of
> > > the specified user, soon after initialization, before starting to deal
> > > with any guest IO.
> > >
> > > To actually secure QEMU when running in Dom0, we need at least to
> > > deprivilege the privcmd and xenstore interfaces, this is just the first
> > > step in that direction.
> > >
> > > Signed-off-by: Stefano Stabellini 
> >
> > Acked-by: Ian Campbell 
>
> There were some minor conflicts against some patches committed at the start
> of October. I had fixed them up (I think) but then I noticed
> that docs/misc/qemu-deprivilege.txt in my working tree wasn't actually
> committed.
>
> Since this patch refers to it, but didn't include it I checked before
> acking that it was already in tree some how, but didn't realise it wasn't
> actually committed (somehow, not sure how). Was it supposed to be in this
> patch or was it supposed to be in some earlier patch?
>
> In any case given something odd is clearly going on I don't want to just
> commit some random version of that doc which I just found in my working
> directory along with this patch. Please can you resubmit with that file
> included (or in a precursor patch).

Done, see v10


> Also please check the coding style of the comment in libxl.h, the "/*"
> should be by itself.

Sorry I forgot this change! Feel free to fix it as you commit if that's
OK for you.


> Thanks,
> Ian.
>
> >
> > (based on previous plus eyeballing only the changes from:
> > >  
> > > Changes in v9:
> > > - add a device_model_user option to the xl domain config file
> >
> > Ian.
> >
> > ___
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> ___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v10] run QEMU as non-root

2015-11-05 Thread Stefano Stabellini

Try to use "xen-qemuuser-domid$domid" first, then
"xen-qemuuser-shared" and root if everything else fails.

The uids need to be manually created by the user or, more likely, by the
xen package maintainer.

Expose a device_model_user setting in libxl_domain_build_info, so that
opinionated callers, such as libvirt, can set any user they like. Do not
fall back to root if device_model_user is set. Users can also set
device_model_user by hand in the xl domain config file.

QEMU is going to setuid and setgid to the user ID and the group ID of
the specified user, soon after initialization, before starting to deal
with any guest IO.

To actually secure QEMU when running in Dom0, we need at least to
deprivilege the privcmd and xenstore interfaces, this is just the first
step in that direction.

Signed-off-by: Stefano Stabellini 

---

Changes in v10:
- rebase
- git add docs/misc/qemu-deprivilege.txt
- fix commit message to reflect the names chosen (xen-qemudepriv ->
  xen-qemuuser)

Changes in v9:
- add a device_model_user option to the xl domain config file

Changes in v8:
- no need to pass the -runas option if the user requested for root
- return ERROR_FAIL from libxl__dm_runas_helper in case of errors
- return NULL from libxl__build_device_model_args_new if libxl__dm_runas_helper 
failed
- fix line too long
- remove setting errno
- replace retry goto loop, with a while loop
- const char * as argument to libxl__dm_runas_helper
- fix comment

Changes in v7:
- do not fall back to root if the user explicitly set
b_info->device_model_user.

Changes in v6:
- add device_model_user to libxl_domain_build_info
- improve doc
- improve wording in commit message

Changes in v5:
- improve wording in doc
- fix wording in warning message
- fix example in doc
- drop xen-qemudepriv-$domname

Changes in v4:
- rename qemu-deprivilege to qemu-deprivilege.txt
- add a note about qemu-deprivilege.txt to INSTALL
- instead of xen-qemudepriv-base + $domid, try xen-qemudepriv-domid$domid
- introduce libxl__dm_runas_helper to make the code nicer

Changes in v3:
- clarify doc
- handle errno == ERANGE
---
 INSTALL|7 +
 docs/man/xl.cfg.pod.5  |5 +++
 docs/misc/qemu-deprivilege.txt |   31 +++
 tools/libxl/libxl.h|5 +++
 tools/libxl/libxl_dm.c |   67 +++-
 tools/libxl/libxl_internal.h   |5 +++
 tools/libxl/libxl_types.idl|1 +
 tools/libxl/xl_cmdimpl.c   |3 ++
 8 files changed, 123 insertions(+), 1 deletion(-)
 create mode 100644 docs/misc/qemu-deprivilege.txt

diff --git a/INSTALL b/INSTALL
index 56e2950..b7e426c 100644
--- a/INSTALL
+++ b/INSTALL
@@ -304,6 +304,13 @@ systemctl enable xendomains.service
 systemctl enable xen-watchdog.service
 
 
+QEMU Deprivilege
+
+It is recommended to run QEMU as non-root.
+See docs/misc/qemu-deprivilege.txt for an explanation on what you need
+to do at installation time to run QEMU as a dedicated user.
+
+
 History of options
 ==
 
diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index b63846a..2aca8dd 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1825,6 +1825,11 @@ Pass additional arbitrary options on the device-model 
command line for
 an HVM device model only. Each element in the list is passed as an
 option to the device-model.
 
+=item B
+
+Run the device model as user "username", instead of
+xen-qemudepriv-domid$domid or xen-qemudepriv-shared or root.
+
 =back
 
 =head2 Keymaps
diff --git a/docs/misc/qemu-deprivilege.txt b/docs/misc/qemu-deprivilege.txt
new file mode 100644
index 000..dde74ab
--- /dev/null
+++ b/docs/misc/qemu-deprivilege.txt
@@ -0,0 +1,31 @@
+For security reasons, libxl tries to pass a non-root username to QEMU as
+argument. During initialization QEMU calls setuid and setgid with the
+user ID and the group ID of the user passed as argument.
+Libxl looks for the following users in this order:
+
+1) a user named "xen-qemuuser-domid$domid",
+Where $domid is the domid of the domain being created.
+This requires the reservation of 65535 uids from xen-qemuuser-domid1
+to xen-qemuuser-domid65535. To use this mechanism, you might want to
+create a large number of users at installation time. For example:
+
+for ((i=1; i<65536; i++))
+do
+adduser --no-create-home --system xen-qemuuser-domid$i
+done
+
+You might want to consider passing --group to adduser to create a new
+group for each new user.
+
+
+2) a user named "xen-qemuuser-shared"
+As a fall back if both 1) fails, libxl will use a single user for
+all QEMU instances. The user is named xen-qemuuser-shared. This is
+less secure but still better than running QEMU as root. Using this is as
+simple as creating just one more user on your host:
+
+adduser --no-create-home --system xen-qemuuser-shared
+
+
+3) root
+As a last resort, libxl will start QEMU as root.
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 168fedd..5edeb30

Re: [Xen-devel] [VOTE] Release cycle scheme

2015-11-05 Thread Tim Deegan

At 13:47 + on 02 Nov (1446472041), Wei Liu wrote:
> So I propose we use the following scheme:
> 
> - 6 months release cycle from unstable branch.
>   - 4 months development.
>   - 2 months freeze.
>   - Eat into next cycle if doesn't release on time.
> - Fixed cut-off date: the Fridays of the week in which the last day of
>   March and September falls.
> - No more freeze exception, but heads-up mails about freeze will be
>   sent a few weeks before hand.
> - Stable branch maintained for 18 months full support plus 18 months
>   security support. No mixed maintainership for stable trees.
> 
> Please vote to ack or nack this proposal.

This seems like a reasonable plan.  Since I'm not actively involved in
releases or large feature review, and I don't want to dictate things
that really only affect other people, I vote 0.

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] Hackathon 2016 Location Preferences

2015-11-05 Thread Lars Kurth

Hi all,

I wanted to do quick straw-poll regarding Hackathon Locations for next year. 
Before I do this though, I wanted to let you know that the 2016 Developer 
Summit will most likely be in Berlin in October (I am in the process of 
finalising space, budget and contract details which will need to be approved by 
the Advisory Board).

We do have two options for a Hackathon: China (either Shanghai, Hangzhou or 
Beijing - details TBC) and Cambridge, UK. We are still in the early planning 
phase and the budget for the Hackathon has not yet been approved. 

Do let me know of your preference, and I will see whether I can work with the 
vendor(s) who are willing to host the 2016 Hackathon and choose a location, 
which suits a majority of developers.

Best Regards
Lars


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v11 5/5] xen/arm: account for stolen ticks

2015-11-05 Thread Stefano Stabellini

Register the runstate_memory_area with the hypervisor.
Use pv_time_ops.steal_clock to account for stolen ticks.

Signed-off-by: Stefano Stabellini 

---

Changes in v4:
- don't use paravirt_steal_rq_enabled: we do not support retrieving
stolen ticks for vcpus other than one we are running on.

Changes in v3:
- use BUG_ON and smp_processor_id.
---
 arch/arm/xen/enlighten.c |   21 +
 1 file changed, 21 insertions(+)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index fc7ea52..15621b1 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -14,7 +14,10 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -79,6 +82,19 @@ int xen_unmap_domain_gfn_range(struct vm_area_struct *vma,
 }
 EXPORT_SYMBOL_GPL(xen_unmap_domain_gfn_range);
 
+static unsigned long long xen_stolen_accounting(int cpu)
+{
+   struct vcpu_runstate_info state;
+
+   BUG_ON(cpu != smp_processor_id());
+
+   xen_get_runstate_snapshot(&state);
+
+   WARN_ON(state.state != RUNSTATE_running);
+
+   return state.time[RUNSTATE_runnable] + state.time[RUNSTATE_offline];
+}
+
 static void xen_percpu_init(void)
 {
struct vcpu_register_vcpu_info info;
@@ -104,6 +120,8 @@ static void xen_percpu_init(void)
BUG_ON(err);
per_cpu(xen_vcpu, cpu) = vcpup;
 
+   xen_setup_runstate_info(cpu);
+
 after_register_vcpu_info:
enable_percpu_irq(xen_events_irq, 0);
put_cpu();
@@ -271,6 +289,9 @@ static int __init xen_guest_init(void)
 
register_cpu_notifier(&xen_cpu_notifier);
 
+   pv_time_ops.steal_clock = xen_stolen_accounting;
+   static_key_slow_inc(¶virt_steal_enabled);
+
return 0;
 }
 early_initcall(xen_guest_init);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v11 2/5] missing include asm/paravirt.h in cputime.c

2015-11-05 Thread Stefano Stabellini

Add include asm/paravirt.h to cputime.c, as steal_account_process_tick
calls paravirt_steal_clock, which is defined in asm/paravirt.h.

The ifdef CONFIG_PARAVIRT is necessary because not all archs have an
asm/paravirt.h to include.

Signed-off-by: Stefano Stabellini 
CC: mi...@redhat.com
CC: pet...@infradead.org

---

Changes in v11:
- add ifdef CONFIG_PARAVIRT to cputime.c, because not all architectures
  have an asm/paravirt.h header file to include
- drop the removal of ifdef CONFIG_PARAVIRT from kernel/sched/core.c for
  the same reason
---
 kernel/sched/cputime.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 8cbc3db..c7a27c4 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -5,6 +5,9 @@
 #include 
 #include 
 #include "sched.h"
+#ifdef CONFIG_PARAVIRT
+#include 
+#endif
 
 
 #ifdef CONFIG_IRQ_TIME_ACCOUNTING
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v11 3/5] arm: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and pv_time_ops

2015-11-05 Thread Stefano Stabellini

Introduce CONFIG_PARAVIRT and PARAVIRT_TIME_ACCOUNTING on ARM.

The only paravirt interface supported is pv_time_ops.steal_clock, so no
runtime pvops patching needed.

This allows us to make use of steal_account_process_tick for stolen
ticks accounting.

Signed-off-by: Stefano Stabellini 
Acked-by: Christopher Covington 
Acked-by: Ian Campbell 
CC: li...@arm.linux.org.uk
CC: will.dea...@arm.com
CC: n...@linaro.org
CC: marc.zyng...@arm.com
CC: c...@codeaurora.org
CC: a...@arndb.de
CC: o...@lixom.net

---

Changes in v10:
- replace "---help---" with "help"

Changes in v7:
- ifdef CONFIG_PARAVIRT the content of paravirt.h.

Changes in v3:
- improve commit description and Kconfig help text;
- no need to initialize pv_time_ops;
- add PARAVIRT_TIME_ACCOUNTING.
---
 arch/arm/Kconfig|   20 
 arch/arm/include/asm/paravirt.h |   20 
 arch/arm/kernel/Makefile|1 +
 arch/arm/kernel/paravirt.c  |   25 +
 4 files changed, 66 insertions(+)
 create mode 100644 arch/arm/include/asm/paravirt.h
 create mode 100644 arch/arm/kernel/paravirt.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index f1ed110..60be104 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1823,6 +1823,25 @@ config SWIOTLB
 config IOMMU_HELPER
def_bool SWIOTLB
 
+config PARAVIRT
+   bool "Enable paravirtualization code"
+   help
+ This changes the kernel so it can modify itself when it is run
+ under a hypervisor, potentially improving performance significantly
+ over full virtualization.
+
+config PARAVIRT_TIME_ACCOUNTING
+   bool "Paravirtual steal time accounting"
+   select PARAVIRT
+   default n
+   help
+ Select this option to enable fine granularity task steal time
+ accounting. Time spent executing other tasks in parallel with
+ the current vCPU is discounted from the vCPU power. To account for
+ that, there can be a small performance impact.
+
+ If in doubt, say N here.
+
 config XEN_DOM0
def_bool y
depends on XEN
@@ -1836,6 +1855,7 @@ config XEN
select ARCH_DMA_ADDR_T_64BIT
select ARM_PSCI
select SWIOTLB_XEN
+   select PARAVIRT
help
  Say Y if you want to run Linux in a Virtual Machine on Xen on ARM.
 
diff --git a/arch/arm/include/asm/paravirt.h b/arch/arm/include/asm/paravirt.h
new file mode 100644
index 000..8435ff59
--- /dev/null
+++ b/arch/arm/include/asm/paravirt.h
@@ -0,0 +1,20 @@
+#ifndef _ASM_ARM_PARAVIRT_H
+#define _ASM_ARM_PARAVIRT_H
+
+#ifdef CONFIG_PARAVIRT
+struct static_key;
+extern struct static_key paravirt_steal_enabled;
+extern struct static_key paravirt_steal_rq_enabled;
+
+struct pv_time_ops {
+   unsigned long long (*steal_clock)(int cpu);
+};
+extern struct pv_time_ops pv_time_ops;
+
+static inline u64 paravirt_steal_clock(int cpu)
+{
+   return pv_time_ops.steal_clock(cpu);
+}
+#endif
+
+#endif
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index af9e59b..3e6e937 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -81,6 +81,7 @@ obj-$(CONFIG_VDSO)+= vdso.o
 ifneq ($(CONFIG_ARCH_EBSA110),y)
   obj-y+= io.o
 endif
+obj-$(CONFIG_PARAVIRT) += paravirt.o
 
 head-y := head$(MMUEXT).o
 obj-$(CONFIG_DEBUG_LL) += debug.o
diff --git a/arch/arm/kernel/paravirt.c b/arch/arm/kernel/paravirt.c
new file mode 100644
index 000..53f371e
--- /dev/null
+++ b/arch/arm/kernel/paravirt.c
@@ -0,0 +1,25 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Copyright (C) 2013 Citrix Systems
+ *
+ * Author: Stefano Stabellini 
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+struct static_key paravirt_steal_enabled;
+struct static_key paravirt_steal_rq_enabled;
+
+struct pv_time_ops pv_time_ops;
+EXPORT_SYMBOL_GPL(pv_time_ops);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Getting the XSAVE size from userspace

2015-11-05 Thread Razvan Cojocaru

On 11/05/2015 04:05 PM, Andrew Cooper wrote:
> On 05/11/15 12:26, Razvan Cojocaru wrote:
>> On 11/05/2015 01:44 PM, Andrew Cooper wrote:
>>> On 05/11/15 11:35, Andrei LUTAS wrote:
 The use-case is the following: whenever an EPT violation is triggered
 inside a monitored VM, the introspection logic needs to know how many
 bytes were accessed (read/written). This is done by inspecting the
 faulting instruction and directly inferring the size, which is not
 straight-forward for XSAVE/XRSTOR family. Using the maximum possible
 size is wrong, as in any given moment the OS may or may not desire to
 XSAVE/XRSTOR the entire state (and thinking that the instruction tries
 to access more than it actually does may yield undesired effects).
 Therefore, the size needed for the currently enabled features of the
 monitored guest is required instead. Normally, it could be done by
 running CPUID with eax = 0xD and ecx = i, where i >= 2 and XCR0[i] is
 1 (XCR0 belongs to the monitored guest), but I am unsure if using
 CPUID this way would be safe/desired: will Xen expose the same CPUID
 features, for XSAVE related functionality, on all VMs? (using XCPUID
 with eax = 0xD and ecx = 0 would give us the needed size for the SVA,
 and like I said, using the maximum size would not be safe, even if
 it's the same across all VMs on a given host). Also, I'm unsure how
 this would get along with migration...
>>> Hmm yes - there is no way to do this currently.
>>>
>>> Xen's CPUID handling for xsave related things is broken in levelling and
>>> migration scenarios, which is why it is *still* disabled by default in
>>> XenServer.
>>>
>>> I am working on fixing it, and will take this usecase into account
>>> (although I think I had already included enough for this usecase to work).
>>>
>>> At the point of the xsave/xrestor trap, you need to know xcr0 and be
>>> able to perfom a cpuid instruction in the context of a target domain, to
>>> make use of 0xD[0].ebx to get the "current size based on xcr0".
>> So then the closest thing to what we need would be to add a size field
>> to struct hvm_hw_cpu_xsave, and just assign the size variable to it in
>> hvm_save_cpu_xsave_states (migration aside)?
>>
>> 2130 static int hvm_save_cpu_xsave_states(struct domain *d,
>> hvm_domain_context_t *h)
>> 2131 {
>> 2132 struct vcpu *v;
>> 2133 struct hvm_hw_cpu_xsave *ctxt;
>> 2134
>> 2135 if ( !cpu_has_xsave )
>> 2136 return 0;   /* do nothing */
>> 2137
>> 2138 for_each_vcpu ( d, v )
>> 2139 {
>> 2140 unsigned int size = HVM_CPU_XSAVE_SIZE(v->arch.xcr0_accum);
>> 2141
>> 2142 if ( !xsave_enabled(v) )
>> 2143 continue;
>> 2144 if ( _hvm_init_entry(h, CPU_XSAVE_CODE, v->vcpu_id, size) )
>> 2145 return 1;
>> 2146 ctxt = (struct hvm_hw_cpu_xsave *)&h->data[h->cur];
>> 2147 h->cur += size;
>> 2148
>> 2149 ctxt->xfeature_mask = xfeature_mask;
>> 2150 ctxt->xcr0 = v->arch.xcr0;
>> 2151 ctxt->xcr0_accum = v->arch.xcr0_accum;
>> 2152 memcpy(&ctxt->save_area, v->arch.xsave_area,
>> 2153size - offsetof(struct hvm_hw_cpu_xsave, save_area));
>> 2154 }
>> 2155
>> 2156 return 0;
>> 2157 }
> 
> I don't see any difference between this pasted code and the current
> hvm_save_cpu_xsave_states().  What have you changed?

I haven't changed anything, I was just pointing out what code I'm
referring to (which size variable I'm talking about), sorry for not
being as clear as possible.

> You can't use this size value, and it is the accumulated xcr0 over the
> life of the VM, not the xcr0 in use at the time of the intercepted
> instruction.

OK.

> You also can't blindly modify the ctxt structure, or you will break
> migration.

Well, yes, not blindly, that assumes that something like a patch for
mainline is agreed upon, or that migration is disabled for guests that
need this, and so on.

> The xcr0 -> size mapping is static, and won't change going forwards. 
> Your best bet is just to query each one and stash all the results.

OK.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v11 4/5] arm64: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and pv_time_ops

2015-11-05 Thread Stefano Stabellini

Introduce CONFIG_PARAVIRT and PARAVIRT_TIME_ACCOUNTING on ARM64.
Necessary duplication of paravirt.h and paravirt.c with ARM.

The only paravirt interface supported is pv_time_ops.steal_clock, so no
runtime pvops patching needed.

This allows us to make use of steal_account_process_tick for stolen
ticks accounting.

Signed-off-by: Stefano Stabellini 
Acked-by: Marc Zyngier 
CC: will.dea...@arm.com
CC: n...@linaro.org
CC: marc.zyng...@arm.com
CC: c...@codeaurora.org
CC: a...@arndb.de
CC: o...@lixom.net
CC: catalin.mari...@arm.com

---

Changes in v10:
- replace "---help---" with "help"

Changes in v7:
- ifdef CONFIG_PARAVIRT the content of paravirt.h.
---
 arch/arm64/Kconfig|   20 
 arch/arm64/include/asm/paravirt.h |   20 
 arch/arm64/kernel/Makefile|1 +
 arch/arm64/kernel/paravirt.c  |   25 +
 4 files changed, 66 insertions(+)
 create mode 100644 arch/arm64/include/asm/paravirt.h
 create mode 100644 arch/arm64/kernel/paravirt.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7b10647..659e286 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -533,6 +533,25 @@ config SECCOMP
  and the task is only allowed to execute a few safe syscalls
  defined by each seccomp mode.
 
+config PARAVIRT
+   bool "Enable paravirtualization code"
+   help
+ This changes the kernel so it can modify itself when it is run
+ under a hypervisor, potentially improving performance significantly
+ over full virtualization.
+
+config PARAVIRT_TIME_ACCOUNTING
+   bool "Paravirtual steal time accounting"
+   select PARAVIRT
+   default n
+   help
+ Select this option to enable fine granularity task steal time
+ accounting. Time spent executing other tasks in parallel with
+ the current vCPU is discounted from the vCPU power. To account for
+ that, there can be a small performance impact.
+
+ If in doubt, say N here.
+
 config XEN_DOM0
def_bool y
depends on XEN
@@ -541,6 +560,7 @@ config XEN
bool "Xen guest support on ARM64"
depends on ARM64 && OF
select SWIOTLB_XEN
+   select PARAVIRT
help
  Say Y if you want to run Linux in a Virtual Machine on Xen on ARM64.
 
diff --git a/arch/arm64/include/asm/paravirt.h 
b/arch/arm64/include/asm/paravirt.h
new file mode 100644
index 000..fd5f428
--- /dev/null
+++ b/arch/arm64/include/asm/paravirt.h
@@ -0,0 +1,20 @@
+#ifndef _ASM_ARM64_PARAVIRT_H
+#define _ASM_ARM64_PARAVIRT_H
+
+#ifdef CONFIG_PARAVIRT
+struct static_key;
+extern struct static_key paravirt_steal_enabled;
+extern struct static_key paravirt_steal_rq_enabled;
+
+struct pv_time_ops {
+   unsigned long long (*steal_clock)(int cpu);
+};
+extern struct pv_time_ops pv_time_ops;
+
+static inline u64 paravirt_steal_clock(int cpu)
+{
+   return pv_time_ops.steal_clock(cpu);
+}
+#endif
+
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 474691f..ca9fbe1 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -41,6 +41,7 @@ arm64-obj-$(CONFIG_EFI)   += efi.o 
efi-entry.stub.o
 arm64-obj-$(CONFIG_PCI)+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)   += armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)   += acpi.o
+arm64-obj-$(CONFIG_PARAVIRT)   += paravirt.o
 
 obj-y  += $(arm64-obj-y) vdso/
 obj-m  += $(arm64-obj-m)
diff --git a/arch/arm64/kernel/paravirt.c b/arch/arm64/kernel/paravirt.c
new file mode 100644
index 000..53f371e
--- /dev/null
+++ b/arch/arm64/kernel/paravirt.c
@@ -0,0 +1,25 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Copyright (C) 2013 Citrix Systems
+ *
+ * Author: Stefano Stabellini 
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+struct static_key paravirt_steal_enabled;
+struct static_key paravirt_steal_rq_enabled;
+
+struct pv_time_ops pv_time_ops;
+EXPORT_SYMBOL_GPL(pv_time_ops);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v11 0/5] xen/arm/arm64: CONFIG_PARAVIRT and stolen ticks accounting

2015-11-05 Thread Stefano Stabellini

Hi all,

I dusted off this series from Jan 2014. Patch #2 and #3 still need an ack.


This patch series introduces stolen ticks accounting for Xen on ARM and
ARM64.  Stolen ticks are clocksource ticks that have been "stolen" from
the cpu, typically because Linux is running in a virtual machine and the
vcpu has been descheduled.  To account for these ticks we introduce
CONFIG_PARAVIRT and pv_time_ops so that we can make use of:

kernel/sched/cputime.c:steal_account_process_tick


Changes in v11:
- add ifdef CONFIG_PARAVIRT to kernel/sched/cputime.c, because not all
  architectures have an asm/paravirt.h header file to include
- drop the removal of ifdef CONFIG_PARAVIRT from kernel/sched/core.c for
  the same reason


Stefano Stabellini (5):
  xen: move xen_setup_runstate_info and get_runstate_snapshot to 
drivers/xen/time.c
  missing include asm/paravirt.h in cputime.c
  arm: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and pv_time_ops
  arm64: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and pv_time_ops
  xen/arm: account for stolen ticks

 arch/arm/Kconfig  |   20 
 arch/arm/include/asm/paravirt.h   |   20 
 arch/arm/kernel/Makefile  |1 +
 arch/arm/kernel/paravirt.c|   25 ++
 arch/arm/xen/enlighten.c  |   21 +
 arch/arm64/Kconfig|   20 
 arch/arm64/include/asm/paravirt.h |   20 
 arch/arm64/kernel/Makefile|1 +
 arch/arm64/kernel/paravirt.c  |   25 ++
 arch/x86/xen/time.c   |   76 +--
 drivers/xen/Makefile  |2 +-
 drivers/xen/time.c|   91 +
 include/xen/xen-ops.h |5 ++
 kernel/sched/cputime.c|3 ++
 14 files changed, 254 insertions(+), 76 deletions(-)
 create mode 100644 arch/arm/include/asm/paravirt.h
 create mode 100644 arch/arm/kernel/paravirt.c
 create mode 100644 arch/arm64/include/asm/paravirt.h
 create mode 100644 arch/arm64/kernel/paravirt.c
 create mode 100644 drivers/xen/time.c



Cheers,

Stefano

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v11 1/5] xen: move xen_setup_runstate_info and get_runstate_snapshot to drivers/xen/time.c

2015-11-05 Thread Stefano Stabellini

Signed-off-by: Stefano Stabellini 
Acked-by: Ian Campbell 
Reviewed-by: Konrad Rzeszutek Wilk 
CC: konrad.w...@oracle.com

---

Changes in v10:
- rebase
---
 arch/x86/xen/time.c   |   76 +
 drivers/xen/Makefile  |2 +-
 drivers/xen/time.c|   91 +
 include/xen/xen-ops.h |5 +++
 4 files changed, 98 insertions(+), 76 deletions(-)
 create mode 100644 drivers/xen/time.c

diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index f1ba6a0..041d4cd 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -32,86 +32,12 @@
 #define TIMER_SLOP 10
 #define NS_PER_TICK(10LL / HZ)
 
-/* runstate info updated by Xen */
-static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate);
-
 /* snapshots of runstate info */
 static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate_snapshot);
 
 /* unused ns of stolen time */
 static DEFINE_PER_CPU(u64, xen_residual_stolen);
 
-/* return an consistent snapshot of 64-bit time/counter value */
-static u64 get64(const u64 *p)
-{
-   u64 ret;
-
-   if (BITS_PER_LONG < 64) {
-   u32 *p32 = (u32 *)p;
-   u32 h, l;
-
-   /*
-* Read high then low, and then make sure high is
-* still the same; this will only loop if low wraps
-* and carries into high.
-* XXX some clean way to make this endian-proof?
-*/
-   do {
-   h = p32[1];
-   barrier();
-   l = p32[0];
-   barrier();
-   } while (p32[1] != h);
-
-   ret = (((u64)h) << 32) | l;
-   } else
-   ret = *p;
-
-   return ret;
-}
-
-/*
- * Runstate accounting
- */
-static void get_runstate_snapshot(struct vcpu_runstate_info *res)
-{
-   u64 state_time;
-   struct vcpu_runstate_info *state;
-
-   BUG_ON(preemptible());
-
-   state = this_cpu_ptr(&xen_runstate);
-
-   /*
-* The runstate info is always updated by the hypervisor on
-* the current CPU, so there's no need to use anything
-* stronger than a compiler barrier when fetching it.
-*/
-   do {
-   state_time = get64(&state->state_entry_time);
-   barrier();
-   *res = *state;
-   barrier();
-   } while (get64(&state->state_entry_time) != state_time);
-}
-
-/* return true when a vcpu could run but has no real cpu to run on */
-bool xen_vcpu_stolen(int vcpu)
-{
-   return per_cpu(xen_runstate, vcpu).state == RUNSTATE_runnable;
-}
-
-void xen_setup_runstate_info(int cpu)
-{
-   struct vcpu_register_runstate_memory_area area;
-
-   area.addr.v = &per_cpu(xen_runstate, cpu);
-
-   if (HYPERVISOR_vcpu_op(VCPUOP_register_runstate_memory_area,
-  cpu, &area))
-   BUG();
-}
-
 static void do_stolen_accounting(void)
 {
struct vcpu_runstate_info state;
@@ -119,7 +45,7 @@ static void do_stolen_accounting(void)
s64 runnable, offline, stolen;
cputime_t ticks;
 
-   get_runstate_snapshot(&state);
+   xen_get_runstate_snapshot(&state);
 
WARN_ON(state.state != RUNSTATE_running);
 
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index aa8a7f7..9b7a35c 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -1,6 +1,6 @@
 obj-$(CONFIG_HOTPLUG_CPU)  += cpu_hotplug.o
 obj-$(CONFIG_X86)  += fallback.o
-obj-y  += grant-table.o features.o balloon.o manage.o preempt.o
+obj-y  += grant-table.o features.o balloon.o manage.o preempt.o time.o
 obj-y  += events/
 obj-y  += xenbus/
 
diff --git a/drivers/xen/time.c b/drivers/xen/time.c
new file mode 100644
index 000..433fe24
--- /dev/null
+++ b/drivers/xen/time.c
@@ -0,0 +1,91 @@
+/*
+ * Xen stolen ticks accounting.
+ */
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* runstate info updated by Xen */
+static DEFINE_PER_CPU(struct vcpu_runstate_info, xen_runstate);
+
+/* return an consistent snapshot of 64-bit time/counter value */
+static u64 get64(const u64 *p)
+{
+   u64 ret;
+
+   if (BITS_PER_LONG < 64) {
+   u32 *p32 = (u32 *)p;
+   u32 h, l;
+
+   /*
+* Read high then low, and then make sure high is
+* still the same; this will only loop if low wraps
+* and carries into high.
+* XXX some clean way to make this endian-proof?
+*/
+   do {
+   h = p32[1];
+   barrier();
+   l = p32[0];
+   barrier();
+   } while (p32[1] != h);
+
+   ret = (((u64)h) << 32) | l;
+   } else
+   ret = *p;
+
+

Re: [Xen-devel] [PATCH 1/2] rwlock: add per-cpu reader-writer locks

2015-11-05 Thread Malcolm Crossley

On 05/11/15 13:48, Marcos E. Matsunaga wrote:
> Hi Malcolm,
> 
> I tried your patches against staging yesterday and as soon as I started a 
> guest, it panic. I have
> lock_profile enabled and applied your patches against:

I tested with a non debug version of Xen (because I was analysing the 
performance of Xen) and thus
those ASSERTS were never run.

The ASSERTS can be safely removed, the rwlock behaviour is slightly different 
in that it's possible
for a writer to hold the write lock whilst a reader is progressing through the 
read critical
section, this is safe because the writer is waiting for the percpu variables to 
clear before
actually progressing through it's own critical section.

I have an updated version of the patch series which fixes this. Do you want me 
to post it or are you
happy to remove the ASSERTS yourself ( or switch to non-debug build of Xen)

Sorry for not catching this before it hit the list.

Malcolm

> 
> 6f04de658574833688c3f9eab310e7834d56a9c0 x86: cleanup of early cpuid handling
> 
> 
> 
> (XEN) HVM1 save: CPU
> (XEN) HVM1 save: PIC
> (XEN) HVM1 save: IOAPIC
> (XEN) HVM1 save: LAPIC
> (XEN) HVM1 save: LAPIC_REGS
> (XEN) HVM1 save: PCI_IRQ
> (XEN) HVM1 save: ISA_IRQ
> (XEN) HVM1 save: PCI_LINK
> (XEN) HVM1 save: PIT
> (XEN) HVM1 save: RTC
> (XEN) HVM1 save: HPET
> (XEN) HVM1 save: PMTIMER
> (XEN) HVM1 save: MTRR
> (XEN) HVM1 save: VIRIDIAN_DOMAIN
> (XEN) HVM1 save: CPU_XSAVE
> (XEN) HVM1 save: VIRIDIAN_VCPU
> (XEN) HVM1 save: VMCE_VCPU
> (XEN) HVM1 save: TSC_ADJUST
> (XEN) HVM1 restore: CPU 0
> [  394.163143] loop: module loaded
> (XEN) Assertion 'rw_is_locked(&t->lock)' failed at grant_table.c:215
> (XEN) [ Xen-4.7-unstable  x86_64  debug=y  Tainted:C ]
> (XEN) CPU:0
> (XEN) RIP:e008:[] do_grant_table_op+0x63f/0x2e04
> (XEN) RFLAGS: 00010246   CONTEXT: hypervisor (d0v0)
> (XEN) rax:    rbx: 83400f9dc9e0   rcx: 
> (XEN) rdx: 0001   rsi: 82d080342b10   rdi: 83400819b784
> (XEN) rbp: 8300774ffef8   rsp: 8300774ffdf8   r8: 0002
> (XEN) r9:  0002   r10: 0002   r11: 
> (XEN) r12:    r13:    r14: 83400819b780
> (XEN) r15: 83400f9d   cr0: 80050033   cr4: 001526e0
> (XEN) cr3: 01007f613000   cr2: 8800746182b8
> (XEN) ds:    es:    fs:    gs:    ss: e010   cs: e008
> (XEN) Xen stack trace from rsp=8300774ffdf8:
> (XEN)8300774ffe08 82d0 8300774ffef8 82d08017fc9b
> (XEN)82d080342b28 83400f9d8600 82d080342b10 
> (XEN)83400f9dca20 8321 834008188000 0001
> (XEN)0001772ee000 8801e98d03e0 8300774ffe88 
> (XEN) 8300774fff18 0021d0269c10 0001001a
> (XEN)0001  0246 7ff7de45a407
> (XEN)0100 7ff7de45a407 0033 8300772ee000
> (XEN)8801eb0e3c00 880004bf57e8 8801e98d03e0 8801eb0a5938
> (XEN)7cff88b000c7 82d08023d952 8100128a 0014
> (XEN) 0001 8801f6e18388 81d3d740
> (XEN)8801efb7bd40 88000542e780 0282 
> (XEN)8801e98d03a0 8801efe07000 0014 8100128a
> (XEN)0001 8801e98d03e0  00010100
> (XEN)8100128a e033 0282 8801efb7bce0
> (XEN)e02b   
> (XEN)  8300772ee000 
> (XEN)
> (XEN) Xen call trace:
> (XEN)[] do_grant_table_op+0x63f/0x2e04
> (XEN)[] lstar_enter+0xe2/0x13c
> (XEN)
> (XEN)
> (XEN) 
> (XEN) Panic on CPU 0:
> (XEN) Assertion 'rw_is_locked(&t->lock)' failed at grant_table.c:215
> (XEN) 
> (XEN)
> (XEN) Manual reset required ('noreboot' specified)
> 
> 
> Thanks for your help.
> 
> On 11/03/2015 12:58 PM, Malcolm Crossley wrote:
>> Per-cpu read-write locks allow for the fast path read case to have low 
>> overhead
>> by only setting/clearing a per-cpu variable for using the read lock.
>> The per-cpu read fast path also avoids locked compare swap operations which 
>> can
>> be particularly slow on coherent multi-socket systems, particularly if there 
>> is
>> heavy usage of the read lock itself.
>>
>> The per-cpu reader-writer lock uses a global variable to control the read 
>> lock
>> fast path. This allows a writer to disable the fast path and ensure the 
>> readers
>> use the underlying read-write lock implementation.
>>
>> Once the writer has taken the write lock and disabled the fast path, it must
>> poll the per-cpu variable for all CPU's which have ente

Re: [Xen-devel] [xen-unstable test] 63540: regressions - FAIL

2015-11-05 Thread Ian Campbell

On Thu, 2015-11-05 at 03:49 -0700, Jan Beulich wrote:
> > > > On 05.11.15 at 04:01,  wrote:
> > flight 63540 xen-unstable real [real]
> > http://logs.test-lab.xenproject.org/osstest/logs/63540/ 
> > 
> > Regressions :-(
> > 
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> >  test-amd64-amd64-xl-qemut-winxpsp3  6 xen-bootfail
> > REGR. vs. 63475
> 
> Hmm, did there something go wrong during install? The first boot
> after install appears to be a kernel booted natively, and then
> nothing else.

This is commonly a sign that the host has forgotten its boot order, the
merlot machines have some form for this (merlot0 is currently out of
rotation for this reason).

Ian, do we need to look into merlot1 again too?

BTW the osstest-admin@ alias goes to real people (Ian and myself) so it
is useful to keep it in the reply in cases like this.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v7 16/32] xen/x86: allow disabling the pmtimer

2015-11-05 Thread Andrew Cooper

On 04/11/15 16:17, Jan Beulich wrote:
 On 04.11.15 at 17:05,  wrote:
>> El 03/11/15 a les 13.41, Jan Beulich ha escrit:
>> On 03.11.15 at 11:57,  wrote:
 On 03/11/15 07:21, Jan Beulich wrote:
 On 30.10.15 at 16:36,  wrote:
>> On 30/10/15 13:16, Jan Beulich wrote:
>> On 30.10.15 at 13:50,  wrote:
 El 14/10/15 a les 16.37, Jan Beulich ha escrit:
 On 02.10.15 at 17:48,  wrote:
>> Signed-off-by: Roger Pau Monné 
>> Cc: Jan Beulich 
>> Cc: Andrew Cooper 
>> ---
>> Changes since v6:
>>  - Return ENODEV in pmtimer_load if the timer is disabled.
>>  - hvm_acpi_power_button and hvm_acpi_sleep_button become noops if 
>> the
>>pmtimer is disabled.
> But how are those two features connected? I don't think you can
> assume absence of a PM block just because there's no PM timer.
> Or if you want to tie them together for now, the predicate needs
> to be renamed.
>
>>  - Return ENODEV if pmtimer_change_ioport is called with the pmtimer
>>disabled.
> Same here.
 What about changing XEN_X86_EMU_PMTIMER into XEN_X86_EMU_PM and this
 flags disables all PM stuff?
>>> Ah, right, that's a reasonable option.
>> It still might be a nice idea to split them in two, given future work.
>>
>> To support hotplug properly (cpu, ram and pci), Xen needs to inject
>> GPEs, which comes from part of the PM infrastructure.  To support PCI
>> devices in the future without the whole PM infrastructure, it would be
>> nice to keep the split.
> Coming back to this - I'm not sure: The hotplug aspect as you
> mention it should matter for Dom0 only. DomU could (and perhaps
> should) use a PV interface instead.
 I disagree.

 All PVH guests should use the same mechanism; making a split between
 dom0 and domU will only make our lives harder.

 Where reasonable, we should follow what happens on native; one of the
 underlying points of PVH is to have less of an impact on the guest
 side.  In some cases it is indeed nasty, but has the advantage of being
 well understood.
>>> What meaning would ACPI have to a PVH DomU?
>>>
> So I'd like to suggest quite the opposite: Don't call the thing PM,
> but make it more general and call it ACPI. And instead of
> separating HPET, we might have this fall under ACPI as well, or
> we might have a second TIMER flag, requiring both to be set
> for there to be a HPET and PMTMR. This leaves open the option
> of Dom0 getting ACPI enabled (despite this then being "real",
> not emulated ACPI), but TIMER left off.
 An HPET can exist independently of other features such as ACPI.  It
 should have its own option.
>>> Without ACPI there's no defined way to discover it. Doing what
>>> Linux does - applying chipset knowledge - won't work on PVH either,
>>> because there's no emulated chipset. Which would leave scanning
>>> physical memory, but if there is none, none can be found.
>>>
 +1 to having an ACPI option, but as indicated above, I expect it to be
 used in the longterm even for domU.
>>> Again - why and how?
>> I think that at this point in the design it's not so important to have
>> all the XEN_X86_EMU_* properly defined. This is not a public interface,
>> so we can expand/reduce them whenever we want. Would it be fine, for the
>> time being to just have a XEN_X86_EMU_PM and control both the PM and the
>> PMTMR?
> I think so, yes.

Also +1 for now.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/2] rwlock: add per-cpu reader-writer locks

2015-11-05 Thread Marcos E. Matsunaga


Hi Malcolm,

If you can post the updated patches, that would be great. I think it 
would be better for me to test with your update.


Thanks again.

On 11/05/2015 10:20 AM, Malcolm Crossley wrote:

On 05/11/15 13:48, Marcos E. Matsunaga wrote:

Hi Malcolm,

I tried your patches against staging yesterday and as soon as I started a 
guest, it panic. I have
lock_profile enabled and applied your patches against:

I tested with a non debug version of Xen (because I was analysing the 
performance of Xen) and thus
those ASSERTS were never run.

The ASSERTS can be safely removed, the rwlock behaviour is slightly different 
in that it's possible
for a writer to hold the write lock whilst a reader is progressing through the 
read critical
section, this is safe because the writer is waiting for the percpu variables to 
clear before
actually progressing through it's own critical section.

I have an updated version of the patch series which fixes this. Do you want me 
to post it or are you
happy to remove the ASSERTS yourself ( or switch to non-debug build of Xen)

Sorry for not catching this before it hit the list.

Malcolm


6f04de658574833688c3f9eab310e7834d56a9c0 x86: cleanup of early cpuid handling



(XEN) HVM1 save: CPU
(XEN) HVM1 save: PIC
(XEN) HVM1 save: IOAPIC
(XEN) HVM1 save: LAPIC
(XEN) HVM1 save: LAPIC_REGS
(XEN) HVM1 save: PCI_IRQ
(XEN) HVM1 save: ISA_IRQ
(XEN) HVM1 save: PCI_LINK
(XEN) HVM1 save: PIT
(XEN) HVM1 save: RTC
(XEN) HVM1 save: HPET
(XEN) HVM1 save: PMTIMER
(XEN) HVM1 save: MTRR
(XEN) HVM1 save: VIRIDIAN_DOMAIN
(XEN) HVM1 save: CPU_XSAVE
(XEN) HVM1 save: VIRIDIAN_VCPU
(XEN) HVM1 save: VMCE_VCPU
(XEN) HVM1 save: TSC_ADJUST
(XEN) HVM1 restore: CPU 0
[  394.163143] loop: module loaded
(XEN) Assertion 'rw_is_locked(&t->lock)' failed at grant_table.c:215
(XEN) [ Xen-4.7-unstable  x86_64  debug=y  Tainted:C ]
(XEN) CPU:0
(XEN) RIP:e008:[] do_grant_table_op+0x63f/0x2e04
(XEN) RFLAGS: 00010246   CONTEXT: hypervisor (d0v0)
(XEN) rax:    rbx: 83400f9dc9e0   rcx: 
(XEN) rdx: 0001   rsi: 82d080342b10   rdi: 83400819b784
(XEN) rbp: 8300774ffef8   rsp: 8300774ffdf8   r8: 0002
(XEN) r9:  0002   r10: 0002   r11: 
(XEN) r12:    r13:    r14: 83400819b780
(XEN) r15: 83400f9d   cr0: 80050033   cr4: 001526e0
(XEN) cr3: 01007f613000   cr2: 8800746182b8
(XEN) ds:    es:    fs:    gs:    ss: e010   cs: e008
(XEN) Xen stack trace from rsp=8300774ffdf8:
(XEN)8300774ffe08 82d0 8300774ffef8 82d08017fc9b
(XEN)82d080342b28 83400f9d8600 82d080342b10 
(XEN)83400f9dca20 8321 834008188000 0001
(XEN)0001772ee000 8801e98d03e0 8300774ffe88 
(XEN) 8300774fff18 0021d0269c10 0001001a
(XEN)0001  0246 7ff7de45a407
(XEN)0100 7ff7de45a407 0033 8300772ee000
(XEN)8801eb0e3c00 880004bf57e8 8801e98d03e0 8801eb0a5938
(XEN)7cff88b000c7 82d08023d952 8100128a 0014
(XEN) 0001 8801f6e18388 81d3d740
(XEN)8801efb7bd40 88000542e780 0282 
(XEN)8801e98d03a0 8801efe07000 0014 8100128a
(XEN)0001 8801e98d03e0  00010100
(XEN)8100128a e033 0282 8801efb7bce0
(XEN)e02b   
(XEN)  8300772ee000 
(XEN)
(XEN) Xen call trace:
(XEN)[] do_grant_table_op+0x63f/0x2e04
(XEN)[] lstar_enter+0xe2/0x13c
(XEN)
(XEN)
(XEN) 
(XEN) Panic on CPU 0:
(XEN) Assertion 'rw_is_locked(&t->lock)' failed at grant_table.c:215
(XEN) 
(XEN)
(XEN) Manual reset required ('noreboot' specified)


Thanks for your help.

On 11/03/2015 12:58 PM, Malcolm Crossley wrote:

Per-cpu read-write locks allow for the fast path read case to have low overhead
by only setting/clearing a per-cpu variable for using the read lock.
The per-cpu read fast path also avoids locked compare swap operations which can
be particularly slow on coherent multi-socket systems, particularly if there is
heavy usage of the read lock itself.

The per-cpu reader-writer lock uses a global variable to control the read lock
fast path. This allows a writer to disable the fast path and ensure the readers
use the underlying read-write lock implementation.

Once the writer has taken the write lock and disabled the fast path, it must
poll the per-cpu variable for all CPU's which have entered the

Re: [Xen-devel] Getting the XSAVE size from userspace

2015-11-05 Thread Andrew Cooper

On 05/11/15 12:26, Razvan Cojocaru wrote:
> On 11/05/2015 01:44 PM, Andrew Cooper wrote:
>> On 05/11/15 11:35, Andrei LUTAS wrote:
>>> The use-case is the following: whenever an EPT violation is triggered
>>> inside a monitored VM, the introspection logic needs to know how many
>>> bytes were accessed (read/written). This is done by inspecting the
>>> faulting instruction and directly inferring the size, which is not
>>> straight-forward for XSAVE/XRSTOR family. Using the maximum possible
>>> size is wrong, as in any given moment the OS may or may not desire to
>>> XSAVE/XRSTOR the entire state (and thinking that the instruction tries
>>> to access more than it actually does may yield undesired effects).
>>> Therefore, the size needed for the currently enabled features of the
>>> monitored guest is required instead. Normally, it could be done by
>>> running CPUID with eax = 0xD and ecx = i, where i >= 2 and XCR0[i] is
>>> 1 (XCR0 belongs to the monitored guest), but I am unsure if using
>>> CPUID this way would be safe/desired: will Xen expose the same CPUID
>>> features, for XSAVE related functionality, on all VMs? (using XCPUID
>>> with eax = 0xD and ecx = 0 would give us the needed size for the SVA,
>>> and like I said, using the maximum size would not be safe, even if
>>> it's the same across all VMs on a given host). Also, I'm unsure how
>>> this would get along with migration...
>> Hmm yes - there is no way to do this currently.
>>
>> Xen's CPUID handling for xsave related things is broken in levelling and
>> migration scenarios, which is why it is *still* disabled by default in
>> XenServer.
>>
>> I am working on fixing it, and will take this usecase into account
>> (although I think I had already included enough for this usecase to work).
>>
>> At the point of the xsave/xrestor trap, you need to know xcr0 and be
>> able to perfom a cpuid instruction in the context of a target domain, to
>> make use of 0xD[0].ebx to get the "current size based on xcr0".
> So then the closest thing to what we need would be to add a size field
> to struct hvm_hw_cpu_xsave, and just assign the size variable to it in
> hvm_save_cpu_xsave_states (migration aside)?
>
> 2130 static int hvm_save_cpu_xsave_states(struct domain *d,
> hvm_domain_context_t *h)
> 2131 {
> 2132 struct vcpu *v;
> 2133 struct hvm_hw_cpu_xsave *ctxt;
> 2134
> 2135 if ( !cpu_has_xsave )
> 2136 return 0;   /* do nothing */
> 2137
> 2138 for_each_vcpu ( d, v )
> 2139 {
> 2140 unsigned int size = HVM_CPU_XSAVE_SIZE(v->arch.xcr0_accum);
> 2141
> 2142 if ( !xsave_enabled(v) )
> 2143 continue;
> 2144 if ( _hvm_init_entry(h, CPU_XSAVE_CODE, v->vcpu_id, size) )
> 2145 return 1;
> 2146 ctxt = (struct hvm_hw_cpu_xsave *)&h->data[h->cur];
> 2147 h->cur += size;
> 2148
> 2149 ctxt->xfeature_mask = xfeature_mask;
> 2150 ctxt->xcr0 = v->arch.xcr0;
> 2151 ctxt->xcr0_accum = v->arch.xcr0_accum;
> 2152 memcpy(&ctxt->save_area, v->arch.xsave_area,
> 2153size - offsetof(struct hvm_hw_cpu_xsave, save_area));
> 2154 }
> 2155
> 2156 return 0;
> 2157 }

I don't see any difference between this pasted code and the current
hvm_save_cpu_xsave_states().  What have you changed?

You can't use this size value, and it is the accumulated xcr0 over the
life of the VM, not the xcr0 in use at the time of the intercepted
instruction.

You also can't blindly modify the ctxt structure, or you will break
migration.

The xcr0 -> size mapping is static, and won't change going forwards. 
Your best bet is just to query each one and stash all the results.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] ovmf fail to compile

2015-11-05 Thread Hao, Xudong

> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com]
> Sent: Wednesday, November 4, 2015 6:19 PM
> To: Hao, Xudong 
> Cc: Wei Liu ; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] ovmf fail to compile
> 
> On Wed, Nov 04, 2015 at 08:27:56AM +, Hao, Xudong wrote:
> > "git clean -fdx" doesn't change the error result with gcc 4.4.7. Gcc 
> > version is
> "gcc-4.4.?" in Debian Jessie of yours?
> >
> 
> Debian Jessie's gcc-4.4 has the same version 4.4.7.
> 
> As the other sub-thread suggests, can you try passing more f's to git?
> 
> A somewhat related question, are you only interested in xen-unstable branch?
> Have you tried latest OVMF from upstream? If that builds for you I can easily
> send another patch to update Config.mk again.
>

I'm busy on other urgent today, will try the two above tomorrow and share the 
result later.

-Xudong 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH V8 3/7] libxl: add pvusb API

2015-11-05 Thread George Dunlap

On Wed, Nov 4, 2015 at 6:31 AM, Chun Yan Liu  wrote:
> Ian & George, any comments?

Hey Chunyan,

I did actually spend a chunk of time looking at this last week.
Looking at the diff-of-diffs, it looks like you've addressed
everything I asked you to address.  I still want to take a longer look
at it before giving it a reviewed-by.  Unfortunately this will have to
wait until next week.

One thing that came up though in an offline discussion between IanJ
and I was that we would like you to actually address the
DEFINE_DEVICE_REMOVE_EXT code duplication issue before this is checked
in.  Let me know if you understand the request clearly; I'd be willing
to send you a patch you can fold in if that would be helpful.

IanJ said he has some more comments on the AO stuff as well.

 -George

>
 On 10/21/2015 at 05:08 PM, in message
> <1445418510-19614-4-git-send-email-cy...@suse.com>, Chunyan Liu
>  wrote:
>> Add pvusb APIs, including:
>>  - attach/detach (create/destroy) virtual usb controller.
>>  - attach/detach usb device
>>  - list usb controller and usb devices
>>  - some other helper functions
>>
>> Signed-off-by: Chunyan Liu 
>> Signed-off-by: Simon Cao 
>>
>> ---
>> changes:
>>   - update COMPARE_USB to compare ctrl and port
>>   - add check in usb_add/remove to disable non-Dom0 backend so that
>> not worring about codes which are effective on Dom0 but not
>> compatible on non-Dom0 backend.
>>   - define READ_SUBPATH macro within functions
>>   - do not initialize rc but give it value in each return case
>>   - libxl__strdup gc or NOGC update, internal function using gc,
>> external using NOGC.
>>   - address other comments from George and Ian J.
>>
>>  tools/libxl/Makefile |2 +-
>>  tools/libxl/libxl.c  |   53 ++
>>  tools/libxl/libxl.h  |   74 ++
>>  tools/libxl/libxl_device.c   |5 +-
>>  tools/libxl/libxl_internal.h |   18 +
>>  tools/libxl/libxl_osdeps.h   |   13 +
>>  tools/libxl/libxl_pvusb.c| 1451
>> ++
>>  tools/libxl/libxl_types.idl  |   57 ++
>>  tools/libxl/libxl_types_internal.idl |1 +
>>  tools/libxl/libxl_utils.c|   16 +
>>  tools/libxl/libxl_utils.h|5 +
>>  11 files changed, 1693 insertions(+), 2 deletions(-)
>>  create mode 100644 tools/libxl/libxl_pvusb.c
>>
>> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
>> index c5ecec1..ef9ccd3 100644
>> --- a/tools/libxl/Makefile
>> +++ b/tools/libxl/Makefile
>> @@ -103,7 +103,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o
>> libxl_dm.o libxl_pci.o \
>>   libxl_stream_read.o libxl_stream_write.o \
>>   libxl_save_callout.o _libxl_save_msgs_callout.o \
>>   libxl_qmp.o libxl_event.o libxl_fork.o \
>> - libxl_dom_suspend.o $(LIBXL_OBJS-y)
>> + libxl_dom_suspend.o libxl_pvusb.o $(LIBXL_OBJS-y)
>>  LIBXL_OBJS += libxl_genid.o
>>  LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
>>
>> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
>> index dacfaae..a050e8b 100644
>> --- a/tools/libxl/libxl.c
>> +++ b/tools/libxl/libxl.c
>> @@ -4218,11 +4218,54 @@ DEFINE_DEVICE_REMOVE(vtpm, destroy, 1)
>>
>>
>> /
>> **/
>>
>> +/* Macro for defining device remove/destroy functions for usbctrl */
>> +/* Follo:wing functions are defined:
>> + * libxl_device_usbctrl_remove
>> + * libxl_device_usbctrl_destroy
>> + */
>> +
>> +#define DEFINE_DEVICE_REMOVE_EXT(type, removedestroy, f)\
>> +int libxl_device_##type##_##removedestroy(libxl_ctx *ctx,   \
>> +uint32_t domid, libxl_device_##type *type,  \
>> +const libxl_asyncop_how *ao_how)\
>> +{   \
>> +AO_CREATE(ctx, domid, ao_how);  \
>> +libxl__device *device;  \
>> +libxl__ao_device *aodev;\
>> +int rc; \
>> +\
>> +GCNEW(device);  \
>> +rc = libxl__device_from_##type(gc, domid, type, device);\
>> +if (rc != 0) goto out;  \
>> +\
>> +GCNEW(aodev);   \
>> +libxl__prepare_ao_device(ao, aodev);\
>> +aodev->action = LIBXL__DEVICE_ACTION_REMOVE;\
>> +aodev->dev = device;

Re: [Xen-devel] [PATCH v7 16/32] xen/x86: allow disabling the pmtimer

2015-11-05 Thread Andrew Cooper

On 03/11/15 12:41, Jan Beulich wrote:
 On 03.11.15 at 11:57,  wrote:
>> On 03/11/15 07:21, Jan Beulich wrote:
>> On 30.10.15 at 16:36,  wrote:
 On 30/10/15 13:16, Jan Beulich wrote:
 On 30.10.15 at 13:50,  wrote:
>> El 14/10/15 a les 16.37, Jan Beulich ha escrit:
>> On 02.10.15 at 17:48,  wrote:
 Signed-off-by: Roger Pau Monné 
 Cc: Jan Beulich 
 Cc: Andrew Cooper 
 ---
 Changes since v6:
  - Return ENODEV in pmtimer_load if the timer is disabled.
  - hvm_acpi_power_button and hvm_acpi_sleep_button become noops if the
pmtimer is disabled.
>>> But how are those two features connected? I don't think you can
>>> assume absence of a PM block just because there's no PM timer.
>>> Or if you want to tie them together for now, the predicate needs
>>> to be renamed.
>>>
  - Return ENODEV if pmtimer_change_ioport is called with the pmtimer
disabled.
>>> Same here.
>> What about changing XEN_X86_EMU_PMTIMER into XEN_X86_EMU_PM and this
>> flags disables all PM stuff?
> Ah, right, that's a reasonable option.
 It still might be a nice idea to split them in two, given future work.

 To support hotplug properly (cpu, ram and pci), Xen needs to inject
 GPEs, which comes from part of the PM infrastructure.  To support PCI
 devices in the future without the whole PM infrastructure, it would be
 nice to keep the split.
>>> Coming back to this - I'm not sure: The hotplug aspect as you
>>> mention it should matter for Dom0 only. DomU could (and perhaps
>>> should) use a PV interface instead.
>> I disagree.
>>
>> All PVH guests should use the same mechanism; making a split between
>> dom0 and domU will only make our lives harder.
>>
>> Where reasonable, we should follow what happens on native; one of the
>> underlying points of PVH is to have less of an impact on the guest
>> side.  In some cases it is indeed nasty, but has the advantage of being
>> well understood.
> What meaning would ACPI have to a PVH DomU?

Whatever is covered in the tables provided.

For hotplug, this is at minimum a PM block which can be used to inject GPEs.

>
>>> So I'd like to suggest quite the opposite: Don't call the thing PM,
>>> but make it more general and call it ACPI. And instead of
>>> separating HPET, we might have this fall under ACPI as well, or
>>> we might have a second TIMER flag, requiring both to be set
>>> for there to be a HPET and PMTMR. This leaves open the option
>>> of Dom0 getting ACPI enabled (despite this then being "real",
>>> not emulated ACPI), but TIMER left off.
>> An HPET can exist independently of other features such as ACPI.  It
>> should have its own option.
> Without ACPI there's no defined way to discover it. Doing what
> Linux does - applying chipset knowledge - won't work on PVH either,
> because there's no emulated chipset. Which would leave scanning
> physical memory, but if there is none, none can be found.

In reality, the legacy HPET always lives at 0xfed0, so only a single
MMIO read is required to locate one.

As for the Linux chipset behaviour, that reminds me that I need to do
something similar in Xen to deny MMIO access.  At the moment, if the
legacy HPET is not exposed in the ACPI tables, Xen doesn't find the HPET
but Linux does, and attempts to play with interrupts.  It doesn't get
very far, but the kexec environment finds itself without a timesource,
as Linux disables legacy broadcast mode.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 9/9] libxc: create p2m list outside of kernel mapping if supported

2015-11-05 Thread Juergen Gross

In case the kernel of a new pv-domU indicates it is supporting a p2m
list outside the initial kernel mapping by specifying INIT_P2M, let
the domain builder allocate the memory for the p2m list from physical
guest memory only and map it to the address the kernel is expecting.

This will enable loading pv-domUs larger than 512 GB.

Signed-off-by: Juergen Gross 
---
 tools/libxc/include/xc_dom.h |  1 +
 tools/libxc/xc_dom_core.c| 15 +++-
 tools/libxc/xc_dom_x86.c | 56 ++--
 3 files changed, 64 insertions(+), 8 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 7c157c3..ad8e47e 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -238,6 +238,7 @@ struct xc_dom_arch {
 char *native_protocol;
 int page_shift;
 int sizeof_pfn;
+int p2m_base_supported;
 int arch_private_size;
 
 struct xc_dom_arch *next;
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index ad91b35..5d6c3ba 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -777,6 +777,7 @@ struct xc_dom_image *xc_dom_allocate(xc_interface *xch,
 dom->parms.virt_hypercall = UNSET_ADDR;
 dom->parms.virt_hv_start_low = UNSET_ADDR;
 dom->parms.elf_paddr_offset = UNSET_ADDR;
+dom->parms.p2m_base = UNSET_ADDR;
 
 dom->alloc_malloc += sizeof(*dom);
 return dom;
@@ -1096,7 +1097,11 @@ int xc_dom_build_image(struct xc_dom_image *dom)
 }
 
 /* allocate other pages */
-if ( dom->arch_hooks->alloc_p2m_list &&
+if ( !dom->arch_hooks->p2m_base_supported ||
+ dom->parms.p2m_base >= dom->parms.virt_base ||
+ (dom->parms.p2m_base & (XC_DOM_PAGE_SIZE(dom) - 1)) )
+dom->parms.p2m_base = UNSET_ADDR;
+if ( dom->arch_hooks->alloc_p2m_list && dom->parms.p2m_base == UNSET_ADDR 
&&
  dom->arch_hooks->alloc_p2m_list(dom) != 0 )
 goto err;
 if ( dom->arch_hooks->alloc_magic_pages(dom) != 0 )
@@ -1124,6 +1129,14 @@ int xc_dom_build_image(struct xc_dom_image *dom)
 dom->initrd_len = page_size * dom->ramdisk_seg.pages;
 }
 
+/* Allocate p2m list if outside of initial kernel mapping. */
+if ( dom->arch_hooks->alloc_p2m_list && dom->parms.p2m_base != UNSET_ADDR )
+{
+if ( dom->arch_hooks->alloc_p2m_list(dom) != 0 )
+goto err;
+dom->p2m_seg.vstart = dom->parms.p2m_base;
+}
+
 return 0;
 
  err:
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 497aa55..147468c 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -69,6 +69,7 @@
 #define bits_to_mask(bits)   (((xen_vaddr_t)1 << (bits))-1)
 #define round_down(addr, mask)   ((addr) & ~(mask))
 #define round_up(addr, mask) ((addr) | (mask))
+#define round_pg_up(addr)  (((addr) + PAGE_SIZE_X86 - 1) & ~(PAGE_SIZE_X86 - 
1))
 
 struct xc_dom_params {
 unsigned levels;
@@ -90,7 +91,7 @@ struct xc_dom_x86_mapping {
 
 struct xc_dom_image_x86 {
 unsigned n_mappings;
-#define MAPPING_MAX 1
+#define MAPPING_MAX 2
 struct xc_dom_x86_mapping maps[MAPPING_MAX];
 struct xc_dom_params *params;
 };
@@ -483,11 +484,8 @@ static int setup_pgtables_x86_64(struct xc_dom_image *dom)
 
 /*  */
 
-static int alloc_p2m_list(struct xc_dom_image *dom)
+static int alloc_p2m_list(struct xc_dom_image *dom, size_t p2m_alloc_size)
 {
-size_t p2m_alloc_size = dom->p2m_size * dom->arch_hooks->sizeof_pfn;
-
-/* allocate phys2mach table */
 if ( xc_dom_alloc_segment(dom, &dom->p2m_seg, "phys2mach",
   0, p2m_alloc_size) )
 return -1;
@@ -498,6 +496,40 @@ static int alloc_p2m_list(struct xc_dom_image *dom)
 return 0;
 }
 
+static int alloc_p2m_list_x86_32(struct xc_dom_image *dom)
+{
+size_t p2m_alloc_size = dom->p2m_size * dom->arch_hooks->sizeof_pfn;
+
+p2m_alloc_size = round_pg_up(p2m_alloc_size);
+return alloc_p2m_list(dom, p2m_alloc_size);
+}
+
+static int alloc_p2m_list_x86_64(struct xc_dom_image *dom)
+{
+struct xc_dom_image_x86 *domx86 = dom->arch_private;
+struct xc_dom_x86_mapping *map = domx86->maps + domx86->n_mappings;
+size_t p2m_alloc_size = dom->p2m_size * dom->arch_hooks->sizeof_pfn;
+xen_vaddr_t from, to;
+unsigned lvl;
+
+p2m_alloc_size = round_pg_up(p2m_alloc_size);
+if ( dom->parms.p2m_base != UNSET_ADDR )
+{
+from = dom->parms.p2m_base;
+to = from + p2m_alloc_size - 1;
+if ( count_pgtables(dom, from, to, dom->pfn_alloc_end) )
+return -1;
+
+map->area.pfn = dom->pfn_alloc_end;
+for ( lvl = 0; lvl < 4; lvl++ )
+map->lvls[lvl].pfn += p2m_alloc_size >> PAGE_SHIFT_X86;
+domx86->n_mappings++;
+p2m_alloc_size += map->area.pgtables << PAGE_SHIFT_X86;
+}
+
+return alloc_p2m_list(dom, p2m_alloc_size);
+}
+
 /*

[Xen-devel] [PATCH v4 7/9] libxc: split p2m allocation in domain builder from other magic pages

2015-11-05 Thread Juergen Gross

Carve out the p2m list allocation from the .alloc_magic_pages hook of
the domain builder in order to prepare allocating the p2m list outside
of the initial kernel mapping. This will be needed to support loading
domains with huge memory (>512 GB).

Signed-off-by: Juergen Gross 
Acked-by: Ian Campbell 
Acked-by: Wei Liu 
---
 tools/libxc/include/xc_dom.h |  1 +
 tools/libxc/xc_dom_core.c|  3 +++
 tools/libxc/xc_dom_x86.c | 11 ++-
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 2358012..7c157c3 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -221,6 +221,7 @@ struct xc_dom_arch {
 /* pagetable setup */
 int (*alloc_magic_pages) (struct xc_dom_image * dom);
 int (*alloc_pgtables) (struct xc_dom_image * dom);
+int (*alloc_p2m_list) (struct xc_dom_image * dom);
 int (*setup_pgtables) (struct xc_dom_image * dom);
 
 /* arch-specific data structs setup */
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index 7b48b1f..ad91b35 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -1096,6 +1096,9 @@ int xc_dom_build_image(struct xc_dom_image *dom)
 }
 
 /* allocate other pages */
+if ( dom->arch_hooks->alloc_p2m_list &&
+ dom->arch_hooks->alloc_p2m_list(dom) != 0 )
+goto err;
 if ( dom->arch_hooks->alloc_magic_pages(dom) != 0 )
 goto err;
 if ( dom->arch_hooks->alloc_pgtables(dom) != 0 )
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 3c6bb9c..dd448cb 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -475,7 +475,7 @@ pfn_error:
 
 /*  */
 
-static int alloc_magic_pages(struct xc_dom_image *dom)
+static int alloc_p2m_list(struct xc_dom_image *dom)
 {
 size_t p2m_alloc_size = dom->p2m_size * dom->arch_hooks->sizeof_pfn;
 
@@ -487,6 +487,13 @@ static int alloc_magic_pages(struct xc_dom_image *dom)
 if ( dom->p2m_guest == NULL )
 return -1;
 
+return 0;
+}
+
+/*  */
+
+static int alloc_magic_pages(struct xc_dom_image *dom)
+{
 /* allocate special pages */
 dom->start_info_pfn = xc_dom_alloc_page(dom, "start info");
 dom->xenstore_pfn = xc_dom_alloc_page(dom, "xenstore");
@@ -1667,6 +1674,7 @@ static struct xc_dom_arch xc_dom_32_pae = {
 .arch_private_size = sizeof(struct xc_dom_image_x86),
 .alloc_magic_pages = alloc_magic_pages,
 .alloc_pgtables = alloc_pgtables_x86_32_pae,
+.alloc_p2m_list = alloc_p2m_list,
 .setup_pgtables = setup_pgtables_x86_32_pae,
 .start_info = start_info_x86_32,
 .shared_info = shared_info_x86_32,
@@ -1684,6 +1692,7 @@ static struct xc_dom_arch xc_dom_64 = {
 .arch_private_size = sizeof(struct xc_dom_image_x86),
 .alloc_magic_pages = alloc_magic_pages,
 .alloc_pgtables = alloc_pgtables_x86_64,
+.alloc_p2m_list = alloc_p2m_list,
 .setup_pgtables = setup_pgtables_x86_64,
 .start_info = start_info_x86_64,
 .shared_info = shared_info_x86_64,
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 6/9] libxc: create unmapped initrd in domain builder if supported

2015-11-05 Thread Juergen Gross

In case the kernel of a new pv-domU indicates it is supporting an
unmapped initrd, don't waste precious virtual space for the initrd,
but allocate only guest physical memory for it.

Signed-off-by: Juergen Gross 
Acked-by: Wei Liu 
---
 tools/libxc/include/xc_dom.h |  5 +
 tools/libxc/xc_dom_core.c| 19 +--
 tools/libxc/xc_dom_x86.c |  8 
 3 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 0ba9821..2358012 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -94,6 +94,11 @@ struct xc_dom_image {
 xen_pfn_t pfn_alloc_end;
 xen_vaddr_t virt_alloc_end;
 xen_vaddr_t bsd_symtab_start;
+
+/* initrd parameters as specified in start_info page */
+unsigned long initrd_start;
+unsigned long initrd_len;
+
 unsigned int alloc_bootstack;
 xen_vaddr_t virt_pgtab_end;
 
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index 3a31222..7b48b1f 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -1041,6 +1041,7 @@ static int xc_dom_build_ramdisk(struct xc_dom_image *dom)
 int xc_dom_build_image(struct xc_dom_image *dom)
 {
 unsigned int page_size;
+bool unmapped_initrd;
 
 DOMPRINTF_CALLED(dom->xch);
 
@@ -1064,11 +1065,15 @@ int xc_dom_build_image(struct xc_dom_image *dom)
 if ( dom->kernel_loader->loader(dom) != 0 )
 goto err;
 
-/* load ramdisk */
-if ( dom->ramdisk_blob )
+/* Don't load ramdisk now if no initial mapping required. */
+unmapped_initrd = dom->parms.unmapped_initrd && !dom->ramdisk_seg.vstart;
+
+if ( dom->ramdisk_blob && !unmapped_initrd )
 {
 if ( xc_dom_build_ramdisk(dom) != 0 )
 goto err;
+dom->initrd_start = dom->ramdisk_seg.vstart;
+dom->initrd_len = dom->ramdisk_seg.vend - dom->ramdisk_seg.vstart;
 }
 
 /* load devicetree */
@@ -1106,6 +,16 @@ int xc_dom_build_image(struct xc_dom_image *dom)
 if ( dom->virt_pgtab_end && xc_dom_alloc_pad(dom, dom->virt_pgtab_end) )
 return -1;
 
+/* Load ramdisk if no initial mapping required. */
+if ( dom->ramdisk_blob && unmapped_initrd )
+{
+if ( xc_dom_build_ramdisk(dom) != 0 )
+goto err;
+dom->flags |= SIF_MOD_START_PFN;
+dom->initrd_start = dom->ramdisk_seg.pfn;
+dom->initrd_len = page_size * dom->ramdisk_seg.pages;
+}
+
 return 0;
 
  err:
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index aba50df..3c6bb9c 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -663,8 +663,8 @@ static int start_info_x86_32(struct xc_dom_image *dom)
 
 if ( dom->ramdisk_blob )
 {
-start_info->mod_start = dom->ramdisk_seg.vstart;
-start_info->mod_len = dom->ramdisk_seg.vend - dom->ramdisk_seg.vstart;
+start_info->mod_start = dom->initrd_start;
+start_info->mod_len = dom->initrd_len;
 }
 
 if ( dom->cmdline )
@@ -710,8 +710,8 @@ static int start_info_x86_64(struct xc_dom_image *dom)
 
 if ( dom->ramdisk_blob )
 {
-start_info->mod_start = dom->ramdisk_seg.vstart;
-start_info->mod_len = dom->ramdisk_seg.vend - dom->ramdisk_seg.vstart;
+start_info->mod_start = dom->initrd_start;
+start_info->mod_len = dom->initrd_len;
 }
 
 if ( dom->cmdline )
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 8/9] libxc: rework of domain builder's page table handler

2015-11-05 Thread Juergen Gross

In order to prepare a p2m list outside of the initial kernel mapping
do a rework of the domain builder's page table handler. The goal is
to be able to use common helpers for page table allocation and setup
for initial kernel page tables and page tables mapping the p2m list.
This is achieved by supporting multiple mapping areas. The mapped
virtual addresses of the single areas must not overlap, while the
page tables of a new area added might already be partially present.
Especially the top level page table is existing only once, of course.

Currently restrict the number of mappings to 1 because the only mapping
now is the initial mapping created by toolstack. There should not be
behaviour change and guest visible change introduced.

Signed-off-by: Juergen Gross 
---
 tools/libxc/xc_dom_x86.c | 478 ---
 tools/libxc/xg_private.h |  39 +---
 2 files changed, 251 insertions(+), 266 deletions(-)

diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index dd448cb..497aa55 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -69,13 +70,29 @@
 #define round_down(addr, mask)   ((addr) & ~(mask))
 #define round_up(addr, mask) ((addr) | (mask))
 
-struct xc_dom_image_x86 {
-/* initial page tables */
+struct xc_dom_params {
+unsigned levels;
+xen_vaddr_t vaddr_mask;
+x86_pgentry_t lvl_prot[4];
+};
+
+struct xc_dom_x86_mapping_lvl {
+xen_vaddr_t from;
+xen_vaddr_t to;
+xen_pfn_t pfn;
 unsigned int pgtables;
-unsigned int pg_l4;
-unsigned int pg_l3;
-unsigned int pg_l2;
-unsigned int pg_l1;
+};
+
+struct xc_dom_x86_mapping {
+struct xc_dom_x86_mapping_lvl area;
+struct xc_dom_x86_mapping_lvl lvls[4];
+};
+
+struct xc_dom_image_x86 {
+unsigned n_mappings;
+#define MAPPING_MAX 1
+struct xc_dom_x86_mapping maps[MAPPING_MAX];
+struct xc_dom_params *params;
 };
 
 /* get guest IO ABI protocol */
@@ -105,102 +122,159 @@ const char *xc_domain_get_native_protocol(xc_interface 
*xch,
 return protocol;
 }
 
-static unsigned long
-nr_page_tables(struct xc_dom_image *dom,
-   xen_vaddr_t start, xen_vaddr_t end, unsigned long bits)
+static int count_pgtables(struct xc_dom_image *dom, xen_vaddr_t from,
+  xen_vaddr_t to, xen_pfn_t pfn)
 {
-xen_vaddr_t mask = bits_to_mask(bits);
-int tables;
+struct xc_dom_image_x86 *domx86 = dom->arch_private;
+struct xc_dom_x86_mapping *map, *map_cmp;
+xen_pfn_t pfn_end;
+xen_vaddr_t mask;
+unsigned bits;
+int l, m;
 
-if ( bits == 0 )
-return 0;  /* unused */
+if ( domx86->n_mappings == MAPPING_MAX )
+{
+xc_dom_panic(dom->xch, XC_OUT_OF_MEMORY,
+ "%s: too many mappings\n", __FUNCTION__);
+return -ENOMEM;
+}
+map = domx86->maps + domx86->n_mappings;
 
-if ( bits == (8 * sizeof(unsigned long)) )
+pfn_end = pfn + ((to - from) >> PAGE_SHIFT_X86);
+if ( pfn_end >= dom->p2m_size )
 {
-/* must be pgd, need one */
-start = 0;
-end = -1;
-tables = 1;
+xc_dom_panic(dom->xch, XC_OUT_OF_MEMORY,
+ "%s: not enough memory for initial mapping (%#"PRIpfn" > 
%#"PRIpfn")",
+ __FUNCTION__, pfn_end, dom->p2m_size);
+return -ENOMEM;
 }
-else
+for ( m = 0; m < domx86->n_mappings; m++ )
 {
-start = round_down(start, mask);
-end = round_up(end, mask);
-tables = ((end - start) >> bits) + 1;
+map_cmp = domx86->maps + m;
+if ( from < map_cmp->area.to && to > map_cmp->area.from )
+{
+xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+ "%s: overlapping mappings\n", __FUNCTION__);
+return -1;
+}
 }
 
-DOMPRINTF("%s: 0x%016" PRIx64 "/%ld: 0x%016" PRIx64
-  " -> 0x%016" PRIx64 ", %d table(s)",
-  __FUNCTION__, mask, bits, start, end, tables);
-return tables;
+memset(map, 0, sizeof(*map));
+map->area.from = from & domx86->params->vaddr_mask;
+map->area.to = to & domx86->params->vaddr_mask;
+
+for ( l = domx86->params->levels - 1; l >= 0; l-- )
+{
+map->lvls[l].pfn = pfn + map->area.pgtables;
+if ( l == domx86->params->levels - 1 )
+{
+if ( domx86->n_mappings == 0 )
+{
+map->lvls[l].from = 0;
+map->lvls[l].to = domx86->params->vaddr_mask;
+map->lvls[l].pgtables = 1;
+map->area.pgtables++;
+}
+continue;
+}
+
+bits = PAGE_SHIFT_X86 + (l + 1) * PGTBL_LEVEL_SHIFT_X86;
+mask = bits_to_mask(bits);
+map->lvls[l].from = map->area.from & ~mask;
+map->lvls[l].to = map->area.to | mask;
+
+if ( domx86->params->levels == 3 && domx86->n_

[Xen-devel] [linux-3.4 test] 63567: regressions - FAIL

2015-11-05 Thread osstest service owner

flight 63567 linux-3.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/63567/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-rumpuserxen-i386  6 xen-boot  fail REGR. vs. 62277
 test-amd64-i386-qemuu-rhel6hvm-intel  6 xen-boot  fail REGR. vs. 62277
 test-amd64-i386-xl-qemuu-debianhvm-amd64  6 xen-boot  fail REGR. vs. 62277
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  6 xen-boot  fail REGR. vs. 62277
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm  6 xen-boot fail REGR. vs. 62277
 test-amd64-i386-xl-qemuu-ovmf-amd64  6 xen-boot   fail REGR. vs. 62277
 test-amd64-i386-xl6 xen-boot  fail REGR. vs. 62277
 test-amd64-i386-freebsd10-amd64  6 xen-boot   fail REGR. vs. 62277
 test-amd64-amd64-xl-xsm   6 xen-boot  fail REGR. vs. 62277
 test-amd64-i386-xl-qemut-debianhvm-amd64  6 xen-boot  fail REGR. vs. 62277
 test-amd64-amd64-xl-multivcpu  6 xen-boot fail REGR. vs. 62277
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  6 xen-boot fail REGR. vs. 62277
 test-amd64-i386-qemut-rhel6hvm-intel  6 xen-boot  fail REGR. vs. 62277
 test-amd64-amd64-xl-qemuu-ovmf-amd64  6 xen-boot  fail REGR. vs. 62277
 test-amd64-amd64-xl-qemut-winxpsp3  6 xen-bootfail REGR. vs. 62277
 test-amd64-i386-xl-qemuu-winxpsp3  6 xen-boot fail REGR. vs. 62277

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-amd64-pvgrub  3 host-install(3) broken in 63294 pass in 63567
 test-amd64-i386-qemuu-rhel6hvm-amd 3 host-install(3) broken in 63294 pass in 
63567
 test-amd64-amd64-xl-qemuu-winxpsp3 3 host-install(3) broken in 63294 pass in 
63567
 test-amd64-i386-xl-xsm3 host-install(3)  broken in 63310 pass in 63567
 test-amd64-amd64-xl-qcow2 3 host-install(3)  broken in 63310 pass in 63567
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 3 host-install(3) broken in 
63310 pass in 63567
 test-amd64-amd64-xl-qemut-winxpsp3 3 host-install(3) broken in 63310 pass in 
63567
 test-amd64-amd64-xl-credit2   3 host-install(3)  broken in 63324 pass in 63567
 test-amd64-i386-xl-raw3 host-install(3)  broken in 63324 pass in 63567
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 3 host-install(3) broken 
in 63324 pass in 63567
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 3 host-install(3) broken in 63324 
pass in 63567
 test-amd64-i386-qemut-rhel6hvm-amd 3 host-install(3) broken in 63324 pass in 
63567
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 13 guest-localmigrate 
fail in 63324 pass in 63567
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 9 windows-install fail in 63485 pass 
in 63567
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 6 xen-boot fail pass in 
63228
 test-amd64-amd64-xl-rtds  6 xen-bootfail pass in 63228
 test-amd64-amd64-i386-pvgrub  6 xen-bootfail pass in 63294
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 15 guest-localmigrate.2 
fail pass in 63294
 test-amd64-i386-pair 10 xen-boot/dst_host   fail pass in 63310
 test-amd64-i386-pair  9 xen-boot/src_host   fail pass in 63310
 test-amd64-amd64-libvirt-pair 10 xen-boot/dst_host  fail pass in 63310
 test-amd64-amd64-libvirt-pair  9 xen-boot/src_host  fail pass in 63310
 test-amd64-amd64-amd64-pvgrub  6 xen-boot   fail pass in 63324
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  6 xen-boot   fail pass in 63324
 test-amd64-amd64-xl-qcow2 6 xen-bootfail pass in 63338
 test-amd64-i386-libvirt-pair 10 xen-boot/dst_host   fail pass in 63374
 test-amd64-i386-libvirt-pair  9 xen-boot/src_host   fail pass in 63374
 test-amd64-amd64-pair10 xen-boot/dst_host   fail pass in 63404
 test-amd64-amd64-pair 9 xen-boot/src_host   fail pass in 63404
 test-amd64-i386-qemut-rhel6hvm-amd  6 xen-boot  fail pass in 63485

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt-xsm   6 xen-boot  fail REGR. vs. 62277
 test-amd64-amd64-libvirt-xsm  6 xen-boot  fail REGR. vs. 62277
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 15 guest-localmigrate.2 
fail in 63228 blocked in 62277
 test-amd64-amd64-rumpuserxen-amd64 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail like 62277
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail like 62277
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail like 62277
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail like 62277

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail in 63228 never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-

[Xen-devel] [PATCH v4 5/9] libxc: use domain builder architecture private data for x86 pv domains

2015-11-05 Thread Juergen Gross

Move some data private to the x86 domain builder to the private data
section. Remove extra_pages as they are used nowhere.

Signed-off-by: Juergen Gross 
Acked-by: Wei Liu 
---
 tools/libxc/include/xc_dom.h |  8 
 tools/libxc/xc_dom_x86.c | 48 +---
 2 files changed, 32 insertions(+), 24 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 09f73cd..0ba9821 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -94,15 +94,7 @@ struct xc_dom_image {
 xen_pfn_t pfn_alloc_end;
 xen_vaddr_t virt_alloc_end;
 xen_vaddr_t bsd_symtab_start;
-
-/* initial page tables */
-unsigned int pgtables;
-unsigned int pg_l4;
-unsigned int pg_l3;
-unsigned int pg_l2;
-unsigned int pg_l1;
 unsigned int alloc_bootstack;
-unsigned int extra_pages;
 xen_vaddr_t virt_pgtab_end;
 
 /* other state info */
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index ea32b00..aba50df 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -69,6 +69,15 @@
 #define round_down(addr, mask)   ((addr) & ~(mask))
 #define round_up(addr, mask) ((addr) | (mask))
 
+struct xc_dom_image_x86 {
+/* initial page tables */
+unsigned int pgtables;
+unsigned int pg_l4;
+unsigned int pg_l3;
+unsigned int pg_l2;
+unsigned int pg_l1;
+};
+
 /* get guest IO ABI protocol */
 const char *xc_domain_get_native_protocol(xc_interface *xch,
   uint32_t domid)
@@ -132,9 +141,9 @@ static int alloc_pgtables(struct xc_dom_image *dom, int pae,
 int pages, extra_pages;
 xen_vaddr_t try_virt_end;
 xen_pfn_t try_pfn_end;
+struct xc_dom_image_x86 *domx86 = dom->arch_private;
 
 extra_pages = dom->alloc_bootstack ? 1 : 0;
-extra_pages += dom->extra_pages;
 extra_pages += 128; /* 512kB padding */
 pages = extra_pages;
 for ( ; ; )
@@ -152,29 +161,30 @@ static int alloc_pgtables(struct xc_dom_image *dom, int 
pae,
 return -ENOMEM;
 }
 
-dom->pg_l4 =
+domx86->pg_l4 =
 nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l4_bits);
-dom->pg_l3 =
+domx86->pg_l3 =
 nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l3_bits);
-dom->pg_l2 =
+domx86->pg_l2 =
 nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l2_bits);
-dom->pg_l1 =
+domx86->pg_l1 =
 nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l1_bits);
 if (pae && try_virt_end < 0xc000)
 {
 DOMPRINTF("%s: PAE: extra l2 page table for l3#3",
   __FUNCTION__);
-dom->pg_l2++;
+domx86->pg_l2++;
 }
-dom->pgtables = dom->pg_l4 + dom->pg_l3 + dom->pg_l2 + dom->pg_l1;
-pages = dom->pgtables + extra_pages;
+domx86->pgtables = domx86->pg_l4 + domx86->pg_l3 +
+   domx86->pg_l2 + domx86->pg_l1;
+pages = domx86->pgtables + extra_pages;
 if ( dom->virt_alloc_end + pages * PAGE_SIZE_X86 <= try_virt_end + 1 )
 break;
 }
 dom->virt_pgtab_end = try_virt_end + 1;
 
 return xc_dom_alloc_segment(dom, &dom->pgtables_seg, "page tables", 0,
-dom->pgtables * PAGE_SIZE_X86);
+domx86->pgtables * PAGE_SIZE_X86);
 }
 
 /*  */
@@ -262,9 +272,10 @@ static xen_pfn_t move_l3_below_4G(struct xc_dom_image *dom,
 
 static int setup_pgtables_x86_32_pae(struct xc_dom_image *dom)
 {
+struct xc_dom_image_x86 *domx86 = dom->arch_private;
 xen_pfn_t l3pfn = dom->pgtables_seg.pfn;
-xen_pfn_t l2pfn = l3pfn + dom->pg_l3;
-xen_pfn_t l1pfn = l2pfn + dom->pg_l2;
+xen_pfn_t l2pfn = l3pfn + domx86->pg_l3;
+xen_pfn_t l1pfn = l2pfn + domx86->pg_l2;
 l3_pgentry_64_t *l3tab;
 l2_pgentry_64_t *l2tab = NULL;
 l1_pgentry_64_t *l1tab = NULL;
@@ -373,10 +384,11 @@ static int alloc_pgtables_x86_64(struct xc_dom_image *dom)
 
 static int setup_pgtables_x86_64(struct xc_dom_image *dom)
 {
+struct xc_dom_image_x86 *domx86 = dom->arch_private;
 xen_pfn_t l4pfn = dom->pgtables_seg.pfn;
-xen_pfn_t l3pfn = l4pfn + dom->pg_l4;
-xen_pfn_t l2pfn = l3pfn + dom->pg_l3;
-xen_pfn_t l1pfn = l2pfn + dom->pg_l2;
+xen_pfn_t l3pfn = l4pfn + domx86->pg_l4;
+xen_pfn_t l2pfn = l3pfn + domx86->pg_l3;
+xen_pfn_t l1pfn = l2pfn + domx86->pg_l2;
 l4_pgentry_64_t *l4tab = xc_dom_pfn_to_ptr(dom, l4pfn, 1);
 l3_pgentry_64_t *l3tab = NULL;
 l2_pgentry_64_t *l2tab = NULL;
@@ -619,6 +631,7 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
 
 static int start_info_x86_32(struct xc_dom_image *dom)
 {
+struct xc_dom_image_x86 *domx86 = dom->arch_private;
 start_info_x86_32_t *star

[Xen-devel] [PATCH v4 4/9] libxc: introduce domain builder architecture specific data

2015-11-05 Thread Juergen Gross

Reorganize struct xc_dom_image to contain a pointer to domain builder
architecture specific private data. This will abstract the architecture
or domain type specific data from the general used data.

The new area is allocated as soon as the domain type is known.

Signed-off-by: Juergen Gross 
Acked-by: Wei Liu 
---
 stubdom/grub/kexec.c |  6 +-
 tools/libxc/include/xc_dom.h |  6 +-
 tools/libxc/xc_dom_core.c| 27 +++
 3 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c
index 2300318..8fd9ff9 100644
--- a/stubdom/grub/kexec.c
+++ b/stubdom/grub/kexec.c
@@ -272,7 +272,11 @@ void kexec(void *kernel, long kernel_size, void *module, 
long module_size, char
 #endif
 
 /* equivalent of xc_dom_mem_init */
-dom->arch_hooks = xc_dom_find_arch_hooks(xc_handle, dom->guest_type);
+if (xc_dom_set_arch_hooks(dom)) {
+grub_printf("xc_dom_set_arch_hooks failed\n");
+errnum = ERR_EXEC_FORMAT;
+goto out;
+}
 dom->total_pages = start_info.nr_pages;
 
 /* equivalent of arch_setup_meminit */
diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 19d45f4..09f73cd 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -175,6 +175,9 @@ struct xc_dom_image {
 unsigned int *vnode_to_pnode;
 unsigned int nr_vnodes;
 
+/* domain type/architecture specific data */
+void *arch_private;
+
 /* kernel loader */
 struct xc_dom_arch *arch_hooks;
 /* allocate up to pfn_alloc_end */
@@ -237,6 +240,7 @@ struct xc_dom_arch {
 char *native_protocol;
 int page_shift;
 int sizeof_pfn;
+int arch_private_size;
 
 struct xc_dom_arch *next;
 };
@@ -290,7 +294,7 @@ int xc_dom_devicetree_mem(struct xc_dom_image *dom, const 
void *mem,
   size_t memsize);
 
 int xc_dom_parse_image(struct xc_dom_image *dom);
-struct xc_dom_arch *xc_dom_find_arch_hooks(xc_interface *xch, char 
*guest_type);
+int xc_dom_set_arch_hooks(struct xc_dom_image *dom);
 int xc_dom_build_image(struct xc_dom_image *dom);
 int xc_dom_update_guest_p2m(struct xc_dom_image *dom);
 
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index 74de3c3..3a31222 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -710,19 +710,30 @@ void xc_dom_register_arch_hooks(struct xc_dom_arch *hooks)
 first_hook = hooks;
 }
 
-struct xc_dom_arch *xc_dom_find_arch_hooks(xc_interface *xch, char *guest_type)
+int xc_dom_set_arch_hooks(struct xc_dom_image *dom)
 {
 struct xc_dom_arch *hooks = first_hook;
 
 while (  hooks != NULL )
 {
-if ( !strcmp(hooks->guest_type, guest_type))
-return hooks;
+if ( !strcmp(hooks->guest_type, dom->guest_type) )
+{
+if ( hooks->arch_private_size )
+{
+dom->arch_private = malloc(hooks->arch_private_size);
+if ( dom->arch_private == NULL )
+return -1;
+memset(dom->arch_private, 0, hooks->arch_private_size);
+dom->alloc_malloc += hooks->arch_private_size;
+}
+dom->arch_hooks = hooks;
+return 0;
+}
 hooks = hooks->next;
 }
-xc_dom_panic(xch, XC_INVALID_KERNEL,
- "%s: not found (type %s)", __FUNCTION__, guest_type);
-return NULL;
+xc_dom_panic(dom->xch, XC_INVALID_KERNEL,
+ "%s: not found (type %s)", __FUNCTION__, dom->guest_type);
+return -1;
 }
 
 /*  */
@@ -734,6 +745,7 @@ void xc_dom_release(struct xc_dom_image *dom)
 if ( dom->phys_pages )
 xc_dom_unmap_all(dom);
 xc_dom_free_all(dom);
+free(dom->arch_private);
 free(dom);
 }
 
@@ -924,8 +936,7 @@ int xc_dom_mem_init(struct xc_dom_image *dom, unsigned int 
mem_mb)
 unsigned int page_shift;
 xen_pfn_t nr_pages;
 
-dom->arch_hooks = xc_dom_find_arch_hooks(dom->xch, dom->guest_type);
-if ( dom->arch_hooks == NULL )
+if ( xc_dom_set_arch_hooks(dom) )
 {
 xc_dom_panic(dom->xch, XC_INTERNAL_ERROR, "%s: arch hooks not set",
  __FUNCTION__);
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 0/9] libxc: support building large pv-domains

2015-11-05 Thread Juergen Gross

The Xen hypervisor supports starting a dom0 with large memory (up to
the TB range) by not including the initrd and p2m list in the initial
kernel mapping. Especially the p2m list can grow larger than the
available virtual space in the initial mapping.

The started kernel is indicating the support of each feature via
elf notes.

This series enables the domain builder in libxc to do the same as the
hypervisor. This enables starting of huge pv-domUs via xl.

Unmapped initrd is supported for 64 and 32 bit domains, omitting the
p2m from initial kernel mapping is possible for 64 bit domains only.

Tested with:
- 32 bit domU (kernel not supporting unmapped initrd)
- 32 bit domU (kernel supporting unmapped initrd)
- 1 GB 64 bit domU (kernel supporting unmapped initrd, not p2m)
- 1 GB 64 bit domU (kernel supporting unmapped initrd and p2m)
- 900GB 64 bit domU (kernel supporting unmapped initrd and p2m)
- HVM domU

Changes in v4:
- updated patch 1 as suggested by Wei Liu (comment and variable name)
- modify comment in patch 6 as suggested by Wei Liu
- rework of patch 8 reducing line count by nearly 100
- added some additional plausibility checks to patch 8 as suggested by
  Wei Liu
- renamed round_pg() to round_pg_up() in patch 9 as suggested by Wei Liu

Changes in v3:
- Rebased the complete series to new staging (hvm builder patches by
  Roger Pau Monne)
- Removed old patch 1 as it broke stubdom build
- Introduced new Patch 1 to make allocation of guest memory more clear
  regarding virtual/physical memory allocation (requested by Ian Campbell)
- Change name of flag to indicate support of unmapped initrd in patch 2
  (requested by Ian Campbell)
- Introduce new patches 3, 4, 5 ("rename domain builder count_pgtables to
  alloc_pgtables", "introduce domain builder architecture specific data",
  "use domain builder architecture private data for x86 pv domains") to
  assist later page table work
- don't fiddle with initrd virtual address in patch 6 (was patch 3 in v2),
  add explicit initrd parameters for start_info in struct xc_dom_image
  instead (requested by Ian Campbell)
- Introduce new patch 8 ("rework of domain builder's page table handler")
  to be able to use common helpers for unmapped p2m list (requested by
  Ian Campbell)
- use now known segment size in pages for p2m list in patch 9 (was patch
  5 in v2) instead of fiddling with segment end address (requested by
  Ian Campbell)
- split alloc_p2m_list() in patch 9 (was patch 5 in v2) to 32/64 bit
  variants (requested by Ian Campbell)

Changes in v2:
- patch 2 has been removed as it has been applied already
- introduced new patch 2 as suggested by Ian Campbell: add a flag
  indicating support of an unmapped initrd to the parsed elf data of
  the elf_dom_parms structure
- updated patch description of patch 3 as requested by Ian Campbell


Juergen Gross (9):
  libxc: reorganize domain builder guest memory allocator
  xen: add generic flag to elf_dom_parms indicating support of unmapped
initrd
  libxc: rename domain builder count_pgtables to alloc_pgtables
  libxc: introduce domain builder architecture specific data
  libxc: use domain builder architecture private data for x86 pv domains
  libxc: create unmapped initrd in domain builder if supported
  libxc: split p2m allocation in domain builder from other magic pages
  libxc: rework of domain builder's page table handler
  libxc: create p2m list outside of kernel mapping if supported

 stubdom/grub/kexec.c   |  12 +-
 tools/libxc/include/xc_dom.h   |  34 +--
 tools/libxc/xc_dom_arm.c   |   6 +-
 tools/libxc/xc_dom_core.c  | 180 
 tools/libxc/xc_dom_x86.c   | 563 +
 tools/libxc/xg_private.h   |  39 +--
 xen/arch/x86/domain_build.c|   4 +-
 xen/common/libelf/libelf-dominfo.c |   3 +
 xen/include/xen/libelf.h   |   1 +
 9 files changed, 490 insertions(+), 352 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 2/9] xen: add generic flag to elf_dom_parms indicating support of unmapped initrd

2015-11-05 Thread Juergen Gross

Support of an unmapped initrd is indicated by the kernel of the domain
via elf notes. In order not to have to use raw elf data in the tools
for support of an unmapped initrd add a flag to the parsed data area
to indicate the kernel supporting this feature.

Switch using this flag in the hypervisor domain builder.

Cc: andrew.coop...@citrix.com
Cc: jbeul...@suse.com
Cc: k...@xen.org
Suggested-by: Ian Campbell 
Signed-off-by: Juergen Gross 
Acked-by: Jan Beulich 
---
 xen/arch/x86/domain_build.c| 4 ++--
 xen/common/libelf/libelf-dominfo.c | 3 +++
 xen/include/xen/libelf.h   | 1 +
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index c2ef87a..d02dc4b 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -353,7 +353,7 @@ static unsigned long __init compute_dom0_nr_pages(
 
 vstart = parms->virt_base;
 vend = round_pgup(parms->virt_kend);
-if ( !parms->elf_notes[XEN_ELFNOTE_MOD_START_PFN].data.num )
+if ( !parms->unmapped_initrd )
 vend += round_pgup(initrd_len);
 end = vend + nr_pages * sizeof_long;
 
@@ -1037,7 +1037,7 @@ int __init construct_dom0(
 v_start  = parms.virt_base;
 vkern_start  = parms.virt_kstart;
 vkern_end= parms.virt_kend;
-if ( parms.elf_notes[XEN_ELFNOTE_MOD_START_PFN].data.num )
+if ( parms.unmapped_initrd )
 {
 vinitrd_start  = vinitrd_end = 0;
 vphysmap_start = round_pgup(vkern_end);
diff --git a/xen/common/libelf/libelf-dominfo.c 
b/xen/common/libelf/libelf-dominfo.c
index 3de1c23..c9243e4 100644
--- a/xen/common/libelf/libelf-dominfo.c
+++ b/xen/common/libelf/libelf-dominfo.c
@@ -190,6 +190,9 @@ elf_errorstatus elf_xen_parse_note(struct elf_binary *elf,
 case XEN_ELFNOTE_INIT_P2M:
 parms->p2m_base = val;
 break;
+case XEN_ELFNOTE_MOD_START_PFN:
+parms->unmapped_initrd = !!val;
+break;
 case XEN_ELFNOTE_PADDR_OFFSET:
 parms->elf_paddr_offset = val;
 break;
diff --git a/xen/include/xen/libelf.h b/xen/include/xen/libelf.h
index de788c7..6da4cc0 100644
--- a/xen/include/xen/libelf.h
+++ b/xen/include/xen/libelf.h
@@ -423,6 +423,7 @@ struct elf_dom_parms {
 char loader[16];
 enum xen_pae_type pae;
 bool bsd_symtab;
+bool unmapped_initrd;
 uint64_t virt_base;
 uint64_t virt_entry;
 uint64_t virt_hypercall;
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] tools: pygrub: if partition table is empty, try treating as a whole disk

2015-11-05 Thread Ian Campbell

pygrub (in identify_disk_image()) detects a DOS style partition table
via the presence of the 0xaa55 signature at the end of the first
sector of the disk.

However this signature is also present in whole-disk configurations
when there is an MBR on the disk. Many filesystems (e.g. ext[234])
include leading padding in their on disk format specifically to enable
this.

So if we think we have a DOS partition table but do not find any
actual partition table entries we may as well try looking at it as a
whole disk image. Worst case is we probe and find there isn't anything
there.

This was reported by Sjors Gielen in Debian bug #745419. The fix was
inspired by a patch by Adi Kriegisch in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=745419#27

Tested by genext2fs'ing my /boot into a new raw image (works) and
then:
   dd if=/usr/lib/grub/i386-pc/g2ldr.mbr of=img conv=notrunc bs=512 count=1

to add an MBR (with 0xaa55 signature) to it, which after this patch
also works.

Signed-off-by: Ian Campbell 
Cc: 745419-forwar...@bugs.debian.org
---
 tools/pygrub/src/pygrub | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tools/pygrub/src/pygrub b/tools/pygrub/src/pygrub
index e4aedda..40f9584 100755
--- a/tools/pygrub/src/pygrub
+++ b/tools/pygrub/src/pygrub
@@ -156,6 +156,11 @@ def get_partition_offsets(file):
 else:
 part_offs.append(offset)
 
+# We thought we had a DOS partition table, but didn't find any
+# actual valid partition entries. This can happen because an MBR
+# (e.g. grubs) may contain the same signature.
+if not part_offs: part_offs = [0]
+
 return part_offs
 
 class GrubLineEditor(curses.textpad.Textbox):
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 3/9] libxc: rename domain builder count_pgtables to alloc_pgtables

2015-11-05 Thread Juergen Gross

Rename the count_pgtables hook of the domain builder to alloc_pgtables
and do the allocation of the guest memory for page tables inside this
hook. This will remove the need for accessing the x86 specific pgtables
member of struct xc_dom_image in the generic domain builder code.

Signed-off-by: Juergen Gross 
Acked-by: Wei Liu 
---
 tools/libxc/include/xc_dom.h |  2 +-
 tools/libxc/xc_dom_arm.c |  6 +++---
 tools/libxc/xc_dom_core.c| 11 ++-
 tools/libxc/xc_dom_x86.c | 26 +-
 4 files changed, 23 insertions(+), 22 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 68d6848..19d45f4 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -220,7 +220,7 @@ void xc_dom_register_loader(struct xc_dom_loader *loader);
 struct xc_dom_arch {
 /* pagetable setup */
 int (*alloc_magic_pages) (struct xc_dom_image * dom);
-int (*count_pgtables) (struct xc_dom_image * dom);
+int (*alloc_pgtables) (struct xc_dom_image * dom);
 int (*setup_pgtables) (struct xc_dom_image * dom);
 
 /* arch-specific data structs setup */
diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
index 397eef0..d9a6371 100644
--- a/tools/libxc/xc_dom_arm.c
+++ b/tools/libxc/xc_dom_arm.c
@@ -49,7 +49,7 @@ const char *xc_domain_get_native_protocol(xc_interface *xch,
  * arm guests are hybrid and start off with paging disabled, therefore no
  * pagetables and nothing to do here.
  */
-static int count_pgtables_arm(struct xc_dom_image *dom)
+static int alloc_pgtables_arm(struct xc_dom_image *dom)
 {
 DOMPRINTF_CALLED(dom->xch);
 return 0;
@@ -534,7 +534,7 @@ static struct xc_dom_arch xc_dom_32 = {
 .page_shift = PAGE_SHIFT_ARM,
 .sizeof_pfn = 8,
 .alloc_magic_pages = alloc_magic_pages,
-.count_pgtables = count_pgtables_arm,
+.alloc_pgtables = alloc_pgtables_arm,
 .setup_pgtables = setup_pgtables_arm,
 .start_info = start_info_arm,
 .shared_info = shared_info_arm,
@@ -550,7 +550,7 @@ static struct xc_dom_arch xc_dom_64 = {
 .page_shift = PAGE_SHIFT_ARM,
 .sizeof_pfn = 8,
 .alloc_magic_pages = alloc_magic_pages,
-.count_pgtables = count_pgtables_arm,
+.alloc_pgtables = alloc_pgtables_arm,
 .setup_pgtables = setup_pgtables_arm,
 .start_info = start_info_arm,
 .shared_info = shared_info_arm,
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index a14d477..74de3c3 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -1082,15 +1082,8 @@ int xc_dom_build_image(struct xc_dom_image *dom)
 /* allocate other pages */
 if ( dom->arch_hooks->alloc_magic_pages(dom) != 0 )
 goto err;
-if ( dom->arch_hooks->count_pgtables )
-{
-if ( dom->arch_hooks->count_pgtables(dom) != 0 )
-goto err;
-if ( (dom->pgtables > 0) &&
- (xc_dom_alloc_segment(dom, &dom->pgtables_seg, "page tables", 0,
-   dom->pgtables * page_size) != 0) )
-goto err;
-}
+if ( dom->arch_hooks->alloc_pgtables(dom) != 0 )
+goto err;
 if ( dom->alloc_bootstack )
 dom->bootstack_pfn = xc_dom_alloc_page(dom, "boot stack");
 DOMPRINTF("%-20s: virt_alloc_end : 0x%" PRIx64 "",
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index ed43c28..ea32b00 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -126,7 +126,7 @@ nr_page_tables(struct xc_dom_image *dom,
 return tables;
 }
 
-static int count_pgtables(struct xc_dom_image *dom, int pae,
+static int alloc_pgtables(struct xc_dom_image *dom, int pae,
   int l4_bits, int l3_bits, int l2_bits, int l1_bits)
 {
 int pages, extra_pages;
@@ -172,7 +172,9 @@ static int count_pgtables(struct xc_dom_image *dom, int pae,
 break;
 }
 dom->virt_pgtab_end = try_virt_end + 1;
-return 0;
+
+return xc_dom_alloc_segment(dom, &dom->pgtables_seg, "page tables", 0,
+dom->pgtables * PAGE_SIZE_X86);
 }
 
 /*  */
@@ -182,9 +184,9 @@ static int count_pgtables(struct xc_dom_image *dom, int pae,
 #define L2_PROT (_PAGE_PRESENT|_PAGE_RW|_PAGE_ACCESSED|_PAGE_DIRTY|_PAGE_USER)
 #define L3_PROT (_PAGE_PRESENT)
 
-static int count_pgtables_x86_32_pae(struct xc_dom_image *dom)
+static int alloc_pgtables_x86_32_pae(struct xc_dom_image *dom)
 {
-return count_pgtables(dom, 1, 0, 32,
+return alloc_pgtables(dom, 1, 0, 32,
   L3_PAGETABLE_SHIFT_PAE, L2_PAGETABLE_SHIFT_PAE);
 }
 
@@ -355,9 +357,9 @@ pfn_error:
 /*  */
 /* x86_64 pagetables*/
 
-static int count_pgtables_x86_64(struct xc_dom_image *dom)
+static int alloc_pgtables_x86_64(struct xc_dom_image *dom)

Re: [Xen-devel] Hackathon 2016 Location Preferences

2015-11-05 Thread Wei Liu

On Thu, Nov 05, 2015 at 03:21:18PM +, Lars Kurth wrote:
> Hi all,
> 
> I wanted to do quick straw-poll regarding Hackathon Locations for next
> year. Before I do this though, I wanted to let you know that the 2016
> Developer Summit will most likely be in Berlin in October (I am in the
> process of finalising space, budget and contract details which will
> need to be approved by the Advisory Board).
> 
> We do have two options for a Hackathon: China (either Shanghai,
> Hangzhou or Beijing - details TBC) and Cambridge, UK. We are still in
> the early planning phase and the budget for the Hackathon has not yet
> been approved. 
> 

I lived in Hangzhou for a while -- it is a nice city in my humble
opinion. :-)

Wei.

> Do let me know of your preference, and I will see whether I can work
> with the vendor(s) who are willing to host the 2016 Hackathon and
> choose a location, which suits a majority of developers.
> 
> Best Regards Lars
> 
> 
> ___ Xen-devel mailing list
> Xen-devel@lists.xen.org http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] ovmf fail to compile

2015-11-05 Thread Wei Liu

On Thu, Nov 05, 2015 at 02:07:26PM +, Hao, Xudong wrote:
> > -Original Message-
> > From: Wei Liu [mailto:wei.l...@citrix.com]
> > Sent: Wednesday, November 4, 2015 6:19 PM
> > To: Hao, Xudong 
> > Cc: Wei Liu ; xen-devel@lists.xen.org
> > Subject: Re: [Xen-devel] ovmf fail to compile
> > 
> > On Wed, Nov 04, 2015 at 08:27:56AM +, Hao, Xudong wrote:
> > > "git clean -fdx" doesn't change the error result with gcc 4.4.7. Gcc 
> > > version is
> > "gcc-4.4.?" in Debian Jessie of yours?
> > >
> > 
> > Debian Jessie's gcc-4.4 has the same version 4.4.7.
> > 
> > As the other sub-thread suggests, can you try passing more f's to git?
> > 
> > A somewhat related question, are you only interested in xen-unstable branch?
> > Have you tried latest OVMF from upstream? If that builds for you I can 
> > easily
> > send another patch to update Config.mk again.
> >
> 
> I'm busy on other urgent today, will try the two above tomorrow and share the 
> result later.
> 

No worries.

Wei.

> -Xudong 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Xen-API] Hackathon 2016 Location Preferences

2015-11-05 Thread Anil Madhavapeddy

On 5 Nov 2015, at 15:21, Lars Kurth  wrote:
> 
> Hi all,
> 
> I wanted to do quick straw-poll regarding Hackathon Locations for next year. 
> Before I do this though, I wanted to let you know that the 2016 Developer 
> Summit will most likely be in Berlin in October (I am in the process of 
> finalising space, budget and contract details which will need to be approved 
> by the Advisory Board).
> 
> We do have two options for a Hackathon: China (either Shanghai, Hangzhou or 
> Beijing - details TBC) and Cambridge, UK. We are still in the early planning 
> phase and the budget for the Hackathon has not yet been approved. 

A lot of unikernel hackers could show up if it's in Cambridge, but 
unfortunately not if it's in China (despite being a much more exciting 
location!).

best,
Anil


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [win-pv-devel] Hackathon 2016 Location Preferences

2015-11-05 Thread Paul Durrant

> -Original Message-
> From: win-pv-devel-boun...@lists.xenproject.org [mailto:win-pv-devel-
> boun...@lists.xenproject.org] On Behalf Of Lars Kurth
> Sent: 05 November 2015 15:21
> To: Xen-devel; mirageos-devel; xen-...@lists.xenproject.org; Win-pv-
> de...@lists.xenproject.org
> Subject: [win-pv-devel] Hackathon 2016 Location Preferences
> 
> Hi all,
> 
> I wanted to do quick straw-poll regarding Hackathon Locations for next year.
> Before I do this though, I wanted to let you know that the 2016 Developer
> Summit will most likely be in Berlin in October (I am in the process of 
> finalising
> space, budget and contract details which will need to be approved by the
> Advisory Board).
> 
> We do have two options for a Hackathon: China (either Shanghai, Hangzhou
> or Beijing - details TBC) and Cambridge, UK. We are still in the early 
> planning
> phase and the budget for the Hackathon has not yet been approved.
> 
> Do let me know of your preference, and I will see whether I can work with
> the vendor(s) who are willing to host the 2016 Hackathon and choose a
> location, which suits a majority of developers.
> 

Since this year's was in Shanghai, my vote would be for Cambridge.

  Paul

> Best Regards
> Lars
> 
> 
> ___
> win-pv-devel mailing list
> win-pv-de...@lists.xenproject.org
> http://lists.xenproject.org/cgi-bin/mailman/listinfo/win-pv-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v11 2/5] missing include asm/paravirt.h in cputime.c

2015-11-05 Thread Peter Zijlstra



How can this be missing? Things compile fine now, right? So please
better explain why we do this change.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v4 1/9] libxc: reorganize domain builder guest memory allocator

2015-11-05 Thread Juergen Gross

Guest memory allocation in the domain builder of libxc is done via
virtual addresses only. In order to be able to support preallocated
areas not virtually mapped reorganize the memory allocator to keep
track of allocated pages globally and in allocated segments.

This requires an interface change of the allocate callback of the
domain builder which currently is using the last mapped virtual
address as a parameter. This is no problem as the only user of this
callback is stubdom/grub/kexec.c using this virtual address to
calculate the last used pfn.

Signed-off-by: Juergen Gross 
---
 stubdom/grub/kexec.c |   6 +--
 tools/libxc/include/xc_dom.h |  13 +++---
 tools/libxc/xc_dom_core.c| 107 ---
 3 files changed, 79 insertions(+), 47 deletions(-)

diff --git a/stubdom/grub/kexec.c b/stubdom/grub/kexec.c
index 0b2f4f3..2300318 100644
--- a/stubdom/grub/kexec.c
+++ b/stubdom/grub/kexec.c
@@ -100,9 +100,9 @@ static void do_exchange(struct xc_dom_image *dom, xen_pfn_t 
target_pfn, xen_pfn_
 dom->p2m_host[target_pfn] = source_mfn;
 }
 
-int kexec_allocate(struct xc_dom_image *dom, xen_vaddr_t up_to)
+int kexec_allocate(struct xc_dom_image *dom)
 {
-unsigned long new_allocated = (up_to - dom->parms.virt_base) / PAGE_SIZE;
+unsigned long new_allocated = dom->pfn_alloc_end - dom->rambase_pfn;
 unsigned long i;
 
 pages = realloc(pages, new_allocated * sizeof(*pages));
@@ -319,8 +319,6 @@ void kexec(void *kernel, long kernel_size, void *module, 
long module_size, char
 
 /* Make sure the bootstrap page table does not RW-map any of our current
  * page table frames */
-kexec_allocate(dom, dom->virt_pgtab_end);
-
 if ( (rc = xc_dom_update_guest_p2m(dom))) {
 grub_printf("xc_dom_update_guest_p2m returned %d\n", rc);
 errnum = ERR_BOOT_FAILURE;
diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index ccc5926..68d6848 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -29,6 +29,7 @@ struct xc_dom_seg {
 xen_vaddr_t vstart;
 xen_vaddr_t vend;
 xen_pfn_t pfn;
+xen_pfn_t pages;
 };
 
 struct xc_dom_mem {
@@ -90,6 +91,7 @@ struct xc_dom_image {
 xen_pfn_t xenstore_pfn;
 xen_pfn_t shared_info_pfn;
 xen_pfn_t bootstack_pfn;
+xen_pfn_t pfn_alloc_end;
 xen_vaddr_t virt_alloc_end;
 xen_vaddr_t bsd_symtab_start;
 
@@ -175,8 +177,8 @@ struct xc_dom_image {
 
 /* kernel loader */
 struct xc_dom_arch *arch_hooks;
-/* allocate up to virt_alloc_end */
-int (*allocate) (struct xc_dom_image * dom, xen_vaddr_t up_to);
+/* allocate up to pfn_alloc_end */
+int (*allocate) (struct xc_dom_image * dom);
 
 /* Container type (HVM or PV). */
 enum {
@@ -360,14 +362,11 @@ static inline void *xc_dom_seg_to_ptr_pages(struct 
xc_dom_image *dom,
   struct xc_dom_seg *seg,
   xen_pfn_t *pages_out)
 {
-xen_vaddr_t segsize = seg->vend - seg->vstart;
-unsigned int page_size = XC_DOM_PAGE_SIZE(dom);
-xen_pfn_t pages = (segsize + page_size - 1) / page_size;
 void *retval;
 
-retval = xc_dom_pfn_to_ptr(dom, seg->pfn, pages);
+retval = xc_dom_pfn_to_ptr(dom, seg->pfn, seg->pages);
 
-*pages_out = retval ? pages : 0;
+*pages_out = retval ? seg->pages : 0;
 return retval;
 }
 
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index fbe4464..a14d477 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -535,56 +535,75 @@ void *xc_dom_pfn_to_ptr_retcount(struct xc_dom_image 
*dom, xen_pfn_t pfn,
 return phys->ptr;
 }
 
-int xc_dom_alloc_segment(struct xc_dom_image *dom,
- struct xc_dom_seg *seg, char *name,
- xen_vaddr_t start, xen_vaddr_t size)
+static int xc_dom_chk_alloc_pages(struct xc_dom_image *dom, char *name,
+  xen_pfn_t pages)
 {
 unsigned int page_size = XC_DOM_PAGE_SIZE(dom);
-xen_pfn_t pages = (size + page_size - 1) / page_size;
-xen_pfn_t pfn;
-void *ptr;
 
-if ( start == 0 )
-start = dom->virt_alloc_end;
+if ( pages > dom->total_pages || /* multiple test avoids overflow probs */
+ dom->pfn_alloc_end - dom->rambase_pfn > dom->total_pages ||
+ pages > dom->total_pages - dom->pfn_alloc_end + dom->rambase_pfn )
+{
+xc_dom_panic(dom->xch, XC_OUT_OF_MEMORY,
+ "%s: segment %s too large (0x%"PRIpfn" > "
+ "0x%"PRIpfn" - 0x%"PRIpfn" pages)", __FUNCTION__, name,
+ pages, dom->total_pages,
+ dom->pfn_alloc_end - dom->rambase_pfn);
+return -1;
+}
+
+dom->pfn_alloc_end += pages;
+dom->virt_alloc_end += pages * page_size;
+
+return 0;
+}
 
-if ( start & (page_size - 1) )
+static int xc_dom_alloc_pad(struct xc_dom_image *dom, xen_vaddr_t boundary)
+{
+

Re: [Xen-devel] [PATCH] x86/PoD: tighten conditions for checking super page

2015-11-05 Thread Jan Beulich

>>> On 02.11.15 at 17:29,  wrote:
> * steal_for_cache may now be wrong.  I realize that since now ram == 0
> that all the subsequent "steal_for_cache" expressions will end up as
> "false" anyway, but leaving invariants in an invalid state is sort of
> asking for trouble.
> 
> I'd prefer you just update steal_for_cache; but if not, at least leave a
> comment there saying that it may be wrong and why it doesn't matter.

I've just done the other things, but I don't think steal_for_cache
can have changed at this point: p2m_pod_cache_add() increments
p2m->pod.count by the same value by which
p2m_pod_zero_check_superpage() bumps p2m->pod.entry_count
right after having called p2m_pod_cache_add(). I could leave a
comment of ASSERT() to that effect, unless I'm overlooking
something.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v11 1/5] xen: move xen_setup_runstate_info and get_runstate_snapshot to drivers/xen/time.c

2015-11-05 Thread Mark Rutland

Hi,

> +static u64 get64(const u64 *p)
> +{
> + u64 ret;
> +
> + if (BITS_PER_LONG < 64) {
> + u32 *p32 = (u32 *)p;
> + u32 h, l;
> +
> + /*
> +  * Read high then low, and then make sure high is
> +  * still the same; this will only loop if low wraps
> +  * and carries into high.
> +  * XXX some clean way to make this endian-proof?
> +  */
> + do {
> + h = p32[1];
> + barrier();
> + l = p32[0];
> + barrier();
> + } while (p32[1] != h);

I realise this is simply a move of existing code, but it may be better
to instead have:

do {
h = READ_ONCE(p32[1]);
l = READ_ONCE(p32[0]);
} while (READ_ONCE(p32[1] != h);

Which ensures that each load is a single access (though it almost
certainly would be anyway), and prevents the compiler from having to
reload any other memory locations (which the current barrier() usage
forces).

> +
> + ret = (((u64)h) << 32) | l;
> + } else
> + ret = *p;

Likewise, this would be better as READ_ONCE(*p), to force a single
access.

> +
> + return ret;
> +}

> + do {
> + state_time = get64(&state->state_entry_time);
> + barrier();
> + *res = *state;
> + barrier();

You can also have:

*res = READ_ONCE(*state);

That will which will handle the barriers implicitly.

Thanks,
Mark.

> + } while (get64(&state->state_entry_time) != state_time);
> +}

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 0/2] wallclock time on arm

2015-11-05 Thread Stefano Stabellini

Hi all,

this small series enables the wallclock time on arm and it consists
mostly in code movement from x86 to common.


Stefano Stabellini (2):
  xen: move wallclock functions from x86 to common
  arm: export platform_op XENPF_settime

 xen/arch/arm/Makefile |1 +
 xen/arch/arm/domain.c |3 ++
 xen/arch/arm/platform_hypercall.c |   62 
 xen/arch/arm/time.c   |5 --
 xen/arch/arm/traps.c  |1 +
 xen/arch/x86/time.c   |   92 +---
 xen/common/time.c |   94 +
 xen/include/xsm/dummy.h   |   12 ++---
 xen/include/xsm/xsm.h |   13 ++---
 9 files changed, 175 insertions(+), 108 deletions(-)
 create mode 100644 xen/arch/arm/platform_hypercall.c

Cheers,

Stefano

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 2/2] arm: export platform_op XENPF_settime

2015-11-05 Thread Stefano Stabellini

Call update_domain_wallclock_time at domain initialization.

Signed-off-by: Stefano Stabellini 
Signed-off-by: Ian Campbell 
---
 xen/arch/arm/Makefile |1 +
 xen/arch/arm/domain.c |3 ++
 xen/arch/arm/platform_hypercall.c |   62 +
 xen/arch/arm/traps.c  |1 +
 xen/include/xsm/dummy.h   |   12 +++
 xen/include/xsm/xsm.h |   13 
 6 files changed, 80 insertions(+), 12 deletions(-)
 create mode 100644 xen/arch/arm/platform_hypercall.c

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 1ef39f7..240aa29 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -23,6 +23,7 @@ obj-y += percpu.o
 obj-y += guestcopy.o
 obj-y += physdev.o
 obj-y += platform.o
+obj-y += platform_hypercall.o
 obj-y += setup.o
 obj-y += bootfdt.o
 obj-y += time.o
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index b2bfc7d..ac9b1b3 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -742,6 +742,9 @@ int arch_set_info_guest(
 v->arch.ttbr1 = ctxt->ttbr1;
 v->arch.ttbcr = ctxt->ttbcr;
 
+if ( v->vcpu_id == 0 )
+update_domain_wallclock_time(v->domain);
+
 v->is_initialised = 1;
 
 if ( ctxt->flags & VGCF_online )
diff --git a/xen/arch/arm/platform_hypercall.c 
b/xen/arch/arm/platform_hypercall.c
new file mode 100644
index 000..f60d7b3
--- /dev/null
+++ b/xen/arch/arm/platform_hypercall.c
@@ -0,0 +1,62 @@
+/**
+ * platform_hypercall.c
+ * 
+ * Hardware platform operations. Intended for use by domain-0 kernel.
+ * 
+ * Copyright (c) 2015, Citrix
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+DEFINE_SPINLOCK(xenpf_lock);
+
+long do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
+{
+long ret;
+struct xen_platform_op curop, *op = &curop;
+
+if ( copy_from_guest(op, u_xenpf_op, 1) )
+return -EFAULT;
+
+if ( op->interface_version != XENPF_INTERFACE_VERSION )
+return -EACCES;
+
+ret = xsm_platform_op(XSM_PRIV, op->cmd);
+if ( ret )
+return ret;
+
+spin_lock(&xenpf_lock);
+
+switch ( op->cmd )
+{
+case XENPF_settime32:
+do_settime(op->u.settime32.secs,
+   op->u.settime32.nsecs,
+   op->u.settime32.system_time);
+break;
+
+case XENPF_settime64:
+if ( likely(!op->u.settime64.mbz) )
+do_settime(op->u.settime64.secs,
+   op->u.settime64.nsecs,
+   op->u.settime64.system_time);
+else
+ret = -EINVAL;
+break;
+
+default:
+ret = -ENOSYS;
+break;
+}
+
+spin_unlock(&xenpf_lock);
+return ret;
+}
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 9d2bd6a..c49bd3f 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1233,6 +1233,7 @@ static arm_hypercall_t arm_hypercall_table[] = {
 HYPERCALL(hvm_op, 2),
 HYPERCALL(grant_table_op, 3),
 HYPERCALL(multicall, 2),
+HYPERCALL(platform_op, 1),
 HYPERCALL_ARM(vcpu_op, 3),
 };
 
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 9fe372c..aec5a9b 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -583,6 +583,12 @@ static XSM_INLINE int xsm_mem_sharing(XSM_DEFAULT_ARG 
struct domain *d)
 return xsm_default_action(action, current->domain, d);
 }
 #endif
+ 
+static XSM_INLINE int xsm_platform_op(XSM_DEFAULT_ARG uint32_t op)
+{
+XSM_ASSERT_ACTION(XSM_PRIV);
+return xsm_default_action(action, current->domain, NULL);
+}
 
 #ifdef CONFIG_X86
 static XSM_INLINE int xsm_do_mca(XSM_DEFAULT_VOID)
@@ -639,12 +645,6 @@ static XSM_INLINE int xsm_apic(XSM_DEFAULT_ARG struct 
domain *d, int cmd)
 return xsm_default_action(action, d, NULL);
 }
 
-static XSM_INLINE int xsm_platform_op(XSM_DEFAULT_ARG uint32_t op)
-{
-XSM_ASSERT_ACTION(XSM_PRIV);
-return xsm_default_action(action, current->domain, NULL);
-}
-
 static XSM_INLINE int xsm_machine_memory_map(XSM_DEFAULT_VOID)
 {
 XSM_ASSERT_ACTION(XSM_PRIV);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index ba3caed..f48cf60 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -164,6 +164,8 @@ struct xsm_operations {
 int (*mem_sharing) (struct domain *d);
 #endif
 
+int (*platform_op) (uint32_t cmd);
+
 #ifdef CONFIG_X86
 int (*do_mca) (void);
 int (*shadow_control) (struct domain *d, uint32_t op);
@@ -175,7 +177,6 @@ struct xsm_operations {
 int (*mem_sharing_op) (struct domain *d, struct domain *cd, int op);
 int (*apic) (struct domain *d, int cmd);
 int (*memtype) (uint32_t access);
-int (*platform_op) (uint32_t cmd);
 int (*machine_memory_map) (void);
 int (*domain_memory_map) (struct domain *d);
 #define XSM_MMU_UPDATE_READ  1
@@ -6

[Xen-devel] [PATCH 1/2] xen: move wallclock functions from x86 to common

2015-11-05 Thread Stefano Stabellini

Remove dummy arm implementation of wallclock_time.
Use shared_info() in common code rather than x86-ism to access it.

Signed-off-by: Stefano Stabellini 
Signed-off-by: Ian Campbell 
---
 xen/arch/arm/time.c |5 ---
 xen/arch/x86/time.c |   92 +
 xen/common/time.c   |   94 +++
 3 files changed, 95 insertions(+), 96 deletions(-)

diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
index 5ded30c..6207615 100644
--- a/xen/arch/arm/time.c
+++ b/xen/arch/arm/time.c
@@ -280,11 +280,6 @@ void domain_set_time_offset(struct domain *d, int64_t 
time_offset_seconds)
 /* XXX update guest visible wallclock time */
 }
 
-struct tm wallclock_time(uint64_t *ns)
-{
-return (struct tm) { 0 };
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index bbb7e6c..764d7dc 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -47,9 +47,6 @@ string_param("clocksource", opt_clocksource);
 unsigned long __read_mostly cpu_khz;  /* CPU clock frequency in kHz. */
 DEFINE_SPINLOCK(rtc_lock);
 unsigned long pit0_ticks;
-static unsigned long wc_sec; /* UTC time at last 'time update'. */
-static unsigned int wc_nsec;
-static DEFINE_SPINLOCK(wc_lock);
 
 struct cpu_time {
 u64 local_tsc_stamp;
@@ -900,37 +897,6 @@ void force_update_vcpu_system_time(struct vcpu *v)
 __update_vcpu_system_time(v, 1);
 }
 
-void update_domain_wallclock_time(struct domain *d)
-{
-uint32_t *wc_version;
-unsigned long sec;
-
-spin_lock(&wc_lock);
-
-wc_version = &shared_info(d, wc_version);
-*wc_version = version_update_begin(*wc_version);
-wmb();
-
-sec = wc_sec + d->time_offset_seconds;
-if ( likely(!has_32bit_shinfo(d)) )
-{
-d->shared_info->native.wc_sec= sec;
-d->shared_info->native.wc_nsec   = wc_nsec;
-d->shared_info->native.wc_sec_hi = sec >> 32;
-}
-else
-{
-d->shared_info->compat.wc_sec = sec;
-d->shared_info->compat.wc_nsec= wc_nsec;
-d->shared_info->compat.arch.wc_sec_hi = sec >> 32;
-}
-
-wmb();
-*wc_version = version_update_end(*wc_version);
-
-spin_unlock(&wc_lock);
-}
-
 static void update_domain_rtc(void)
 {
 struct domain *d;
@@ -988,27 +954,6 @@ int cpu_frequency_change(u64 freq)
 return 0;
 }
 
-/* Set clock to  after 00:00:00 UTC, 1 January, 1970. */
-void do_settime(unsigned long secs, unsigned int nsecs, u64 system_time_base)
-{
-u64 x;
-u32 y;
-struct domain *d;
-
-x = SECONDS(secs) + nsecs - system_time_base;
-y = do_div(x, 10);
-
-spin_lock(&wc_lock);
-wc_sec  = x;
-wc_nsec = y;
-spin_unlock(&wc_lock);
-
-rcu_read_lock(&domlist_read_lock);
-for_each_domain ( d )
-update_domain_wallclock_time(d);
-rcu_read_unlock(&domlist_read_lock);
-}
-
 /* Per-CPU communication between rendezvous IRQ and softirq handler. */
 struct cpu_calibration {
 u64 local_tsc_stamp;
@@ -1608,25 +1553,6 @@ void send_timer_event(struct vcpu *v)
 send_guest_vcpu_virq(v, VIRQ_TIMER);
 }
 
-/* Return secs after 00:00:00 localtime, 1 January, 1970. */
-unsigned long get_localtime(struct domain *d)
-{
-return wc_sec + (wc_nsec + NOW()) / 10ULL 
-+ d->time_offset_seconds;
-}
-
-/* Return microsecs after 00:00:00 localtime, 1 January, 1970. */
-uint64_t get_localtime_us(struct domain *d)
-{
-return (SECONDS(wc_sec + d->time_offset_seconds) + wc_nsec + NOW())
-   / 1000UL;
-}
-
-unsigned long get_sec(void)
-{
-return wc_sec + (wc_nsec + NOW()) / 10ULL;
-}
-
 /* "cmos_utc_offset" is the difference between UTC time and CMOS time. */
 static long cmos_utc_offset; /* in seconds */
 
@@ -1635,7 +1561,7 @@ int time_suspend(void)
 if ( smp_processor_id() == 0 )
 {
 cmos_utc_offset = -get_cmos_time();
-cmos_utc_offset += (wc_sec + (wc_nsec + NOW()) / 10ULL);
+cmos_utc_offset += get_sec();
 kill_timer(&calibration_timer);
 
 /* Sync platform timer stamps. */
@@ -1715,22 +1641,6 @@ int hwdom_pit_access(struct ioreq *ioreq)
 return 0;
 }
 
-struct tm wallclock_time(uint64_t *ns)
-{
-uint64_t seconds, nsec;
-
-if ( !wc_sec )
-return (struct tm) { 0 };
-
-seconds = NOW() + SECONDS(wc_sec) + wc_nsec;
-nsec = do_div(seconds, 10);
-
-if ( ns )
-*ns = nsec;
-
-return gmtime(seconds);
-}
-
 /*
  * PV SoftTSC Emulation.
  */
diff --git a/xen/common/time.c b/xen/common/time.c
index 29fdf52..306c5dc 100644
--- a/xen/common/time.c
+++ b/xen/common/time.c
@@ -16,7 +16,13 @@
  */
 
 #include 
+#include 
+#include 
+#include 
 #include 
+#include 
+#include 
+
 
 /* Nonzero if YEAR is a leap year (every 4 years,
except every 100th isn't, and every 400th is).  */
@@ -34,6 +40,10 @@ const unsigned short int __mon_lengths[2][12] = {
 #define SECS_PER_HOUR (60 * 60)
 #define

Re: [Xen-devel] [PATCH v11 5/5] xen/arm: account for stolen ticks

2015-11-05 Thread Mark Rutland

>  static void xen_percpu_init(void)
>  {
>   struct vcpu_register_vcpu_info info;
> @@ -104,6 +120,8 @@ static void xen_percpu_init(void)
>   BUG_ON(err);
>   per_cpu(xen_vcpu, cpu) = vcpup;
>  
> + xen_setup_runstate_info(cpu);

Does the runstate memory area get unregsitered when a kernel tears
things down, or is kexec somehow inhibited for xen guests?

i couldn't spot either happening, but I may have missed it.

Mark.

> +
>  after_register_vcpu_info:
>   enable_percpu_irq(xen_events_irq, 0);
>   put_cpu();
> @@ -271,6 +289,9 @@ static int __init xen_guest_init(void)
>  
>   register_cpu_notifier(&xen_cpu_notifier);
>  
> + pv_time_ops.steal_clock = xen_stolen_accounting;
> + static_key_slow_inc(¶virt_steal_enabled);
> +
>   return 0;
>  }
>  early_initcall(xen_guest_init);
> -- 
> 1.7.10.4
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] Design doc of adding ACPI support for arm64 on Xen - version 6

2015-11-05 Thread Shannon Zhao

This document is going to explain the design details of Xen booting with
ACPI on ARM. Any comments are welcome.

Changes v5->v6:
* add a new node "uefi" under /hypervisor to pass UEFI informations to
  Dom0 instead of the nodes under /chosen.
* change creation of MADT table, get the information from
  domain->arch.vgic struct
* Reuse grant table region which will not be used by Dom0 when booting
  through ACPI to store the new created ACPI tables.

Changes v4->v5:
* change the description of section 4 to make it more generic
* place EFI and ACPI tables at non-RAM space of Dom0

Changes v3->v4:
* add explanation for minimal DT and the properties
* drop "linux," prefix of the properties
* add explanation for the event channel flag
* create RSDP table since the "xsdt_physical_address" is changed
* since it uses hypervisor_id introduced by ACPI 6.0 to notify Dom0 the
  hypervisor ID, so it needs to limit minimum supported ACPI version for
  Xen on ARM to 6.0.

Changes v2->v3:
* remove the two HVM_PARAMs for grant table and let linux kernel use
  xlated_setup_gnttab_pages() to setup grant table.
* don't modify GTDT table
* add definition of event-channel interrupt flag
* state that route all Xen unused interrupt to Dom0
* state that reusing existing PCI bus_notifier for PCI devices MMIO
* mapping

To Xen itself booting with ACPI, this is similar to Linux kernel except
that Xen doesn't parse DSDT table. So I'll skip this part and focus on
how Xen prepares ACPI tables for Dom0 and how Xen passes them to Dom0.

1. Create minimal DT to pass required informations to Dom0
--
When booting in UEFI mode on ARM64, it needs to pass some UEFI
informations to Dom0. The necessary informations is the address of EFI
System table and EFI Memory Descriptor table, the size of EFI Memory
Descriptor table, the size of EFI Memory Descriptor and the version of
EFI Memory Descriptor. Here it passes these informations through the
"uefi" node under hypervisor of this minimal DT. Dom0 should parse this
DT to get Xen UEFI informations like the way Linux kernel getting normal
UEFI informations. Also, it should check if the DT contains only the
/hypervisor and /chosen nodes to know whether it boots with DT or ACPI.

In addition, Dom0 should parse DT to know whether it runs on Xen
hypervisor, then it should execute a Xen UEFI specific routine to
initialize UEFI.

An example of the minimal DT:
/ {
#address-cells = <2>;
#size-cells = <2>;
hypervisor {
compatible = "xen,xen-4.3", "xen,xen";
reg = <0 0xb000 0 0x2>;  /* Only need for booting
without ACPI */
interrupts = <1 15 0xf08>; /* Only need for booting without ACPI */
uefi {
xen,uefi-system-table = <0x>;
xen,uefi-mmap-start = <0x>;
xen,uefi-mmap-size = <0x>;
xen,uefi-mmap-desc-size = <0x>;
xen,uefi-mmap-desc-ver = <0x>;
};
};

chosen {
bootargs = "kernel=Image console=hvc0 earlycon=pl011,0x1c09
root=/dev/vda2 rw rootfstype=ext4 init=/bin/sh acpi=force";
linux,initrd-start = <0x>;
linux,initrd-end = <0x>;
};
};

For details loook at(this will be updated by a patch of Linux kernel)
https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/arm/xen.txt

2. Copy and change some EFI and ACPI tables
---
a) Create EFI_SYSTEM_TABLE table
Create a new EFI System table. Copy the table header from host original
EFI System table. Change the value of HeaderSize, CRC32 and Revision
fields in this EFI System table header. Assign new values for
FirmwareVendor and FirmwareRevision fields of EFI System table. Create
one ConfigurationTable and assign the value of VendorGuid field to
ACPI_20_TABLE_GUID, the value of VendorTable field to the address of
ACPI RSDP table. This EFI System Table will be passed to Dom0 through
the property "uefi-system-table" in the above minimal DT. So Dom0 could
get ACPI root table address through the ConfigurationTable.

b) Create EFI_MEMORY_DESCRIPTOR table
It needs to notify Dom0 where are the RAM regions. Add memory start and
size information of Dom0 in this table. It's passed to Dom0 through the
properties "uefi-mmap-start", "uefi-mmap-size", "uefi-mmap-desc-size"
and "uefi-mmap-desc-ver" of the minimal DT. Then Dom0 will get the
memory information through this EFI table.

c) Create FADT table
Firstly copy the contents of host FADT table to the new created FADT
table. Then change the value of arm_boot_flags to enable PSCI and HVC.

d) Create MADT table
It needs to change MADT table to restrict the number of vCPUs.
Firstly copy the contents of host MADT table except the interrupt
controller structures to the new created MADT table. For GICv2, it needs
to add dom0_max_vcpus number of GICC entries and one GICD entry. For
GICv3, it needs to add one GICD

Re: [Xen-devel] [PATCH 2/2] arm: export platform_op XENPF_settime

2015-11-05 Thread David Vrabel

On 05/11/15 16:57, Stefano Stabellini wrote:
> +case XENPF_settime32:
> +do_settime(op->u.settime32.secs,
> +   op->u.settime32.nsecs,
> +   op->u.settime32.system_time);
> +break;

I don't think you want to provide this hypercall -- only provide the
XENPF_settime64 one.

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC] vmalloc/vzalloc: Add memflags parameter.

2015-11-05 Thread Jan Beulich

>>> On 02.11.15 at 18:12,  wrote:
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -1223,7 +1223,7 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, 
> XEN_GUEST_HANDLE_PARAM(void) arg)
>  if ( v->vcpu_info == &dummy_vcpu_info )
>  return -EINVAL;
>  
> -if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
> +if ( (ctxt = alloc_vcpu_guest_context(MEMF_node(domain_to_node(d 
> == NULL )

This one's a temporary allocation that gets freed a few lines down.
Hence best performance would be achieved by using the current
CPU's node, which iiuc will result if you pass just zero here.

> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -492,7 +492,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
> u_domctl)
>   < sizeof(struct compat_vcpu_guest_context));
>  #endif
>  ret = -ENOMEM;
> -if ( (c.nat = alloc_vcpu_guest_context()) == NULL )
> +if ( (c.nat = 
> alloc_vcpu_guest_context(MEMF_node(domain_to_node(d == NULL )

Same here.

> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -577,9 +577,9 @@ void domain_cpuid(struct domain *d,
>  
>  #define domain_max_vcpus(d) (is_hvm_domain(d) ? HVM_MAX_VCPUS : 
> MAX_VIRT_CPUS)
>  
> -static inline struct vcpu_guest_context *alloc_vcpu_guest_context(void)
> +static inline struct vcpu_guest_context *alloc_vcpu_guest_context(unsigned 
> int memflags)
>  {
> -return vmalloc(sizeof(struct vcpu_guest_context));
> +return vmalloc(sizeof(struct vcpu_guest_context), memflags);

With the above you won't need to add a parameter to the
function anymore, but if for some reason you did you'd need
to mirror this to ARM code.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 0/3] Xen wallclock on arm and arm64

2015-11-05 Thread Stefano Stabellini

Hi all,

this series introduces PV wallclock time support on arm and arm64.


Stefano Stabellini (3):
  xen/arm: introduce xen_read_wallclock
  xen/arm: introduce HYPERVISOR_dom0_op on arm and arm64
  xen/arm: set the system time in Xen via the XENPF_settime hypercall

 arch/arm/Kconfig |1 +
 arch/arm/include/asm/xen/hypercall.h |2 +
 arch/arm/xen/enlighten.c |   82 ++
 arch/arm/xen/hypercall.S |1 +
 arch/arm64/xen/hypercall.S   |1 +
 5 files changed, 87 insertions(+)


Cheers,

Stefano

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 3/3] xen/arm: set the system time in Xen via the XENPF_settime hypercall

2015-11-05 Thread Stefano Stabellini

If Linux is running as dom0, call XENPF_settime to update the system
time in Xen on pvclock_gtod notifications.

Signed-off-by: Stefano Stabellini 
Signed-off-by: Ian Campbell 
---
 arch/arm/xen/enlighten.c |   52 +-
 1 file changed, 51 insertions(+), 1 deletion(-)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index b6aea9c..0176db0 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -123,6 +124,50 @@ static void xen_read_wallclock(struct timespec *ts)
set_normalized_timespec(ts, now.tv_sec, now.tv_nsec);
 }
 
+static int xen_pvclock_gtod_notify(struct notifier_block *nb,
+  unsigned long was_set, void *priv)
+{
+   /* Protected by the calling core code serialization */
+   static struct timespec next_sync;
+
+   struct xen_platform_op op;
+   struct timespec now;
+
+   now = __current_kernel_time();
+
+   /*
+* We only take the expensive HV call when the clock was set
+* or when the 11 minutes RTC synchronization time elapsed.
+*/
+   if (!was_set && timespec_compare(&now, &next_sync) < 0)
+   return NOTIFY_OK;
+
+   op.interface_version = XENPF_INTERFACE_VERSION;
+   op.cmd = XENPF_settime;
+   op.u.settime.secs = now.tv_sec;
+   op.u.settime.nsecs = now.tv_nsec;
+   op.u.settime.system_time = arch_timer_read_counter();
+   printk("GTOD: Setting to %ld.%ld at %lld\n",
+  (long)op.u.settime.secs,
+  (long)op.u.settime.nsecs,
+  (long long)op.u.settime.system_time);
+   (void)HYPERVISOR_dom0_op(&op);
+
+   /*
+* Move the next drift compensation time 11 minutes
+* ahead. That's emulating the sync_cmos_clock() update for
+* the hardware RTC.
+*/
+   next_sync = now;
+   next_sync.tv_sec += 11 * 60;
+
+   return NOTIFY_OK;
+}
+
+static struct notifier_block xen_pvclock_gtod_notifier = {
+   .notifier_call = xen_pvclock_gtod_notify,
+};
+
 static void xen_percpu_init(void)
 {
struct vcpu_register_vcpu_info info;
@@ -321,7 +366,12 @@ static int __init xen_guest_init(void)
pv_time_ops.steal_clock = xen_stolen_accounting;
static_key_slow_inc(¶virt_steal_enabled);
xen_read_wallclock(&ts);
-   do_settimeofday(&ts);
+   if (xen_initial_domain())
+   pvclock_gtod_register_notifier(&xen_pvclock_gtod_notifier);
+   else {
+   xen_read_wallclock(&ts);
+   do_settimeofday(&ts);
+   }
 
return 0;
 }
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 2/3] xen/arm: introduce HYPERVISOR_dom0_op on arm and arm64

2015-11-05 Thread Stefano Stabellini

Signed-off-by: Stefano Stabellini 
---
 arch/arm/include/asm/xen/hypercall.h |2 ++
 arch/arm/xen/enlighten.c |1 +
 arch/arm/xen/hypercall.S |1 +
 arch/arm64/xen/hypercall.S   |1 +
 4 files changed, 5 insertions(+)

diff --git a/arch/arm/include/asm/xen/hypercall.h 
b/arch/arm/include/asm/xen/hypercall.h
index 712b50e..7a8ee15 100644
--- a/arch/arm/include/asm/xen/hypercall.h
+++ b/arch/arm/include/asm/xen/hypercall.h
@@ -35,6 +35,7 @@
 
 #include 
 #include 
+#include 
 
 long privcmd_call(unsigned call, unsigned long a1,
unsigned long a2, unsigned long a3,
@@ -49,6 +50,7 @@ int HYPERVISOR_memory_op(unsigned int cmd, void *arg);
 int HYPERVISOR_physdev_op(int cmd, void *arg);
 int HYPERVISOR_vcpu_op(int cmd, int vcpuid, void *extra_args);
 int HYPERVISOR_tmem_op(void *arg);
+int HYPERVISOR_dom0_op(void *arg);
 int HYPERVISOR_multicall(struct multicall_entry *calls, uint32_t nr);
 
 static inline int
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index f07383d..b6aea9c 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -359,5 +359,6 @@ EXPORT_SYMBOL_GPL(HYPERVISOR_memory_op);
 EXPORT_SYMBOL_GPL(HYPERVISOR_physdev_op);
 EXPORT_SYMBOL_GPL(HYPERVISOR_vcpu_op);
 EXPORT_SYMBOL_GPL(HYPERVISOR_tmem_op);
+EXPORT_SYMBOL_GPL(HYPERVISOR_dom0_op);
 EXPORT_SYMBOL_GPL(HYPERVISOR_multicall);
 EXPORT_SYMBOL_GPL(privcmd_call);
diff --git a/arch/arm/xen/hypercall.S b/arch/arm/xen/hypercall.S
index 10fd99c..89db58f 100644
--- a/arch/arm/xen/hypercall.S
+++ b/arch/arm/xen/hypercall.S
@@ -89,6 +89,7 @@ HYPERCALL2(memory_op);
 HYPERCALL2(physdev_op);
 HYPERCALL3(vcpu_op);
 HYPERCALL1(tmem_op);
+HYPERCALL1(dom0_op);
 HYPERCALL2(multicall);
 
 ENTRY(privcmd_call)
diff --git a/arch/arm64/xen/hypercall.S b/arch/arm64/xen/hypercall.S
index 8bbe940..3840b1a 100644
--- a/arch/arm64/xen/hypercall.S
+++ b/arch/arm64/xen/hypercall.S
@@ -80,6 +80,7 @@ HYPERCALL2(memory_op);
 HYPERCALL2(physdev_op);
 HYPERCALL3(vcpu_op);
 HYPERCALL1(tmem_op);
+HYPERCALL1(dom0_op);
 HYPERCALL2(multicall);
 
 ENTRY(privcmd_call)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 1/3] xen/arm: introduce xen_read_wallclock

2015-11-05 Thread Stefano Stabellini

Read the wallclock from the shared info page at boot time.

Signed-off-by: Stefano Stabellini 
---
 arch/arm/Kconfig |1 +
 arch/arm/xen/enlighten.c |   31 +++
 2 files changed, 32 insertions(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 60be104..a9de420 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1852,6 +1852,7 @@ config XEN
depends on CPU_V7 && !CPU_V6
depends on !GENERIC_ATOMIC64
depends on MMU
+   depends on HAVE_ARM_ARCH_TIMER
select ARCH_DMA_ADDR_T_64BIT
select ARM_PSCI
select SWIOTLB_XEN
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 15621b1..f07383d 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -28,6 +28,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 
@@ -95,6 +97,32 @@ static unsigned long long xen_stolen_accounting(int cpu)
return state.time[RUNSTATE_runnable] + state.time[RUNSTATE_offline];
 }
 
+static void xen_read_wallclock(struct timespec *ts)
+{
+   u32 version;
+   u64 delta;
+   struct timespec now;
+   struct shared_info *s = HYPERVISOR_shared_info;
+   struct pvclock_wall_clock *wall_clock = &(s->wc);
+
+   /* get wallclock at system boot */
+   do {
+   version = wall_clock->version;
+   rmb();  /* fetch version before time */
+   now.tv_sec  = wall_clock->sec;
+   now.tv_nsec = wall_clock->nsec;
+   rmb();  /* fetch time before checking version */
+   } while ((wall_clock->version & 1) || (version != wall_clock->version));
+
+   delta = arch_timer_read_counter();  /* time since system boot */
+   delta += now.tv_sec * (u64)NSEC_PER_SEC + now.tv_nsec;
+
+   now.tv_nsec = do_div(delta, NSEC_PER_SEC);
+   now.tv_sec = delta;
+
+   set_normalized_timespec(ts, now.tv_sec, now.tv_nsec);
+}
+
 static void xen_percpu_init(void)
 {
struct vcpu_register_vcpu_info info;
@@ -218,6 +246,7 @@ static int __init xen_guest_init(void)
struct shared_info *shared_info_page = NULL;
struct resource res;
phys_addr_t grant_frames;
+   struct timespec ts;
 
if (!xen_domain())
return 0;
@@ -291,6 +320,8 @@ static int __init xen_guest_init(void)
 
pv_time_ops.steal_clock = xen_stolen_accounting;
static_key_slow_inc(¶virt_steal_enabled);
+   xen_read_wallclock(&ts);
+   do_settimeofday(&ts);
 
return 0;
 }
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC] domain: Compile with lock_profile=y enabled.

2015-11-05 Thread Jan Beulich

>>> On 02.11.15 at 18:12,  wrote:
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -237,6 +237,7 @@ struct domain *alloc_domain_struct(void)
>  #ifdef CONFIG_BIGMEM
>  const unsigned int bits = 0;
>  #else
> +int order = get_order_from_bytes(sizeof(*d));

unsigned int

> @@ -247,10 +248,12 @@ struct domain *alloc_domain_struct(void)
>   bits = _domain_struct_bits();
>  #endif
>  
> -BUILD_BUG_ON(sizeof(*d) > PAGE_SIZE);

Not unconditionally (i.e. at least non-debug builds should continue
to have this).

> -d = alloc_xenheap_pages(0, MEMF_bits(bits));
> +d = alloc_xenheap_pages(order, MEMF_bits(bits));
>  if ( d != NULL )
> -clear_page(d);
> +{
> +for ( ; order >= 0; order-- )
> +clear_page((void *)d + PAGE_SIZE*order);

This loop works for orders 0 and 1, but not anything else (not
clearing all of the pages).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [MirageOS-devel] Hackathon 2016 Location Preferences

2015-11-05 Thread Richard Mortier

On 5 November 2015 at 16:24, Wei Liu  wrote:
>> We do have two options for a Hackathon: China (either Shanghai,
>> Hangzhou or Beijing - details TBC) and Cambridge, UK. We are still in
>> the early planning phase and the budget for the Hackathon has not yet
>> been approved.
>
> I lived in Hangzhou for a while -- it is a nice city in my humble
> opinion. :-)

I have visited Hangzhou and it is certainly a nice city!
But Cambridge would get my vote for convenience I'm afraid :)

-- 
Richard Mortier
m...@cantab.net

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen: move wallclock functions from x86 to common

2015-11-05 Thread Jan Beulich

>>> On 05.11.15 at 17:57,  wrote:
> --- a/xen/common/time.c
> +++ b/xen/common/time.c
> @@ -16,7 +16,13 @@
>   */
>  
>  #include 
> +#include 
> +#include 
> +#include 
>  #include 
> +#include 
> +#include 
> +
>  
>  /* Nonzero if YEAR is a leap year (every 4 years,

Stray blank line being added.

Also please take the opportunity to remove xen/config.h here.

> @@ -85,3 +95,87 @@ struct tm gmtime(unsigned long t)
>  
>  return tbuf;
>  }
> +
> +/* Explicitly OR with 1 just in case version number gets out of sync. */
> +#define version_update_begin(v) (((v)+1)|1)
> +#define version_update_end(v)   ((v)+1)

This should be moved to a header instead of getting defined a second
time here. Also please add spaces to match our coding style.

> +struct tm wallclock_time(uint64_t *ns)
> +{
> +uint64_t seconds, nsec;
> +
> +if ( !wc_sec )
> +return (struct tm) { 0 };
> +
> +seconds = NOW() + SECONDS(wc_sec) + wc_nsec;
> +nsec = do_div(seconds, 10);
> +
> +if ( ns )
> +*ns = nsec;
> +
> +return gmtime(seconds);
> +}
> +
> +

Stray blank lines again.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/2] xen: move wallclock functions from x86 to common

2015-11-05 Thread Julien Grall

Hi,

You forgot to CC the x86 maintainers.

Regards,

On 05/11/15 16:57, Stefano Stabellini wrote:
> Remove dummy arm implementation of wallclock_time.
> Use shared_info() in common code rather than x86-ism to access it.
> 
> Signed-off-by: Stefano Stabellini 
> Signed-off-by: Ian Campbell 
> ---
>  xen/arch/arm/time.c |5 ---
>  xen/arch/x86/time.c |   92 +
>  xen/common/time.c   |   94 
> +++
>  3 files changed, 95 insertions(+), 96 deletions(-)
> 
> diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c
> index 5ded30c..6207615 100644
> --- a/xen/arch/arm/time.c
> +++ b/xen/arch/arm/time.c
> @@ -280,11 +280,6 @@ void domain_set_time_offset(struct domain *d, int64_t 
> time_offset_seconds)
>  /* XXX update guest visible wallclock time */
>  }
>  
> -struct tm wallclock_time(uint64_t *ns)
> -{
> -return (struct tm) { 0 };
> -}
> -
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> index bbb7e6c..764d7dc 100644
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -47,9 +47,6 @@ string_param("clocksource", opt_clocksource);
>  unsigned long __read_mostly cpu_khz;  /* CPU clock frequency in kHz. */
>  DEFINE_SPINLOCK(rtc_lock);
>  unsigned long pit0_ticks;
> -static unsigned long wc_sec; /* UTC time at last 'time update'. */
> -static unsigned int wc_nsec;
> -static DEFINE_SPINLOCK(wc_lock);
>  
>  struct cpu_time {
>  u64 local_tsc_stamp;
> @@ -900,37 +897,6 @@ void force_update_vcpu_system_time(struct vcpu *v)
>  __update_vcpu_system_time(v, 1);
>  }
>  
> -void update_domain_wallclock_time(struct domain *d)
> -{
> -uint32_t *wc_version;
> -unsigned long sec;
> -
> -spin_lock(&wc_lock);
> -
> -wc_version = &shared_info(d, wc_version);
> -*wc_version = version_update_begin(*wc_version);
> -wmb();
> -
> -sec = wc_sec + d->time_offset_seconds;
> -if ( likely(!has_32bit_shinfo(d)) )
> -{
> -d->shared_info->native.wc_sec= sec;
> -d->shared_info->native.wc_nsec   = wc_nsec;
> -d->shared_info->native.wc_sec_hi = sec >> 32;
> -}
> -else
> -{
> -d->shared_info->compat.wc_sec = sec;
> -d->shared_info->compat.wc_nsec= wc_nsec;
> -d->shared_info->compat.arch.wc_sec_hi = sec >> 32;
> -}
> -
> -wmb();
> -*wc_version = version_update_end(*wc_version);
> -
> -spin_unlock(&wc_lock);
> -}
> -
>  static void update_domain_rtc(void)
>  {
>  struct domain *d;
> @@ -988,27 +954,6 @@ int cpu_frequency_change(u64 freq)
>  return 0;
>  }
>  
> -/* Set clock to  after 00:00:00 UTC, 1 January, 1970. */
> -void do_settime(unsigned long secs, unsigned int nsecs, u64 system_time_base)
> -{
> -u64 x;
> -u32 y;
> -struct domain *d;
> -
> -x = SECONDS(secs) + nsecs - system_time_base;
> -y = do_div(x, 10);
> -
> -spin_lock(&wc_lock);
> -wc_sec  = x;
> -wc_nsec = y;
> -spin_unlock(&wc_lock);
> -
> -rcu_read_lock(&domlist_read_lock);
> -for_each_domain ( d )
> -update_domain_wallclock_time(d);
> -rcu_read_unlock(&domlist_read_lock);
> -}
> -
>  /* Per-CPU communication between rendezvous IRQ and softirq handler. */
>  struct cpu_calibration {
>  u64 local_tsc_stamp;
> @@ -1608,25 +1553,6 @@ void send_timer_event(struct vcpu *v)
>  send_guest_vcpu_virq(v, VIRQ_TIMER);
>  }
>  
> -/* Return secs after 00:00:00 localtime, 1 January, 1970. */
> -unsigned long get_localtime(struct domain *d)
> -{
> -return wc_sec + (wc_nsec + NOW()) / 10ULL 
> -+ d->time_offset_seconds;
> -}
> -
> -/* Return microsecs after 00:00:00 localtime, 1 January, 1970. */
> -uint64_t get_localtime_us(struct domain *d)
> -{
> -return (SECONDS(wc_sec + d->time_offset_seconds) + wc_nsec + NOW())
> -   / 1000UL;
> -}
> -
> -unsigned long get_sec(void)
> -{
> -return wc_sec + (wc_nsec + NOW()) / 10ULL;
> -}
> -
>  /* "cmos_utc_offset" is the difference between UTC time and CMOS time. */
>  static long cmos_utc_offset; /* in seconds */
>  
> @@ -1635,7 +1561,7 @@ int time_suspend(void)
>  if ( smp_processor_id() == 0 )
>  {
>  cmos_utc_offset = -get_cmos_time();
> -cmos_utc_offset += (wc_sec + (wc_nsec + NOW()) / 10ULL);
> +cmos_utc_offset += get_sec();
>  kill_timer(&calibration_timer);
>  
>  /* Sync platform timer stamps. */
> @@ -1715,22 +1641,6 @@ int hwdom_pit_access(struct ioreq *ioreq)
>  return 0;
>  }
>  
> -struct tm wallclock_time(uint64_t *ns)
> -{
> -uint64_t seconds, nsec;
> -
> -if ( !wc_sec )
> -return (struct tm) { 0 };
> -
> -seconds = NOW() + SECONDS(wc_sec) + wc_nsec;
> -nsec = do_div(seconds, 10);
> -
> -if ( ns )
> -*ns = nsec;
> -
> -return gmtime(seconds);
> -}
> -
>  /*
>   * PV SoftTSC Emulation.
>   */
> diff --git a/xen/

Re: [Xen-devel] [PATCH 1/2] rwlock: add per-cpu reader-writer locks

2015-11-05 Thread Marcos E. Matsunaga


Hi Malcolm,

I tried your patches against staging yesterday and as soon as I started 
a guest, it panic. I have lock_profile enabled and applied your patches 
against:


6f04de658574833688c3f9eab310e7834d56a9c0 x86: cleanup of early cpuid 
handling




(XEN) HVM1 save: CPU
(XEN) HVM1 save: PIC
(XEN) HVM1 save: IOAPIC
(XEN) HVM1 save: LAPIC
(XEN) HVM1 save: LAPIC_REGS
(XEN) HVM1 save: PCI_IRQ
(XEN) HVM1 save: ISA_IRQ
(XEN) HVM1 save: PCI_LINK
(XEN) HVM1 save: PIT
(XEN) HVM1 save: RTC
(XEN) HVM1 save: HPET
(XEN) HVM1 save: PMTIMER
(XEN) HVM1 save: MTRR
(XEN) HVM1 save: VIRIDIAN_DOMAIN
(XEN) HVM1 save: CPU_XSAVE
(XEN) HVM1 save: VIRIDIAN_VCPU
(XEN) HVM1 save: VMCE_VCPU
(XEN) HVM1 save: TSC_ADJUST
(XEN) HVM1 restore: CPU 0
[  394.163143] loop: module loaded
(XEN) Assertion 'rw_is_locked(&t->lock)' failed at grant_table.c:215
(XEN) [ Xen-4.7-unstable  x86_64  debug=y  Tainted:C ]
(XEN) CPU:0
(XEN) RIP:e008:[] do_grant_table_op+0x63f/0x2e04
(XEN) RFLAGS: 00010246   CONTEXT: hypervisor (d0v0)
(XEN) rax:    rbx: 83400f9dc9e0   rcx: 
(XEN) rdx: 0001   rsi: 82d080342b10   rdi: 83400819b784
(XEN) rbp: 8300774ffef8   rsp: 8300774ffdf8   r8: 0002
(XEN) r9:  0002   r10: 0002   r11: 
(XEN) r12:    r13:    r14: 83400819b780
(XEN) r15: 83400f9d   cr0: 80050033   cr4: 001526e0
(XEN) cr3: 01007f613000   cr2: 8800746182b8
(XEN) ds:    es:    fs:    gs:    ss: e010   cs: e008
(XEN) Xen stack trace from rsp=8300774ffdf8:
(XEN)8300774ffe08 82d0 8300774ffef8 82d08017fc9b
(XEN)82d080342b28 83400f9d8600 82d080342b10 
(XEN)83400f9dca20 8321 834008188000 0001
(XEN)0001772ee000 8801e98d03e0 8300774ffe88 
(XEN) 8300774fff18 0021d0269c10 0001001a
(XEN)0001  0246 7ff7de45a407
(XEN)0100 7ff7de45a407 0033 8300772ee000
(XEN)8801eb0e3c00 880004bf57e8 8801e98d03e0 8801eb0a5938
(XEN)7cff88b000c7 82d08023d952 8100128a 0014
(XEN) 0001 8801f6e18388 81d3d740
(XEN)8801efb7bd40 88000542e780 0282 
(XEN)8801e98d03a0 8801efe07000 0014 8100128a
(XEN)0001 8801e98d03e0  00010100
(XEN)8100128a e033 0282 8801efb7bce0
(XEN)e02b   
(XEN)  8300772ee000 
(XEN)
(XEN) Xen call trace:
(XEN)[] do_grant_table_op+0x63f/0x2e04
(XEN)[] lstar_enter+0xe2/0x13c
(XEN)
(XEN)
(XEN) 
(XEN) Panic on CPU 0:
(XEN) Assertion 'rw_is_locked(&t->lock)' failed at grant_table.c:215
(XEN) 
(XEN)
(XEN) Manual reset required ('noreboot' specified)


Thanks for your help.

On 11/03/2015 12:58 PM, Malcolm Crossley wrote:

Per-cpu read-write locks allow for the fast path read case to have low overhead
by only setting/clearing a per-cpu variable for using the read lock.
The per-cpu read fast path also avoids locked compare swap operations which can
be particularly slow on coherent multi-socket systems, particularly if there is
heavy usage of the read lock itself.

The per-cpu reader-writer lock uses a global variable to control the read lock
fast path. This allows a writer to disable the fast path and ensure the readers
use the underlying read-write lock implementation.

Once the writer has taken the write lock and disabled the fast path, it must
poll the per-cpu variable for all CPU's which have entered the critical section
for the specific read-write lock the writer is attempting to take. This design
allows for a single per-cpu variable to be used for read/write locks belonging
to seperate data structures as long as multiple per-cpu read locks are not
simultaneously held by one particular cpu. This also means per-cpu
reader-writer locks are not recursion safe.

Slow path readers which are unblocked set the per-cpu variable and drop the
read lock. This simplifies the implementation and allows for fairness in the
underlying read-write lock to be taken advantage of.

There may be slightly more overhead on the per-cpu write lock path due to
checking each CPUs fast path read variable but this overhead is likely be hidden
by the required delay of waiting for readers to exit the critical section.
The loop is optimised to only iterate over the per-cpu data of active readers
of the rwlock.

Signed-off-by: Malcolm Crossley 
---
  xen/common/spinlock.c

Re: [Xen-devel] [PATCH 2/3] xen/arm: introduce HYPERVISOR_dom0_op on arm and arm64

2015-11-05 Thread Jan Beulich

>>> On 05.11.15 at 18:09,  wrote:
> --- a/arch/arm/xen/hypercall.S
> +++ b/arch/arm/xen/hypercall.S
> @@ -89,6 +89,7 @@ HYPERCALL2(memory_op);
>  HYPERCALL2(physdev_op);
>  HYPERCALL3(vcpu_op);
>  HYPERCALL1(tmem_op);
> +HYPERCALL1(dom0_op);

Assuming this somehow tries to mirror x86 naming - time to rename it
there? I don't see why you'd want to introduce a dom0_op when it
has been renamed to platform_op many years ago - see
public/dom0_ops.h.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Linux 4.4 MW: Boot under Xen fails with CONFIG_DEBUG_WX enabled: RIP: ptdump_walk_pgd_level_core

2015-11-05 Thread Boris Ostrovsky


On 11/05/2015 04:13 AM, Sander Eikelenboom wrote:


It makes "cat /sys/kernel/debug/kernel_page_tables" work and
prevents a kernel with CONFIG_DEBUG_WX=y from crashing at boot.


Great. Our nightly runs also failed spectacularly due to this bug.



It now does give a warning about an insecure W+X mapping, so 
CONFIG_DEBUG_WX=y
seems to be working. No idea how to interpret it though (and if it's a 
legit

warning).

--
Sander

[   19.034706] Freeing unused kernel memory: 1104K (822fc000 - 
8241)

[   19.041339] Write protecting the kernel read-only data: 18432k
[   19.052596] Freeing unused kernel memory: 1144K (880001ae2000 - 
880001c0)
[   19.060285] Freeing unused kernel memory: 1560K (88000207a000 - 
88000220)

[   19.067079] [ cut here ]
[   19.073931] WARNING: CPU: 5 PID: 1 at 
arch/x86/mm/dump_pagetables.c:225 note_page+0x619/0x7e0()


Yes, this apparently is a known issue: https://lkml.org/lkml/2015/11/4/476

-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [VOTE] Release cycle scheme

2015-11-05 Thread Stefano Stabellini

On Mon, 2 Nov 2015, Wei Liu wrote:
> Hi committers,

I am not a xen.git committer, but I am the qemu-xen.git committer, and
since I maintain all the qemu-xen stable trees and releases, I think
that my vote should count, at least for this proposal.


> There doesn't seem to be consensus on how release cycle should be
> managed. In the survey [0] about release cycle there were following
> proposed schemes:
> 
> #1. 6 months release cycle + current stable release scheme
> #2. 6 months release cycle + LTS scheme
> #3. 6 months release cycle + extended security support
> #4. 9 months release cycle + current stable release scheme (no change
> at all)
> 
> And the tally:
> 
>   #1  #2  #3  #4
> George+1  +2  -2
> Dario +1  +2  -2
> Stefano   +1  +2  -2
> Ian C +1  +1  +1  -1
> Olaf  +1   0  +1   0
> Juergen0  -1  +1
> Ian J +2  +1  +1  -2
> Andrew+1  +1  -1
> Jan   -1  -1   0  +1
> 
> 
> There are comments made by individuals that couldn't be clearly
> represent in tally. The most acceptable option to stable tree
> maintainers is #1.
> 
> So I propose we use the following scheme:
> 
> - 6 months release cycle from unstable branch.
>   - 4 months development.
>   - 2 months freeze.
>   - Eat into next cycle if doesn't release on time.

+2


> - Fixed cut-off date: the Fridays of the week in which the last day of
>   March and September falls.

+1


> - No more freeze exception, but heads-up mails about freeze will be
>   sent a few weeks before hand.

+1


> - Stable branch maintained for 18 months full support plus 18 months
>   security support. No mixed maintainership for stable trees.

-1


If I need to give an overall vote, I'll give +1.

 
> Please vote to ack or nack this proposal.
> 
> 
> Thanks
> Wei.
> 
> [0]: <20151012173222.ge2...@zion.uk.xensource.com>
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Linux 4.4 MW: Boot under Xen fails with CONFIG_DEBUG_WX enabled: RIP: ptdump_walk_pgd_level_core

2015-11-05 Thread Sander Eikelenboom


Thursday, November 5, 2015, 2:53:40 PM, you wrote:

> On 11/05/2015 04:13 AM, Sander Eikelenboom wrote:
>>
>> It makes "cat /sys/kernel/debug/kernel_page_tables" work and
>> prevents a kernel with CONFIG_DEBUG_WX=y from crashing at boot.

> Great. Our nightly runs also failed spectacularly due to this bug.

>>
>> It now does give a warning about an insecure W+X mapping, so 
>> CONFIG_DEBUG_WX=y
>> seems to be working. No idea how to interpret it though (and if it's a 
>> legit
>> warning).
>>
>> -- 
>> Sander
>>
>> [   19.034706] Freeing unused kernel memory: 1104K (822fc000 - 
>> 8241)
>> [   19.041339] Write protecting the kernel read-only data: 18432k
>> [   19.052596] Freeing unused kernel memory: 1144K (880001ae2000 - 
>> 880001c0)
>> [   19.060285] Freeing unused kernel memory: 1560K (88000207a000 - 
>> 88000220)
>> [   19.067079] [ cut here ]
>> [   19.073931] WARNING: CPU: 5 PID: 1 at 
>> arch/x86/mm/dump_pagetables.c:225 note_page+0x619/0x7e0()

> Yes, this apparently is a known issue: https://lkml.org/lkml/2015/11/4/476

> -boris

Ah thx for the pointer :)

--
Sander





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v11 2/5] missing include asm/paravirt.h in cputime.c

2015-11-05 Thread Stefano Stabellini

On Thu, 5 Nov 2015, Peter Zijlstra wrote:
> How can this be missing? Things compile fine now, right?

Fair enough.

> So please better explain why we do this change.

asm/paravirt.h is included by one of the other headers included in
kernel/sched/cputime.c on x86, but not on other architecures. On arm and
arm64, where I am about to introduce asm/paravirt.h and stolen time
support, without #include  in cputime.c I would get:

kernel/sched/cputime.c: In function ‘steal_account_process_tick’:
kernel/sched/cputime.c:260:24: error: ‘paravirt_steal_enabled’ undeclared 
(first use in this function)
  if (static_key_false(¶virt_steal_enabled)) {

A bit of digging on x86 (using gcc -E on cputime.c) tells me that
asm/paravirt.h is coming from the following include chain:

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include ___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/2] arm: export platform_op XENPF_settime

2015-11-05 Thread Julien Grall

Hi Stefano,

You forgot to CC Daniel for the XSM part. Please use
scripts/get_maintainers.pl to get the relevant maintainers.

On 05/11/15 16:57, Stefano Stabellini wrote:
> Call update_domain_wallclock_time at domain initialization.

It's not really what you are doing in the code. You are calling
update_domain_wallclock_time when the first vCPU is initialized.

Also some rationale to explain why this call should be done here would
be good.

Finally, I'm a bit surprised that you only need to call
update_domain_wallclock_time when the domain is created. x86 needs to
call in various places.

For instance we may want to call update_domain_wallclock_time in
construct_dom0 before clearing the pause flags. This is because the
wallclock may be out of sync as construction DOM0 takes some time.

> Signed-off-by: Stefano Stabellini 
> Signed-off-by: Ian Campbell 
> ---
>  xen/arch/arm/Makefile |1 +
>  xen/arch/arm/domain.c |3 ++
>  xen/arch/arm/platform_hypercall.c |   62 
> +
>  xen/arch/arm/traps.c  |1 +
>  xen/include/xsm/dummy.h   |   12 +++
>  xen/include/xsm/xsm.h |   13 

You also have to fix xsm/flask/hooks.c.

>  6 files changed, 80 insertions(+), 12 deletions(-)
>  create mode 100644 xen/arch/arm/platform_hypercall.c

[..]

> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index b2bfc7d..ac9b1b3 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -742,6 +742,9 @@ int arch_set_info_guest(
>  v->arch.ttbr1 = ctxt->ttbr1;
>  v->arch.ttbcr = ctxt->ttbcr;
>  
> +if ( v->vcpu_id == 0 )
> +update_domain_wallclock_time(v->domain);
> +
>  v->is_initialised = 1;
>  
>  if ( ctxt->flags & VGCF_online )
> diff --git a/xen/arch/arm/platform_hypercall.c 
> b/xen/arch/arm/platform_hypercall.c
> new file mode 100644
> index 000..f60d7b3
> --- /dev/null
> +++ b/xen/arch/arm/platform_hypercall.c
> @@ -0,0 +1,62 @@
> +/**
> + * platform_hypercall.c
> + * 
> + * Hardware platform operations. Intended for use by domain-0 kernel.
> + * 
> + * Copyright (c) 2015, Citrix
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +DEFINE_SPINLOCK(xenpf_lock);
> +
> +long do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)
> +{

Would it make sense to introduce a common platform code which take care
of common hypercall? See for instance do_domctl and arch_do_domctl.

> +long ret;
> +struct xen_platform_op curop, *op = &curop;
> +
> +if ( copy_from_guest(op, u_xenpf_op, 1) )
> +return -EFAULT;
> +
> +if ( op->interface_version != XENPF_INTERFACE_VERSION )
> +return -EACCES;
> +
> +ret = xsm_platform_op(XSM_PRIV, op->cmd);
> +if ( ret )
> +return ret;
> +
> +spin_lock(&xenpf_lock);
> +
> +switch ( op->cmd )
> +{
> +case XENPF_settime32:
> +do_settime(op->u.settime32.secs,
> +   op->u.settime32.nsecs,
> +   op->u.settime32.system_time);
> +break;

Do we really want to support settime32 on ARM?

> +
> +case XENPF_settime64:
> +if ( likely(!op->u.settime64.mbz) )
> +do_settime(op->u.settime64.secs,
> +   op->u.settime64.nsecs,
> +   op->u.settime64.system_time);
> +else
> +ret = -EINVAL;
> +break;
> +
> +default:
> +ret = -ENOSYS;
> +break;
> +}
> +
> +spin_unlock(&xenpf_lock);
> +return ret;
> +}

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 2/9] xen: add generic flag to elf_dom_parms indicating support of unmapped initrd

2015-11-05 Thread Andrew Cooper

On 05/11/15 14:36, Juergen Gross wrote:
> Support of an unmapped initrd is indicated by the kernel of the domain
> via elf notes. In order not to have to use raw elf data in the tools
> for support of an unmapped initrd add a flag to the parsed data area
> to indicate the kernel supporting this feature.
>
> Switch using this flag in the hypervisor domain builder.
>
> Cc: andrew.coop...@citrix.com
> Cc: jbeul...@suse.com
> Cc: k...@xen.org
> Suggested-by: Ian Campbell 
> Signed-off-by: Juergen Gross 
> Acked-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

1 2 >

1 - 100 of 134 matches

Mail list logo