[Xen-devel] [PATCH 26/45] gntdev.h: include stdint.h in userspace

2015-02-18 Thread Mikko Rapeli
Fixes compilation error:

xen/gntdev.h:38:2: error: unknown type name ‘uint32_t’

Signed-off-by: Mikko Rapeli 
---
 include/uapi/xen/gntdev.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/uapi/xen/gntdev.h b/include/uapi/xen/gntdev.h
index 5304bd3..f724f75 100644
--- a/include/uapi/xen/gntdev.h
+++ b/include/uapi/xen/gntdev.h
@@ -33,6 +33,12 @@
 #ifndef __LINUX_PUBLIC_GNTDEV_H__
 #define __LINUX_PUBLIC_GNTDEV_H__
 
+#ifdef __KERNEL__
+#include 
+#else
+#include 
+#endif
+
 struct ioctl_gntdev_grant_ref {
/* The domain ID of the grant to be mapped. */
uint32_t domid;
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 25/45] gntalloc.h: include stdint.h in userspace

2015-02-18 Thread Mikko Rapeli
Fixes compilation error:

xen/gntalloc.h:22:2: error: unknown type name ‘uint16_t’

Signed-off-by: Mikko Rapeli 
---
 include/uapi/xen/gntalloc.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/uapi/xen/gntalloc.h b/include/uapi/xen/gntalloc.h
index 76bd580..184df7e 100644
--- a/include/uapi/xen/gntalloc.h
+++ b/include/uapi/xen/gntalloc.h
@@ -11,6 +11,12 @@
 #ifndef __LINUX_PUBLIC_GNTALLOC_H__
 #define __LINUX_PUBLIC_GNTALLOC_H__
 
+#ifdef __KERNEL__
+#include 
+#else
+#include 
+#endif
+
 /*
  * Allocates a new page and creates a new grant reference.
  */
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-4.5-testing test] 34638: regressions - FAIL

2015-02-18 Thread xen . org
flight 34638 xen-4.5-testing real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34638/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt  5 xen-boot  fail REGR. vs. 34200
 test-amd64-amd64-pair14 leak-check/basis/src_host(14) fail REGR. vs. 34200

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2   5 xen-boot fail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass

version targeted for testing:
 xen  2417e243bb510dafdbd589a5aeedb29095e62c10
baseline version:
 xen  d8e78d691d9b4bcc945d8f0b0ed2b48713931c4d


People who touched revisions under test:
  Ian Campbell 
  Ian Jackson 
  Jim Fehlig 
  Julien Grall 
  Wei Liu 


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64   pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fai

Re: [Xen-devel] [PATCH v4 14/29] MdePkg/BaseSynchronizationLib: implement 16-bit compare-exchange

2015-02-18 Thread Ard Biesheuvel
On 17 February 2015 at 18:40, Jordan Justen  wrote:
> Ard,
>
> For the subject, I think
> MdePkg/BaseSynchronizationLib: Add InterlockedCompareExchange16
> would be better.
>

OK

> Acked-by: Jordan Justen 
>

Thanks

> Thanks for working to move this to a common location.
>

No problem

> Mike,
>
> I think Anthony tested the IA32 and X64 implementations with Xen. For
> IPF, I don't think it has been tested.
>

As mentioned in the other thread, Anthony seems to be incommunicado,
and some other of the Xen guys have replied that Xen on OVMF/x86 is
broken for other reasons so they cannot positively confirm that
everything still works, even though the x86 changes other than the
PCI/xenbus split are primarily refactoring of existing code.

As far as IPF is concerned: I don't think Xen or Ovmf can be built for
IPF anyway, but I think a build test and visual inspection of the .S
file should be sufficient here?

Thanks,
Ard.


> On 2015-02-12 03:19:06, Ard Biesheuvel wrote:
>> This implements the function InterlockedCompareExchange16 () for all
>> architectures, using architecture and toolchain specific intrinsics
>> or primitive assembler instructions.
>>
>> Contributed-under: TianoCore Contribution Agreement 1.0
>> Reviewed-by: Olivier Martin 
>> Signed-off-by: Ard Biesheuvel 
>> ---
>>  MdePkg/Include/Library/SynchronizationLib.h 
>> | 26 ++
>>  MdePkg/Library/BaseSynchronizationLib/AArch64/Synchronization.S 
>> | 44 
>>  MdePkg/Library/BaseSynchronizationLib/Arm/Synchronization.S 
>> | 44 
>>  MdePkg/Library/BaseSynchronizationLib/Arm/Synchronization.asm   
>> | 44 
>>  MdePkg/Library/BaseSynchronizationLib/BaseSynchronizationLib.inf
>> |  5 +
>>  MdePkg/Library/BaseSynchronizationLib/BaseSynchronizationLibInternals.h 
>> | 26 ++
>>  MdePkg/Library/BaseSynchronizationLib/Ebc/Synchronization.c 
>> | 31 +++
>>  MdePkg/Library/BaseSynchronizationLib/Ia32/GccInline.c  
>> | 42 ++
>>  MdePkg/Library/BaseSynchronizationLib/Ia32/InterlockedCompareExchange16.asm 
>> | 46 ++
>>  MdePkg/Library/BaseSynchronizationLib/Ia32/InterlockedCompareExchange16.c   
>> | 51 +++
>>  MdePkg/Library/BaseSynchronizationLib/Ipf/InterlockedCompareExchange16.s
>> | 30 ++
>>  MdePkg/Library/BaseSynchronizationLib/Synchronization.c 
>> | 31 +++
>>  MdePkg/Library/BaseSynchronizationLib/SynchronizationGcc.c  
>> | 31 +++
>>  MdePkg/Library/BaseSynchronizationLib/SynchronizationMsc.c  
>> | 31 +++
>>  MdePkg/Library/BaseSynchronizationLib/X64/GccInline.c   
>> | 44 
>>  MdePkg/Library/BaseSynchronizationLib/X64/InterlockedCompareExchange16.asm  
>> | 42 ++
>>  MdePkg/Library/BaseSynchronizationLib/X64/InterlockedCompareExchange16.c
>> | 54 ++
>>  17 files changed, 622 insertions(+)
>>
>> diff --git a/MdePkg/Include/Library/SynchronizationLib.h 
>> b/MdePkg/Include/Library/SynchronizationLib.h
>> index f97569739914..7b97683ca0af 100644
>> --- a/MdePkg/Include/Library/SynchronizationLib.h
>> +++ b/MdePkg/Include/Library/SynchronizationLib.h
>> @@ -184,6 +184,32 @@ InterlockedDecrement (
>>
>>
>>  /**
>> +  Performs an atomic compare exchange operation on a 16-bit unsigned 
>> integer.
>> +
>> +  Performs an atomic compare exchange operation on the 16-bit unsigned 
>> integer
>> +  specified by Value.  If Value is equal to CompareValue, then Value is set 
>> to
>> +  ExchangeValue and CompareValue is returned.  If Value is not equal to 
>> CompareValue,
>> +  then Value is returned.  The compare exchange operation must be performed 
>> using
>> +  MP safe mechanisms.
>> +
>> +  If Value is NULL, then ASSERT().
>> +
>> +  @param  Value A pointer to the 16-bit value for the compare 
>> exchange
>> +operation.
>> +  @param  CompareValue  16-bit value used in compare operation.
>> +  @param  ExchangeValue 16-bit value used in exchange operation.
>> +
>> +  @return The original *Value before exchange.
>> +**/
>> +UINT16
>> +EFIAPI
>> +InterlockedCompareExchange16 (
>> +  IN OUT  UINT16*Value,
>> +  IN  UINT16CompareValue,
>> +  IN  UINT16ExchangeValue
>> +  );
>> +
>> +/**
>>Performs an atomic compare exchange operation on a 32-bit unsigned 
>> integer.
>>
>>Perf

[Xen-devel] [PATCH 0/2] x86: tboot adjustments

2015-02-18 Thread Jan Beulich
1: invalidate FIX_TBOOT_MAP_ADDRESS mapping after use
2: simplify DMAR table copying

Signed-off-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/2] x86/tboot: invalidate FIX_TBOOT_MAP_ADDRESS mapping after use

2015-02-18 Thread Jan Beulich
In order for commit cbeeaa7d ("x86/nmi: fix shootdown of pcpus
running in VMX non-root mode")'s re-use of that fixmap entry to not
cause undesirable (in crash context) cross-CPU TLB flushes, invalidate
the fixmap entry right after use.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/tboot.c
+++ b/xen/arch/x86/tboot.c
@@ -138,6 +138,7 @@ void __init tboot_probe(void)
   TXT_PUB_CONFIG_REGS_BASE + TXTCR_SINIT_BASE);
 tboot_copy_memory((unsigned char *)&sinit_size, sizeof(sinit_size),
   TXT_PUB_CONFIG_REGS_BASE + TXTCR_SINIT_SIZE);
+__set_fixmap(FIX_TBOOT_MAP_ADDRESS, 0, 0);
 }
 
 /* definitions from xen/drivers/passthrough/vtd/iommu.h
@@ -476,6 +477,8 @@ int __init tboot_parse_dmar_table(acpi_t
 dmar_table_raw = xmalloc_array(unsigned char, dmar_table_length);
 tboot_copy_memory(dmar_table_raw, dmar_table_length, pa);
 dmar_table = (struct acpi_table_header *)dmar_table_raw;
+__set_fixmap(FIX_TBOOT_MAP_ADDRESS, 0, 0);
+
 rc = dmar_handler(dmar_table);
 xfree(dmar_table_raw);
 



x86/tboot: invalidate FIX_TBOOT_MAP_ADDRESS mapping after use

In order for commit cbeeaa7d ("x86/nmi: fix shootdown of pcpus
running in VMX non-root mode")'s re-use of that fixmap entry to not
cause undesirable (in crash context) cross-CPU TLB flushes, invalidate
the fixmap entry right after use.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/tboot.c
+++ b/xen/arch/x86/tboot.c
@@ -138,6 +138,7 @@ void __init tboot_probe(void)
   TXT_PUB_CONFIG_REGS_BASE + TXTCR_SINIT_BASE);
 tboot_copy_memory((unsigned char *)&sinit_size, sizeof(sinit_size),
   TXT_PUB_CONFIG_REGS_BASE + TXTCR_SINIT_SIZE);
+__set_fixmap(FIX_TBOOT_MAP_ADDRESS, 0, 0);
 }
 
 /* definitions from xen/drivers/passthrough/vtd/iommu.h
@@ -476,6 +477,8 @@ int __init tboot_parse_dmar_table(acpi_t
 dmar_table_raw = xmalloc_array(unsigned char, dmar_table_length);
 tboot_copy_memory(dmar_table_raw, dmar_table_length, pa);
 dmar_table = (struct acpi_table_header *)dmar_table_raw;
+__set_fixmap(FIX_TBOOT_MAP_ADDRESS, 0, 0);
+
 rc = dmar_handler(dmar_table);
 xfree(dmar_table_raw);
 
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/2] x86/tboot: simplify DMAR table copying

2015-02-18 Thread Jan Beulich
There's no need for more than one variable, no need for casts, and no
point in using the type-safe xmalloc_array() here.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/tboot.c
+++ b/xen/arch/x86/tboot.c
@@ -435,13 +435,12 @@ int __init tboot_protect_mem_regions(voi
 
 int __init tboot_parse_dmar_table(acpi_table_handler dmar_handler)
 {
-struct acpi_table_header *dmar_table;
 int rc;
 uint64_t size;
 uint32_t dmar_table_length;
 unsigned long pa;
 sinit_mle_data_t sinit_mle_data;
-unsigned char *dmar_table_raw;
+void *dmar_table;
 
 if ( !tboot_in_measured_env() )
 return acpi_table_parse(ACPI_SIG_DMAR, dmar_handler);
@@ -474,13 +473,12 @@ int __init tboot_parse_dmar_table(acpi_t
 tboot_copy_memory((unsigned char *)&dmar_table_length,
   sizeof(dmar_table_length),
   pa + sizeof(char) * ACPI_NAME_SIZE);
-dmar_table_raw = xmalloc_array(unsigned char, dmar_table_length);
-tboot_copy_memory(dmar_table_raw, dmar_table_length, pa);
-dmar_table = (struct acpi_table_header *)dmar_table_raw;
+dmar_table = xmalloc_bytes(dmar_table_length);
+tboot_copy_memory(dmar_table, dmar_table_length, pa);
 __set_fixmap(FIX_TBOOT_MAP_ADDRESS, 0, 0);
 
 rc = dmar_handler(dmar_table);
-xfree(dmar_table_raw);
+xfree(dmar_table);
 
 /* acpi_parse_dmar() zaps APCI DMAR signature in TXT heap table */
 /* but dom0 will read real table, so must zap it there too */



x86/tboot: simplify DMAR table copying

There's no need for more than one variable, no need for casts, and no
point in using the type-safe xmalloc_array() here.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/tboot.c
+++ b/xen/arch/x86/tboot.c
@@ -435,13 +435,12 @@ int __init tboot_protect_mem_regions(voi
 
 int __init tboot_parse_dmar_table(acpi_table_handler dmar_handler)
 {
-struct acpi_table_header *dmar_table;
 int rc;
 uint64_t size;
 uint32_t dmar_table_length;
 unsigned long pa;
 sinit_mle_data_t sinit_mle_data;
-unsigned char *dmar_table_raw;
+void *dmar_table;
 
 if ( !tboot_in_measured_env() )
 return acpi_table_parse(ACPI_SIG_DMAR, dmar_handler);
@@ -474,13 +473,12 @@ int __init tboot_parse_dmar_table(acpi_t
 tboot_copy_memory((unsigned char *)&dmar_table_length,
   sizeof(dmar_table_length),
   pa + sizeof(char) * ACPI_NAME_SIZE);
-dmar_table_raw = xmalloc_array(unsigned char, dmar_table_length);
-tboot_copy_memory(dmar_table_raw, dmar_table_length, pa);
-dmar_table = (struct acpi_table_header *)dmar_table_raw;
+dmar_table = xmalloc_bytes(dmar_table_length);
+tboot_copy_memory(dmar_table, dmar_table_length, pa);
 __set_fixmap(FIX_TBOOT_MAP_ADDRESS, 0, 0);
 
 rc = dmar_handler(dmar_table);
-xfree(dmar_table_raw);
+xfree(dmar_table);
 
 /* acpi_parse_dmar() zaps APCI DMAR signature in TXT heap table */
 /* but dom0 will read real table, so must zap it there too */
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V5 06/12] x86/hvm: factor out and rename vm_event related functions

2015-02-18 Thread Jan Beulich
>>> On 17.02.15 at 18:37,  wrote:
> On Tue, Feb 17, 2015 at 12:56 PM, Jan Beulich  wrote:
> On 13.02.15 at 17:33,  wrote:
>>> +static void hvm_event_cr(uint32_t reason, unsigned long value,
>>> +unsigned long old)
>>> +{
>>> +vm_event_request_t req = {
>>> +.reason = reason,
>>> +.vcpu_id = current->vcpu_id,
>>> +.u.mov_to_cr.new_value = value,
>>> +.u.mov_to_cr.old_value = old
>>> +};
>>> +uint64_t parameters = 0;
>>> +
>>> +switch(reason)
>>
>> Coding style. Also I continue to think using switch() here rather than
>> having the caller pass both VM_EVENT_* and HVM_PARAM_* is ugly/
>> inefficient (even if the compiler may be able to sort this out for you).
> 
> It's getting retired in the series so there isn't much point in
> tweaking it here.

I realized that looking at later patches in this series, but then you
could similarly argue that the other requested adjustments are
unnecessary. But please always keep in mind that series may get
applied partially. And of course ideally a series wouldn't introduce
code just for a later patch to delete it again - i.e. if you already
find you want/need to do that, then please accept that coding
style remarks are still being made and considered relevant.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4] modify the IO_TLB_SEGSIZE and IO_TLB_DEFAULT_SIZE configurable as flexible requirement about SW-IOMMU.

2015-02-18 Thread Wang, Xiaoming
Dear Jan

> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Tuesday, February 17, 2015 6:09 PM
> To: Wang, Xiaoming
> Cc: ch...@chris-wilson.co.uk; david.vra...@citrix.com;
> lau...@codeaurora.org; heiko.carst...@de.ibm.com; li...@horizon.com;
> Liu, Chuansheng; Zhang, Dongxing; takahiro.aka...@linaro.org;
> a...@linux-foundation.org; linux-m...@linux-mips.org; ralf@linux-
> mips.org; xen-de...@lists.xenproject.org; boris.ostrov...@oracle.com;
> konrad.w...@oracle.com; d.kasat...@samsung.com; pebo...@tiscali.nl;
> linux-ker...@vger.kernel.org
> Subject: Re: [Xen-devel] [PATCH v4] modify the IO_TLB_SEGSIZE and
> IO_TLB_DEFAULT_SIZE configurable as flexible requirement about SW-
> IOMMU.
> 
> >>> On 17.02.15 at 07:51,  wrote:
> > --- a/Documentation/kernel-parameters.txt
> > +++ b/Documentation/kernel-parameters.txt
> > @@ -3438,10 +3438,12 @@ bytes respectively. Such letter suffixes can
> > also be entirely omitted.
> > it if 0 is given (See
> Documentation/cgroups/memory.txt)
> >
> > swiotlb=[ARM,IA-64,PPC,MIPS,X86]
> > -   Format: {  | force }
> > +   Format: {  | force |  | }
> >  -- Number of I/O TLB slabs
> > force -- force using of bounce buffers even if they
> >  wouldn't be automatically used by the kernel
> > +-- Maximum allowable number of contiguous
> slabs to map
> > +-- The size of SW-MMU mapped.
> 
> This makes no sense - the new numbers added aren't position independent
> (nor were the previous  and "force").
> 
Use ","  can separate them one by one.
We do it at lib/swiotlb.c
> Also you are (supposedly) removing all uses of IO_TLB_DEFAULT_SIZE, yet
> you don't seem to remove the definition itself.
> 
I have change all uses of IO_TLB_DEFAULT_SIZE to io_tlb_default_size in 
lib/swiotlb.c
> Finally - are arbitrary numbers really okay for the newly added command line
> options? I.e. shouldn't you add some checking of their validity?
> 
I have validity these code is OK.
Example:
BOARD_KERNEL_CMDLINE += swiotlb=, ,512,268435456
Io_tlb_segsize has been changed from 128 to 512
Io_tlb_default_size has been changed from 64M to 268435456  (256M)

> Jan
Xiaoming.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 13/13] xen: allow more than 512 GB of RAM for 64 bit pv-domains

2015-02-18 Thread Paul Bolle
On Wed, 2015-02-18 at 07:52 +0100, Juergen Gross wrote:
> 64 bit pv-domains under Xen are limited to 512 GB of RAM today. The
> main reason has been the 3 level p2m tree, which was replaced by the
> virtual mapped linear p2m list. Parallel to the p2m list which is
> being used by the kernel itself there is a 3 level mfn tree for usage
> by the Xen tools and eventually for crash dump analysis. For this tree
> the linear p2m list can serve as a replacement, too. As the kernel
> can't know whether the tools are capable of dealing with the p2m list
> instead of the mfn tree, the limit of 512 GB can't be dropped in all
> cases.
> 
> This patch replaces the hard limit by a kernel parameter which tells
> the kernel to obey the 512 GB limit or not. The default is selected by
> a configuration parameter which specifies whether the 512 GB limit
> should be active per default for dom0 (only crash dump analysis is
> affected) and/or for domUs (additionally domain save/restore/migration
> are affected).
> 
> Memory above the domain limit is returned to the hypervisor instead of
> being identity mapped, which was wrong anyways.
> 
> The kernel configuration parameter to specify the maximum size of a
> domain can be deleted, as it is not relevant any more.
> 
> Signed-off-by: Juergen Gross 
> ---
>  Documentation/kernel-parameters.txt |  7 
>  arch/x86/include/asm/xen/page.h |  4 ---
>  arch/x86/xen/Kconfig| 31 +++-
>  arch/x86/xen/p2m.c  | 10 +++---
>  arch/x86/xen/setup.c| 72 
> ++---
>  5 files changed, 93 insertions(+), 31 deletions(-)

[...]

> --- a/arch/x86/xen/Kconfig
> +++ b/arch/x86/xen/Kconfig
> @@ -23,14 +23,29 @@ config XEN_PVHVM
>   def_bool y
>   depends on XEN && PCI && X86_LOCAL_APIC
>  
> -config XEN_MAX_DOMAIN_MEMORY
> -   int
> -   default 500 if X86_64
> -   default 64 if X86_32
> -   depends on XEN
> -   help
> - This only affects the sizing of some bss arrays, the unused
> - portions of which are freed.
> +if X86_64

Not
&& XEN
?

> +choice
> + prompt "Support pv-domains larger than 512GB"
> + default XEN_512GB_NONE
> + help
> +   Support paravirtualized domains with more than 512GB of RAM.
> +
> +   The Xen tools and crash dump analysis tools might not support
> +   pv-domains with more than 512 GB of RAM. This option controls the
> +   default setting of the kernel to use only up to 512 GB or more.
> +   It is always possible to change the default via specifying the
> +   boot parameter "xen_512gb_limit".
> +
> + config XEN_512GB_NONE
> + bool "neither dom0 nor domUs can be larger than 512GB"
> + config XEN_512GB_DOM0
> + bool "dom0 can be larger than 512GB, domUs not"
> + config XEN_512GB_DOMU
> + bool "domUs can be larger than 512GB, dom0 not"
> + config XEN_512GB_ALL
> + bool "dom0 and domUs can be larger than 512GB"
> +endchoice

So there are actually two independent limits, configured through a
choice with four entries. Would using just two separate Kconfig symbols
(XEN_512GB_DOM0 and XEN_512GB_DOMU) without a choice wrapper also work?
Because ...

> +endif
>  
>  config XEN_SAVE_RESTORE
> bool

[...]
 
> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> index 84a6473..16d94de 100644
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -32,6 +32,8 @@
>  #include "p2m.h"
>  #include "mmu.h"
>  
> +#define GB(x) ((uint64_t)(x) * 1024 * 1024 * 1024)
> +
>  /* Amount of extra memory space we add to the e820 ranges */
>  struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;
>  
> @@ -85,6 +87,27 @@ static struct {
>   */
>  #define EXTRA_MEM_RATIO  (10)
>  
> +static bool xen_dom0_512gb_limit __initdata =
> + IS_ENABLED(CONFIG_XEN_512GB_NONE) || IS_ENABLED(CONFIG_XEN_512GB_DOMU);

... then this could be something like:
static bool xen_dom0_512gb_limit __initdata = 
!IS_ENABLED(CONFIG_XEN_512GB_DOM0);

> +static bool xen_domu_512gb_limit __initdata =
> + IS_ENABLED(CONFIG_XEN_512GB_NONE) || IS_ENABLED(CONFIG_XEN_512GB_DOM0);
> +

and this likewise:
static bool xen_domu_512gb_limit __initdata = 
!IS_ENABLED(CONFIG_XEN_512GB_DOMU);

Correct?

> +static int __init xen_parse_512gb(char *arg)
> +{
> + bool val = false;
> +
> + if (!arg)
> + val = true;
> + else if (strtobool(arg, &val))
> + return 1;
> +
> + xen_dom0_512gb_limit = val;
> + xen_domu_512gb_limit = val;
> +
> + return 0;
> +}
> +early_param("xen_512gb_limit", xen_parse_512gb);
> +
>  static void __init xen_add_extra_mem(phys_addr_t start, phys_addr_t size)
>  {
>   int i;

So one can configure these two limits separately, but the kernel
parameter is used for both. Any particular reason?

Thanks,


Paul Bolle


___
Xen-devel

Re: [Xen-devel] [PATCH V5 07/12] xen: Introduce monitor_op domctl

2015-02-18 Thread Jan Beulich
>>> On 17.02.15 at 19:20,  wrote:
> On Tue, Feb 17, 2015 at 3:02 PM, Jan Beulich  wrote:
> On 13.02.15 at 17:33,  wrote:
>>>  rc = vm_event_enable(d, vec, ved, _VPF_mem_access,
>>> -HVM_PARAM_MONITOR_RING_PFN,
>>> -mem_access_notification);
>>> -
>>> -if ( vec->op == XEN_VM_EVENT_MONITOR_ENABLE_INTROSPECTION
>>> - && !rc )
>>> -p2m_setup_introspection(d);
>>> -
>>> -}
>>> -break;
>>> + HVM_PARAM_MONITOR_RING_PFN,
>>> + mem_access_notification);
>>
>> I don't see what changes for these two lines. If it's indentation, it
>> should be done right when the code gets added.
> 
> Indentation can't be fixed in the code addition as it breaks git -M.
> It reverts to the old format where it just removes the whole file and
> adds the new one. I think its a waste to add a whole new separate
> patch just to fix indentations so I just fix it here.

Considering that indentation is broken already prior to your
series, this is perhaps acceptable. But at least if indentation
was correct before the rename, it should be afterwards. You'd
have to use of git's -B option to control the resulting diff.

>>> +#include 
>>> +
>>> +static inline
>>> +int monitor_domctl(struct xen_domctl_monitor_op *op, struct domain *d)
>>
>> The includes above are insufficient for the types used, or you should
>> forward declare _both_ structs and not have any includes.
> 
> Just including sched.h additionally should be enough IMHO.

Resulting in a huge pile of further dependencies. Our goal really
should be to get the dependencies down, not up - improving build
time. Hence forward declarations are very likely the better choice
here.

>>> --- a/xen/include/asm-x86/domain.h
>>> +++ b/xen/include/asm-x86/domain.h
>>> @@ -241,6 +241,24 @@ struct time_scale {
>>>  u32 mul_frac;
>>>  };
>>>
>>> +//
>>> +/*monitor event options */
>>> +//
>>> +struct mov_to_cr {
>>> +uint8_t enabled;
>>> +uint8_t sync;
>>> +uint8_t onchangeonly;
>>> +};
>>> +
>>> +struct mov_to_msr {
>>> +uint8_t enabled;
>>> +uint8_t extended_capture;
>>> +};
>>> +
>>> +struct debug_event {
>>> +uint8_t enabled;
>>> +};
>>
>> These are all internal structures - is there anything wrong with using
>> bitfields here?
> 
> The use if bitfields is not good performance-wise AFAIK. Would there
> be any benefit that would offset that?

As Andrew already said - total structure size. Also I'm pretty
convinced "or $, " as well as "and $~,"
aren't much worse than "mov $,", and the code
writing these fields shouldn't be performance critical. And
"test $," and "cmp $," (as well as
their split up alternatives, should the compiler elect to do so)
ought to be equal performance wise.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V5 10/12] xen/vm_event: Relocate memop checks

2015-02-18 Thread Jan Beulich
>>> On 17.02.15 at 19:47,  wrote:
> On Tue, Feb 17, 2015 at 3:25 PM, Jan Beulich  wrote:
> On 13.02.15 at 17:33,  wrote:
>>> -int mem_paging_memop(struct domain *d, xen_mem_paging_op_t *mpo)
>>> +int mem_paging_memop(unsigned long cmd,
>>> + XEN_GUEST_HANDLE_PARAM(xen_mem_paging_op_t) arg)
>>>  {
>>> -int rc = -ENODEV;
>>> +int rc;
>>> +xen_mem_paging_op_t mpo;
>>> +struct domain *d;
>>> +
>>> +rc = -EFAULT;
>>> +if ( copy_from_guest(&mpo, arg, 1) )
>>> +return rc;
>>
>> Please don't make things more complicated than they need to be:
>> You only use the -EFAULT once here, so no reason to assign it to
>> rc up front.
> 
> This return will be a "goto out;" where the rcu is getting unlocked as well.

How that? You didn't take the RCU lock yet (which is even visible
from the rest of the hunk above).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V5 12/12] xen/vm_event: Add RESUME option to vm_event_op domctl

2015-02-18 Thread Jan Beulich
>>> On 17.02.15 at 19:32,  wrote:
> On Tue, Feb 17, 2015 at 3:31 PM, Jan Beulich  wrote:
> On 13.02.15 at 17:33,  wrote:
>>> @@ -611,13 +611,22 @@ int vm_event_domctl(struct domain *d, 
> xen_domctl_vm_event_op_t *vec,
>>>  }
>>>  break;
>>>
>>> -case XEN_VM_EVENT_PAGING_DISABLE:
>>> +case XEN_VM_EVENT_DISABLE:
>>>  {
>>>  if ( ved->ring_page )
>>>  rc = vm_event_disable(d, ved);
>>>  }
>>>  break;
>>>
>>> +case XEN_VM_EVENT_RESUME:
>>> +{
>>> +if ( ved->ring_page )
>>> +vm_event_resume(d, ved);
>>> +else
>>> +rc = -ENODEV;
>>> +}
>>> +break;
>>
>> Stray braces again.
> 
> Ack.
> 
>>
>> I also find it confusing that the same set of changes repeats three
>> times here - is that an indication of a problem with an earlier patch?
> 
> No it's not. There are three rings vm_event can use, thus three rings
> that can be resumed.

But if the code ends up being almost identical, this loudly calls for
consolidation into e.g. a helper function.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4] modify the IO_TLB_SEGSIZE and IO_TLB_DEFAULT_SIZE configurable as flexible requirement about SW-IOMMU.

2015-02-18 Thread Jan Beulich
>>> On 18.02.15 at 10:09,  wrote:
>> From: Jan Beulich [mailto:jbeul...@suse.com]
>> Sent: Tuesday, February 17, 2015 6:09 PM
>> >>> On 17.02.15 at 07:51,  wrote:
>> > --- a/Documentation/kernel-parameters.txt
>> > +++ b/Documentation/kernel-parameters.txt
>> > @@ -3438,10 +3438,12 @@ bytes respectively. Such letter suffixes can
>> > also be entirely omitted.
>> >it if 0 is given (See
>> Documentation/cgroups/memory.txt)
>> >
>> >swiotlb=[ARM,IA-64,PPC,MIPS,X86]
>> > -  Format: {  | force }
>> > +  Format: {  | force |  | }
>> > -- Number of I/O TLB slabs
>> >force -- force using of bounce buffers even if they
>> > wouldn't be automatically used by the kernel
>> > +   -- Maximum allowable number of contiguous
>> slabs to map
>> > +   -- The size of SW-MMU mapped.
>> 
>> This makes no sense - the new numbers added aren't position independent
>> (nor were the previous  and "force").
>> 
> Use ","  can separate them one by one.
> We do it at lib/swiotlb.c

Right, but the documentation above doesn't say so.

>> Also you are (supposedly) removing all uses of IO_TLB_DEFAULT_SIZE, yet
>> you don't seem to remove the definition itself.
>> 
> I have change all uses of IO_TLB_DEFAULT_SIZE to io_tlb_default_size in 
> lib/swiotlb.c

Then are there any left elsewhere? If not, again - why don't you
remove the definition of IO_TLB_DEFAULT_SIZE?

>> Finally - are arbitrary numbers really okay for the newly added command line
>> options? I.e. shouldn't you add some checking of their validity?
>> 
> I have validity these code is OK.
> Example:
> BOARD_KERNEL_CMDLINE += swiotlb=, ,512,268435456
> Io_tlb_segsize has been changed from 128 to 512
> Io_tlb_default_size has been changed from 64M to 268435456  (256M)

I specifically said "arbitrary numbers", which in particular includes
zero and non-power-of-2 values. If there are any restrictions on
which numbers can validly be passed here (and it very much looks
like there are), such restrictions should be enforced imo.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 13/13] xen: allow more than 512 GB of RAM for 64 bit pv-domains

2015-02-18 Thread Juergen Gross

On 02/18/2015 10:21 AM, Paul Bolle wrote:

On Wed, 2015-02-18 at 07:52 +0100, Juergen Gross wrote:

64 bit pv-domains under Xen are limited to 512 GB of RAM today. The
main reason has been the 3 level p2m tree, which was replaced by the
virtual mapped linear p2m list. Parallel to the p2m list which is
being used by the kernel itself there is a 3 level mfn tree for usage
by the Xen tools and eventually for crash dump analysis. For this tree
the linear p2m list can serve as a replacement, too. As the kernel
can't know whether the tools are capable of dealing with the p2m list
instead of the mfn tree, the limit of 512 GB can't be dropped in all
cases.

This patch replaces the hard limit by a kernel parameter which tells
the kernel to obey the 512 GB limit or not. The default is selected by
a configuration parameter which specifies whether the 512 GB limit
should be active per default for dom0 (only crash dump analysis is
affected) and/or for domUs (additionally domain save/restore/migration
are affected).

Memory above the domain limit is returned to the hypervisor instead of
being identity mapped, which was wrong anyways.

The kernel configuration parameter to specify the maximum size of a
domain can be deleted, as it is not relevant any more.

Signed-off-by: Juergen Gross 
---
  Documentation/kernel-parameters.txt |  7 
  arch/x86/include/asm/xen/page.h |  4 ---
  arch/x86/xen/Kconfig| 31 +++-
  arch/x86/xen/p2m.c  | 10 +++---
  arch/x86/xen/setup.c| 72 ++---
  5 files changed, 93 insertions(+), 31 deletions(-)


[...]


--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -23,14 +23,29 @@ config XEN_PVHVM
def_bool y
depends on XEN && PCI && X86_LOCAL_APIC

-config XEN_MAX_DOMAIN_MEMORY
-   int
-   default 500 if X86_64
-   default 64 if X86_32
-   depends on XEN
-   help
- This only affects the sizing of some bss arrays, the unused
- portions of which are freed.
+if X86_64


Not
 && XEN
?


The complete directory is made only if CONFIG_XEN is set.




+choice
+   prompt "Support pv-domains larger than 512GB"
+   default XEN_512GB_NONE
+   help
+ Support paravirtualized domains with more than 512GB of RAM.
+
+ The Xen tools and crash dump analysis tools might not support
+ pv-domains with more than 512 GB of RAM. This option controls the
+ default setting of the kernel to use only up to 512 GB or more.
+ It is always possible to change the default via specifying the
+ boot parameter "xen_512gb_limit".
+
+   config XEN_512GB_NONE
+   bool "neither dom0 nor domUs can be larger than 512GB"
+   config XEN_512GB_DOM0
+   bool "dom0 can be larger than 512GB, domUs not"
+   config XEN_512GB_DOMU
+   bool "domUs can be larger than 512GB, dom0 not"
+   config XEN_512GB_ALL
+   bool "dom0 and domUs can be larger than 512GB"
+endchoice


So there are actually two independent limits, configured through a
choice with four entries. Would using just two separate Kconfig symbols
(XEN_512GB_DOM0 and XEN_512GB_DOMU) without a choice wrapper also work?


Yes.


Because ...


+endif

  config XEN_SAVE_RESTORE
 bool


[...]


diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 84a6473..16d94de 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -32,6 +32,8 @@
  #include "p2m.h"
  #include "mmu.h"

+#define GB(x) ((uint64_t)(x) * 1024 * 1024 * 1024)
+
  /* Amount of extra memory space we add to the e820 ranges */
  struct xen_memory_region xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;

@@ -85,6 +87,27 @@ static struct {
   */
  #define EXTRA_MEM_RATIO   (10)

+static bool xen_dom0_512gb_limit __initdata =
+   IS_ENABLED(CONFIG_XEN_512GB_NONE) || IS_ENABLED(CONFIG_XEN_512GB_DOMU);


... then this could be something like:
 static bool xen_dom0_512gb_limit __initdata = 
!IS_ENABLED(CONFIG_XEN_512GB_DOM0);


+static bool xen_domu_512gb_limit __initdata =
+   IS_ENABLED(CONFIG_XEN_512GB_NONE) || IS_ENABLED(CONFIG_XEN_512GB_DOM0);
+


and this likewise:
 static bool xen_domu_512gb_limit __initdata = 
!IS_ENABLED(CONFIG_XEN_512GB_DOMU);

Correct?


Yes.

That's a matter of taste, I think.




+static int __init xen_parse_512gb(char *arg)
+{
+   bool val = false;
+
+   if (!arg)
+   val = true;
+   else if (strtobool(arg, &val))
+   return 1;
+
+   xen_dom0_512gb_limit = val;
+   xen_domu_512gb_limit = val;
+
+   return 0;
+}
+early_param("xen_512gb_limit", xen_parse_512gb);
+
  static void __init xen_add_extra_mem(phys_addr_t start, phys_addr_t size)
  {
int i;


So one can configure these two limits separately, but the kernel
parameter is used for both. Any particular reason?


Yes. A kernel is running only either as Dom0 or as domU at

Re: [Xen-devel] [PATCH V6 04/13] xen: Rename mem_event to vm_event

2015-02-18 Thread Jan Beulich
>>> On 18.02.15 at 01:11,  wrote:
> diff --git a/xen/common/mem_event.c b/xen/common/vm_event.c
> similarity index 59%
> rename from xen/common/mem_event.c
> rename to xen/common/vm_event.c

Looking at this already quite huge delta I can't really see why
adjusting white space at once would make it much worse. In any
case better than leaving white space damage behind.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-4.4-testing test] 34688: regressions - FAIL

2015-02-18 Thread xen . org
flight 34688 xen-4.4-testing real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34688/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-xend   5 xen-build fail REGR. vs. 34151
 build-amd64-xend  5 xen-build fail REGR. vs. 34151
 build-amd64   5 xen-build fail REGR. vs. 34151
 build-i3865 xen-build fail REGR. vs. 34151

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-pv   1 build-check(1)   blocked  n/a
 test-amd64-i386-xend-winxpsp3  1 build-check(1)   blocked  n/a
 build-i386-rumpuserxen1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-sedf-pin  1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-amd64-rumpuserxen   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-i386-pv1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-win7-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-sedf  1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-i386-xend-qemut-winxpsp3  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-win7-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemut-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-winxpsp3-vcpus1  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemut-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-rhel6hvm-amd  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-pcipt-intel  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-qemut-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-rhel6hvm-intel  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  9 guest-start  fail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2   5 xen-boot fail   never pass

version targeted for testing:
 xen  50a61de84f323e32a7ce0cfdf0ce8db39330de3d
baseline version:
 xen  52e190cacf95046c99a52947aa12d7c0a2225b4d

-

Re: [Xen-devel] [PATCH 13/13] xen: allow more than 512 GB of RAM for 64 bit pv-domains

2015-02-18 Thread Jan Beulich
>>> On 18.02.15 at 10:37,  wrote:
> On 02/18/2015 10:21 AM, Paul Bolle wrote:
>> On Wed, 2015-02-18 at 07:52 +0100, Juergen Gross wrote:
>>> --- a/arch/x86/xen/Kconfig
>>> +++ b/arch/x86/xen/Kconfig
>>> @@ -23,14 +23,29 @@ config XEN_PVHVM
>>> def_bool y
>>> depends on XEN && PCI && X86_LOCAL_APIC
>>>
>>> -config XEN_MAX_DOMAIN_MEMORY
>>> -   int
>>> -   default 500 if X86_64
>>> -   default 64 if X86_32
>>> -   depends on XEN
>>> -   help
>>> - This only affects the sizing of some bss arrays, the unused
>>> - portions of which are freed.
>>> +if X86_64
>>
>> Not
>>  && XEN
>> ?
> 
> The complete directory is made only if CONFIG_XEN is set.

But that doesn't mean this file gets used only when XEN is enabled.
I would think though that an eventual "if XEN" should have wider
scope than just this option (i.e. likely almost the entire file).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 13/13] xen: allow more than 512 GB of RAM for 64 bit pv-domains

2015-02-18 Thread Juergen Gross

On 02/18/2015 10:49 AM, Jan Beulich wrote:

On 18.02.15 at 10:37,  wrote:

On 02/18/2015 10:21 AM, Paul Bolle wrote:

On Wed, 2015-02-18 at 07:52 +0100, Juergen Gross wrote:

--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -23,14 +23,29 @@ config XEN_PVHVM
def_bool y
depends on XEN && PCI && X86_LOCAL_APIC

-config XEN_MAX_DOMAIN_MEMORY
-   int
-   default 500 if X86_64
-   default 64 if X86_32
-   depends on XEN
-   help
- This only affects the sizing of some bss arrays, the unused
- portions of which are freed.
+if X86_64


Not
  && XEN
?


The complete directory is made only if CONFIG_XEN is set.


But that doesn't mean this file gets used only when XEN is enabled.


Oh, you are right. I seem to have mixed up make and Kconfig of the
directory.


I would think though that an eventual "if XEN" should have wider
scope than just this option (i.e. likely almost the entire file).


Indeed.

So either I'll add the XEN dependency for the new option or I do
another patch adding "if XEN" just below configuring XEN and remove
the XEN dependencies in the rest of the entries.

As Luis is just doing a rework of XEN Kconfig stuff, I think I'll add
the XEN dependency.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4] modify the IO_TLB_SEGSIZE and IO_TLB_DEFAULT_SIZE configurable as flexible requirement about SW-IOMMU.

2015-02-18 Thread Wang, Xiaoming
Dear Jan

> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Wednesday, February 18, 2015 5:35 PM
> To: Wang, Xiaoming
> Cc: ch...@chris-wilson.co.uk; david.vra...@citrix.com;
> lau...@codeaurora.org; heiko.carst...@de.ibm.com; li...@horizon.com;
> Liu, Chuansheng; Zhang, Dongxing; takahiro.aka...@linaro.org;
> a...@linux-foundation.org; linux-m...@linux-mips.org; ralf@linux-
> mips.org; xen-de...@lists.xenproject.org; boris.ostrov...@oracle.com;
> konrad.w...@oracle.com; d.kasat...@samsung.com; pebo...@tiscali.nl;
> linux-ker...@vger.kernel.org
> Subject: RE: [Xen-devel] [PATCH v4] modify the IO_TLB_SEGSIZE and
> IO_TLB_DEFAULT_SIZE configurable as flexible requirement about SW-
> IOMMU.
> 
> >>> On 18.02.15 at 10:09,  wrote:
> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> >> Sent: Tuesday, February 17, 2015 6:09 PM
> >> >>> On 17.02.15 at 07:51,  wrote:
> >> > --- a/Documentation/kernel-parameters.txt
> >> > +++ b/Documentation/kernel-parameters.txt
> >> > @@ -3438,10 +3438,12 @@ bytes respectively. Such letter suffixes
> >> > can also be entirely omitted.
> >> >  it if 0 is given (See
> >> Documentation/cgroups/memory.txt)
> >> >
> >> >  swiotlb=[ARM,IA-64,PPC,MIPS,X86]
> >> > -Format: {  | force }
> >> > +Format: {  | force |  | }
> >> >   -- Number of I/O TLB slabs
> >> >  force -- force using of bounce buffers even if 
> >> > they
> >> >   wouldn't be automatically used by the 
> >> > kernel
> >> > + -- Maximum allowable number of contiguous
> >> slabs to map
> >> > + -- The size of SW-MMU mapped.
> >>
> >> This makes no sense - the new numbers added aren't position
> >> independent (nor were the previous  and "force").
> >>
> > Use ","  can separate them one by one.
> > We do it at lib/swiotlb.c
> 
> Right, but the documentation above doesn't say so.
> 
OK, I will add some comments on next patch version.
> >> Also you are (supposedly) removing all uses of IO_TLB_DEFAULT_SIZE,
> >> yet you don't seem to remove the definition itself.
> >>
> > I have change all uses of IO_TLB_DEFAULT_SIZE to io_tlb_default_size
> > in lib/swiotlb.c
> 
> Then are there any left elsewhere? If not, again - why don't you remove the
> definition of IO_TLB_DEFAULT_SIZE?
> 
There hasn't any IO_TLB_DEFAULT_SIZE left.
I check the code IO_TLB_DEFAULT_SIZE only used in lib/swiotlb.c.
And I have removed the definition of IO_TLB_DEFAULT_SIZE, in my patch

@@ -120,15 +146,13 @@ unsigned long swiotlb_nr_tbl(void)  }  
EXPORT_SYMBOL_GPL(swiotlb_nr_tbl);
 
-/* default to 64MB */
-#define IO_TLB_DEFAULT_SIZE (64UL<<20)

> >> Finally - are arbitrary numbers really okay for the newly added
> >> command line options? I.e. shouldn't you add some checking of their
> validity?
> >>
> > I have validity these code is OK.
> > Example:
> > BOARD_KERNEL_CMDLINE += swiotlb=, ,512,268435456 Io_tlb_segsize has
> > been changed from 128 to 512 Io_tlb_default_size has been changed from
> > 64M to 268435456  (256M)
> 
> I specifically said "arbitrary numbers", which in particular includes zero and
> non-power-of-2 values. If there are any restrictions on which numbers can
> validly be passed here (and it very much looks like there are), such
> restrictions should be enforced imo.
> 
OK, we will validate for these variables' value in next patch version.
> Jan
Xiaoming

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v1 0/8] xen: kconfig changes

2015-02-18 Thread David Vrabel
On 17/02/15 07:39, Juergen Gross wrote:
> 
> If we have neither XEN_PV nor XEN_PVH set, why do we have to build
> enlighten.c? It will never be used. Same should apply to several other
> files in arch/x86/xen.

Can we limit this series to only Kconfig changes?  I don't really like
scope-creep in patch series.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v1 0/8] xen: kconfig changes

2015-02-18 Thread Juergen Gross

On 02/18/2015 11:03 AM, David Vrabel wrote:

On 17/02/15 07:39, Juergen Gross wrote:


If we have neither XEN_PV nor XEN_PVH set, why do we have to build
enlighten.c? It will never be used. Same should apply to several other
files in arch/x86/xen.


Can we limit this series to only Kconfig changes?  I don't really like
scope-creep in patch series.


Are you sure this is possible? XEN will be configured in more cases as
today: this is the result of being able to build pv-drivers for hvm
domains.

BTW: it was you who wanted XEN_PVHVM to be implying XEN.

So today the complete directory arch/x86/xen isn't built for non-pv
kernels. Do you really want to change this? I don't think this is
acceptable.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/13] xen: anchor linear p2m list in shared info structure

2015-02-18 Thread David Vrabel
On 18/02/15 06:51, Juergen Gross wrote:
> The linear p2m list should be anchored in the shared info structure

I'm not really sure what you mean by "anchored".

> read by the Xen tools to be able to support 64 bit pv-domains larger
> than 512 MB. Additionally the linear p2m list interface includes a
> generation count which is changed prior to and after each mapping
> change of the p2m list. Reading the generation count the Xen tools can
> detect changes of the mappings and re-read the p2m list eventually.
[...]
> --- a/arch/x86/xen/p2m.c
> +++ b/arch/x86/xen/p2m.c
> @@ -256,6 +256,10 @@ void xen_setup_mfn_list_list(void)
>   HYPERVISOR_shared_info->arch.pfn_to_mfn_frame_list_list =
>   virt_to_mfn(p2m_top_mfn);
>   HYPERVISOR_shared_info->arch.max_pfn = xen_max_p2m_pfn;
> + HYPERVISOR_shared_info->arch.p2m_generation = 0;
> + HYPERVISOR_shared_info->arch.p2m_vaddr = (unsigned long)xen_p2m_addr;
> + HYPERVISOR_shared_info->arch.p2m_cr3 =
> + xen_pfn_to_cr3(virt_to_mfn(swapper_pg_dir));
>  }
>  
>  /* Set up p2m_top to point to the domain-builder provided p2m pages */
> @@ -469,8 +473,10 @@ static pte_t *alloc_p2m_pmd(unsigned long addr, pte_t 
> *pte_pg)
>  
>   ptechk = lookup_address(vaddr, &level);
>   if (ptechk == pte_pg) {
> + HYPERVISOR_shared_info->arch.p2m_generation++;
>   set_pmd(pmdp,
>   __pmd(__pa(pte_newpg[i]) | _KERNPG_TABLE));
> + HYPERVISOR_shared_info->arch.p2m_generation++;

Do these increments of p2m_generation need to be atomic?

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 13/13] xen: allow more than 512 GB of RAM for 64 bit pv-domains

2015-02-18 Thread Paul Bolle
On Wed, 2015-02-18 at 10:37 +0100, Juergen Gross wrote:
> On 02/18/2015 10:21 AM, Paul Bolle wrote:
> > On Wed, 2015-02-18 at 07:52 +0100, Juergen Gross wrote:
> >> +choice
> >> +  prompt "Support pv-domains larger than 512GB"
> >> +  default XEN_512GB_NONE
> >> +  help
> >> +Support paravirtualized domains with more than 512GB of RAM.
> >> +
> >> +The Xen tools and crash dump analysis tools might not support
> >> +pv-domains with more than 512 GB of RAM. This option controls the
> >> +default setting of the kernel to use only up to 512 GB or more.
> >> +It is always possible to change the default via specifying the
> >> +boot parameter "xen_512gb_limit".
> >> +
> >> +  config XEN_512GB_NONE
> >> +  bool "neither dom0 nor domUs can be larger than 512GB"
> >> +  config XEN_512GB_DOM0
> >> +  bool "dom0 can be larger than 512GB, domUs not"
> >> +  config XEN_512GB_DOMU
> >> +  bool "domUs can be larger than 512GB, dom0 not"
> >> +  config XEN_512GB_ALL
> >> +  bool "dom0 and domUs can be larger than 512GB"
> >> +endchoice
> >
> > So there are actually two independent limits, configured through a
> > choice with four entries. Would using just two separate Kconfig symbols
> > (XEN_512GB_DOM0 and XEN_512GB_DOMU) without a choice wrapper also work?
> 
> Yes.
> 
> > Because ...
> >
> >> +endif

[...]

> >> @@ -85,6 +87,27 @@ static struct {
> >>*/
> >>   #define EXTRA_MEM_RATIO  (10)
> >>
> >> +static bool xen_dom0_512gb_limit __initdata =
> >> +  IS_ENABLED(CONFIG_XEN_512GB_NONE) || IS_ENABLED(CONFIG_XEN_512GB_DOMU);
> >
> > ... then this could be something like:
> >  static bool xen_dom0_512gb_limit __initdata = 
> > !IS_ENABLED(CONFIG_XEN_512GB_DOM0);
> >
> >> +static bool xen_domu_512gb_limit __initdata =
> >> +  IS_ENABLED(CONFIG_XEN_512GB_NONE) || IS_ENABLED(CONFIG_XEN_512GB_DOM0);
> >> +
> >
> > and this likewise:
> >  static bool xen_domu_512gb_limit __initdata = 
> > !IS_ENABLED(CONFIG_XEN_512GB_DOMU);
> >
> > Correct?
> 
> Yes.
> 
> That's a matter of taste, I think.

Well, my suggestion does look simpler. Anyhow, I'll be glad to let the
maintainers decide.

> >
> >> +static int __init xen_parse_512gb(char *arg)
> >> +{
> >> +  bool val = false;
> >> +
> >> +  if (!arg)
> >> +  val = true;
> >> +  else if (strtobool(arg, &val))
> >> +  return 1;
> >> +
> >> +  xen_dom0_512gb_limit = val;
> >> +  xen_domu_512gb_limit = val;
> >> +
> >> +  return 0;
> >> +}
> >> +early_param("xen_512gb_limit", xen_parse_512gb);
> >> +
> >>   static void __init xen_add_extra_mem(phys_addr_t start, phys_addr_t size)
> >>   {
> >>int i;
> >
> > So one can configure these two limits separately, but the kernel
> > parameter is used for both. Any particular reason?
> 
> Yes. A kernel is running only either as Dom0 or as domU at a given time.
> Having two parameters here would be nonsense, as only one could apply.

I see.

> And being able to configure both limits separately does make sense,
> of course.

Thanks,


Paul Bolle


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] x86/tboot: invalidate FIX_TBOOT_MAP_ADDRESS mapping after use

2015-02-18 Thread Andrew Cooper
On 18/02/15 09:03, Jan Beulich wrote:
> In order for commit cbeeaa7d ("x86/nmi: fix shootdown of pcpus
> running in VMX non-root mode")'s re-use of that fixmap entry to not
> cause undesirable (in crash context) cross-CPU TLB flushes, invalidate
> the fixmap entry right after use.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

>
> --- a/xen/arch/x86/tboot.c
> +++ b/xen/arch/x86/tboot.c
> @@ -138,6 +138,7 @@ void __init tboot_probe(void)
>TXT_PUB_CONFIG_REGS_BASE + TXTCR_SINIT_BASE);
>  tboot_copy_memory((unsigned char *)&sinit_size, sizeof(sinit_size),
>TXT_PUB_CONFIG_REGS_BASE + TXTCR_SINIT_SIZE);
> +__set_fixmap(FIX_TBOOT_MAP_ADDRESS, 0, 0);
>  }
>  
>  /* definitions from xen/drivers/passthrough/vtd/iommu.h
> @@ -476,6 +477,8 @@ int __init tboot_parse_dmar_table(acpi_t
>  dmar_table_raw = xmalloc_array(unsigned char, dmar_table_length);
>  tboot_copy_memory(dmar_table_raw, dmar_table_length, pa);
>  dmar_table = (struct acpi_table_header *)dmar_table_raw;
> +__set_fixmap(FIX_TBOOT_MAP_ADDRESS, 0, 0);
> +
>  rc = dmar_handler(dmar_table);
>  xfree(dmar_table_raw);
>  
>
>
>


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] x86/tboot: simplify DMAR table copying

2015-02-18 Thread Andrew Cooper
On 18/02/15 09:03, Jan Beulich wrote:
> There's no need for more than one variable, no need for casts, and no
> point in using the type-safe xmalloc_array() here.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

>
> --- a/xen/arch/x86/tboot.c
> +++ b/xen/arch/x86/tboot.c
> @@ -435,13 +435,12 @@ int __init tboot_protect_mem_regions(voi
>  
>  int __init tboot_parse_dmar_table(acpi_table_handler dmar_handler)
>  {
> -struct acpi_table_header *dmar_table;
>  int rc;
>  uint64_t size;
>  uint32_t dmar_table_length;
>  unsigned long pa;
>  sinit_mle_data_t sinit_mle_data;
> -unsigned char *dmar_table_raw;
> +void *dmar_table;
>  
>  if ( !tboot_in_measured_env() )
>  return acpi_table_parse(ACPI_SIG_DMAR, dmar_handler);
> @@ -474,13 +473,12 @@ int __init tboot_parse_dmar_table(acpi_t
>  tboot_copy_memory((unsigned char *)&dmar_table_length,
>sizeof(dmar_table_length),
>pa + sizeof(char) * ACPI_NAME_SIZE);
> -dmar_table_raw = xmalloc_array(unsigned char, dmar_table_length);
> -tboot_copy_memory(dmar_table_raw, dmar_table_length, pa);
> -dmar_table = (struct acpi_table_header *)dmar_table_raw;
> +dmar_table = xmalloc_bytes(dmar_table_length);
> +tboot_copy_memory(dmar_table, dmar_table_length, pa);
>  __set_fixmap(FIX_TBOOT_MAP_ADDRESS, 0, 0);
>  
>  rc = dmar_handler(dmar_table);
> -xfree(dmar_table_raw);
> +xfree(dmar_table);
>  
>  /* acpi_parse_dmar() zaps APCI DMAR signature in TXT heap table */
>  /* but dom0 will read real table, so must zap it there too */
>
>
>


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/13] xen: anchor linear p2m list in shared info structure

2015-02-18 Thread Juergen Gross

On 02/18/2015 11:32 AM, David Vrabel wrote:

On 18/02/15 06:51, Juergen Gross wrote:

The linear p2m list should be anchored in the shared info structure


I'm not really sure what you mean by "anchored".


Bad wording? What about:

The virtual address of the linear p2m list should be stored in the
shared info structure.




read by the Xen tools to be able to support 64 bit pv-domains larger
than 512 MB. Additionally the linear p2m list interface includes a
generation count which is changed prior to and after each mapping
change of the p2m list. Reading the generation count the Xen tools can
detect changes of the mappings and re-read the p2m list eventually.

[...]

--- a/arch/x86/xen/p2m.c
+++ b/arch/x86/xen/p2m.c
@@ -256,6 +256,10 @@ void xen_setup_mfn_list_list(void)
HYPERVISOR_shared_info->arch.pfn_to_mfn_frame_list_list =
virt_to_mfn(p2m_top_mfn);
HYPERVISOR_shared_info->arch.max_pfn = xen_max_p2m_pfn;
+   HYPERVISOR_shared_info->arch.p2m_generation = 0;
+   HYPERVISOR_shared_info->arch.p2m_vaddr = (unsigned long)xen_p2m_addr;
+   HYPERVISOR_shared_info->arch.p2m_cr3 =
+   xen_pfn_to_cr3(virt_to_mfn(swapper_pg_dir));
  }

  /* Set up p2m_top to point to the domain-builder provided p2m pages */
@@ -469,8 +473,10 @@ static pte_t *alloc_p2m_pmd(unsigned long addr, pte_t 
*pte_pg)

ptechk = lookup_address(vaddr, &level);
if (ptechk == pte_pg) {
+   HYPERVISOR_shared_info->arch.p2m_generation++;
set_pmd(pmdp,
__pmd(__pa(pte_newpg[i]) | _KERNPG_TABLE));
+   HYPERVISOR_shared_info->arch.p2m_generation++;


Do these increments of p2m_generation need to be atomic?


Hmm, they are done under lock. I don't think the compiler is allowed to
reorder the writes to p2m_generation across set_pmd().


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH OSSTEST] Debian: Add "fdt chosen" to boot script

2015-02-18 Thread Ian Campbell
This causes u-boot to fill in the various fields in the chosen node
(specifically the bootargs) which would otherwise not be done until
the bootz command. Doing it manually means the following "fdt print
/chosen" will print what is actually going to be used.

This change means that instead of whatever /chosen/bootargs is
embedded in the firmware FDT we end up printing what we will actually
use.

Signed-off-by: Ian Campbell 
---
 Osstest/Debian.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Osstest/Debian.pm b/Osstest/Debian.pm
index 7633b51..f6874af 100644
--- a/Osstest/Debian.pm
+++ b/Osstest/Debian.pm
@@ -237,6 +237,8 @@ fdt set /chosen/module\@1 compatible "xen,linux-initrd" 
"xen,multiboot-module"
 fdt set /chosen/module\@1 reg <\\\${ramdisk_addr_r} 
${size_hex_prefix}\\\${filesize}>
 echo Loaded $initrd to \\\${ramdisk_addr_r} (\\\${filesize})
 
+fdt chosen
+
 fdt print /chosen
 
 echo Booting \\\${xen_addr_r} - \\\${fdt_addr}
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/13] xen: anchor linear p2m list in shared info structure

2015-02-18 Thread Andrew Cooper
On 18/02/15 10:42, Juergen Gross wrote:
>
>>>   /* Set up p2m_top to point to the domain-builder provided p2m
>>> pages */
>>> @@ -469,8 +473,10 @@ static pte_t *alloc_p2m_pmd(unsigned long addr,
>>> pte_t *pte_pg)
>>>
>>>   ptechk = lookup_address(vaddr, &level);
>>>   if (ptechk == pte_pg) {
>>> +HYPERVISOR_shared_info->arch.p2m_generation++;
>>>   set_pmd(pmdp,
>>>   __pmd(__pa(pte_newpg[i]) | _KERNPG_TABLE));
>>> +HYPERVISOR_shared_info->arch.p2m_generation++;
>>
>> Do these increments of p2m_generation need to be atomic?
>
> Hmm, they are done under lock. I don't think the compiler is allowed to
> reorder the writes to p2m_generation across set_pmd().

They do need smp_wmb() to guarantee that the increment is visible before
the update occurs, just as the toolstack will need smp_rmb() to read.

They also need to be protected from concurrent update inside the kernel,
for which a lock should appear to suffice.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/13] xen: anchor linear p2m list in shared info structure

2015-02-18 Thread David Vrabel
On 18/02/15 10:42, Juergen Gross wrote:
> On 02/18/2015 11:32 AM, David Vrabel wrote:
>> On 18/02/15 06:51, Juergen Gross wrote:
>>> The linear p2m list should be anchored in the shared info structure
>>
>> I'm not really sure what you mean by "anchored".
> 
> Bad wording? What about:
> 
> The virtual address of the linear p2m list should be stored in the
> shared info structure.

This is better.

>>> read by the Xen tools to be able to support 64 bit pv-domains larger
>>> than 512 MB. Additionally the linear p2m list interface includes a
>>> generation count which is changed prior to and after each mapping
>>> change of the p2m list. Reading the generation count the Xen tools can
>>> detect changes of the mappings and re-read the p2m list eventually.
>> [...]
>>> --- a/arch/x86/xen/p2m.c
>>> +++ b/arch/x86/xen/p2m.c
>>> @@ -256,6 +256,10 @@ void xen_setup_mfn_list_list(void)
>>>   HYPERVISOR_shared_info->arch.pfn_to_mfn_frame_list_list =
>>>   virt_to_mfn(p2m_top_mfn);
>>>   HYPERVISOR_shared_info->arch.max_pfn = xen_max_p2m_pfn;
>>> +HYPERVISOR_shared_info->arch.p2m_generation = 0;
>>> +HYPERVISOR_shared_info->arch.p2m_vaddr = (unsigned
>>> long)xen_p2m_addr;
>>> +HYPERVISOR_shared_info->arch.p2m_cr3 =
>>> +xen_pfn_to_cr3(virt_to_mfn(swapper_pg_dir));
>>>   }
>>>
>>>   /* Set up p2m_top to point to the domain-builder provided p2m pages */
>>> @@ -469,8 +473,10 @@ static pte_t *alloc_p2m_pmd(unsigned long addr,
>>> pte_t *pte_pg)
>>>
>>>   ptechk = lookup_address(vaddr, &level);
>>>   if (ptechk == pte_pg) {
>>> +HYPERVISOR_shared_info->arch.p2m_generation++;
>>>   set_pmd(pmdp,
>>>   __pmd(__pa(pte_newpg[i]) | _KERNPG_TABLE));
>>> +HYPERVISOR_shared_info->arch.p2m_generation++;
>>
>> Do these increments of p2m_generation need to be atomic?
> 
> Hmm, they are done under lock.

Ok, atomic isn't necessary.

> I don't think the compiler is allowed to
> reorder the writes to p2m_generation across set_pmd().

Ok. but I think you also need to prevent the processor reordering the
writes so I think some write barriers are needed here.  (The toolstack
would then need the corresponding read barriers when checking the
p2m_generation.)

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/13] xen: anchor linear p2m list in shared info structure

2015-02-18 Thread David Vrabel
On 18/02/15 10:50, Andrew Cooper wrote:
> On 18/02/15 10:42, Juergen Gross wrote:
>>
   /* Set up p2m_top to point to the domain-builder provided p2m
 pages */
 @@ -469,8 +473,10 @@ static pte_t *alloc_p2m_pmd(unsigned long addr,
 pte_t *pte_pg)

   ptechk = lookup_address(vaddr, &level);
   if (ptechk == pte_pg) {
 +HYPERVISOR_shared_info->arch.p2m_generation++;
   set_pmd(pmdp,
   __pmd(__pa(pte_newpg[i]) | _KERNPG_TABLE));
 +HYPERVISOR_shared_info->arch.p2m_generation++;
>>>
>>> Do these increments of p2m_generation need to be atomic?
>>
>> Hmm, they are done under lock. I don't think the compiler is allowed to
>> reorder the writes to p2m_generation across set_pmd().
> 
> They do need smp_wmb() to guarantee that the increment is visible before
> the update occurs, just as the toolstack will need smp_rmb() to read.

smp_wmb() isn't good enough since you need the barrier even on non-smp
-- you need a wmb().

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/13] xen: anchor linear p2m list in shared info structure

2015-02-18 Thread Juergen Gross

On 02/18/2015 11:50 AM, Andrew Cooper wrote:

On 18/02/15 10:42, Juergen Gross wrote:



   /* Set up p2m_top to point to the domain-builder provided p2m
pages */
@@ -469,8 +473,10 @@ static pte_t *alloc_p2m_pmd(unsigned long addr,
pte_t *pte_pg)

   ptechk = lookup_address(vaddr, &level);
   if (ptechk == pte_pg) {
+HYPERVISOR_shared_info->arch.p2m_generation++;
   set_pmd(pmdp,
   __pmd(__pa(pte_newpg[i]) | _KERNPG_TABLE));
+HYPERVISOR_shared_info->arch.p2m_generation++;


Do these increments of p2m_generation need to be atomic?


Hmm, they are done under lock. I don't think the compiler is allowed to
reorder the writes to p2m_generation across set_pmd().


They do need smp_wmb() to guarantee that the increment is visible before
the update occurs, just as the toolstack will need smp_rmb() to read.


Okay, I'll add smp_wmb() before and after calling set_pmd().


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/13] xen: anchor linear p2m list in shared info structure

2015-02-18 Thread Juergen Gross

On 02/18/2015 11:54 AM, David Vrabel wrote:

On 18/02/15 10:50, Andrew Cooper wrote:

On 18/02/15 10:42, Juergen Gross wrote:



   /* Set up p2m_top to point to the domain-builder provided p2m
pages */
@@ -469,8 +473,10 @@ static pte_t *alloc_p2m_pmd(unsigned long addr,
pte_t *pte_pg)

   ptechk = lookup_address(vaddr, &level);
   if (ptechk == pte_pg) {
+HYPERVISOR_shared_info->arch.p2m_generation++;
   set_pmd(pmdp,
   __pmd(__pa(pte_newpg[i]) | _KERNPG_TABLE));
+HYPERVISOR_shared_info->arch.p2m_generation++;


Do these increments of p2m_generation need to be atomic?


Hmm, they are done under lock. I don't think the compiler is allowed to
reorder the writes to p2m_generation across set_pmd().


They do need smp_wmb() to guarantee that the increment is visible before
the update occurs, just as the toolstack will need smp_rmb() to read.


smp_wmb() isn't good enough since you need the barrier even on non-smp
-- you need a wmb().


Okay, will do.

Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/13] xen: anchor linear p2m list in shared info structure

2015-02-18 Thread Andrew Cooper
On 18/02/15 10:54, David Vrabel wrote:
> On 18/02/15 10:50, Andrew Cooper wrote:
>> On 18/02/15 10:42, Juergen Gross wrote:
>   /* Set up p2m_top to point to the domain-builder provided p2m
> pages */
> @@ -469,8 +473,10 @@ static pte_t *alloc_p2m_pmd(unsigned long addr,
> pte_t *pte_pg)
>
>   ptechk = lookup_address(vaddr, &level);
>   if (ptechk == pte_pg) {
> +HYPERVISOR_shared_info->arch.p2m_generation++;
>   set_pmd(pmdp,
>   __pmd(__pa(pte_newpg[i]) | _KERNPG_TABLE));
> +HYPERVISOR_shared_info->arch.p2m_generation++;
 Do these increments of p2m_generation need to be atomic?
>>> Hmm, they are done under lock. I don't think the compiler is allowed to
>>> reorder the writes to p2m_generation across set_pmd().
>> They do need smp_wmb() to guarantee that the increment is visible before
>> the update occurs, just as the toolstack will need smp_rmb() to read.
> smp_wmb() isn't good enough since you need the barrier even on non-smp
> -- you need a wmb().

Ah yes.  I was thinking in the wrong context for smp.  In this case we
need to guarantee interdomain consistency even with a UP guest kernel.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] dprintk() and gdprintk() to be compiled out when NDEBUG

2015-02-18 Thread Ian Campbell
On Wed, 2015-02-11 at 07:50 +, Jan Beulich wrote:
> All,
> 
> I'd like to propose to honor the 'd' in these functions' names (which
> I understand to mean "debug") in that such functions should be
> no-ops in non-debug builds. I'd then be inclined to introduce a
> gprintk() automatically adding XENLOG_GUEST and the printing of
> current using the %pv format.

Sounds fine to me.

>  Quite likely the (mis-)use of these
> two functions may then temporarily result in messages not meant
> to be debugging ones to become hidden in non-debug builds. If
> others agree, I'd try to make one pass through the tree to try to
> identify such,

Thanks, that would be useful I think. Will you cover arch/arm too?

> but I'd like to ask others to also keep an eye on that aspect.

I'll certainly try.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] dprintk() and gdprintk() to be compiled out when NDEBUG

2015-02-18 Thread Jan Beulich
>>> On 18.02.15 at 11:58,  wrote:
> On Wed, 2015-02-11 at 07:50 +, Jan Beulich wrote:
>>  Quite likely the (mis-)use of these
>> two functions may then temporarily result in messages not meant
>> to be debugging ones to become hidden in non-debug builds. If
>> others agree, I'd try to make one pass through the tree to try to
>> identify such,
> 
> Thanks, that would be useful I think. Will you cover arch/arm too?

I did the patch and the auditing pass already, and I skipped - as
you kind of expected - arch/arm. Since we're not under pressure
and everyone should be doing debug builds right now anyway, I
don't think applying the to-be-posted patch without ARM
adjustments will do much harm; let me know if you think otherwise.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v18 05/16] x86/VPMU: Interface for setting PMU mode and flags

2015-02-18 Thread Dietmar Hahn
Am Montag 16 Februar 2015, 17:26:48 schrieb Boris Ostrovsky:
> Add runtime interface for setting PMU mode and flags. Three main modes are
> provided:
> * XENPMU_MODE_OFF:  PMU is not virtualized
> * XENPMU_MODE_SELF: Guests can access PMU MSRs and receive PMU interrupts.
> * XENPMU_MODE_HV: Same as XENPMU_MODE_SELF for non-proviledged guests, dom0
>   can profile itself and the hypervisor.
> 
> Note that PMU modes are different from what can be provided at Xen's boot line
> with 'vpmu' argument. An 'off' (or '0') value is equivalent to 
> XENPMU_MODE_OFF.
> Any other value, on the other hand, will cause VPMU mode to be set to
> XENPMU_MODE_SELF during boot.
> 
> For feature flags only Intel's BTS is currently supported.
> 
> Mode and flags are set via HYPERVISOR_xenpmu_op hypercall.
> 
> Signed-off-by: Boris Ostrovsky 
> Acked-by: Daniel De Graaf 
> ---
>  tools/flask/policy/policy/modules/xen/xen.te |   3 +
>  xen/arch/x86/domain.c|   6 +-
>  xen/arch/x86/hvm/svm/vpmu.c  |  25 ++-
>  xen/arch/x86/hvm/vmx/vmcs.c  |   7 +-
>  xen/arch/x86/hvm/vmx/vpmu_core2.c|  27 ++-
>  xen/arch/x86/hvm/vpmu.c  | 240 
> +--
>  xen/arch/x86/oprofile/nmi_int.c  |   3 +-
>  xen/arch/x86/x86_64/compat/entry.S   |   4 +
>  xen/arch/x86/x86_64/entry.S  |   4 +
>  xen/include/asm-x86/hvm/vmx/vmcs.h   |   7 +-
>  xen/include/asm-x86/hvm/vpmu.h   |  33 +++-
>  xen/include/public/pmu.h |  45 +
>  xen/include/public/xen.h |   1 +
>  xen/include/xen/hypercall.h  |   4 +
>  xen/include/xlat.lst |   1 +
>  xen/include/xsm/dummy.h  |  15 ++
>  xen/include/xsm/xsm.h|   6 +
>  xen/xsm/dummy.c  |   1 +
>  xen/xsm/flask/hooks.c|  18 ++
>  xen/xsm/flask/policy/access_vectors  |   2 +
>  20 files changed, 417 insertions(+), 35 deletions(-)
> 
> diff --git a/tools/flask/policy/policy/modules/xen/xen.te 
> b/tools/flask/policy/policy/modules/xen/xen.te
> index c0128aa..870ff81 100644
> --- a/tools/flask/policy/policy/modules/xen/xen.te
> +++ b/tools/flask/policy/policy/modules/xen/xen.te
> @@ -68,6 +68,9 @@ allow dom0_t xen_t:xen2 {
>  resource_op
>  psr_cmt_op
>  };
> +allow dom0_t xen_t:xen2 {
> +pmu_ctrl
> +};
>  allow dom0_t xen_t:mmu memorymap;
>  
>  # Allow dom0 to use these domctls on itself. For domctls acting on other
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index eb8ac3a..b0e3c3d 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1536,7 +1536,7 @@ void context_switch(struct vcpu *prev, struct vcpu 
> *next)
>  if ( is_hvm_vcpu(prev) )
>  {
>  if (prev != next)
> -vpmu_save(vcpu_vpmu(prev));
> +vpmu_switch_from(vcpu_vpmu(prev), vcpu_vpmu(next));
>  
>  if ( !list_empty(&prev->arch.hvm_vcpu.tm_list) )
>  pt_save_timer(prev);
> @@ -1579,9 +1579,9 @@ void context_switch(struct vcpu *prev, struct vcpu 
> *next)
> !is_hardware_domain(next->domain));
>  }
>  
> -if (is_hvm_vcpu(next) && (prev != next) )
> +if ( is_hvm_vcpu(next) && (prev != next) )
>  /* Must be done with interrupts enabled */
> -vpmu_load(vcpu_vpmu(next));
> +vpmu_switch_to(vcpu_vpmu(prev), vcpu_vpmu(next));
>  
>  context_saved(prev);
>  
> diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c
> index 72e2561..2cfdf08 100644
> --- a/xen/arch/x86/hvm/svm/vpmu.c
> +++ b/xen/arch/x86/hvm/svm/vpmu.c
> @@ -253,6 +253,26 @@ static int amd_vpmu_save(struct vpmu_struct *vpmu)
>  return 1;
>  }
>  
> +static void amd_vpmu_unload(struct vpmu_struct *vpmu)
> +{
> +struct vcpu *v;
> +
> +if ( vpmu_is_set(vpmu, VPMU_CONTEXT_LOADED | VPMU_FROZEN) )
> +{
> +unsigned int i;
> +
> +for ( i = 0; i < num_counters; i++ )
> +wrmsrl(ctrls[i], 0);
> +context_save(vpmu);
> +}
> +
> +v = vpmu_vcpu(vpmu);
> +if ( has_hvm_container_vcpu(v) && is_msr_bitmap_on(vpmu) )
> +amd_vpmu_unset_msr_bitmap(v);
> +
> +vpmu_reset(vpmu, VPMU_FROZEN);
> +}
> +
>  static void context_update(unsigned int msr, u64 msr_content)
>  {
>  unsigned int i;
> @@ -471,17 +491,18 @@ struct arch_vpmu_ops amd_vpmu_ops = {
>  .arch_vpmu_destroy = amd_vpmu_destroy,
>  .arch_vpmu_save = amd_vpmu_save,
>  .arch_vpmu_load = amd_vpmu_load,
> +.arch_vpmu_unload = amd_vpmu_unload,
>  .arch_vpmu_dump = amd_vpmu_dump
>  };
>  
> -int svm_vpmu_initialise(struct vcpu *v, unsigned int vpmu_flags)
> +int svm_vpmu_initialise(struct vcpu *v)
>  {
>  struct vpmu_struct *vpmu = vcpu_vpmu(v);
>  uint8_t family = current_cpu_data.x86;
>  int ret = 0;
>  
>  /* vpmu enabled? */
> -if 

Re: [Xen-devel] [PATCH 13/13] xen: allow more than 512 GB of RAM for 64 bit pv-domains

2015-02-18 Thread David Vrabel
On 18/02/15 06:52, Juergen Gross wrote:
> 
> +if X86_64
> +choice
> + prompt "Support pv-domains larger than 512GB"
> + default XEN_512GB_NONE
> + help
> +   Support paravirtualized domains with more than 512GB of RAM.
> +
> +   The Xen tools and crash dump analysis tools might not support
> +   pv-domains with more than 512 GB of RAM. This option controls the
> +   default setting of the kernel to use only up to 512 GB or more.
> +   It is always possible to change the default via specifying the
> +   boot parameter "xen_512gb_limit".
> +
> + config XEN_512GB_NONE
> + bool "neither dom0 nor domUs can be larger than 512GB"
> + config XEN_512GB_DOM0
> + bool "dom0 can be larger than 512GB, domUs not"
> + config XEN_512GB_DOMU
> + bool "domUs can be larger than 512GB, dom0 not"
> + config XEN_512GB_ALL
> + bool "dom0 and domUs can be larger than 512GB"
> +endchoice
> +endif

This configuration option doesn't look useful to me.  Can we get rid of
it with runtime checking.  e.g.,

If dom0, enable >512G.
If domU, enable >512G if requested by command line option /or/ toolstack
indicates that it supports the linear p2m.

And

If max_pfn < 512G, populate 3-level p2m /unless/ toolstack indicates it
supports the linear p2m.

People using crash analysis tools that need the 3-level p2m can clamp
dom0 memory with the Xen command line option.  FWIW, the tool we use
doesn't need this.

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] Tweaking the release process for Xen 4.6

2015-02-18 Thread Ian Campbell
On Tue, 2015-02-10 at 15:04 +, Wei Liu wrote:

Not much to add... a couple of comments.

> Counting from the point that we forked the tree, it took ~11 months to
> ship 4.5. The time spent on development was 7 months (Feb 21 to Sept
> 24), and the time spent on freeze was ~4 months (Sept 24 to Jan 6).
> 
> The good thing was that code quality was ensured, the downside was that
> such long freeze 

FWIW I was really starting to feel like the freeze had gone on forever
by the end of it, and was feeling like there had been stuff which was
pushed on at the end of the summer which had now been sitting in limbo
for far too long.

> With the above proposal in mind, here is my proposed time frame for
> Xen 4.6 release:
> 
>   Development start: 6  Jan 2015
>   Feature freeze:10 Jul 2015
>   Release date:  9  Oct 2015 (could release earlier)

This is 6 months of dev + 3 of freeze, for 9 months in total.

> Note that the release date is not a goal. It is more like a deadline
> we try to keep up with. We might be well able to release earlier if
> everything is in good shape.

In fact I think we could reasonably aim for a two month freeze (~6 rcs)
and pencil in the third month as potential slippage time which we don't
plan to use.

> Any thought on this tweaked process? Comments are welcome.
> 
> Wei.
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] arm/xen: Correctly check if the event channel interrupt is present

2015-02-18 Thread Ian Campbell
On Thu, 2015-02-12 at 06:34 +, Julien Grall wrote:
> The function irq_of_parse_and_map returns 0 when the IRQ is not found.
> 
> Furthermore xen_events_irq is only read when the CPU is bring up, so
> it's not necessary to use the attribute __read_mostly.

Part of the purpose of __read_mostly is to move such things out of
sharing cachelines with other more hot read/write things, as much as it
is to group all the "read only" things together.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/2] arm/arm64: Detect Xen support earlier

2015-02-18 Thread Ian Campbell
On Thu, 2015-02-12 at 06:34 +, Julien Grall wrote:
> Hello,
> 
> This small patch series move the detection of running on Xen earlier. This is
> required in order to support earlyprintk via Xen and selecting the preferred
> console.

Thanks for doing this, having all of the init done in an initcall (even
a relatively early one) has been a niggle I've wanted address for ages,
for exactly earlyprintk and preferred console reasons.

I had a very minor comment on #1 but nonetheless both patches:
Acked-by: Ian Campbell 

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] dprintk() and gdprintk() to be compiled out when NDEBUG

2015-02-18 Thread Ian Campbell
On Wed, 2015-02-18 at 11:05 +, Jan Beulich wrote:
> >>> On 18.02.15 at 11:58,  wrote:
> > On Wed, 2015-02-11 at 07:50 +, Jan Beulich wrote:
> >>  Quite likely the (mis-)use of these
> >> two functions may then temporarily result in messages not meant
> >> to be debugging ones to become hidden in non-debug builds. If
> >> others agree, I'd try to make one pass through the tree to try to
> >> identify such,
> > 
> > Thanks, that would be useful I think. Will you cover arch/arm too?
> 
> I did the patch and the auditing pass already, and I skipped - as
> you kind of expected - arch/arm. Since we're not under pressure
> and everyone should be doing debug builds right now anyway, I
> don't think applying the to-be-posted patch without ARM
> adjustments will do much harm; let me know if you think otherwise.

No problem, please go ahead and I'll try and find a moment to audit the
ARM ones separately.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] arm/xen: Correctly check if the event channel interrupt is present

2015-02-18 Thread Julien Grall

Hi Ian,

On 18/02/2015 11:27, Ian Campbell wrote:

On Thu, 2015-02-12 at 06:34 +, Julien Grall wrote:

The function irq_of_parse_and_map returns 0 when the IRQ is not found.

Furthermore xen_events_irq is only read when the CPU is bring up, so
it's not necessary to use the attribute __read_mostly.


Part of the purpose of __read_mostly is to move such things out of
sharing cachelines with other more hot read/write things, as much as it
is to group all the "read only" things together.


Hmmm... You are right. I didn't understand this macro like that.

I will resend the series with this patch drop and add Ard's patch [1].

Regards,

[1] https://patches.linaro.org/44633/

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for 4.6 13/13] xen/iommu: smmu: Add Xen specific code to be able to use the driver

2015-02-18 Thread Julien Grall

Hi Manish,

On 18/02/2015 01:02, Manish wrote:


On 17/12/14 1:38 am, Julien Grall wrote:

The main goal is to modify as little the Linux code to be able to port
easily new feature added in Linux repo for the driver.

To achieve that we:
 - Add helpers to Linux function not implemented on Xen
 - Add callbacks used by Xen to do our own stuff and call Linux ones
 - Only modify when required the code which comes from Linux. If so a
 comment has been added with /* Xen: ... */ explaining why it's
 necessary.

The support for PCI has been commented because it's not yet supported by
Xen ARM and therefore won't compile.

Signed-off-by: Julien Grall 
---
  xen/drivers/passthrough/arm/Makefile |   1 +
  xen/drivers/passthrough/arm/smmu.c   | 668
+++
  2 files changed, 602 insertions(+), 67 deletions(-)

diff --git a/xen/drivers/passthrough/arm/Makefile
b/xen/drivers/passthrough/arm/Makefile
index 0484b79..f4cd26e 100644
--- a/xen/drivers/passthrough/arm/Makefile
+++ b/xen/drivers/passthrough/arm/Makefile
@@ -1 +1,2 @@
  obj-y += iommu.o
+obj-y += smmu.o
diff --git a/xen/drivers/passthrough/arm/smmu.c
b/xen/drivers/passthrough/arm/smmu.c
index 8a6514f..3cf1773 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -18,6 +18,13 @@
   *
   * Author: Will Deacon 
   *
+ * Based on Linux drivers/iommu/arm-smmu.c
+ *=> commit e6b5be2be4e30037eb551e0ed09dd97bd00d85d3
+ *
+ * Xen modification:
+ * Julien Grall 
+ * Copyright (C) 2014 Linaro Limited.
+ *
   * This driver currently supports:
   *- SMMUv1 and v2 implementations
   *- Stream-matching and stream-indexing
@@ -28,26 +35,154 @@
   *- Context fault reporting
   */


<<>>


+/* Xen: Dummy iommu_domain */
+struct iommu_domain
+{
+struct arm_smmu_domain*priv;
+
+/* Used to link domain contexts for a same domain */
+struct list_headlist;
+};
+
+/* Xen: Describes informations required for a Xen domain */
+struct arm_smmu_xen_domain {
+spinlock_tlock;
+/* List of context (i.e iommu_domain) associated to this domain */
+struct list_headcontexts;
+};
+
+/* Xen: Information about each device stored in dev->archdata.iommu */
+struct arm_smmu_xen_device {
+struct iommu_domain *domain;
+struct iommu_group *group;
+};
+
+#define dev_archdata(dev) ((struct arm_smmu_xen_device
*)dev->archdata.iommu)
+#define dev_iommu_domain(dev) (dev_archdata(dev)->domain)
+#define dev_iommu_group(dev) (dev_archdata(dev)->group)
+
+/* Xen: Dummy iommu_group */
+struct iommu_group
+{
+struct arm_smmu_master_cfg *cfg;
+
+atomic_t ref;
+};
+

The naming needs to be revisited in this patch. Original driver from
Will has arm_smmu_domain. This patch adds  iommu_domain,
arm_smmu_xen_domain, iommu_group.


I can't change the naming of the structure. iommu_domain and iommu_group 
are from Linux. As we don't have it on Xen, I have to add dummy 
structure for it.



Could you please add some description about the relation and hierarchy
of these data structures.


Good point, I will try to add more comment and explain why we have to do it.

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] arm/xen: Correctly check if the event channel interrupt is present

2015-02-18 Thread Julien Grall



On 18/02/2015 11:41, Julien Grall wrote:

Hi Ian,

On 18/02/2015 11:27, Ian Campbell wrote:

On Thu, 2015-02-12 at 06:34 +, Julien Grall wrote:

The function irq_of_parse_and_map returns 0 when the IRQ is not found.

Furthermore xen_events_irq is only read when the CPU is bring up, so
it's not necessary to use the attribute __read_mostly.


Part of the purpose of __read_mostly is to move such things out of
sharing cachelines with other more hot read/write things, as much as it
is to group all the "read only" things together.


Hmmm... You are right. I didn't understand this macro like that.

I will resend the series with this patch drop and add Ard's patch [1].


Actually I'm stupid... This patch is still useful except the __read_mostly.



Regards,

[1] https://patches.linaro.org/44633/



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/2] arm/arm64: Detect Xen support earlier

2015-02-18 Thread Julien Grall

Hi Ian,

On 18/02/2015 11:30, Ian Campbell wrote:

On Thu, 2015-02-12 at 06:34 +, Julien Grall wrote:

Hello,

This small patch series move the detection of running on Xen earlier. This is
required in order to support earlyprintk via Xen and selecting the preferred
console.


Thanks for doing this, having all of the init done in an initcall (even
a relatively early one) has been a niggle I've wanted address for ages,
for exactly earlyprintk and preferred console reasons.

I had a very minor comment on #1 but nonetheless both patches:
Acked-by: Ian Campbell 


Can I keep you ack on the first one with the __read_mostly dropped?

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 13/13] xen: allow more than 512 GB of RAM for 64 bit pv-domains

2015-02-18 Thread Juergen Gross

On 02/18/2015 12:18 PM, David Vrabel wrote:

On 18/02/15 06:52, Juergen Gross wrote:


+if X86_64
+choice
+   prompt "Support pv-domains larger than 512GB"
+   default XEN_512GB_NONE
+   help
+ Support paravirtualized domains with more than 512GB of RAM.
+
+ The Xen tools and crash dump analysis tools might not support
+ pv-domains with more than 512 GB of RAM. This option controls the
+ default setting of the kernel to use only up to 512 GB or more.
+ It is always possible to change the default via specifying the
+ boot parameter "xen_512gb_limit".
+
+   config XEN_512GB_NONE
+   bool "neither dom0 nor domUs can be larger than 512GB"
+   config XEN_512GB_DOM0
+   bool "dom0 can be larger than 512GB, domUs not"
+   config XEN_512GB_DOMU
+   bool "domUs can be larger than 512GB, dom0 not"
+   config XEN_512GB_ALL
+   bool "dom0 and domUs can be larger than 512GB"
+endchoice
+endif


This configuration option doesn't look useful to me.  Can we get rid of
it with runtime checking.  e.g.,

If dom0, enable >512G.
If domU, enable >512G if requested by command line option /or/ toolstack
indicates that it supports the linear p2m.


How is the toolstack supposed to indicate this?

I'd be more than happy to get rid of that option. For Dom0 you seem to
have changed your mind (you rejected enabling >512GB as default last
year).

Doing some more tests I found the command line option is problematic:
The option seems to be evaluated only after it is needed (I did the
first tests using the config option). Can we get rid of the option
even for domU? Or do I have to pre-scan the command line for the option?


And

If max_pfn < 512G, populate 3-level p2m /unless/ toolstack indicates it
supports the linear p2m.


What about Dom0?


People using crash analysis tools that need the 3-level p2m can clamp
dom0 memory with the Xen command line option.  FWIW, the tool we use
doesn't need this.


Interesting. Which tool are you using?


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 25/45] gntalloc.h: include stdint.h in userspace

2015-02-18 Thread David Vrabel
On 16/02/15 23:05, Mikko Rapeli wrote:
> Fixes compilation error:
> 
> xen/gntalloc.h:22:2: error: unknown type name ‘uint16_t’
> 
> Signed-off-by: Mikko Rapeli 
> ---
>  include/uapi/xen/gntalloc.h | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/include/uapi/xen/gntalloc.h b/include/uapi/xen/gntalloc.h
> index 76bd580..184df7e 100644
> --- a/include/uapi/xen/gntalloc.h
> +++ b/include/uapi/xen/gntalloc.h
> @@ -11,6 +11,12 @@
>  #ifndef __LINUX_PUBLIC_GNTALLOC_H__
>  #define __LINUX_PUBLIC_GNTALLOC_H__
>  
> +#ifdef __KERNEL__
> +#include 
> +#else
> +#include 
> +#endif

I think it would be preferrable to #include  only and
switch to using the __u32 etc. types (as others have suggested).

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 13/13] xen: allow more than 512 GB of RAM for 64 bit pv-domains

2015-02-18 Thread Juergen Gross

Sorry, used Reply instead of Reply-all

On 02/18/2015 12:18 PM, David Vrabel wrote:

On 18/02/15 06:52, Juergen Gross wrote:


+if X86_64
+choice
+   prompt "Support pv-domains larger than 512GB"
+   default XEN_512GB_NONE
+   help
+ Support paravirtualized domains with more than 512GB of RAM.
+
+ The Xen tools and crash dump analysis tools might not support
+ pv-domains with more than 512 GB of RAM. This option controls the
+ default setting of the kernel to use only up to 512 GB or more.
+ It is always possible to change the default via specifying the
+ boot parameter "xen_512gb_limit".
+
+   config XEN_512GB_NONE
+   bool "neither dom0 nor domUs can be larger than 512GB"
+   config XEN_512GB_DOM0
+   bool "dom0 can be larger than 512GB, domUs not"
+   config XEN_512GB_DOMU
+   bool "domUs can be larger than 512GB, dom0 not"
+   config XEN_512GB_ALL
+   bool "dom0 and domUs can be larger than 512GB"
+endchoice
+endif


This configuration option doesn't look useful to me.  Can we get rid of
it with runtime checking.  e.g.,

If dom0, enable >512G.
If domU, enable >512G if requested by command line option /or/ toolstack
indicates that it supports the linear p2m.


How is the toolstack supposed to indicate this?

I'd be more than happy to get rid of that option. For Dom0 you seem to
have changed your mind (you rejected enabling >512GB as default last
year).

Doing some more tests I found the command line option is problematic:
The option seems to be evaluated only after it is needed (I did the
first tests using the config option). Can we get rid of the option
even for domU? Or do I have to pre-scan the command line for the option?


And

If max_pfn < 512G, populate 3-level p2m /unless/ toolstack indicates it
supports the linear p2m.


What about Dom0?


People using crash analysis tools that need the 3-level p2m can clamp
dom0 memory with the Xen command line option.  FWIW, the tool we use
doesn't need this.


Interesting. Which tool are you using?


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for 4.6 13/13] xen/iommu: smmu: Add Xen specific code to be able to use the driver

2015-02-18 Thread Julien Grall
BTW, I have sent few versions of this series since then. Please comment 
on the latest series as the code may have change.


Nonetheless, you are comment is still valid for the v3 :).

Regards,

On 18/02/2015 11:47, Julien Grall wrote:

Hi Manish,

On 18/02/2015 01:02, Manish wrote:


On 17/12/14 1:38 am, Julien Grall wrote:

The main goal is to modify as little the Linux code to be able to port
easily new feature added in Linux repo for the driver.

To achieve that we:
 - Add helpers to Linux function not implemented on Xen
 - Add callbacks used by Xen to do our own stuff and call Linux ones
 - Only modify when required the code which comes from Linux. If
so a
 comment has been added with /* Xen: ... */ explaining why it's
 necessary.

The support for PCI has been commented because it's not yet supported by
Xen ARM and therefore won't compile.

Signed-off-by: Julien Grall 
---
  xen/drivers/passthrough/arm/Makefile |   1 +
  xen/drivers/passthrough/arm/smmu.c   | 668
+++
  2 files changed, 602 insertions(+), 67 deletions(-)

diff --git a/xen/drivers/passthrough/arm/Makefile
b/xen/drivers/passthrough/arm/Makefile
index 0484b79..f4cd26e 100644
--- a/xen/drivers/passthrough/arm/Makefile
+++ b/xen/drivers/passthrough/arm/Makefile
@@ -1 +1,2 @@
  obj-y += iommu.o
+obj-y += smmu.o
diff --git a/xen/drivers/passthrough/arm/smmu.c
b/xen/drivers/passthrough/arm/smmu.c
index 8a6514f..3cf1773 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -18,6 +18,13 @@
   *
   * Author: Will Deacon 
   *
+ * Based on Linux drivers/iommu/arm-smmu.c
+ *=> commit e6b5be2be4e30037eb551e0ed09dd97bd00d85d3
+ *
+ * Xen modification:
+ * Julien Grall 
+ * Copyright (C) 2014 Linaro Limited.
+ *
   * This driver currently supports:
   *- SMMUv1 and v2 implementations
   *- Stream-matching and stream-indexing
@@ -28,26 +35,154 @@
   *- Context fault reporting
   */


<<>>


+/* Xen: Dummy iommu_domain */
+struct iommu_domain
+{
+struct arm_smmu_domain*priv;
+
+/* Used to link domain contexts for a same domain */
+struct list_headlist;
+};
+
+/* Xen: Describes informations required for a Xen domain */
+struct arm_smmu_xen_domain {
+spinlock_tlock;
+/* List of context (i.e iommu_domain) associated to this domain */
+struct list_headcontexts;
+};
+
+/* Xen: Information about each device stored in dev->archdata.iommu */
+struct arm_smmu_xen_device {
+struct iommu_domain *domain;
+struct iommu_group *group;
+};
+
+#define dev_archdata(dev) ((struct arm_smmu_xen_device
*)dev->archdata.iommu)
+#define dev_iommu_domain(dev) (dev_archdata(dev)->domain)
+#define dev_iommu_group(dev) (dev_archdata(dev)->group)
+
+/* Xen: Dummy iommu_group */
+struct iommu_group
+{
+struct arm_smmu_master_cfg *cfg;
+
+atomic_t ref;
+};
+

The naming needs to be revisited in this patch. Original driver from
Will has arm_smmu_domain. This patch adds  iommu_domain,
arm_smmu_xen_domain, iommu_group.


I can't change the naming of the structure. iommu_domain and iommu_group
are from Linux. As we don't have it on Xen, I have to add dummy
structure for it.


Could you please add some description about the relation and hierarchy
of these data structures.


Good point, I will try to add more comment and explain why we have to do
it.

Regards,



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 13/13] xen: allow more than 512 GB of RAM for 64 bit pv-domains

2015-02-18 Thread Andrew Cooper
On 18/02/15 11:51, Juergen Gross wrote:
>
>
>> People using crash analysis tools that need the 3-level p2m can clamp
>> dom0 memory with the Xen command line option.  FWIW, the tool we use
>> doesn't need this.
>
> Interesting. Which tool are you using?

https://github.com/xenserver/xen-crashdump-analyser

It walks dom0 by locating the real pagetables from struct vcpu, so needs
no p2m/m2p fiddling at all.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST] Debian: Add "fdt chosen" to boot script

2015-02-18 Thread Ian Jackson
Ian Campbell writes ("[PATCH OSSTEST] Debian: Add "fdt chosen" to boot script"):
> This causes u-boot to fill in the various fields in the chosen node
> (specifically the bootargs) which would otherwise not be done until
> the bootz command. Doing it manually means the following "fdt print
> /chosen" will print what is actually going to be used.
> 
> This change means that instead of whatever /chosen/bootargs is
> embedded in the firmware FDT we end up printing what we will actually
> use.

FWIW

Acked-by: Ian Jackson 

but really I mean that although I have no knowledge of what this does
or why it is helpful beyond what you have written, I have no objection
to the patch :-).

Maybe this should be combined with Wei's XSM series ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/2] arm/arm64: Detect Xen support earlier

2015-02-18 Thread Ian Campbell
On Wed, 2015-02-18 at 11:51 +, Julien Grall wrote:
> Hi Ian,
> 
> On 18/02/2015 11:30, Ian Campbell wrote:
> > On Thu, 2015-02-12 at 06:34 +, Julien Grall wrote:
> >> Hello,
> >>
> >> This small patch series move the detection of running on Xen earlier. This 
> >> is
> >> required in order to support earlyprintk via Xen and selecting the 
> >> preferred
> >> console.
> >
> > Thanks for doing this, having all of the init done in an initcall (even
> > a relatively early one) has been a niggle I've wanted address for ages,
> > for exactly earlyprintk and preferred console reasons.
> >
> > I had a very minor comment on #1 but nonetheless both patches:
> > Acked-by: Ian Campbell 
> 
> Can I keep you ack on the first one with the __read_mostly dropped?

Sure.

BTW, when reposting you might want to CC the arch/arm* maintainers on
this intro mail as well as just the second patch.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST] Debian: Add "fdt chosen" to boot script

2015-02-18 Thread Ian Campbell
On Wed, 2015-02-18 at 11:56 +, Ian Jackson wrote:
> Ian Campbell writes ("[PATCH OSSTEST] Debian: Add "fdt chosen" to boot 
> script"):
> > This causes u-boot to fill in the various fields in the chosen node
> > (specifically the bootargs) which would otherwise not be done until
> > the bootz command. Doing it manually means the following "fdt print
> > /chosen" will print what is actually going to be used.
> > 
> > This change means that instead of whatever /chosen/bootargs is
> > embedded in the firmware FDT we end up printing what we will actually
> > use.
> 
> FWIW
> 
> Acked-by: Ian Jackson 
> 
> but really I mean that although I have no knowledge of what this does
> or why it is helpful beyond what you have written, I have no objection
> to the patch :-).

;-) Would you like me to try and clarify or do you not really care?

> Maybe this should be combined with Wei's XSM series ?

That's as good a place as any.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V5 06/12] x86/hvm: factor out and rename vm_event related functions

2015-02-18 Thread Tamas K Lengyel
On Wed Feb 18 2015 10:07:29 AM CET, Jan Beulich  wrote:

> > > > On 17.02.15 at 18:37,  wrote:
> > On Tue, Feb 17, 2015 at 12:56 PM, Jan Beulich 
> > wrote:
> > > > > > On 13.02.15 at 17:33,  wrote:
> > > > +static void hvm_event_cr(uint32_t reason, unsigned long value,
> > > > +                                                               
> > > > unsigned long old)
> > > > +{
> > > > +       vm_event_request_t req = {
> > > > +               .reason = reason,
> > > > +               .vcpu_id = current->vcpu_id,
> > > > +               .u.mov_to_cr.new_value = value,
> > > > +               .u.mov_to_cr.old_value = old
> > > > +       };
> > > > +       uint64_t parameters = 0;
> > > > +
> > > > +       switch(reason)
> > > 
> > > Coding style. Also I continue to think using switch() here rather
> > > than having the caller pass both VM_EVENT_* and HVM_PARAM_* is ugly/
> > > inefficient (even if the compiler may be able to sort this out for
> > > you).
> > 
> > It's getting retired in the series so there isn't much point in
> > tweaking it here.
> 
> I realized that looking at later patches in this series, but then you
> could similarly argue that the other requested adjustments are
> unnecessary. But please always keep in mind that series may get
> applied partially. And of course ideally a series wouldn't introduce
> code just for a later patch to delete it again - i.e. if you already
> find you want/need to do that, then please accept that coding
> style remarks are still being made and considered relevant.
> 
> Jan

I do consider coding style issues relevant. Here we were talking about 
optimizing a method that is being retired in the series anyway but it is your 
call. In v6 I already made the changes you requested. 

Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V5 07/12] xen: Introduce monitor_op domctl

2015-02-18 Thread Tamas K Lengyel
On Wed Feb 18 2015 10:26:00 AM CET, Jan Beulich  wrote:

> > > > On 17.02.15 at 19:20,  wrote:
> > On Tue, Feb 17, 2015 at 3:02 PM, Jan Beulich  wrote:
> > > > > > On 13.02.15 at 17:33,  wrote:
> > > > rc = vm_event_enable(d, vec, ved, _VPF_mem_access,
> > > > -                                                                       
> > > > HVM_PARAM_MONITOR_RING_PFN,
> > > > -                                                                       
> > > > mem_access_notification);
> > > > -
> > > > -                       if ( vec->op ==
> > > > XEN_VM_EVENT_MONITOR_ENABLE_INTROSPECTION -                             
> > > >     && !rc
> > > > ) -                               p2m_setup_introspection(d);
> > > > -
> > > > -               }
> > > > -               break;
> > > > +                                                                 
> > > > HVM_PARAM_MONITOR_RING_PFN,
> > > > +                                                                 
> > > > mem_access_notification);
> > > 
> > > I don't see what changes for these two lines. If it's indentation, it
> > > should be done right when the code gets added.
> > 
> > Indentation can't be fixed in the code addition as it breaks git -M.
> > It reverts to the old format where it just removes the whole file and
> > adds the new one. I think its a waste to add a whole new separate
> > patch just to fix indentations so I just fix it here.
> 
> Considering that indentation is broken already prior to your
> series, this is perhaps acceptable. But at least if indentation
> was correct before the rename, it should be afterwards. You'd
> have to use of git's -B option to control the resulting diff.
> 
> > > > +#include 
> > > > +
> > > > +static inline
> > > > +int monitor_domctl(struct xen_domctl_monitor_op *op, struct
> > > > domain *d)
> > > 
> > > The includes above are insufficient for the types used, or you should
> > > forward declare _both_ structs and not have any includes.
> > 
> > Just including sched.h additionally should be enough IMHO.
> 
> Resulting in a huge pile of further dependencies. Our goal really
> should be to get the dependencies down, not up - improving build
> time. Hence forward declarations are very likely the better choice
> here.
> 
> > > > --- a/xen/include/asm-x86/domain.h
> > > > +++ b/xen/include/asm-x86/domain.h
> > > > @@ -241,6 +241,24 @@ struct time_scale {
> > > > u32 mul_frac;
> > > > };
> > > > 
> > > > +//
> > > > +/*                       monitor event options                         
> > > > */
> > > > +//
> > > > +struct mov_to_cr {
> > > > +       uint8_t enabled;
> > > > +       uint8_t sync;
> > > > +       uint8_t onchangeonly;
> > > > +};
> > > > +
> > > > +struct mov_to_msr {
> > > > +       uint8_t enabled;
> > > > +       uint8_t extended_capture;
> > > > +};
> > > > +
> > > > +struct debug_event {
> > > > +       uint8_t enabled;
> > > > +};
> > > 
> > > These are all internal structures - is there anything wrong with
> > > using bitfields here?
> > 
> > The use if bitfields is not good performance-wise AFAIK. Would there
> > be any benefit that would offset that?
> 
> As Andrew already said - total structure size. Also I'm pretty
> convinced "or $, " as well as "and $~,"
> aren't much worse than "mov $,", and the code
> writing these fields shouldn't be performance critical. And
> "test $," and "cmp $," (as well as
> their split up alternatives, should the compiler elect to do so)
> ought to be equal performance wise.
> 
> Jan

OK, I'll switch to bitfields and adjust the patch accordingly.

Tamas


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V5 10/12] xen/vm_event: Relocate memop checks

2015-02-18 Thread Tamas K Lengyel
On Wed Feb 18 2015 10:29:40 AM CET, Jan Beulich  wrote:

> > > > On 17.02.15 at 19:47,  wrote:
> > On Tue, Feb 17, 2015 at 3:25 PM, Jan Beulich  wrote:
> > > > > > On 13.02.15 at 17:33,  wrote:
> > > > -int mem_paging_memop(struct domain *d, xen_mem_paging_op_t *mpo)
> > > > +int mem_paging_memop(unsigned long cmd,
> > > > +                                         
> > > > XEN_GUEST_HANDLE_PARAM(xen_mem_paging_op_t)
> > > > arg) {
> > > > -       int rc = -ENODEV;
> > > > +       int rc;
> > > > +       xen_mem_paging_op_t mpo;
> > > > +       struct domain *d;
> > > > +
> > > > +       rc = -EFAULT;
> > > > +       if ( copy_from_guest(&mpo, arg, 1) )
> > > > +               return rc;
> > > 
> > > Please don't make things more complicated than they need to be:
> > > You only use the -EFAULT once here, so no reason to assign it to
> > > rc up front.
> > 
> > This return will be a "goto out;" where the rcu is getting unlocked as
> > well.
> 
> How that? You didn't take the RCU lock yet (which is even visible
> from the rest of the hunk above).
> 
> Jan

Sorry, was just replying mechanically as most returns here turn into goto outs. 
Ack.

Tamas


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/2] arm/arm64: Detect Xen support earlier

2015-02-18 Thread Julien Grall



On 18/02/2015 12:03, Ian Campbell wrote:

On Wed, 2015-02-18 at 11:51 +, Julien Grall wrote:

Hi Ian,

On 18/02/2015 11:30, Ian Campbell wrote:

On Thu, 2015-02-12 at 06:34 +, Julien Grall wrote:

Hello,

This small patch series move the detection of running on Xen earlier. This is
required in order to support earlyprintk via Xen and selecting the preferred
console.


Thanks for doing this, having all of the init done in an initcall (even
a relatively early one) has been a niggle I've wanted address for ages,
for exactly earlyprintk and preferred console reasons.

I had a very minor comment on #1 but nonetheless both patches:
Acked-by: Ian Campbell 


Can I keep you ack on the first one with the __read_mostly dropped?


Sure.

BTW, when reposting you might want to CC the arch/arm* maintainers on
this intro mail as well as just the second patch.


Ok. I wasn't sure if this patches should go via the Xen tree or ARM one.

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V5 12/12] xen/vm_event: Add RESUME option to vm_event_op domctl

2015-02-18 Thread Tamas K Lengyel
On Wed Feb 18 2015 10:31:06 AM CET, Jan Beulich  wrote:

> > > > On 17.02.15 at 19:32,  wrote:
> > On Tue, Feb 17, 2015 at 3:31 PM, Jan Beulich  wrote:
> > > > > > On 13.02.15 at 17:33,  wrote:
> > > > @@ -611,13 +611,22 @@ int vm_event_domctl(struct domain *d, 
> > xen_domctl_vm_event_op_t *vec,
> > > > }
> > > > break;
> > > > 
> > > > -               case XEN_VM_EVENT_PAGING_DISABLE:
> > > > +               case XEN_VM_EVENT_DISABLE:
> > > > {
> > > > if ( ved->ring_page )
> > > > rc = vm_event_disable(d, ved);
> > > > }
> > > > break;
> > > > 
> > > > +               case XEN_VM_EVENT_RESUME:
> > > > +               {
> > > > +                       if ( ved->ring_page )
> > > > +                               vm_event_resume(d, ved);
> > > > +                       else
> > > > +                               rc = -ENODEV;
> > > > +               }
> > > > +               break;
> > > 
> > > Stray braces again.
> > 
> > Ack.
> > 
> > > 
> > > I also find it confusing that the same set of changes repeats three
> > > times here - is that an indication of a problem with an earlier
> > > patch?
> > 
> > No it's not. There are three rings vm_event can use, thus three rings
> > that can be resumed.
> 
> But if the code ends up being almost identical, this loudly calls for
> consolidation into e.g. a helper function.
> 
> Jan

That could be done but considering we are talking about only couple lines of 
code, I'm not sure that would improve readability by much.

I think the question I raised earlier, whether we need the resume option to the 
domctl to begin with is what needs discussion. IMHO the event channel method is 
enough so maybe I'll just turn this patch into deprecating the resume options 
in the memops.

Thanks,
Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 2/3] xen/arm: Add early printk support for ThunderX platform

2015-02-18 Thread vijay . kilari
From: Vijaya Kumar K 

ThunderX platform uses pl011 uart.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/Rules.mk |4 
 1 file changed, 4 insertions(+)

diff --git a/xen/arch/arm/Rules.mk b/xen/arch/arm/Rules.mk
index c7bd227..54efa91 100644
--- a/xen/arch/arm/Rules.mk
+++ b/xen/arch/arm/Rules.mk
@@ -113,6 +113,10 @@ ifeq ($(CONFIG_EARLY_PRINTK), lager)
 EARLY_PRINTK_INC := scif
 EARLY_UART_BASE_ADDRESS := 0xe6e6
 endif
+ifeq ($(CONFIG_EARLY_PRINTK), thunderx)
+EARLY_PRINTK_INC := pl011
+EARLY_UART_BASE_ADDRESS := 0x87e02400
+endif
 
 ifneq ($(EARLY_PRINTK_INC),)
 EARLY_PRINTK := y
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 1/3] xen/arm: Add ThunderX platform support

2015-02-18 Thread vijay . kilari
From: Vijaya Kumar K 

Add basic support for Cavium ThunderX platform

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/platforms/Makefile   |1 +
 xen/arch/arm/platforms/thunderx.c |   66 +
 xen/arch/arm/setup.c  |1 +
 3 files changed, 68 insertions(+)

diff --git a/xen/arch/arm/platforms/Makefile b/xen/arch/arm/platforms/Makefile
index e173fec..d9f98f9 100644
--- a/xen/arch/arm/platforms/Makefile
+++ b/xen/arch/arm/platforms/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_ARM_32) += sunxi.o
 obj-$(CONFIG_ARM_32) += rcar2.o
 obj-$(CONFIG_ARM_64) += seattle.o
 obj-$(CONFIG_ARM_64) += xgene-storm.o
+obj-$(CONFIG_ARM_64) += thunderx.o
diff --git a/xen/arch/arm/platforms/thunderx.c 
b/xen/arch/arm/platforms/thunderx.c
new file mode 100644
index 000..96560e1
--- /dev/null
+++ b/xen/arch/arm/platforms/thunderx.c
@@ -0,0 +1,66 @@
+/*
+ * xen/arch/arm/platforms/thunderx.c
+ *
+ * Cavium Thunder specific settings
+ *
+ * Vijaya Kumar K 
+ * Copyright (c) 2015 Cavium Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+
+static int thunderx_specific_mapping(struct domain *d)
+{
+uint64_t addr, size;
+int res;
+
+/* Mappings GSER region required for dom0 */
+addr = 0x87e09000;
+size = 0xd00;
+
+res = map_mmio_regions(d,
+   paddr_to_pfn(addr & PAGE_MASK),
+   DIV_ROUND_UP(size, PAGE_SIZE),
+   paddr_to_pfn(addr & PAGE_MASK));
+if ( res )
+{
+ printk(XENLOG_ERR "Unable to map to dom%d region"
+" 0x%"PRIx64" - 0x%"PRIx64"\n",
+d->domain_id,
+addr & PAGE_MASK, PAGE_ALIGN(addr + size) - 1);
+}
+
+return 0;
+}
+
+static const char * const thunderx_dt_compat[] __initconst =
+{
+"cavium,thunder-88xx",
+NULL
+};
+
+PLATFORM_START(thunderx, "THUNDERX")
+.compatible = thunderx_dt_compat,
+.specific_mapping = thunderx_specific_mapping,
+.dom0_gnttab_start = 0x400,
+.dom0_gnttab_size = 0x2,
+PLATFORM_END
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index a916ca6..43b626b 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -66,6 +66,7 @@ static void __init init_idle_domain(void)
 static const char * __initdata processor_implementers[] = {
 ['A'] = "ARM Limited",
 ['B'] = "Broadcom Corporation",
+['C'] = "Cavium Inc.",
 ['D'] = "Digital Equipment Corp",
 ['M'] = "Motorola, Freescale Semiconductor Inc.",
 ['P'] = "Applied Micro",
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 3/3] xen/arm: Skip parsing psci-0.2 from host device tree

2015-02-18 Thread vijay . kilari
From: Vijaya Kumar K 

psci node is generated by xen for dom0.
if the host device tree has psci-0.2 skip parsing this node
and avoid copying from host device tree to dom0 device tree

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/domain_build.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index c2dcb49..0be639b 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1029,6 +1029,7 @@ static int handle_node(struct domain *d, struct 
kernel_info *kinfo,
 DT_MATCH_COMPATIBLE("xen,multiboot-module"),
 DT_MATCH_COMPATIBLE("multiboot,module"),
 DT_MATCH_COMPATIBLE("arm,psci"),
+DT_MATCH_COMPATIBLE("arm,psci-0.2"),
 DT_MATCH_PATH("/cpus"),
 DT_MATCH_TYPE("memory"),
 /* The memory mapped timer is not supported by Xen. */
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 0/3] Add ThunderX platform support

2015-02-18 Thread vijay . kilari
From: Vijaya Kumar K 

Changes in v2:
 - Updated patch 3 commit message
 - Updated processor_implementers[] with implementor info
   in xen/arch/arm/setup.c

Changes in v1:
 - Add support for ThunderX platform
 - Add early printk support
 - Add psci-0.2 check while parsing dt node

Vijaya Kumar K (3):
  xen/arm: Add ThunderX platform support
  xen/arm: Add early printk support for ThunderX platform
  xen/arm: Skip parsing psci-0.2 from host device tree

 xen/arch/arm/Rules.mk |4 +++
 xen/arch/arm/domain_build.c   |1 +
 xen/arch/arm/platforms/Makefile   |1 +
 xen/arch/arm/platforms/thunderx.c |   66 +
 xen/arch/arm/setup.c  |1 +
 5 files changed, 73 insertions(+)
 create mode 100644 xen/arch/arm/platforms/thunderx.c

-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V6 04/13] xen: Rename mem_event to vm_event

2015-02-18 Thread Tamas K Lengyel
On Wed Feb 18 2015 10:46:02 AM CET, Jan Beulich  wrote:

> > > > On 18.02.15 at 01:11,  wrote:
> > diff --git a/xen/common/mem_event.c b/xen/common/vm_event.c
> > similarity index 59%
> > rename from xen/common/mem_event.c
> > rename to xen/common/vm_event.c
> 
> Looking at this already quite huge delta I can't really see why
> adjusting white space at once would make it much worse. In any
> case better than leaving white space damage behind.
> 
> Jan

I did that in the first version of the patch and the feedback I got was that it 
is unreviewable that way.

Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86: Adjust rdtsc inline assembly

2015-02-18 Thread Andrew Cooper
Currently there are three related rdtsc macros, all of which are lowercase and
not obviously macros, which write by value to their parameters.

This is non-intuitive to program which, being contrary to C semantics for code
appearing to be a regular function call.  It is also causes Coverity to
conclude that __udelay() has an infinite loop, as all of its loop conditions
are constant.

Two of these macros (rdtsc() and rdtscl()) have only a handful of uses while
the vast majority of code uses the rdtscll() variant.  rdtsc() and rdtscll()
are equivalent, while rdtscl() discards the high word.

Replace all 3 macros with a static inline which returns the complete tsc.

Most of this patch is a mechanical change of

  - rdtscll($FOO);
  + $FOO = rdtsc();

And a diff of the generated assembly confirms that this is no change at all.

The single use of the old rdtsc() in emulate_privileged_op() is altered to use
the new rdtsc() and the rdmsr_writeback path to set eax/edx appropriately.

The pair of use of rdtscl() in __udelay() are extended to use full 64bit
values, which makes the overflow edge condition (and early exit from the loop)
far rarer.

Signed-off-by: Andrew Cooper 
CC: Keir Fraser 
CC: Jan Beulich 
---
 xen/arch/x86/apic.c   |4 ++--
 xen/arch/x86/cpu/mcheck/mce.c |2 +-
 xen/arch/x86/delay.c  |4 ++--
 xen/arch/x86/hvm/hvm.c|4 ++--
 xen/arch/x86/hvm/save.c   |4 ++--
 xen/arch/x86/hvm/svm/svm.c|2 +-
 xen/arch/x86/platform_hypercall.c |4 ++--
 xen/arch/x86/smpboot.c|2 +-
 xen/arch/x86/time.c   |   31 +++
 xen/arch/x86/traps.c  |5 -
 xen/include/asm-x86/msr.h |   15 ++-
 xen/include/asm-x86/time.h|4 +---
 12 files changed, 39 insertions(+), 42 deletions(-)

diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index 39cd9e5..3217bdf 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -1148,7 +1148,7 @@ static int __init calibrate_APIC_clock(void)
  * We wrapped around just now. Let's start:
  */
 if (cpu_has_tsc)
-rdtscll(t1);
+t1 = rdtsc();
 tt1 = apic_read(APIC_TMCCT);
 
 /*
@@ -1159,7 +1159,7 @@ static int __init calibrate_APIC_clock(void)
 
 tt2 = apic_read(APIC_TMCCT);
 if (cpu_has_tsc)
-rdtscll(t2);
+t2 = rdtsc();
 
 /*
  * The APIC bus clock counter is 32 bits only, it
diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 05a86fb..3a3b4dc 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -235,7 +235,7 @@ static void mca_init_bank(enum mca_source who,
 
 if (who == MCA_CMCI_HANDLER) {
 mib->mc_ctrl2 = mca_rdmsr(MSR_IA32_MC0_CTL2 + bank);
-rdtscll(mib->mc_tsc);
+mib->mc_tsc = rdtsc();
 }
 }
 
diff --git a/xen/arch/x86/delay.c b/xen/arch/x86/delay.c
index bc1772e..ef6bc5d 100644
--- a/xen/arch/x86/delay.c
+++ b/xen/arch/x86/delay.c
@@ -21,10 +21,10 @@ void __udelay(unsigned long usecs)
 unsigned long ticks = usecs * (cpu_khz / 1000);
 unsigned long s, e;
 
-rdtscl(s);
+s = rdtsc();
 do
 {
 rep_nop();
-rdtscl(e);
+e = rdtsc();
 } while ((e-s) < ticks);
 }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index a52c6e0..72e383f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -292,7 +292,7 @@ void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, 
u64 at_tsc)
 }
 else
 {
-rdtscll(tsc);
+tsc = rdtsc();
 }
 
 delta_tsc = guest_tsc - tsc;
@@ -326,7 +326,7 @@ u64 hvm_get_guest_tsc_fixed(struct vcpu *v, uint64_t at_tsc)
 }
 else
 {
-rdtscll(tsc);
+tsc = rdtsc();
 }
 
 return tsc + v->arch.hvm_vcpu.cache_tsc_offset;
diff --git a/xen/arch/x86/hvm/save.c b/xen/arch/x86/hvm/save.c
index 6af19be..61f780d 100644
--- a/xen/arch/x86/hvm/save.c
+++ b/xen/arch/x86/hvm/save.c
@@ -36,7 +36,7 @@ void arch_hvm_save(struct domain *d, struct hvm_save_header 
*hdr)
 hdr->gtsc_khz = d->arch.tsc_khz;
 
 /* Time when saving started */
-rdtscll(d->arch.hvm_domain.sync_tsc);
+d->arch.hvm_domain.sync_tsc = rdtsc();
 }
 
 int arch_hvm_load(struct domain *d, struct hvm_save_header *hdr)
@@ -71,7 +71,7 @@ int arch_hvm_load(struct domain *d, struct hvm_save_header 
*hdr)
 hvm_set_rdtsc_exiting(d, 1);
 
 /* Time when restore started  */
-rdtscll(d->arch.hvm_domain.sync_tsc);
+d->arch.hvm_domain.sync_tsc = rdtsc();
 
 /* VGA state is not saved/restored, so we nobble the cache. */
 d->arch.hvm_domain.stdvga.cache = 0;
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 018dd70..c83c483 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -805,7 +805,7 @@ static void svm_set_tsc_offset(struct vcpu *v, u64 offset, 
u64 at_tsc)
 if ( at_t

Re: [Xen-devel] [PATCH v2 1/3] xen/arm: Add ThunderX platform support

2015-02-18 Thread Julien Grall

Hello Vijay,

On 18/02/2015 12:19, vijay.kil...@gmail.com wrote:

From: Vijaya Kumar K 

Add basic support for Cavium ThunderX platform

Signed-off-by: Vijaya Kumar K 
---
  xen/arch/arm/platforms/Makefile   |1 +
  xen/arch/arm/platforms/thunderx.c |   66 +
  xen/arch/arm/setup.c  |1 +
  3 files changed, 68 insertions(+)

diff --git a/xen/arch/arm/platforms/Makefile b/xen/arch/arm/platforms/Makefile
index e173fec..d9f98f9 100644
--- a/xen/arch/arm/platforms/Makefile
+++ b/xen/arch/arm/platforms/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_ARM_32) += sunxi.o
  obj-$(CONFIG_ARM_32) += rcar2.o
  obj-$(CONFIG_ARM_64) += seattle.o
  obj-$(CONFIG_ARM_64) += xgene-storm.o
+obj-$(CONFIG_ARM_64) += thunderx.o
diff --git a/xen/arch/arm/platforms/thunderx.c 
b/xen/arch/arm/platforms/thunderx.c
new file mode 100644
index 000..96560e1
--- /dev/null
+++ b/xen/arch/arm/platforms/thunderx.c
@@ -0,0 +1,66 @@
+/*
+ * xen/arch/arm/platforms/thunderx.c
+ *
+ * Cavium Thunder specific settings
+ *
+ * Vijaya Kumar K 
+ * Copyright (c) 2015 Cavium Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+
+static int thunderx_specific_mapping(struct domain *d)
+{
+uint64_t addr, size;
+int res;
+
+/* Mappings GSER region required for dom0 */


Mapping


+addr = 0x87e09000;
+size = 0xd00;
+
+res = map_mmio_regions(d,
+   paddr_to_pfn(addr & PAGE_MASK),
+   DIV_ROUND_UP(size, PAGE_SIZE),
+   paddr_to_pfn(addr & PAGE_MASK));


OOI, why this region is not described in the DT? Is it a PCI device?


+if ( res )
+{
+ printk(XENLOG_ERR "Unable to map to dom%d region"


It would be more clear to specify the name of the region i.e "GSER region"

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Inplace upgrading 4.4.x -> 4.5.0

2015-02-18 Thread Ian Campbell
On Mon, 2015-02-09 at 20:36 +1100, Steven Haigh wrote:
> > This sounds like a packaging issue -- Debian's packages for example jump
> > through some hoops to make sure multiple tools packages can be installed
> > in parallel and the correct ones selected for the currently running
> > hypervisor.
> 
> Hmmm - that sounds very hacky :\

I've been slowly unpicking the Debian patches and upstreaming bits of
them, I'm not sure if I'll manage to get this stuff upstream though
since it is a bit more invasive than the other stuff.
> Hmmm Andrew is correct, the errors are all:
> 
> = xl info ==
> libxl: error: libxl.c:5044:libxl_get_physinfo: getting physinfo:
> Permission denied

EPERM is essential "tools/hypervisor version mismatch" in most contexts.

[...]
> So, this leads me to wonder - as I'm sure MANY people get bitten by this
> - how to control (at least to shutdown) DomUs after an in-place upgrade?

You should evacuate the host before upgrading it, which is what I
suppose most people do as the first step in their maintenance window.
Evacuation might involve migrating VMs to another host (perhaps as part
of a pool rolling upgrade type manoeuvre) or just shutting them down.

> Even if no other functions are implemented other than shutdown, I would
> call that an acceptable functionality. At least this way, you're not
> hard killing running VMs on reboot.

I'd expect that it might be possible to arrange to connect to the VM
console and shut it down from within, or possibly to use the xenstore
CLI tools to initiate the shutdown externally.

After that then you would still end up with some zombie domains since
after they have shutdown actually reaping them would require toolstack
actions to talk to the hypervisor and you'd hit the version mismatch.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.6 0/5] xen: arm: Parse PCI DT nodes' ranges and interrupt-map

2015-02-18 Thread Julien Grall

Hi Suravee,

On 18/02/2015 05:28, Suravee Suthikulanit wrote:


Actually, that seems to be more related to the PCI pass-through devices.
Isn't the Cavium guys already done that work to support their PCI device
pass-through?


They were working on it, but so far there is no patch series on the ML. 
It would be nice to come with a common solution (i.e between GICv2m and 
GICv3 ITS) for MSI.



Anyways, at this point, I am able to generated Dom0 device tree with
correct v2m node, and I can see Dom0 gicv2m driver probing and
initializing correctly as it would on bare-metal.

# Snippet from /sys/firmware/fdt showing dom0 GIC node
 interrupt-controller {
 compatible = "arm,gic-400", "arm,cortex-a15-gic";
 #interrupt-cells = <0x3>;
 interrupt-controller;
 reg = <0x0 0xe111 0x0 0x1000 0x0 0xe112f000 0x0 0x2000>;
 phandle = <0x1>;
 #address-cells = <0x2>;
 #size-cells = <0x2>;
 ranges = <0x0 0x0 0x0 0xe110 0x0 0x10>;

 v2m {
 compatible = "arm,gic-v2m-frame";
 msi-controller;
 arm,msi-base-spi = <0x40>;
 arm,msi-num-spis = <0x100>;
 phandle = <0x5>;
 reg = <0x0 0x8 0x0 0x1000>;
 };
 };

linux:~ # dmesg | grep v2m
[0.00] GICv2m: Overriding V2M MSI_TYPER (base:64, num:256)
[0.00] GICv2m: Node v2m: range[0xe118:0xe1180fff], SPI[64:320]

So, during setting up v2m in hypervisor, I also call
route_irq_to_guest() for the all SPIs used for MSI (i.e. 64-320 on
Seattle), which will force the MSIs to Dom0. However, we would need to
figure out how to detach and re-route certain interrupts to a specific
DomU in case of passing through PCI devices in the future.


Who decide to assign the MSI n to the SPI x? DOM0 or Xen?

Wouldn't it be possible to route the SPI dynamically when the domain 
decide to use the MSI n? We would need to implement PHYSDEVOP_map_pirq 
for MSI.


[..]


And there you have it GICv2m MSI(-X) supports in Dom 0 for Seattle
;)  Thanks to Ian's PCI patch, which makes porting much simpler.

Next, I'll clean up the code and send out Xen patch for review. I'll
also push Linux changes (for adding ARM64 PCI Generic host controller
supports and MSI IRQ domain from Marc) into my Linux tree on Github.
Then you could give it a try on your Seattle box.


Congrats!

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] introduce and use relaxed cpumask bitops

2015-02-18 Thread Ian Campbell
On Wed, 2015-02-11 at 13:42 +, Jan Beulich wrote:
> Using atomic (LOCKed on x86) bitops for certain of the operations on
> cpumask_t is overkill when the variables aren't concurrently accessible
> (e.g. local function variables, or due to explicit locking). Introduce
> alternatives using non-atomic bitops and use them where appropriate.
> 
> Note that this
> - adds a volatile qualifier to cpumask_test_and_{clear,set}_cpu()
>   (should have been there from the beginning, like is the case for
>   cpumask_{clear,set}_cpu())
> - replaces several cpumask_clear()+cpumask_set_cpu(, n) pairs by the
>   simpler cpumask_copy(, cpumask_of(n)) (or just cpumask_of(n) if we
>   can do without copying)
> 
> Signed-off-by: Jan Beulich 
> Acked-by: George Dunlap 

AIUI there is no need for any arm changes (you reuse the existing
__clear_bit etc), so:
Acked-by: Ian Campbell 

I suppose at some point we might want to switch xen/arch/arm to use the
relaxed ops where appropriate, but no need for you to worry about that.

Ian



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v1] xen/arm: Do not allocate pte entries for MAP_SMALL_PAGES

2015-02-18 Thread vijay . kilari
From: Vijaya Kumar K 

On x86, for the pages mapped with PAGE_HYPERVISOR attribute
non-leaf page tables are allocated with valid pte entries.
and with MAP_SMALL_PAGES attribute only non-leaf page tables are
allocated with invalid (valid bit set to 0) pte entries.
However on arm this is not the case. On arm for the pages
mapped with PAGE_HYPERVISOR and MAP_SMALL_PAGES both
non-leaf and leaf level page table are allocated with valid bit
in pte entries.

This behaviour in arm makes common vmap code fail to
allocate memory beyond 128MB as described below.

In vmap_init, map_pages_to_xen() is called for mapping
vm_bitmap. Initially one page of vm_bitmap is allocated
and mapped using PAGE_HYPERVISOR attribute.
For the rest of vm_bitmap pages, MAP_SMALL_PAGES attribute
is used to map.

In ARM for both PAGE_HYPERVISOR and MAP_SMALL_PAGES, valid bit
is set to 1 in pte entry for these mapping.

In vma_alloc(), map_pages_to_xen() is failing for >128MB because
for this next vm_bitmap page the mapping is already set in vm_init()
with valid bit set in pte entry. So map_pages_to_xen() in
ARM returns error.

With this patch, MAP_SMALL_PAGES attribute will only allocate
non-leaf page tables only.

Here we use bit[16] in the attribute flag to know if leaf page
tables should be allocated or not.

This bit is set only for MAP_SMALL_PAGES attribute.

Signed-off-by: Vijaya Kumar K
---
 xen/arch/arm/mm.c  |9 ++---
 xen/include/asm-arm/page.h |8 +++-
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 7d4ba0c..a12f3f5 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -865,9 +865,12 @@ static int create_xen_entries(enum xenmap_operation op,
addr, mfn);
 return -EINVAL;
 }
-pte = mfn_to_xen_entry(mfn, ai);
-pte.pt.table = 1;
-write_pte(&third[third_table_offset(addr)], pte);
+if ( !(ai & PTE_INVALID) )
+{
+pte = mfn_to_xen_entry(mfn, (ai & 0x));
+pte.pt.table = 1;
+write_pte(&third[third_table_offset(addr)], pte);
+}
 break;
 case REMOVE:
 if ( !third[third_table_offset(addr)].pt.valid )
diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index 3e7b0ae..80415b3 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -61,10 +61,16 @@
 #define DEV_WCBUFFERABLE
 #define DEV_CACHEDWRITEBACK
 
+/* bit 16 in the Attribute index can be used to know if
+ * PTE entry should be added or not. This is useful
+ * when ONLY non-leaf page table entries need to allocated
+ */
+#define PTE_INVALID   (0x1 << 16)
+
 #define PAGE_HYPERVISOR (WRITEALLOC)
 #define PAGE_HYPERVISOR_NOCACHE (DEV_SHARED)
 #define PAGE_HYPERVISOR_WC  (DEV_WC)
-#define MAP_SMALL_PAGES PAGE_HYPERVISOR
+#define MAP_SMALL_PAGES (PAGE_HYPERVISOR | (PTE_INVALID))
 
 /*
  * Stage 2 Memory Type.
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1] xen/arm: Do not allocate pte entries for MAP_SMALL_PAGES

2015-02-18 Thread Julien Grall

Hello Vijay,

On 18/02/2015 12:56, vijay.kil...@gmail.com wrote:

From: Vijaya Kumar K 

On x86, for the pages mapped with PAGE_HYPERVISOR attribute
non-leaf page tables are allocated with valid pte entries.
and with MAP_SMALL_PAGES attribute only non-leaf page tables are
allocated with invalid (valid bit set to 0) pte entries.
However on arm this is not the case. On arm for the pages
mapped with PAGE_HYPERVISOR and MAP_SMALL_PAGES both
non-leaf and leaf level page table are allocated with valid bit
in pte entries.

This behaviour in arm makes common vmap code fail to
allocate memory beyond 128MB as described below.

In vmap_init, map_pages_to_xen() is called for mapping
vm_bitmap. Initially one page of vm_bitmap is allocated
and mapped using PAGE_HYPERVISOR attribute.
For the rest of vm_bitmap pages, MAP_SMALL_PAGES attribute
is used to map.

In ARM for both PAGE_HYPERVISOR and MAP_SMALL_PAGES, valid bit
is set to 1 in pte entry for these mapping.

In vma_alloc(), map_pages_to_xen() is failing for >128MB because
for this next vm_bitmap page the mapping is already set in vm_init()
with valid bit set in pte entry. So map_pages_to_xen() in
ARM returns error.

With this patch, MAP_SMALL_PAGES attribute will only allocate
non-leaf page tables only.

Here we use bit[16] in the attribute flag to know if leaf page
tables should be allocated or not.

This bit is set only for MAP_SMALL_PAGES attribute.

Signed-off-by: Vijaya Kumar K
---
  xen/arch/arm/mm.c  |9 ++---
  xen/include/asm-arm/page.h |8 +++-
  2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 7d4ba0c..a12f3f5 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -865,9 +865,12 @@ static int create_xen_entries(enum xenmap_operation op,
 addr, mfn);
  return -EINVAL;
  }
-pte = mfn_to_xen_entry(mfn, ai);
-pte.pt.table = 1;
-write_pte(&third[third_table_offset(addr)], pte);
+if ( !(ai & PTE_INVALID) )


you could do if ( ai & PTE_INVALID ) break; It would avoid a new level 
of indentation.


Also, I would rename ai to flags as the variable is not anymore an 
attribute index.



+{
+pte = mfn_to_xen_entry(mfn, (ai & 0x));


Please introduce a new macro for the mask.


+pte.pt.table = 1;
+write_pte(&third[third_table_offset(addr)], pte);
+}
  break;
  case REMOVE:
  if ( !third[third_table_offset(addr)].pt.valid )
diff --git a/xen/include/asm-arm/page.h b/xen/include/asm-arm/page.h
index 3e7b0ae..80415b3 100644
--- a/xen/include/asm-arm/page.h
+++ b/xen/include/asm-arm/page.h
@@ -61,10 +61,16 @@
  #define DEV_WCBUFFERABLE
  #define DEV_CACHEDWRITEBACK

+/* bit 16 in the Attribute index can be used to know if
+ * PTE entry should be added or not. This is useful
+ * when ONLY non-leaf page table entries need to allocated
+ */
+#define PTE_INVALID   (0x1 << 16)


It makes more sense to introduce a PTE_PRESENT flags compare to 
PTE_INVALID. The former has more meaning that the latter.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86: Adjust rdtsc inline assembly

2015-02-18 Thread Jan Beulich
>>> On 18.02.15 at 13:25,  wrote:
> The single use of the old rdtsc() in emulate_privileged_op() is altered to use
> the new rdtsc() and the rdmsr_writeback path to set eax/edx appropriately.

I'm not entirely sure about this one - the current code surely is slightly
faster than the replacement. Question is how much this matters. I'd
suggest to be on the safe side and simply open-code the asm() there.
If you decide to follow that, there are a few more cosmetic things which
otherwise I would adjust while committing:

> @@ -1426,13 +1426,13 @@ static void __init tsc_check_writability(void)
>  if ( boot_cpu_has(X86_FEATURE_TSC_RELIABLE) )
>  return;
>  
> -rdtscll(tsc);
> +tsc = rdtsc();
>  if ( wrmsr_safe(MSR_IA32_TSC, 0) == 0 )
>  {
>  uint64_t tmp, tmp2;
> -rdtscll(tmp2);
> +tmp2 = rdtsc();

Blank line between declaration and statements. And perhaps the
assignment could become the variable's initializer.

> @@ -1973,8 +1973,7 @@ void tsc_set_info(struct domain *d,
>  else {
>  /* when using native TSC, offset is nsec relative to power-on
>   * of physical machine */
> -uint64_t tsc = 0;
> -rdtscll(tsc);
> +uint64_t tsc = rdtsc();
>  d->arch.vtsc_offset = scale_delta(tsc,&d->arch.vtsc_to_ns) -

Blank line missing again.

> --- a/xen/include/asm-x86/msr.h
> +++ b/xen/include/asm-x86/msr.h
> @@ -71,17 +71,14 @@ static inline int wrmsr_safe(unsigned int msr, uint64_t 
> val)
>  return _rc;
>  }
>  
> -#define rdtsc(low,high) \
> - __asm__ __volatile__("rdtsc" : "=a" (low), "=d" (high))
> +static inline uint64_t rdtsc(void)
> +{
> +uint32_t low, high;
>  
> -#define rdtscl(low) \
> - __asm__ __volatile__("rdtsc" : "=a" (low) : : "edx")
> +__asm__ __volatile__("rdtsc" : "=a" (low), "=d" (high));
>  
> -#define rdtscll(val) do { \
> - unsigned int _eax, _edx; \
> - asm volatile("rdtsc" : "=a" (_eax), "=d" (_edx)); \
> - (val) = ((unsigned long)_eax) | (((unsigned long)_edx)<<32); \
> -} while(0)
> +return (uint64_t)high << 32 | low;

Parentheses around the << please.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Qemu-devel] [v2][PATCH] libxl: add one machine property to support IGD GFX passthrough

2015-02-18 Thread Ian Campbell
On Wed, 2015-02-11 at 10:45 +0800, Chen, Tiejun wrote:
> On 2015/2/9 19:05, Ian Campbell wrote:
> > On Mon, 2015-02-09 at 14:28 +0800, Chen, Tiejun wrote:
> >
> >> What about this?
> >
> > I've not read the code in detail,since I'm travelling but from a quick
> > glance it looks to be implementing the sort of thing I meant, thanks.
> 
> Thanks for your time.
> 
> >
> > A couple of higher level comments:
> >
> > I'd suggest to put the code for reading the vid/did into a helper
> > function so it can be reused.
> 
> Looks good.
> 
> >
> > You might like to optionally consider add a forcing option somehow so
> > that people with new devices not in the list can control things without
> > the need to recompile (e.g. gfx_passthru_kind_override?). Perhaps that
> > isn't needed for a first cut though and it would be a libxl API so
> > thought required.
> 
> What about 'gfx_passthru_force'? Because what we're doing is, we want to 
> make sure if we have a such a IGD that needs to workaround by posting a 
> parameter to qemu. So in case of non-listed devices we just need to 
> provide a bool to force this regardless of that real device.

If we are going to do this then I think we need to arrange for the
interface to be able to express the need to force the workarounds for a
particular device. IOW a boolean will not suffice since it doesn't
indicate that IGD workarounds are needed.

Probably it would be simplest to just leave this functionality out for
the time being and revisit if/when maintaining the list becomes an
annoyance or an end user trips over it.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 34650: regressions - trouble: broken/fail/pass

2015-02-18 Thread xen . org
flight 34650 ovmf real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34650/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ovmf-amd64 7 debian-hvm-install fail REGR. vs. 33686
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 debian-hvm-install fail REGR. vs. 33686
 test-amd64-i386-pair 4 host-install/dst_host(4) broken REGR. vs. 33686

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3  7 windows-install  fail never pass

version targeted for testing:
 ovmf 085cfdf2be3769f547a34ef9178e867be86835f0
baseline version:
 ovmf 447d264115c476142f884af0be287622cd244423


People who touched revisions under test:
  "Gao, Liming" 
  "Long, Qin" 
  "Yao, Jiewen" 
  Aaron Pop 
  Abner Chang 
  Alex Williamson 
  Anderw Fish 
  Andrew Fish 
  Anthony PERARD 
  Ard Biesheuvel 
  Ard Biesheuvel 
  Ari Zigler 
  Brendan Jackman 
  Bruce Cran 
  Cecil Sheng 
  Chao Zhang 
  Chao, Zhang 
  Chen Fan 
  Chris Phillips 
  Chris Ruffin 
  Cinnamon Shia 
  Daryl McDaniel  
  Daryl McDaniel 
  daryl.mcdaniel 
  daryl.mcdan...@intel.com
  darylm503 
  David Wei 
  David Woodhouse 
  Deric Cole 
  Dong Eric 
  Dong Guo 
  Dong, Guo 
  Elvin Li 
  Eric Dong 
  Erik Bjorge 
  Eugene Cohen 
  Feng Tian 
  Feng, Bob C 
  Fu Siyuan 
  Fu, Siyuan 
  Gabriel Somlo 
  Gao, Liming 
  Gao, Liming liming.gao 
  Gao, Liming liming@intel.com
  Garrett Kirkendall 
  Gary Lin 
  Grzegorz Milos 
  Hao Wu 
  Harry Liebel 
  Hess Chen 
  Hot Tian 
  isakov-sl 
  isakov...@bk.ru
  Jaben Carsey 
  jcarsey 
  jcarsey 
  Jeff Bobzin (jeff.bobzin 
  Jeff Bobzin (jeff.bob...@insyde.com)
  Jeff Fan 
  Jiewen Yao 
  Joe Peterson 
  Jordan Justen 
  jyao1 
  jyao1 
  Kinney, Michael D 
  Larry Cleeton 
  Laszlo Ersek 
  Leandro G. Biss Becker 
  Lee Leahy 
  Leif Lindholm 
  leroy.p.leahy 
  leroy.p.le...@intel.com
  lhauch 
  Li, Elvin 
  Liming Gao 
  Long Qin 
  Long, Qin  
  Long, Qin 
  lpleahy  leroy.p.leahy 
  lpleahy  leroy.p.le...@intel.com
  Mang Guo 
  Mark Salter 
  Matt Fleming 
  Mauro Faccenda 
  Michael Casadevall 
  Michael Kinney  
  Michael Kinney 
  Mike Maslenkin 
  Ni Ruiyu 
  Nikolai Saoukh 
  Olivier Martin 
  Olivier Martin olivier.martin 
  oliviermartin 
  Paolo Bonzini 
  Parmeshwr Prasad 
  Paulo Alcantara 
  Peter Jones 
  Qin Long 
  Qiu Shumin 
  Qiu, Shumin 
  qlong 
  Randy Pawell 
  Reece R. Pollack 
  Reza Jelveh 
  Ronald Cron 
  Roy Franz 
  Ruiyu Ni 
  Ruiyu Ni 
  Ryan Harkin 
  Samer El-Haj-Mahmoud 
  Samer El-Haj-Mahmoud 
  Samer El-Haj-Mahmoud elhaj 
  Samer El-Haj-Mahmoud el...@hp.com
  Samuel Thibault 
  Scott Duplichan 
  Seiji Aguchi 
  Sergey Isakov 
  Shifei Lu 
  Shumin Qiu 
  Star Zeng 
  Stefan Kaeser 
  Steven Kinney 
  Steven Smith 
  Tapan Shah 
  Tian, Feng 
  Tian, Hot 
  Tim He 
  Tycho Nightingale 
  Victor Gouveia 
  Wang, Yu 
  Wu Jiaxin 
  Wu Jiaxin 
  Yao Jiewen 
  Yao, Jiewen 
  Ye Ting  
  Ye Ting 
  Yi Li 
  Yingke D Liu 
  Yingke Liu 
  Zeng, Star 


jobs:
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt 

Re: [Xen-devel] [PATCH] domctl: do away with tool stack based retrying

2015-02-18 Thread Ian Campbell
On Fri, 2015-02-13 at 14:34 +, Wei Liu wrote:
> On Wed, Feb 11, 2015 at 01:47:09PM +, Jan Beulich wrote:
> > XEN_DOMCTL_destroydomain so far is being special cased in libxc to
> > reinvoke the operation when getting back EAGAIN. Quite a few other
> > domctl-s have gained continuations, so I see no reason not to use them
> > here too.
> > 
> > Signed-off-by: Jan Beulich 
> > 
> 
> Acked-by: Wei Liu 

Acked-by: Ian Campbell 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V6 04/13] xen: Rename mem_event to vm_event

2015-02-18 Thread Jan Beulich
>>> On 18.02.15 at 13:21,  wrote:
> On Wed Feb 18 2015 10:46:02 AM CET, Jan Beulich  wrote:
> 
>> > > > On 18.02.15 at 01:11,  wrote:
>> > diff --git a/xen/common/mem_event.c b/xen/common/vm_event.c
>> > similarity index 59%
>> > rename from xen/common/mem_event.c
>> > rename to xen/common/vm_event.c
>> 
>> Looking at this already quite huge delta I can't really see why
>> adjusting white space at once would make it much worse. In any
>> case better than leaving white space damage behind.
> 
> I did that in the first version of the patch and the feedback I got was that 
> it is unreviewable that way.

Not really - in the first version of the patch the old file gets removed
as a whole, and the new file added. Same in v2 afaics, where indeed
you got such a comment.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/2] arm/arm64: Detect Xen support earlier

2015-02-18 Thread Ian Campbell
On Wed, 2015-02-18 at 12:13 +, Julien Grall wrote:
> 
> On 18/02/2015 12:03, Ian Campbell wrote:
> > On Wed, 2015-02-18 at 11:51 +, Julien Grall wrote:
> >> Hi Ian,
> >>
> >> On 18/02/2015 11:30, Ian Campbell wrote:
> >>> On Thu, 2015-02-12 at 06:34 +, Julien Grall wrote:
>  Hello,
> 
>  This small patch series move the detection of running on Xen earlier. 
>  This is
>  required in order to support earlyprintk via Xen and selecting the 
>  preferred
>  console.
> >>>
> >>> Thanks for doing this, having all of the init done in an initcall (even
> >>> a relatively early one) has been a niggle I've wanted address for ages,
> >>> for exactly earlyprintk and preferred console reasons.
> >>>
> >>> I had a very minor comment on #1 but nonetheless both patches:
> >>> Acked-by: Ian Campbell 
> >>
> >> Can I keep you ack on the first one with the __read_mostly dropped?
> >
> > Sure.
> >
> > BTW, when reposting you might want to CC the arch/arm* maintainers on
> > this intro mail as well as just the second patch.
> 
> Ok. I wasn't sure if this patches should go via the Xen tree or ARM one.

That's a question for all relevant maintainers to discuss -- and replies
to the 0/N mail is a good place to do that. I suggest you explicitly ask
the question in the intro next time around.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: arm: handle PCI DT node ranges and interrupt-map properties

2015-02-18 Thread Ian Campbell
On Tue, 2015-02-17 at 17:33 +, Julien Grall wrote:
> Hi Ian,
> 
> On 24/10/14 10:58, Ian Campbell wrote:
> > These properties are defined in ePAPR and the OpenFirmware PCI Bus Binding
> > Specification (IEEE Std 1275-1994).
> > 
> > This replaces the xgene specific mapping. Tested on Mustang and on a model 
> > with
> > a PCI virtio controller.
> 
> I'm wondering why you choose to map everything at Xen boot time rather
> than implementing PHYSDEVOP_pci_device_add to do the job?

Does pci_device_add contain sufficient information to do so?

The regions which are being mapped are essentially the PCI host
controllers MMIO, IO and CFG "windows" which are then consumed by the
various bars of the devices on the bus.

So mapping based on pci_device_add would require us to go from the SBDF
to a set of BARS which need mapping, which is a whole lot more complex
than just mapping all of the resources owned by the root complex through
to the h/w domain.

Or perhaps I've misunderstood what you were suggesting?

> This would allow us to re-use most of the interrupts/mmio decoding
> provided in the device tree library. It would also avoid missing support
> of cascade ranges/interrupt-map.

I *think* (if I'm remembering right) I decided we don't need to worry
about cascades of these things because the second level resources are
all fully contained within the first (top level) one and so with the
approach I've taken here are all fully mapped already. That's why I made
this patch stop descending into children when such a "bus node" is
found.

Ian.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH for-4.6 0/5] xen: arm: Parse PCI DT nodes' ranges and interrupt-map

2015-02-18 Thread Ian Campbell
On Sun, 2015-02-15 at 21:49 -0600, Suravee Suthikulpanit wrote:
> On 10/24/2014 04:58 AM, Ian Campbell wrote:
> > This message has been archived. View the original item
> > 
> > This series adds parsing of the DT ranges and interrupt-map properties
> > for PCI devices, these contain the MMIOs and IRQs used by children on
> > the bus. This replaces the specific mapping on xgene which should also
> > mean that it works on xgene devices other than mustang which use a
> > different PCI root controller (the xgene has several, we only map the
> > first).
> >
> > I've tested on Mustang and on a FastModel with virtio-pci based rootfs.
> >
> > This is *not* for 4.5.
> >
> > Ian.
> >
> >
> > ___
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel
> 
> Ian,
> 
> I have tested this on Seattle w/ the ToT Xen-4.6 unstable branch (commit 
> 001324547356af86875fad5003f679571a6b8f1c), and I have PCI working in 
> Dom0. Are you planning on committing this series at some point?

At some point, yes.

I originally had some doubts that this was the correct approach, but I
a) can't remember what they were and b) think I've convinced myself this
is the right way to go afterall...

Since this is now in your way I'll bump revisiting it up my stack a bit.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 0/3] arm/arm64: Detect Xen support earlier

2015-02-18 Thread Julien Grall
Hello,

This small patch series moves the detection of running on Xen earlier. This is
required in order to support earlyprintk via Xen and selecting preferred 
console.

Actually, the last patch of this patch series adds HVC0 as preferred console
when running on Xen. The patch [1] was previously separetly sent by Ard.

For the maintainers, does this patch series should go in upstream via the ARM
tree or Xen tree?

Regards,

[1] https://patches.linaro.org/44633/ 

Tested-by: Ard Biesheuvel 

Ard Biesheuvel (1):
  xen/arm: allow console=hvc0 to be omitted for guests

Julien Grall (1):
  arm/xen: Correctly check if the event channel interrupt is present

Stefano Stabellini (1):
  arm,arm64/xen: move Xen initialization earlier

 arch/arm/include/asm/xen/hypervisor.h |  8 +
 arch/arm/kernel/setup.c   |  2 ++
 arch/arm/xen/enlighten.c  | 62 ++-
 arch/arm64/kernel/setup.c |  2 ++
 4 files changed, 51 insertions(+), 23 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 1/3] arm/xen: Correctly check if the event channel interrupt is present

2015-02-18 Thread Julien Grall
The function irq_of_parse_and_map returns 0 when the IRQ is not found.

Futhermore, move the check before notifying the user that we are running on
Xen.

Signed-off-by: Julien Grall 
Acked-by: Ian Campbell 

---
Changes in v2:
- Add Ian's ack
- Re-add __read_mostly
---
 arch/arm/xen/enlighten.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 263a204..c8d3a17 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -51,7 +51,7 @@ EXPORT_SYMBOL_GPL(xen_have_vector_callback);
 int xen_platform_pci_unplug = XEN_UNPLUG_ALL;
 EXPORT_SYMBOL_GPL(xen_platform_pci_unplug);
 
-static __read_mostly int xen_events_irq = -1;
+static __read_mostly unsigned int xen_events_irq;
 
 /* map fgmfn of domid to lpfn in the current domain */
 static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
@@ -251,12 +251,14 @@ static int __init xen_guest_init(void)
return 0;
grant_frames = res.start;
xen_events_irq = irq_of_parse_and_map(node, 0);
+   if (!xen_events_irq) {
+   pr_debug("Xen event channel interrupt not found\n");
+   return -ENODEV;
+   }
+
pr_info("Xen %s support found, events_irq=%d gnttab_frame=%pa\n",
version, xen_events_irq, &grant_frames);
 
-   if (xen_events_irq < 0)
-   return -ENODEV;
-
xen_domain_type = XEN_HVM_DOMAIN;
 
xen_setup_features();
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 3/3] xen/arm: allow console=hvc0 to be omitted for guests

2015-02-18 Thread Julien Grall
From: Ard Biesheuvel 

This patch registers hvc0 as the preferred console if no console
has been specified explicitly on the kernel command line.

The purpose is to allow platform agnostic kernels and boot images
(such as distro installers) to boot in a Xen/ARM domU without the
need to modify the command line by hand.

Signed-off-by: Ard Biesheuvel 
Reviewed-by: Julien Grall 
---
 arch/arm/xen/enlighten.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 1660432..904bd2d 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -255,6 +256,9 @@ void __init xen_early_init(void)
xen_start_info->flags |= SIF_INITDOMAIN|SIF_PRIVILEGED;
else
xen_start_info->flags &= ~(SIF_INITDOMAIN|SIF_PRIVILEGED);
+
+   if (!console_set_on_cmdline && !xen_initial_domain())
+   add_preferred_console("hvc", 0, NULL);
 }
 
 static int __init xen_guest_init(void)
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 2/3] arm, arm64/xen: move Xen initialization earlier

2015-02-18 Thread Julien Grall
From: Stefano Stabellini 

Currently, Xen is initialized/discovered in an initcall. This doesn't
allow us to support earlyprintk or choosing the preferred console when
running on Xen.

The current function xen_guest_init is now split in 2 parts:
- xen_early_init: Check if there is a Xen node in the device tree
and setup domain type
- xen_guest_init: Retrieve the information from the device node and
initialize Xen (grant table, shared page...)

The former is called in setup_arch, while the latter is an initcall.

Signed-off-by: Stefano Stabellini 
Signed-off-by: Julien Grall 
Acked-by: Ian Campbell 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 

---
It's based on a patch sent by Stefano nearly 2 years ago [1].

[1] http://lists.xen.org/archives/html/xen-devel/2013-08/msg02960.html

Changes in v2:
- Add Ian's ack
---
 arch/arm/include/asm/xen/hypervisor.h |  8 +
 arch/arm/kernel/setup.c   |  2 ++
 arch/arm/xen/enlighten.c  | 58 ---
 arch/arm64/kernel/setup.c |  2 ++
 4 files changed, 46 insertions(+), 24 deletions(-)

diff --git a/arch/arm/include/asm/xen/hypervisor.h 
b/arch/arm/include/asm/xen/hypervisor.h
index 1317ee4..04ff8e7 100644
--- a/arch/arm/include/asm/xen/hypervisor.h
+++ b/arch/arm/include/asm/xen/hypervisor.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_ARM_XEN_HYPERVISOR_H
 #define _ASM_ARM_XEN_HYPERVISOR_H
 
+#include 
+
 extern struct shared_info *HYPERVISOR_shared_info;
 extern struct start_info *xen_start_info;
 
@@ -18,4 +20,10 @@ static inline enum paravirt_lazy_mode 
paravirt_get_lazy_mode(void)
 
 extern struct dma_map_ops *xen_dma_ops;
 
+#ifdef CONFIG_XEN
+void __init xen_early_init(void);
+#else
+static inline void xen_early_init(void) { return; }
+#endif
+
 #endif /* _ASM_ARM_XEN_HYPERVISOR_H */
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index e55408e..8b59d0d 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -46,6 +46,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -936,6 +937,7 @@ void __init setup_arch(char **cmdline_p)
 
arm_dt_init_cpu_maps();
psci_init();
+   xen_early_init();
 #ifdef CONFIG_SMP
if (is_smp()) {
if (!mdesc->smp_init || !mdesc->smp_init()) {
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index c8d3a17..1660432 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -53,6 +53,8 @@ EXPORT_SYMBOL_GPL(xen_platform_pci_unplug);
 
 static __read_mostly unsigned int xen_events_irq;
 
+static __initdata struct device_node *xen_node;
+
 /* map fgmfn of domid to lpfn in the current domain */
 static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
unsigned int domid)
@@ -222,42 +224,28 @@ static irqreturn_t xen_arm_callback(int irq, void *arg)
  * documentation of the Xen Device Tree format.
  */
 #define GRANT_TABLE_PHYSADDR 0
-static int __init xen_guest_init(void)
+void __init xen_early_init(void)
 {
-   struct xen_add_to_physmap xatp;
-   static struct shared_info *shared_info_page = 0;
-   struct device_node *node;
int len;
const char *s = NULL;
const char *version = NULL;
const char *xen_prefix = "xen,xen-";
-   struct resource res;
-   phys_addr_t grant_frames;
 
-   node = of_find_compatible_node(NULL, NULL, "xen,xen");
-   if (!node) {
+   xen_node = of_find_compatible_node(NULL, NULL, "xen,xen");
+   if (!xen_node) {
pr_debug("No Xen support\n");
-   return 0;
+   return;
}
-   s = of_get_property(node, "compatible", &len);
+   s = of_get_property(xen_node, "compatible", &len);
if (strlen(xen_prefix) + 3  < len &&
!strncmp(xen_prefix, s, strlen(xen_prefix)))
version = s + strlen(xen_prefix);
if (version == NULL) {
pr_debug("Xen version not found\n");
-   return 0;
-   }
-   if (of_address_to_resource(node, GRANT_TABLE_PHYSADDR, &res))
-   return 0;
-   grant_frames = res.start;
-   xen_events_irq = irq_of_parse_and_map(node, 0);
-   if (!xen_events_irq) {
-   pr_debug("Xen event channel interrupt not found\n");
-   return -ENODEV;
+   return;
}
 
-   pr_info("Xen %s support found, events_irq=%d gnttab_frame=%pa\n",
-   version, xen_events_irq, &grant_frames);
+   pr_info("Xen %s support found\n", version);
 
xen_domain_type = XEN_HVM_DOMAIN;
 
@@ -267,10 +255,32 @@ static int __init xen_guest_init(void)
xen_start_info->flags |= SIF_INITDOMAIN|SIF_PRIVILEGED;
else
xen_start_info->flags &= ~(SIF_INITDOMAIN|SIF_PRIVILEGED);
+}
+
+static int __init xen_guest_init(void)
+{
+   struct xen_add_to_physmap xatp;
+   struct 

Re: [Xen-devel] [PATCH] x86: Adjust rdtsc inline assembly

2015-02-18 Thread Andrew Cooper
On 18/02/15 13:11, Jan Beulich wrote:
 On 18.02.15 at 13:25,  wrote:
>> The single use of the old rdtsc() in emulate_privileged_op() is altered to 
>> use
>> the new rdtsc() and the rdmsr_writeback path to set eax/edx appropriately.
> I'm not entirely sure about this one - the current code surely is slightly
> faster than the replacement. Question is how much this matters. I'd
> suggest to be on the safe side and simply open-code the asm() there.

It is a matter of a 4 instructions on a fault slowpath, which which only
ever executed in the unlikely case that vtsc is enabled on a domain, or
the kernel has set CR4.TSD.

I would not loose sleep over the introduced inefficiently.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] x86: Adjust rdtsc inline assembly

2015-02-18 Thread Andrew Cooper
Currently there are three related rdtsc macros, all of which are lowercase and
not obviously macros, which write by value to their parameters.

This is non-intuitive to program which, being contrary to C semantics for code
appearing to be a regular function call.  It is also causes Coverity to
conclude that __udelay() has an infinite loop, as all of its loop conditions
are constant.

Two of these macros (rdtsc() and rdtscl()) have only a handful of uses while
the vast majority of code uses the rdtscll() variant.  rdtsc() and rdtscll()
are equivalent, while rdtscl() discards the high word.

Replace all 3 macros with a static inline which returns the complete tsc.

Most of this patch is a mechanical change of

  - rdtscll($FOO);
  + $FOO = rdtsc();

And a diff of the generated assembly confirms that this is no change at all.

The single use of the old rdtsc() in emulate_privileged_op() is altered to use
the new rdtsc() and the rdmsr_writeback path to set eax/edx appropriately.

The pair of use of rdtscl() in __udelay() are extended to use full 64bit
values, which makes the overflow edge condition (and early exit from the loop)
far rarer.

Signed-off-by: Andrew Cooper 
CC: Keir Fraser 
CC: Jan Beulich 

---
v2: Style adjustments, suggested by Jan
---
 xen/arch/x86/apic.c   |4 ++--
 xen/arch/x86/cpu/mcheck/mce.c |2 +-
 xen/arch/x86/delay.c  |4 ++--
 xen/arch/x86/hvm/hvm.c|4 ++--
 xen/arch/x86/hvm/save.c   |4 ++--
 xen/arch/x86/hvm/svm/svm.c|2 +-
 xen/arch/x86/platform_hypercall.c |4 ++--
 xen/arch/x86/smpboot.c|2 +-
 xen/arch/x86/time.c   |   34 --
 xen/arch/x86/traps.c  |5 -
 xen/include/asm-x86/msr.h |   15 ++-
 xen/include/asm-x86/time.h|4 +---
 12 files changed, 40 insertions(+), 44 deletions(-)

diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index 39cd9e5..3217bdf 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -1148,7 +1148,7 @@ static int __init calibrate_APIC_clock(void)
  * We wrapped around just now. Let's start:
  */
 if (cpu_has_tsc)
-rdtscll(t1);
+t1 = rdtsc();
 tt1 = apic_read(APIC_TMCCT);
 
 /*
@@ -1159,7 +1159,7 @@ static int __init calibrate_APIC_clock(void)
 
 tt2 = apic_read(APIC_TMCCT);
 if (cpu_has_tsc)
-rdtscll(t2);
+t2 = rdtsc();
 
 /*
  * The APIC bus clock counter is 32 bits only, it
diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 05a86fb..3a3b4dc 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -235,7 +235,7 @@ static void mca_init_bank(enum mca_source who,
 
 if (who == MCA_CMCI_HANDLER) {
 mib->mc_ctrl2 = mca_rdmsr(MSR_IA32_MC0_CTL2 + bank);
-rdtscll(mib->mc_tsc);
+mib->mc_tsc = rdtsc();
 }
 }
 
diff --git a/xen/arch/x86/delay.c b/xen/arch/x86/delay.c
index bc1772e..ef6bc5d 100644
--- a/xen/arch/x86/delay.c
+++ b/xen/arch/x86/delay.c
@@ -21,10 +21,10 @@ void __udelay(unsigned long usecs)
 unsigned long ticks = usecs * (cpu_khz / 1000);
 unsigned long s, e;
 
-rdtscl(s);
+s = rdtsc();
 do
 {
 rep_nop();
-rdtscl(e);
+e = rdtsc();
 } while ((e-s) < ticks);
 }
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index a52c6e0..72e383f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -292,7 +292,7 @@ void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, 
u64 at_tsc)
 }
 else
 {
-rdtscll(tsc);
+tsc = rdtsc();
 }
 
 delta_tsc = guest_tsc - tsc;
@@ -326,7 +326,7 @@ u64 hvm_get_guest_tsc_fixed(struct vcpu *v, uint64_t at_tsc)
 }
 else
 {
-rdtscll(tsc);
+tsc = rdtsc();
 }
 
 return tsc + v->arch.hvm_vcpu.cache_tsc_offset;
diff --git a/xen/arch/x86/hvm/save.c b/xen/arch/x86/hvm/save.c
index 6af19be..61f780d 100644
--- a/xen/arch/x86/hvm/save.c
+++ b/xen/arch/x86/hvm/save.c
@@ -36,7 +36,7 @@ void arch_hvm_save(struct domain *d, struct hvm_save_header 
*hdr)
 hdr->gtsc_khz = d->arch.tsc_khz;
 
 /* Time when saving started */
-rdtscll(d->arch.hvm_domain.sync_tsc);
+d->arch.hvm_domain.sync_tsc = rdtsc();
 }
 
 int arch_hvm_load(struct domain *d, struct hvm_save_header *hdr)
@@ -71,7 +71,7 @@ int arch_hvm_load(struct domain *d, struct hvm_save_header 
*hdr)
 hvm_set_rdtsc_exiting(d, 1);
 
 /* Time when restore started  */
-rdtscll(d->arch.hvm_domain.sync_tsc);
+d->arch.hvm_domain.sync_tsc = rdtsc();
 
 /* VGA state is not saved/restored, so we nobble the cache. */
 d->arch.hvm_domain.stdvga.cache = 0;
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 018dd70..c83c483 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -805,7 +805,7 @@ static void svm_set_tsc_offset(struct vcpu

Re: [Xen-devel] freemem-slack and large memory environments

2015-02-18 Thread Ian Campbell
On Tue, 2015-02-10 at 14:34 -0700, Mike Latimer wrote:
> On Monday, February 09, 2015 06:27:54 PM Mike Latimer wrote:
> > While testing commit 2563bca1, I found that libxl_get_free_memory returns 0
> > until there is more free memory than required for freemem-slack. This means
> > that during the domain creation process, freed memory is first set aside for
> > freemem-slack, then marked as truly free for consumption.
> > 
> > On machines with large amounts of memory, freemem-slack can be very high
> > (26GB on a 2TB test machine). If freeing this memory takes more time than
> > allowed during domain startup, domain creation fails with ERROR_NOMEM.
> > (Commit 2563bca1 doesn't help here, as free_memkb remains 0 until
> > freemem-slack is satisfied.)
> > 
> > There is already a 15% limit on the size of freemem-slack (commit a39b5bc6),
> > but this does not take into consideration very large memory environments.
> > (26GB is only 1.2% of 2TB), where this limit is not hit.

Stefano,

What is "freemem-slack" for? It seems to have been added in 7010e9b7 but
the commit log makes no mention of it whatsoever. Was it originally just
supposed to be the delta between the host memory and dom0 memory at
start of day?

This seems to then change in a39b5bc64, to add an arbitrary caP which
seems to be working around an invalid configuration (dom0_mem +
autoballooning on).

Now that we autodetect the use of dom0_mem and set autoballooning
correctly perhaps we should just revert a39b5bc64?

Ian.

> > 
> > It seems that there are two approaches to resolve this:
> > 
> >  - Introduce a hard limit on freemem-slack to avoid unnecessarily large
> > reservations
> >  - Increase the retry count during domain creation to ensure enough time is
> > set aside for any cycles spent freeing memory for freemem-slack (on the test
> > machine, doubling the retry count to 6 is the minimum required)
> > 
> > Which is the best approach (or did I miss something)?
> 
> Sorry - forgot to CC relevant maintainers.
> 
> -Mike



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: arm: handle PCI DT node ranges and interrupt-map properties

2015-02-18 Thread Julien Grall

Hi Ian,

On 18/02/2015 13:50, Ian Campbell wrote:

On Tue, 2015-02-17 at 17:33 +, Julien Grall wrote:

Hi Ian,

On 24/10/14 10:58, Ian Campbell wrote:

These properties are defined in ePAPR and the OpenFirmware PCI Bus Binding
Specification (IEEE Std 1275-1994).

This replaces the xgene specific mapping. Tested on Mustang and on a model with
a PCI virtio controller.


I'm wondering why you choose to map everything at Xen boot time rather
than implementing PHYSDEVOP_pci_device_add to do the job?


Does pci_device_add contain sufficient information to do so?


Hmmm... for the interrupt the SBDF is enough. Although for the MMIO it 
looks like there is no difference between PCI bars.



The regions which are being mapped are essentially the PCI host
controllers MMIO, IO and CFG "windows" which are then consumed by the
various bars of the devices on the bus.

So mapping based on pci_device_add would require us to go from the SBDF
to a set of BARS which need mapping, which is a whole lot more complex
than just mapping all of the resources owned by the root complex through
to the h/w domain.


I gave a look to the code which parse the host bridge resource (see 
of_pci_get_host_bridge_resources). They seem to re-use to the 
of_translate_* function. Would not it be possible to do the same?



Or perhaps I've misunderstood what you were suggesting?


I was suggesting to do it via pci_add_device but it looks like it's only 
possible for IRQ not MMIO.



This would allow us to re-use most of the interrupts/mmio decoding
provided in the device tree library. It would also avoid missing support
of cascade ranges/interrupt-map.


I *think* (if I'm remembering right) I decided we don't need to worry
about cascades of these things because the second level resources are
all fully contained within the first (top level) one and so with the
approach I've taken here are all fully mapped already. That's why I made
this patch stop descending into children when such a "bus node" is
found.


I don't understand this paragraph, sorry.

The address range you decoded via the PCI bus may be an intermediate 
address which needs to be translated in the physical hardware address.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: arm: handle PCI DT node ranges and interrupt-map properties

2015-02-18 Thread Ian Campbell
On Wed, 2015-02-18 at 14:19 +, Julien Grall wrote:
> Hi Ian,
> 
> On 18/02/2015 13:50, Ian Campbell wrote:
> > On Tue, 2015-02-17 at 17:33 +, Julien Grall wrote:
> >> Hi Ian,
> >>
> >> On 24/10/14 10:58, Ian Campbell wrote:
> >>> These properties are defined in ePAPR and the OpenFirmware PCI Bus Binding
> >>> Specification (IEEE Std 1275-1994).
> >>>
> >>> This replaces the xgene specific mapping. Tested on Mustang and on a 
> >>> model with
> >>> a PCI virtio controller.
> >>
> >> I'm wondering why you choose to map everything at Xen boot time rather
> >> than implementing PHYSDEVOP_pci_device_add to do the job?
> >
> > Does pci_device_add contain sufficient information to do so?
> 
> Hmmm... for the interrupt the SBDF is enough. Although for the MMIO it 
> looks like there is no difference between PCI bars.
> 
> > The regions which are being mapped are essentially the PCI host
> > controllers MMIO, IO and CFG "windows" which are then consumed by the
> > various bars of the devices on the bus.
> >
> > So mapping based on pci_device_add would require us to go from the SBDF
> > to a set of BARS which need mapping, which is a whole lot more complex
> > than just mapping all of the resources owned by the root complex through
> > to the h/w domain.
> 
> I gave a look to the code which parse the host bridge resource (see 
> of_pci_get_host_bridge_resources). They seem to re-use to the 
> of_translate_* function. Would not it be possible to do the same?
> 
> > Or perhaps I've misunderstood what you were suggesting?
> 
> I was suggesting to do it via pci_add_device but it looks like it's only 
> possible for IRQ not MMIO.

I think so, and we probably should consider the two cases separately
since the right answer could reasonably differ for different resource
types.

I am reasonably convinced that for MMIO (+IO+CFG space) we should map
everything as described by the ranges property of the top most node, it
can be considered an analogue to / extension of the reg property of that
node.

For IRQ I'm not so sure, it's possible that routing the IRQ at
pci_add_device time might be better, or fit in better with e.g. the ACPI
architecture, but mapping everything described in interrupt-map at start
of day is also an option and a reasonably simple one, probably.

(My memory is fuzzy, but I think the concerns I had with this patch were
precisely to do with IRQs and how to parse the interrupt-map without a
specific SBDF in hand -- but only because the existing helper functions
assume an SBDF is present)

> >> This would allow us to re-use most of the interrupts/mmio decoding
> >> provided in the device tree library. It would also avoid missing support
> >> of cascade ranges/interrupt-map.
> >
> > I *think* (if I'm remembering right) I decided we don't need to worry
> > about cascades of these things because the second level resources are
> > all fully contained within the first (top level) one and so with the
> > approach I've taken here are all fully mapped already. That's why I made
> > this patch stop descending into children when such a "bus node" is
> > found.
> 
> I don't understand this paragraph, sorry.
> 
> The address range you decoded via the PCI bus may be an intermediate 
> address which needs to be translated in the physical hardware address.

This isn't to do with IPA->PA translations but to do with translations
between different PA addressing regimes. i.e. the different addressing
schemes of difference busses.

Lets say we have a system with a PCI-ROOT device exposing a PCI bus,
which in turn contains a PCI-BRIDGE which for the sake of argument lets
say is a PCI-FOOBUS bridge.

Lets just consider the MMIO hole for now, but IRQ is basically the same.

The ranges property on a node describes a mapping from a "parent"
address space into a "child" address space.

For PCI-ROOT "parent" is the host physical address space and "child" is
the PCI MMIO/IO/CFG address spaces.

For PCI-BRIDGE "parent" is the PCI-ROOT's child address space (i.e. PCI
MMIO/IO/CFG) and "child" is the FOOBUS address space.

The inputs ("parents") of the PCI-BRIDGE ranges property must therefore
by definition be valid outputs of the PCI-ROOT ranges property (i.e. be
"child" addresses).

Therefore if we map all of the input/parent ranges described by
PCI-ROOT's ranges property we do not need to recurse further and
consider PCI-BRIDGE's ranges property -- we've effectively already dealt
with it.

Does that make more sense?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: arm: handle PCI DT node ranges and interrupt-map properties

2015-02-18 Thread Julien Grall



On 18/02/2015 14:37, Ian Campbell wrote:

On Wed, 2015-02-18 at 14:19 +, Julien Grall wrote:
I think so, and we probably should consider the two cases separately
since the right answer could reasonably differ for different resource
types.

I am reasonably convinced that for MMIO (+IO+CFG space) we should map
everything as described by the ranges property of the top most node, it
can be considered an analogue to / extension of the reg property of that
node.


Agreed.


For IRQ I'm not so sure, it's possible that routing the IRQ at
pci_add_device time might be better, or fit in better with e.g. the ACPI
architecture, but mapping everything described in interrupt-map at start
of day is also an option and a reasonably simple one, probably.


I agree that it's simple. Are we sure that we would be able to get a 
"better" solution later without modifying the kernel?


If not, we may need to keep this solution forever.


This isn't to do with IPA->PA translations but to do with translations
between different PA addressing regimes. i.e. the different addressing
schemes of difference busses.


I meant bus address. The name "intermediate address" was misused, sorry.


Lets say we have a system with a PCI-ROOT device exposing a PCI bus,
which in turn contains a PCI-BRIDGE which for the sake of argument lets
say is a PCI-FOOBUS bridge.



Lets just consider the MMIO hole for now, but IRQ is basically the same.

The ranges property on a node describes a mapping from a "parent"
address space into a "child" address space.

For PCI-ROOT "parent" is the host physical address space and "child" is
the PCI MMIO/IO/CFG address spaces.

For PCI-BRIDGE "parent" is the PCI-ROOT's child address space (i.e. PCI
MMIO/IO/CFG) and "child" is the FOOBUS address space.

The inputs ("parents") of the PCI-BRIDGE ranges property must therefore
by definition be valid outputs of the PCI-ROOT ranges property (i.e. be
"child" addresses).

Therefore if we map all of the input/parent ranges described by
PCI-ROOT's ranges property we do not need to recurse further and
consider PCI-BRIDGE's ranges property -- we've effectively already dealt
with it.

Does that make more sense?


I'm still confused, what prevents the PCI-ROOT device to not be 
connected to another bus?


In device tree format, that would give something like:

/ {

  soc {
 ranges = "...";

 pcie {
   ranges = "...";
 }
  }
}

The address retrieved from the PCI-ROOT would be a bus address and not a 
physical address.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: arm: handle PCI DT node ranges and interrupt-map properties

2015-02-18 Thread Julien Grall



On 18/02/2015 14:37, Ian Campbell wrote:

I am reasonably convinced that for MMIO (+IO+CFG space) we should map
everything as described by the ranges property of the top most node, it
can be considered an analogue to / extension of the reg property of that
node.


BTW, the CFG space is part of the "reg" property, which is already 
mapped. The "ranges" property only covers the IO/MMIO BARs.


--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: arm: handle PCI DT node ranges and interrupt-map properties

2015-02-18 Thread Julien Grall



On 18/02/2015 15:05, Julien Grall wrote:



On 18/02/2015 14:37, Ian Campbell wrote:

On Wed, 2015-02-18 at 14:19 +, Julien Grall wrote:
I think so, and we probably should consider the two cases separately
since the right answer could reasonably differ for different resource
types.

I am reasonably convinced that for MMIO (+IO+CFG space) we should map
everything as described by the ranges property of the top most node, it
can be considered an analogue to / extension of the reg property of that
node.


Agreed.


For IRQ I'm not so sure, it's possible that routing the IRQ at
pci_add_device time might be better, or fit in better with e.g. the ACPI
architecture, but mapping everything described in interrupt-map at start
of day is also an option and a reasonably simple one, probably.


I agree that it's simple. Are we sure that we would be able to get a
"better" solution later without modifying the kernel?

If not, we may need to keep this solution forever.


This isn't to do with IPA->PA translations but to do with translations
between different PA addressing regimes. i.e. the different addressing
schemes of difference busses.


I meant bus address. The name "intermediate address" was misused, sorry.


Lets say we have a system with a PCI-ROOT device exposing a PCI bus,
which in turn contains a PCI-BRIDGE which for the sake of argument lets
say is a PCI-FOOBUS bridge.



Lets just consider the MMIO hole for now, but IRQ is basically the same.

The ranges property on a node describes a mapping from a "parent"
address space into a "child" address space.

For PCI-ROOT "parent" is the host physical address space and "child" is
the PCI MMIO/IO/CFG address spaces.

For PCI-BRIDGE "parent" is the PCI-ROOT's child address space (i.e. PCI
MMIO/IO/CFG) and "child" is the FOOBUS address space.

The inputs ("parents") of the PCI-BRIDGE ranges property must therefore
by definition be valid outputs of the PCI-ROOT ranges property (i.e. be
"child" addresses).

Therefore if we map all of the input/parent ranges described by
PCI-ROOT's ranges property we do not need to recurse further and
consider PCI-BRIDGE's ranges property -- we've effectively already dealt
with it.

Does that make more sense?


I'm still confused, what prevents the PCI-ROOT device to not be
connected to another bus?

In device tree format, that would give something like:

/ {

   soc {
  ranges = "...";

  pcie {
ranges = "...";
  }
   }
}



Actually the device tree of the x-gene board has something similar.

/ {
  soc {
ranges;

pcie {
  ranges = "...";
}
}

"ranges;" means there is not translation necessary. But nothing prevent 
to have a the property "ranges" set.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: arm: handle PCI DT node ranges and interrupt-map properties

2015-02-18 Thread Ian Campbell
On Wed, 2015-02-18 at 15:05 +, Julien Grall wrote:
> 
> On 18/02/2015 14:37, Ian Campbell wrote:
> > On Wed, 2015-02-18 at 14:19 +, Julien Grall wrote:
> > I think so, and we probably should consider the two cases separately
> > since the right answer could reasonably differ for different resource
> > types.
> >
> > I am reasonably convinced that for MMIO (+IO+CFG space) we should map
> > everything as described by the ranges property of the top most node, it
> > can be considered an analogue to / extension of the reg property of that
> > node.
> 
> Agreed.
> 
> > For IRQ I'm not so sure, it's possible that routing the IRQ at
> > pci_add_device time might be better, or fit in better with e.g. the ACPI
> > architecture, but mapping everything described in interrupt-map at start
> > of day is also an option and a reasonably simple one, probably.
> 
> I agree that it's simple. Are we sure that we would be able to get a 
> "better" solution later without modifying the kernel?
> 
> If not, we may need to keep this solution forever.

True. I suppose feature flags would be one way out, but not a very
convenient one..

> > This isn't to do with IPA->PA translations but to do with translations
> > between different PA addressing regimes. i.e. the different addressing
> > schemes of difference busses.
> 
> I meant bus address. The name "intermediate address" was misused, sorry.
> 
> > Lets say we have a system with a PCI-ROOT device exposing a PCI bus,
> > which in turn contains a PCI-BRIDGE which for the sake of argument lets
> > say is a PCI-FOOBUS bridge.
> 
> > Lets just consider the MMIO hole for now, but IRQ is basically the same.
> >
> > The ranges property on a node describes a mapping from a "parent"
> > address space into a "child" address space.
> >
> > For PCI-ROOT "parent" is the host physical address space and "child" is
> > the PCI MMIO/IO/CFG address spaces.
> >
> > For PCI-BRIDGE "parent" is the PCI-ROOT's child address space (i.e. PCI
> > MMIO/IO/CFG) and "child" is the FOOBUS address space.
> >
> > The inputs ("parents") of the PCI-BRIDGE ranges property must therefore
> > by definition be valid outputs of the PCI-ROOT ranges property (i.e. be
> > "child" addresses).
> >
> > Therefore if we map all of the input/parent ranges described by
> > PCI-ROOT's ranges property we do not need to recurse further and
> > consider PCI-BRIDGE's ranges property -- we've effectively already dealt
> > with it.
> >
> > Does that make more sense?
> 
> I'm still confused, what prevents the PCI-ROOT device to not be 
> connected to another bus?
>
> In device tree format, that would give something like:
> 
> / {
> 
>soc {
>   ranges = "...";
> 
>   pcie {
> ranges = "...";
>   }
>}
> }
> 
> The address retrieved from the PCI-ROOT would be a bus address and not a 
> physical address.

Hrm, nothing, I see what you are getting at now.

Either soc has a device_type property which we understand, in which case
we would handle it and stop recursing or (more likely for an soc) it
does not, in which case we would handle the pcie ranges property, but it
needs to be translated through the ranges property of soc, which the
patch doesn't do and probably it should.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen: arm64: more useful logging on bad trap.

2015-02-18 Thread Ian Campbell
Dump the register state before panicing so we have some clue where the
issue occurred. Also decode the ESR register a bit to save having to
grab a pen and paper.

ESR_EL2 is a 32-bit register, so use SYSREG_READ32 not ..._READ64, as
we already do correctly in the main trap handler.

While here notice that do_trap_serror is never called and remove it.

Signed-off-by: Ian Campbell 
Cc: jint...@cs.columbia.edu
---
Jintack, since you have a system which is exhibiting SError issues I
wonder if I could prevail on you to give this patch a try on your
system and report on the output. I've only compile tested this myself.
---
 xen/arch/arm/arm64/traps.c |   13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/arm64/traps.c b/xen/arch/arm/arm64/traps.c
index 1693b5d..89b8eb3 100644
--- a/xen/arch/arm/arm64/traps.c
+++ b/xen/arch/arm/arm64/traps.c
@@ -24,11 +24,6 @@
 
 #include 
 
-asmlinkage void do_trap_serror(struct cpu_user_regs *regs)
-{
-panic("Unhandled serror trap");
-}
-
 static const char *handler[]= {
 "Synchronous Abort",
 "IRQ",
@@ -38,11 +33,13 @@ static const char *handler[]= {
 
 asmlinkage void do_bad_mode(struct cpu_user_regs *regs, int reason)
 {
-uint64_t esr = READ_SYSREG64(ESR_EL2);
-printk("Bad mode in %s handler detected, code 0x%08"PRIx64"\n",
-   handler[reason], esr);
+union hsr hsr = { .bits = READ_SYSREG32(ESR_EL2) };
+printk("Bad mode in %s handler detected, code 0x%08"PRIx32","
+   " EC=%"PRIx32", IL=%"PRIx32" ISS=%"PRIx32"\n",
+   handler[reason], hsr.bits, hsr.ec, hsr.len, hsr.iss);
 
 local_irq_disable();
+show_execution_state(regs);
 panic("bad mode");
 }
 
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: arm: handle PCI DT node ranges and interrupt-map properties

2015-02-18 Thread Ian Campbell
On Wed, 2015-02-18 at 15:13 +, Julien Grall wrote:
> 
> On 18/02/2015 14:37, Ian Campbell wrote:
> > I am reasonably convinced that for MMIO (+IO+CFG space) we should map
> > everything as described by the ranges property of the top most node, it
> > can be considered an analogue to / extension of the reg property of that
> > node.
> 
> BTW, the CFG space is part of the "reg" property, which is already 
> mapped. The "ranges" property only covers the IO/MMIO BARs.

Right, I keep forgetting...

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 34656: regressions - FAIL

2015-02-18 Thread xen . org
flight 34656 qemu-mainline real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34656/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-debianhvm-amd64 7 debian-hvm-install fail REGR. vs. 
33480
 test-amd64-i386-xl-qemuu-winxpsp3  7 windows-install  fail REGR. vs. 33480
 test-amd64-amd64-xl-winxpsp3  7 windows-install   fail REGR. vs. 33480
 test-amd64-i386-xl-winxpsp3   7 windows-install   fail REGR. vs. 33480
 test-amd64-i386-qemuu-rhel6hvm-intel  7 redhat-installfail REGR. vs. 33480
 test-amd64-i386-xl-win7-amd64  7 windows-install  fail REGR. vs. 33480
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 7 windows-install fail REGR. vs. 33480
 test-amd64-amd64-xl-win7-amd64  7 windows-install fail REGR. vs. 33480
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 debian-hvm-install fail REGR. vs. 33480
 test-amd64-i386-xl-qemuu-win7-amd64  7 windows-installfail REGR. vs. 33480
 test-amd64-i386-xl-winxpsp3-vcpus1  7 windows-install fail REGR. vs. 33480
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 7 debian-hvm-install fail REGR. vs. 
33480
 test-amd64-amd64-xl-qemuu-ovmf-amd64 7 debian-hvm-install fail REGR. vs. 33480
 test-amd64-i386-freebsd10-i386  8 guest-start fail REGR. vs. 33480
 test-amd64-i386-freebsd10-amd64  8 guest-startfail REGR. vs. 33480
 test-amd64-amd64-xl-qemuu-winxpsp3  7 windows-install fail REGR. vs. 33480
 test-amd64-i386-qemuu-rhel6hvm-amd  7 redhat-install  fail REGR. vs. 33480
 test-amd64-i386-rhel6hvm-amd  7 redhat-installfail REGR. vs. 33480
 test-amd64-i386-rhel6hvm-intel  7 redhat-install  fail REGR. vs. 33480
 test-amd64-amd64-xl-qemuu-win7-amd64  7 windows-install   fail REGR. vs. 33480

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 13 guest-destroyfail blocked in 33480
 test-amd64-i386-libvirt  13 guest-destroyfail blocked in 33480
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 33480

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-armhf-armhf-xl-credit2  10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass

version targeted for testing:
 qemuucd2d5541271f1934345d8ca42f5fafff1744eee7
baseline version:
 qemuu1e42c353469cb58ca4f3b450eea4211af7d0b147


People who touched revisions under test:
  Alberto Garcia 
  Alex Suykov 
  Alex Williamson 
  Alexander Graf 
  Alexey Kardashevskiy 
  Alistair Francis 
  Amit Shah 
  Andreas Färber 
  Aurelien Jarno 
  Avi Kivity 
  Bastian Koppelmann 
  Ben Taylor 
  Benjamin Herrenschmidt 
  Bharata B Rao 
  Blue Swirl 
  Chen Fan 
  Chen Gang 
  Chen Gang S 
  Christian Borntraeger 
  Christophe Lyon 
  Claudio Fontana 
  Cornelia Huck 
  Daniel P. Berrange 
  Denis V. Lunev 
  Dinar Valeev 
  Don Koch 
  Don Slutz 
  Dr. David Alan Gilbert 
  Ed Swierk 
  Eduardo Habkost 
  Eduardo Otubo 
  Fabrice Bellard 
  Fam Zheng 
  Felix Janda 
  Francesco Romani 
  Frank Blaschka 
  Gerd Hoffmann 
  Gonglei 
  Greg Bellows 
  Greg Kurz 
  Guan Xuetao 
  Igor Mammedov 
  Ildar Isaev 
  Jan Kiszka 
  Jason Wang 
  Jeff Cody 
  Jiri Slaby 
  John Arbuckle 
  Juan Quintela 
  Kevin Wolf 
  Kirill Batuzov 
  Laszlo Ersek 
  Laurent Desnogues 
  Leon Yu 
  Marc-André Lureau 
  Mark Cave-Ayland 
  Markus Armbruster 
  Markus Armbruster 
  Max Filippov 
  Max Reitz 
  Maxim Ostapenko 
  Michael S. Tsirkin 
  Michael Tokarev 
  Paolo Bonzini 
  Paul Brook 
  Paul Durrant 
  Peter Lieven 
  Peter Maydell 
  Peter Wu 
  Pranavkumar Sawargaonkar 

Re: [Xen-devel] BUG - xen-netback stats interface limited to 32-bit values on 64 bit systems

2015-02-18 Thread Ian Campbell
create ^
thanks

On Thu, 2015-02-12 at 20:47 +0100, Atom2 wrote:
> Hi guys,
> I am forwarding this message after initially having confirmed with Ian 
> Campbell on the user list that there's really an issue - please see 
> further below.
> 
> I am currently running xen-4.3.3 on gentoo (dom0 is based on kernel 
> 3.17.7) and I am happy to help out by applying and testing patches on my 
> version.

FWIW based on e00f85bec0a9 this seems like it would be a reasonably
straight forward introductory kernel hacking exercise (mostly copying
e00f85bec0a9) if you are interested I could help guide/mentor you
through it.

Ian.

> 
> Thanks Atom2
> 
>  Weitergeleitete Nachricht 
> Betreff: Re: [Xen-users] BUG? vif RX/TX byte counters limited to 32-bit 
> values on 64 bit systems
> Datum: Mon, 9 Feb 2015 11:09:08 +
> Von: Ian Campbell 
> An: Atom2 
> Kopie (CC): xen-us...@lists.xen.org
> 
> On Mon, 2015-02-09 at 00:37 +0100, Atom2 wrote:
> > Hi guys,
> > I recently experienced that ifconfig executed within dom0 wraps around
> > byte counters after reaching the 32-bit max value (2^32) for XEN vif
> > interfaces. Specifically I was able to observe this for a XEN vif
> > interface connected to a HVM domU running FreeBSD 10.0.
> 
> Looks like xen-netback was never converted to the 64 bit stats interface
> like e.g. netfront was (see commit e00f85bec0a9 in ~v3.1).
> 
> I could have sworn netback changed eons ago -- I was clearly mistaken.
> 
> Ian.
> 
> 
> 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/3] xen/arm: allow console=hvc0 to be omitted for guests

2015-02-18 Thread Ian Campbell
On Wed, 2015-02-18 at 13:51 +, Julien Grall wrote:
> From: Ard Biesheuvel 
> 
> This patch registers hvc0 as the preferred console if no console
> has been specified explicitly on the kernel command line.
> 
> The purpose is to allow platform agnostic kernels and boot images
> (such as distro installers) to boot in a Xen/ARM domU without the
> need to modify the command line by hand.
> 
> Signed-off-by: Ard Biesheuvel 
> Reviewed-by: Julien Grall 

Acked-by: Ian Campbell 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Processed: Re: BUG - xen-netback stats interface limited to 32-bit values on 64 bit systems

2015-02-18 Thread xen
Processing commands for x...@bugs.xenproject.org:

> create ^
Created new bug #49 rooted at `<54dd036e.4060...@web2web.at>'
Title: `Re: BUG - xen-netback stats interface limited to 32-bit values on 64 
bit systems'
> thanks
Finished processing.

Modified/created Bugs:
 - 49: http://bugs.xenproject.org/xen/bug/49 (new)

---
Xen Hypervisor Bug Tracker
See http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen for information on 
reporting bugs
Contact xen-bugs-ow...@bugs.xenproject.org with any infrastructure issues

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] xen: arm: handle PCI DT node ranges and interrupt-map properties

2015-02-18 Thread Julien Grall



On 18/02/2015 15:18, Ian Campbell wrote:

On Wed, 2015-02-18 at 15:05 +, Julien Grall wrote:


On 18/02/2015 14:37, Ian Campbell wrote:

On Wed, 2015-02-18 at 14:19 +, Julien Grall wrote:
I think so, and we probably should consider the two cases separately
since the right answer could reasonably differ for different resource
types.

I am reasonably convinced that for MMIO (+IO+CFG space) we should map
everything as described by the ranges property of the top most node, it
can be considered an analogue to / extension of the reg property of that
node.


Agreed.


For IRQ I'm not so sure, it's possible that routing the IRQ at
pci_add_device time might be better, or fit in better with e.g. the ACPI
architecture, but mapping everything described in interrupt-map at start
of day is also an option and a reasonably simple one, probably.


I agree that it's simple. Are we sure that we would be able to get a
"better" solution later without modifying the kernel?

If not, we may need to keep this solution forever.


True. I suppose feature flags would be one way out, but not a very
convenient one..


This isn't to do with IPA->PA translations but to do with translations
between different PA addressing regimes. i.e. the different addressing
schemes of difference busses.


I meant bus address. The name "intermediate address" was misused, sorry.


Lets say we have a system with a PCI-ROOT device exposing a PCI bus,
which in turn contains a PCI-BRIDGE which for the sake of argument lets
say is a PCI-FOOBUS bridge.



Lets just consider the MMIO hole for now, but IRQ is basically the same.

The ranges property on a node describes a mapping from a "parent"
address space into a "child" address space.

For PCI-ROOT "parent" is the host physical address space and "child" is
the PCI MMIO/IO/CFG address spaces.

For PCI-BRIDGE "parent" is the PCI-ROOT's child address space (i.e. PCI
MMIO/IO/CFG) and "child" is the FOOBUS address space.

The inputs ("parents") of the PCI-BRIDGE ranges property must therefore
by definition be valid outputs of the PCI-ROOT ranges property (i.e. be
"child" addresses).

Therefore if we map all of the input/parent ranges described by
PCI-ROOT's ranges property we do not need to recurse further and
consider PCI-BRIDGE's ranges property -- we've effectively already dealt
with it.

Does that make more sense?


I'm still confused, what prevents the PCI-ROOT device to not be
connected to another bus?

In device tree format, that would give something like:

/ {

soc {
   ranges = "...";

   pcie {
 ranges = "...";
   }
}
}

The address retrieved from the PCI-ROOT would be a bus address and not a
physical address.


Hrm, nothing, I see what you are getting at now.

Either soc has a device_type property which we understand, in which case
we would handle it and stop recursing or (more likely for an soc) it
does not, in which case we would handle the pcie ranges property, but it
needs to be translated through the ranges property of soc, which the
patch doesn't do and probably it should.


The code to do it is quite complicate and hard to maintain (actually 
it's a copy of the Linux one). It would be good if you can re-use the 
functions to translate in common/device_tree.c.


I think we may have the same problem for interrupts too.

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3] vsprintf: Make sure argument to %pX specifier is valid

2015-02-18 Thread Boris Ostrovsky
If invalid pointer (i.e. something smaller than HYPERVISOR_VIRT_START)
is passed for %*ph/%pv/%ps/%pS format specifiers then print value of the
pointer in parentheses.

For example:

 struct vcpu *v0 = NULL;
 struct vcpu *v1 = (void *)0xffUL;
 unsigned val = 0xab;
 unsigned *ptr = &val;
 unsigned *badptr = (void *)0xab;
 printk("v0 = %pv, v1 = %pv, curr = %pv\n", v0, v1, current);
 printk("badptr = %*ph, ptr = %*ph\n", 1, badptr, 1, ptr);

will produce
 v0 = (0), v1 = (ff), curr = d0v3
 badptr = (ab), ptr = ab

Signed-off-by: Boris Ostrovsky 
---
 xen/common/vsprintf.c |   23 ++-
 1 files changed, 22 insertions(+), 1 deletions(-)

v3:
 * Print value of the bad pointer in parentheses.
   (I understand Andrew's dislike of additional switch but I
   think this is the cleanest way)

v2:
 * Print "(NULL)" instead of specifier-specific string
 * Consider all addresses under HYPERVISOR_VIRT_START as invalid. (I think
   this is true for both x86 and ARM but I don't have ARM platform to test).

diff --git a/xen/common/vsprintf.c b/xen/common/vsprintf.c
index 065cc42..5ab61a1 100644
--- a/xen/common/vsprintf.c
+++ b/xen/common/vsprintf.c
@@ -269,7 +269,28 @@ static char *pointer(char *str, char *end, const char 
**fmt_ptr,
 {
 const char *fmt = *fmt_ptr, *s;
 
-/* Custom %p suffixes. See XEN_ROOT/docs/misc/printk-formats.txt */
+/*
+ * For custom %p suffixes (see XEN_ROOT/docs/misc/printk-formats.txt)
+ * if arg pointer is bogus then print pointer value in parentheses.
+ */
+if ( (unsigned long)arg < HYPERVISOR_VIRT_START )
+{
+switch (fmt[1])
+{
+case 'h':
+case 's':
+case 'S':
+case 'v':
+++*fmt_ptr;
+if ( str < end )
+*str++ = '(';
+str = number(str, end, (unsigned long)arg, 16, -1, -1, ZEROPAD);
+if ( str < end )
+*str++ = ')';
+return str;
+}
+}
+
 switch ( fmt[1] )
 {
 case 'h': /* Raw buffer as hex string. */
-- 
1.7.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   >